| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.james.ai.classic.BayesianAnalyzer
public class BayesianAnalyzer
Determines probability that text contains Spam.
Based upon Paul Grahams' A Plan for Spam. Extended to Paul Grahams' Better Bayesian Filtering.
Sample method usage:
Use: void addHam(Reader) and void addSpam(Reader) methods to build up the Maps of ham & spam tokens/occurrences. Both addHam and addSpam assume they're reading one message at a time, if you feed more than one message per call, be sure to adjust the appropriate message counter: hamMessageCount or spamMessageCount. Then...
Use: void buildCorpus() to build the final token/probabilities Map. Use your own methods for persistent storage of either the individual ham/spam corpus & message counts, and/or the final corpus. Then you can...
Use: double computeSpamProbability(Reader) to determine the probability that a particular text contains spam. A returned result of 0.9 or above is an indicator that the text was spam.
If you use persistent storage, use: void setCorpus(Map) before calling computeSpamProbability.
| Constructor Summary | |
|---|---|
| BayesianAnalyzer()Basic class constructor. | |
| Method Summary | |
|---|---|
|  void | addHam(Reader stream)Adds a message to the ham list. | 
|  void | addSpam(Reader stream)Adds a message to the spam list. | 
|  void | buildCorpus()Builds the corpus from the existing ham & spam counts. | 
|  void | clear()Clears all analysis repositories and counters. | 
|  double | computeSpamProbability(Reader stream)Computes the probability that the stream contains SPAM. | 
|  Map<String,Double> | getCorpus()Public getter for corpus. | 
|  int | getHamMessageCount()Public getter for hamMessageCount. | 
|  Map<String,Integer> | getHamTokenCounts()Public getter for the hamTokenCounts Map. | 
|  int | getSpamMessageCount()Public getter for spamMessageCount. | 
|  Map<String,Integer> | getSpamTokenCounts()Public getter for the spamTokenCounts Map. | 
|  void | setCorpus(Map<String,Double> corpus)Public setter for corpus. | 
|  void | setHamMessageCount(int hamMessageCount)Public setter for hamMessageCount. | 
|  void | setHamTokenCounts(Map<String,Integer> hamTokenCounts)Public setter for the hamTokenCounts Map. | 
|  void | setSpamMessageCount(int spamMessageCount)Public setter for spamMessageCount. | 
|  void | setSpamTokenCounts(Map<String,Integer> spamTokenCounts)Public setter for the spamTokenCounts Map. | 
|  void | tokenCountsClear()Clears token counters. | 
| Methods inherited from class java.lang.Object | 
|---|
| clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Constructor Detail | 
|---|
public BayesianAnalyzer()
| Method Detail | 
|---|
public void setHamTokenCounts(Map<String,Integer> hamTokenCounts)
hamTokenCounts - The new ham Token counts Map.public Map<String,Integer> getHamTokenCounts()
public void setSpamTokenCounts(Map<String,Integer> spamTokenCounts)
spamTokenCounts - The new spam Token counts Map.public Map<String,Integer> getSpamTokenCounts()
public void setSpamMessageCount(int spamMessageCount)
spamMessageCount - The new spam message count.public int getSpamMessageCount()
public void setHamMessageCount(int hamMessageCount)
hamMessageCount - The new ham message count.public int getHamMessageCount()
public void clear()
public void tokenCountsClear()
public void setCorpus(Map<String,Double> corpus)
corpus - The new corpus.public Map<String,Double> getCorpus()
public void buildCorpus()
public void addHam(Reader stream)
            throws IOException
stream - A reader stream on the ham message to analyze
IOException - If any error occurs
public void addSpam(Reader stream)
             throws IOException
stream - A reader stream on the spam message to analyze
IOException - If any error occurs
public double computeSpamProbability(Reader stream)
                              throws IOException
stream - The text to be analyzed for Spamminess.
IOException - If any error occurs| 
 | ||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||