Class BayesianAnalysis

  extended by org.apache.mailet.base.GenericMailet
      extended by
All Implemented Interfaces:
Log, Mailet, MailetConfig

public class BayesianAnalysis
extends GenericMailet
implements Log

Spam detection mailet using bayesian analysis techniques.

Sets an email message header indicating the probability that an email message is SPAM.

Based upon the principals described in: A Plan For Spam by Paul Graham. Extended to Paul Grahams' Better Bayesian Filtering.

The analysis capabilities are based on token frequencies (the Corpus) learned through a training process (see BayesianAnalysisFeeder) and stored in a JDBC database. After a training session, the Corpus must be rebuilt from the database in order to acquire the new frequencies. Every 10 minutes a special thread in this mailet will check if any change was made to the database by the feeder, and rebuild the corpus if necessary.

A org.apache.james.spam.probability mail attribute will be created containing the computed spam probability as a Double. The headerName message header string will be created containing such probability in floating point representation.

Sample configuration:

 <mailet match="All" class="BayesianAnalysis">
     Set this to the header name to add with the spam probability
     (default is "X-MessageIsSpamProbability").
     Set this to true if you want to ignore messages coming from local senders
     (default is false).
     By local sender we mean a return-path with a local server part (server listed
     in <servernames> in config.xml).
     Set this to the maximum message size (in bytes) that a message may have
     to be considered spam (default is 100000).
     Set this to false if you not want to tag the message if spam is detected (Default is true).

The probability of being spam is pre-pended to the subject if it is > 0.1 (10%).

The required tables are automatically created if not already there (see sqlResources.xml). The token field in both the ham and spam tables is case sensitive.

See Also:
BayesianAnalysisFeeder, BayesianAnalyzer, JDBCBayesianAnalyzer

Constructor Summary
Method Summary
 long getLastCorpusLoadTime()
          Getter for property lastCorpusLoadTime.
 String getMailetInfo()
          Return a string describing this mailet.
 int getMaxSize()
          Getter for property maxSize.
 void init()
          Mailet initialization routine.
 void service(Mail mail)
          Scans the mail and determines the spam probability.
 void setDataSource(DataSource datasource)
 void setFileSystem(SystemContext fs)
 void setMaxSize(int maxSize)
          Setter for property maxSize.
Methods inherited from class org.apache.mailet.base.GenericMailet
arrayToString, checkInitParameters, destroy, getInitParameter, getInitParameter, getInitParameter, getInitParameterNames, getMailetConfig, getMailetContext, getMailetName, init, log, log
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface
log, log

Constructor Detail


public BayesianAnalysis()
Method Detail


public String getMailetInfo()
Return a string describing this mailet.

Specified by:
getMailetInfo in interface Mailet
getMailetInfo in class GenericMailet
a string describing this mailet


public int getMaxSize()
Getter for property maxSize.

Value of property maxSize.


public void setMaxSize(int maxSize)
Setter for property maxSize.

maxSize - New value of property maxSize.


public long getLastCorpusLoadTime()
Getter for property lastCorpusLoadTime.

Value of property lastCorpusLoadTime.


public void setDataSource(DataSource datasource)


public void setFileSystem(SystemContext fs)


public void init()
          throws javax.mail.MessagingException
Mailet initialization routine.

init in class GenericMailet
javax.mail.MessagingException - if a problem arises


public void service(Mail mail)
             throws javax.mail.MessagingException
Scans the mail and determines the spam probability.

Specified by:
service in interface Mailet
Specified by:
service in class GenericMailet
mail - The Mail message to be scanned.
javax.mail.MessagingException - if a problem arises

Copyright © 2008-2012 The Apache Software Foundation. All Rights Reserved.