Provided by: dspam_3.10.2+dfsg-13_amd64 bug

NAME

       dspamc - DSPAM Anti-Spam Agent (client)

SYNOPSIS

       dspamc [--mode=[teft|toe|tum|notrain|unlearn]] [--user user1 user2 ... userN]
       [--feature=[ch,no,wh,tb=N,sb]] [--class=[spam|innocent]]
       [--source=[error|corpus|inoculation] ] [--profile=[PROFILE] ] --deliver=[spam,innocent] ]
       [--help ] [--process ] [--classify ] [--signature=[signature] ] [--stdout] [--debug]
       [--daemon] [--client] [--rcpt-to] [--mail-from] [ delivery_arguments ]

DESCRIPTION

       The  DSPAM  agent  provides  a  direct  interface  to  mail  servers for command-line spam
       filtering. The agent can masquerade as the mail server's local  delivery  agent  and  will
       process  any  email  passed  to  it.  The agent will then call whatever delivery agent was
       specified at compile time or quarantine/tag/drop messages identified as  spam.  The  DSPAM
       agent  can  function  locally  or  as  a  proxy.  It  is  also  responsible for processing
       classification errors so that DSPAM can learn from its mistakes.   This  version  (dspamc)
       uses a connection to a dspam server rather than re-create contexts on each execution.

OPTIONS

       --user user1 user2 ... userNSpecifies  the  destination  users of the incoming message. In
       most cases this is
              the local user on the system, however some implementations  may  call  for  virtual
              usernames,  specific  to  DSPAM,  to  be assigned.  The agent processes an incoming
              message once for each user specified. If the message is to be delivered, the $u (or
              %u)  parameters  of  the  argument string will be interpolated for the current user
              being processed.

       --mode=[toe|tum|teft|notrain]Configures the training mode to be  used  for  this  process,
       overriding any
              defaults in dspam.conf:

              teft  :  Train-Everything.   Trains  on  all  messages  processed.   This is a very
              thorough training approach and should be considered the standard training  approach
              for  most  users.   TEFT  may,  however,  prove  too volatile on installations with
              extremely high per-user traffic,  or  prove  not  very  scalable  on  systems  with
              extremely  large user-bases.  In the event that TEFT is proving ineffective, one of
              the other modes is recommended.

              toe : Train-on-Error.  Trains only on  a  classification  error,  once  the  user's
              metadata  has  matured  to 2500 innocent messages.  This training mode is much less
              resource intensive, as only occasional metadata writes are necessary.  It  is  also
              far  less  volatile than the TEFT mode of training.  One drawback, however, is that
              TOE only learns when DSPAM has made a mistake - which means the data  is  sometimes
              too static, and unable to "ease into" a different type of behavior.

              tum  :  Train-until-Mature.   This  training mode is a hybrid between the other two
              training modes and provides a great balance between volatility and static metadata.
              TuM will train on a per-token basis only tokens which have had fewer than 25 "hits"
              on them, unless an error is being retrained in which case all tokens  are  trained.
              This  training  mode  provides  a  solid  core  of  stable  tokens to keep accuracy
              consistent, but also allows for dynamic  adaptation  to  any  new  types  of  email
              behavior a user might be experiencing.

              notrain : No training.  Do not train the user's data, and do not keep totals.  This
              should only be used in cases where you want to process mail for a  particular  user
              (based on a group, for example), but don't want the user to accumulate any learning
              data.

              unlearn : Unlearn original training. Use this if you wish to unlearn  a  previously
              learned  message.  Be  sure  to  specify --source=error and --class to whatever the
              original classification the message was learned under. If not using  TrainPristine,
              this will require the original signature from training.

       --feature=[chained,noise,tb=N,whitelist]Specifies  the  features  that should be activated
       for this filter instance.  The following features may be  used  individually  or  combined
       using a comma as a delimiter:

              chained : Chained Tokens (also known as biGrams).  Chained Tokens combines adjacent
              tokens, presently with a window size of 2, to form token "chains".  Chained  tokens
              uses additional storage resources, but greatly improves accuracy.  Recommended as a
              default feature.

              noise :  Bayesian Noise Reduction (BNR).  Bayesian Noise Reduction kicks in at 2500
              innocent  messages  and  provides  an  advanced  progressive  noise logic to reduce
              Bayesian Noise (wordlist attacks) in spams.  See http://bnr.nuclearelephant.com for
              more information.

              tb=N  :   Sets  the  training loop buffering level.  Training loop buffering is the
              amount of statistical sedation performed to water down statistics and  avoid  false
              positives  during  the  user's  training loop.  The training buffer sets the buffer
              sensitivity, and should be a number between  0  (no  buffering  whatsoever)  to  10
              (heavy buffering).  The default is 5, half of what previous versions of DSPAM used.
              To avoid dulling down statistics at all during the training loop, set this to 0.

              whitelist :  Automatic whitelisting.  DSPAM will keep track of the  entire  "From:"
              line  for each message received per user, and automatically whitelist messages from
              senders with more than 20 innocent messages and zero spams.  Once the user  reports
              a  spam  from  the sender, automatic whitelisting will automatically be deactivated
              for that sender.  Since DSPAM uses the  entire  "From:"  line,  and  not  just  the
              sender's email address, automatic whitelisting is a very safe approach to improving
              accuracy especially during initial training.

              sbph :  Sparse Binary Polynomial Hashing. Bill  Yerazunis'  tokenizer  method  from
              CRM114. Tokenizer method only - works with existing combination algorithms.

       --class=[spam|innocent]Identifies the disposition (if any) of the message being presented.
       This flag
              should  be  used  when  a  misclassification  has  occured,  when   the   user   is
              corpus-feeding  a  message,  or  when  an inoculation is being presented. This flag
              should not be used for standard processing. This flag must be used  in  conjunction
              with  the  --source  flag.  Omitting  this  flag  causes  DSPAM  to  determine  the
              disposition of the message on its own (the standard operating mode).

       --source=[error|corpus|inoculation]Where
              --class is used, the source of the classification must also be provided. The source
              tells dspam how to learn the message being presented:

              error  :  The  message  being  presented  was a message previously misclassified by
              DSPAM.  When 'error' is provided  as  a  source,  DSPAM  requires  that  the  DSPAM
              signature  be  present  in  the  message,  and will use the signature to recall the
              original training metadata.  If the signature is not present, the message  will  be
              rejected.   In  this  source  mode, DSPAM will also decrement each token's previous
              classification's count as well as the user totals.

              You should use error only when DSPAM has made an error in classifying the  message,
              and  should  present  the  modified version of the message with the DSPAM signature
              when doing so.

              corpus : The message being presented is from a mail corpus, and should  be  trained
              as  a new message, rather than re-trained based on a signature.  The message's full
              headers  and  body  will  be  analyzed  and  the  correct  classification  will  be
              incremented, without its opposite being decremented.

              You should use corpus only when feeding messages in from corpus.

              inoculation  :  The  message  being  presented  is  in pristine form, and should be
              trained as an inoculation.  Inoculations  are  a  more  intense  mode  of  training
              designed  to  cause  DSPAM  to  train  the user's metadata repeatedly on previoulsy
              unknown tokens, in an attepmt to vaccinate the user from future messages similar to
              the  one  being  presented.   You  should use inoculation only on honeypots and the
              like.

       --profile=[PROFILE]Specify a storage profile from dspam.conf. The storage profile selected
       will be used for all database connectivity. See dspam.conf for more information.

       --deliver=[innocent,spam]Tells
              DSPAM to deliver the message if its result falls within the criteria specified. For
              example, --deliver=innocent will cause DSPAM to only deliver  the  message  if  its
              classification  has  been determined as innocent. Providing --deliver=innocent,spam
              will cause DSPAM to deliver the message regardless of its classification. This flag
              provides a significant amount of flexibility for nonstandard implementations.

       --stdout If the message is indeed deemed "deliverable" by the
              --deliver flag, this flag will cause DSPAM to deliver the message to stdout, rather
              than the configured delivery agent.

       --process Tells
              DSPAM to process the message. This is the default behavior, and the flag is implied
              unless --classify is used.

       --classifyTells
              DSPAM  to  only classify the message, and not perform any writes to the user's data
              or attempt to deliver/quarantine the message. The results of a  classification  are
              printed to stdout in the following format:

              X-DSPAM-Result: User; result="Spam"; probability=1.0000; confidence=0.80

              NOTE  : The output of the classification is specific to a user's own data, and does
              not include the output of any groups they  might  be  affiliated  with,  so  it  is
              entirely  possible  that  the  message  would be caught as spam by a group the user
              belongs to, and appear as innocent in the output of a classification.  To  get  the
              classification  for  the  group  ,  use  the  group  name as the user instead of an
              individual.

       --signature=[signature]
              If only the signature is available for training, and not the  entire  message,  the
              --signature  flag  may  be  used  to  feed  the signature into DSPAM and forego the
              reading of stdin. DSPAM  will  process  the  signature  with  whatever  commandline
              classification was specified. NOTE: This should only be used with --source=error

       --debugIf
              DSPAM  was  compiled  with --enable-debug then using --debug will turn on debugging
              messages to /tmp/dspam.debug.

       --daemonIf
              DSPAM was compiled with --enable-daemon then using --daemon  will  cause  DSPAM  to
              enter  daemon  mode, where it will listen for DSPAM clients to connect and actively
              service requests.

       --clientIf
              DSPAM was compiled with --enable-daemon then using --client will cause DSPAM to act
              as  a  client  and attempt to connect to the DSPAM server specified in the client's
              configuration within dspam.conf. If client behavior is desired, this option must be
              specified,  otherwise  the agent simply operate as self-contained and processes the
              message on its own, eliminating any benefit of using the daemon.

       --rcpt-toIf
              DSPAM will be configured to deliver via LMTP or SMTP, this  flag  may  be  used  to
              define the RCPT TOs which will be used for the delivery of each user specified with
              --user. If no recipients are provided, the RCPT TOs will match the username.  NOTE:
              The  recipient  list  should  always  be  balanced  with  the  user list, or empty.
              Specifying an unbalanced number of recipients to users  will  result  in  undefined
              behavior.

       --mail-fromIf
              DSPAM  will  be  cofigured to deliver via LMTP or SMTP, this flag will set the MAIL
              FROM sent on delivery of the message. The default MAIL  FROM  depends  on  how  the
              message  was originally relayed to DSPAM. If it was relayed via the commandline, an
              empty MAIL FROM will be used. If it was relayed via LMTP, the  original  MAIL  FROM
              will be used.

EXIT VALUE

       0      Operation was successful.
       other  Operation  resulted  in  an  error.  If  the error involved an error in calling the
              delivery agent, the exit value of the delivery agent will be returned.

AUTHORS

       Jonathan A. Zdziarski

       For more information, see http://dspam.nuclearelephant.com.

SEE ALSO

       dspam_stats(1), dspam_corpus(1), dspam_clean(1), dspam_dump(1), dspam_merge(1)

Jonathan A. Zdziarski <jonathan@nuclearelepSept29,m>2004                                 DSPAMC(1)