Ubuntu Manpage: Mail::SpamAssassin::PerMsgStatus - per-message status (spam or not-spam)

Provided by: spamassassin_3.4.2-0ubuntu0.16.04.5_all

NAME

       Mail::SpamAssassin::PerMsgStatus - per-message status (spam or not-spam)

SYNOPSIS

         my $spamtest = new Mail::SpamAssassin ({
           'rules_filename'      => '/etc/spamassassin.rules',
           'userprefs_filename'  => $ENV{HOME}.'/.spamassassin/user_prefs'
         });
         my $mail = $spamtest->parse();

         my $status = $spamtest->check ($mail);

         my $rewritten_mail;
         if ($status->is_spam()) {
           $rewritten_mail = $status->rewrite_mail ();
         }
         ...

DESCRIPTION

       The Mail::SpamAssassin "check()" method returns an object of this class.  This object encapsulates all
       the per-message state.

METHODS

       $status->check ()
           Runs the SpamAssassin rules against the message pointed to by the object.

       $status->learn()
           After  a mail message has been checked, this method can be called.  If the score is outside a certain
           range around the threshold, ie. if the message is judged more-or-less definitely spam  or  definitely
           non-spam,  it  will  be  fed  into  SpamAssassin's  learning  systems  (currently  the naive Bayesian
           classifier), so that future similar mails will be caught.

       $score = $status->get_autolearn_points()
           Return the message's score as computed for auto-learning.  Certain tests are ignored:

             - rules with tflags set to 'learn' (the Bayesian rules)

             - rules with tflags set to 'userconf' (user white/black-listing rules, etc)

             - rules with tflags set to 'noautolearn'

           Also note that auto-learning occurs using scores from either scoreset  0  or  1,  depending  on  what
           scoreset  is  used  during  message check.  It is likely that the message check and auto-learn scores
           will be different.

       $score = $status->get_head_only_points()
           Return the message's score as computed for auto-learning, ignoring all rules except for  header-based
           ones.

       $score = $status->get_learned_points()
           Return  the  message's  score  as computed for auto-learning, ignoring all rules except for learning-
           based ones.

       $score = $status->get_body_only_points()
           Return the message's score as computed for auto-learning, ignoring all rules  except  for  body-based
           ones.

       $score = $status->get_autolearn_force_status()
           Return whether a message's score included any rules that are flagged as autolearn_force.

       $rule_names = $status->get_autolearn_force_names()
           Return  a list of comma separated list of rule names if a message's score included any rules that are
           flagged as autolearn_force.

       $isspam = $status->is_spam ()
           After a mail message has been checked, this method  can  be  called.   It  will  return  1  for  mail
           determined likely to be spam, 0 if it does not seem spam-like.

       $list = $status->get_names_of_tests_hit ()
           After  a  mail  message has been checked, this method can be called. It will return a comma-separated
           string, listing all the symbolic test names of the tests which were triggered by the mail.

       $list = $status->get_names_of_tests_hit_with_scores_hash ()
           After a mail message has been checked, this method can be called. It will return a pointer to a  hash
           for  rule & score pairs for all the symbolic test names and individual scores of the tests which were
           triggered by the mail.

       $list = $status->get_names_of_tests_hit_with_scores ()
           After a mail message has been checked, this method can be called. It will  return  a  comma-separated
           string  of  rule=score pairs for all the symbolic test names and individual scores of the tests which
           were triggered by the mail.

       $list = $status->get_names_of_subtests_hit ()
           After a mail message has been checked, this method can be called.  It will return  a  comma-separated
           string,  listing  all  the symbolic test names of the meta-rule sub-tests which were triggered by the
           mail.  Sub-tests are the normally-hidden rules, which score 0  and  have  names  beginning  with  two
           underscores, used in meta rules.

       $num = $status->get_score ()
           After  a  mail  message  has  been  checked, this method can be called.  It will return the message's
           score.

       $num = $status->get_required_score ()
           After a mail message has been checked, this method can be called.  It will return the score  required
           for a mail to be considered spam.

       $num = $status->get_autolearn_status ()
           After  a  mail  message  has  been  checked,  this  method  can be called.  It will return one of the
           following strings depending on whether the  mail  was  auto-learned  or  not:  "ham",  "no",  "spam",
           "disabled", "failed", "unavailable".

           It  also returns is flagged with auto_learn_force, it will also include the status and the rules hit.
           For example: "autolearn_force=yes (AUTOLEARNTEST_BODY)"

       $report = $status->get_report ()
           Deliver a "spam report" on the checked  mail  message.   This  contains  details  of  how  many  spam
           detection rules it triggered.

           The report is returned as a multi-line string, with the lines separated by "\n" characters.

       $preview = $status->get_content_preview ()
           Give a "preview" of the content.

           This  is  returned  as a multi-line string, with the lines separated by "\n" characters, containing a
           fully-decoded, safe, plain-text sample of the first few lines of the message body.

       $msg = $status->get_message()
           Return the object representing the message being scanned.

       $status->rewrite_mail ()
           Rewrite the mail message.  This will at minimum add headers,  and  at  maximum  MIME-encapsulate  the
           message  text,  to  reflect  its  spam  or not-spam status.  The function will return a scalar of the
           rewritten message.

           The actual modifications  depend  on  the  configuration  (see  "Mail::SpamAssassin::Conf"  for  more
           information).

           The possible modifications are as follows:

           To:, From: and Subject: modification on spam mails
               Depending  on the configuration, the To: and From: lines can have a user-defined RFC 2822 comment
               appended for spam mail. The subject line may have a user-defined string prepended to it for  spam
               mail.

           X-Spam-* headers for all mails
               Depending  on the configuration, zero or more headers with names beginning with "X-Spam-" will be
               added to mail depending on whether it is spam or ham.

           spam message with report_safe
               If report_safe is  set  to  true  (1),  then  spam  messages  are  encapsulated  into  their  own
               message/rfc822 MIME attachment without any modifications being made.

               If  report_safe  is  set  to  false  (0),  then  the  message  will  only  have the above headers
               added/modified.

       $status->action_depends_on_tags($tags, $code, @args)
           Enqueue the supplied subroutine reference $code, to become  runnable  when  all  the  specified  tags
           become  available.  The  $tags  may  be  a simple scalar - a tag name, or a listref of tag names. The
           subroutine &$code when called will be passed a  "permessagestatus"  object  as  its  first  argument,
           followed by the supplied (optional) list @args .

       $status->set_tag($tagname, $value)
           Set  a template tag, as used in "add_header", report templates, etc.  This API is intended for use by
           plugins.  Tag names will be converted to an all-uppercase representation internally.

           $value can be a simple scalar (string or number), or a reference to  an  array,  in  which  case  the
           public  method  get_tag  will  join  array  elements using a space as a separator, returning a single
           string for backward compatibility.

           $value can also be a subroutine reference,  which  will  be  evaluated  each  time  the  template  is
           expanded.  The  first argument passed by get_tag to a called subroutine will be a PerMsgStatus object
           (this module's object), followed by optional arguments provided a caller to get_tag.

           Note that perl supports closures, which means that  variables  set  in  the  caller's  scope  can  be
           accessed inside this "sub". For example:

               my $text = "hello world!";
               $status->set_tag("FOO", sub {
                         my $pms = shift;
                         return $text;
                       });

           See  "Mail::SpamAssassin::Conf"'s  "TEMPLATE  TAGS" section for more details on how template tags are
           used.

           "undef" will be returned if a tag by that name has not been defined.

       $string = $status->get_tag($tagname)
           Get the current value of a template tag, as used in "add_header", report templates, etc. This API  is
           intended  for  use  by  plugins.   Tag  names  will  be  converted to an all-uppercase representation
           internally.  See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more details on tags.

           "undef" will be returned if a tag by that name has not been defined.

       $string = $status->get_tag_raw($tagname, @args)
           Similar to "get_tag", but keeps a tag name unchanged (does not uppercase it), and  does  not  convert
           arrayref tag values into a single string.

       $status->set_spamd_result_item($subref)
           Set  an  entry  for  the  spamd result log line.  $subref should be a code reference for a subroutine
           which will return a string in 'name=VALUE' format, similar to the other entries in the  spamd  result
           line:

             Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
             DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
             TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
             TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
             uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
             rport=33153,mid=<9PS291LhupY>,autolearn=spam

           "name"  and  "VALUE"  must not contain "=" or "," characters, as it is important that these log lines
           are easy to parse.

           The  code  reference  will  be  called  by  spamd  after  the  message  has  been  scanned,  and  the
           "PerMsgStatus::check()" method has returned.

       $status->finish ()
           Indicate that this $status object is finished with, and can be destroyed.

           If  you  are  using SpamAssassin in a persistent environment, or checking many mail messages from one
           "Mail::SpamAssassin" factory, this method should be called to ensure Perl's garbage  collection  will
           clean up old status objects.

       $name = $status->get_current_eval_rule_name()
           Return the name of the currently-running eval rule.  "undef" is returned if no eval rule is currently
           being  run.  Useful for plugins to determine the current rule name while inside an eval test function
           call.

       $status->get_decoded_body_text_array ()
           Returns the message body, with base64 or quoted-printable encodings decoded, and  non-text  parts  or
           non-inline attachments stripped.

           It  is  returned  as an array of strings, with each string representing one newline-separated line of
           the body.

       $status->get_decoded_stripped_body_text_array ()
           Returns the  message  body,  decoded  (as  described  in  get_decoded_body_text_array()),  with  HTML
           rendered, and with whitespace normalized.

           It  will  always render text/html, and will use a heuristic to determine if other text/* parts should
           be considered text/html.

           It is returned as an array of strings, with each string representing one 'paragraph'.  Paragraphs, in
           plain-text mails, are double-newline-separated blocks of multi-line text.

       $status->get (header_name [, default_value])
           Returns a message header, pseudo-header, real name or address.  "header_name" is the name of  a  mail
           header,  such as 'Subject', 'To', etc.  If "default_value" is given, it will be used if the requested
           "header_name" does not exist.

           Appending ":raw" to the header name will inhibit decoding  of  quoted-printable  or  base-64  encoded
           strings.

           Appending  a  modifier  ":addr"  to  a header field name will cause everything except the first email
           address to be removed from the header field.  It  is  mainly  applicable  to  header  fields  'From',
           'Sender',  'To',  'Cc'  along with their 'Resent-*' counterparts, and the 'Return-Path'. For example,
           all of the following will result in "example@foo":

           example@foo
           example@foo (Foo Blah)
           example@foo, example@bar
           display: example@foo (Foo Blah), example@bar ;
           Foo Blah <example@foo>
           "Foo Blah" <example@foo>
           "'Foo Blah'" <example@foo>

           Appending a modifier ":name" to a header field name will cause everything except  the  first  display
           name  to  be  removed  from  the  header field. It is mainly applicable to header fields containing a
           single  mail  address:  'From',  'Sender',  along  with  their  'Resent-From'   and   'Resent-Sender'
           counterparts.   For  example,  all  of  the  following will result in "Foo Blah". One level of single
           quotes is stripped too, as it is often seen.

           example@foo (Foo Blah)
           example@foo (Foo Blah), example@bar
           display: example@foo (Foo Blah), example@bar ;
           Foo Blah <example@foo>
           "Foo Blah" <example@foo>
           "'Foo Blah'" <example@foo>

           There are several special pseudo-headers that can be specified:

           "ALL" can be used to mean the text of all the message's headers.
           "ALL-TRUSTED" can be used to mean the text of all the message's headers that could only have been
           added by trusted relays.
           "ALL-INTERNAL" can be used to mean the text of all the message's headers that could only have been
           added by internal relays.
           "ALL-UNTRUSTED" can be used to mean the text of all the message's headers that may have been added by
           untrusted relays.  To make this pseudo-header more useful for header rules the 'Received' header that
           was added by the last trusted relay is included, even though it can be trusted.
           "ALL-EXTERNAL" can be used to mean the text of all the message's headers that may have been added by
           external relays.  Like "ALL-UNTRUSTED" the 'Received' header added by the last internal relay is
           included.
           "ToCc" can be used to mean the contents of both the 'To' and 'Cc' headers.
           "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the SMTP transaction that delivered
           this message, if this data has been made available by the SMTP server.
           "MESSAGEID" is a symbol meaning all Message-Id's found in the message; some mailing list software
           moves the real 'Message-Id' to 'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
           'Message-Id' header.  The value returned for this symbol is the text from all 3 headers, separated by
           newlines.
           "X-Spam-Relays-Untrusted" is the generated metadata of untrusted relays the message has passed
           through
           "X-Spam-Relays-Trusted" is the generated metadata of trusted relays the message has passed through
       $status->get_uri_list ()
           Returns an array of all unique URIs found in the message.  It takes a combination of the  URIs  found
           in  the  rendered  (decoded  and  HTML stripped) body and the URIs found when parsing the HTML in the
           message.  Will also set $status->{uri_list} (the array as returned by this function).

           The returned array will include the "raw" URI as well as "slightly cooked"  versions.   For  example,
           the     single     URI    'http://%77&#00119;%77.example.com/'    will    get    turned    into:    (
           'http://%77&#00119;%77.example.com/', 'http://www.example.com/' )

       $status->get_uri_detail_list ()
           Returns a hash reference of all unique URIs found in the message and various  data  about  where  the
           URIs  were  found  in the message.  It takes a combination of the URIs found in the rendered (decoded
           and HTML stripped) body and the URIs found when parsing the HTML  in  the  message.   Will  also  set
           $status->{uri_detail_list}  (the  hash  reference  as returned by this function).  This function will
           also set $status->{uri_domain_count} (count of unique domains).

           The hash format looks something like this:

             raw_uri => {
               types => { a => 1, img => 1, parsed => 1 },
               cleaned => [ canonicalized_uri ],
               anchor_text => [ "click here", "no click here" ],
               domains => { domain1 => 1, domain2 => 1 },
             }

           "raw_uri" is whatever the URI was in the message itself (http://spamassassin.apache%2Eorg/).

           "types" is a hash of the HTML tags (lowercase) which referenced the raw_uri.  parsed is a faked  type
           which specifies that the raw_uri was seen in the rendered text.

           "cleaned"    is    an    array    of   the   raw   and   canonicalized   version   of   the   raw_uri
           (http://spamassassin.apache%2Eorg/, http://spamassassin.apache.org/).

           "anchor_text" is an array of the anchor text (text between <a> and </a>), if any, which linked to the
           URI.

           "domains" is a hash of the domains found in the canonicalized URIs.

           "hosts" is a hash of unstripped hostnames found in the canonicalized URIs as hash  keys,  with  their
           domain part stored as a value of each hash entry.

       $status->clear_test_state()
           Clear test state, including test log messages from "$status->test_log()".

       $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
           Register a hit against a rule in the ruleset.

           There  are  two  mandatory  arguments.  These  are  $rulename,  the  name of the rule that fired, and
           $desc_prepend, which is a short string that will be prepended  to  the  rules  "describe"  string  in
           output reports.

           In addition, callers can supplement that with the following optional data:

           score => $num
               Optional:   the   score   to  use  for  the  rule  hit.   If  unspecified,  the  value  from  the
               "Mail::SpamAssassin::Conf" object's "{scores}" hash will be used (a configured score), and in its
               absence the "defscore" option value.

           defscore => $num
               Optional: the score to use for the rule hit if neither the option  "score"  is  provided,  nor  a
               configured score value is provided.

           value => $num
               Optional:  the  value  to  assign to the rule; the default value is 1.  tflags multiple rules use
               values of greater than 1 to indicate multiple hits.  This value is accessible to meta rules.

           ruletype => $type
               Optional, but recommended: the rule type string.  This is used in  the  "hit_rule"  plugin  call,
               called by this method.  If unset, 'unknown' is used.

           tflags => $string
               Optional:  a  string,  i.e.  a  space-separated  list  of  additional tflags to be appended to an
               existing list of flags in $self->{conf}->{tflags},  such  as:  "nice  noautolearn  multiple".  No
               syntax checks are performed.

           description => $string
               Optional:  a  custom rule description string.  This is used in the "hit_rule" plugin call, called
               by this method. If unset, the static description is used.

           Backward compatibility: the two mandatory arguments have been part of  this  API  since  SpamAssassin
           2.x.  The optional name=<gtvalue> pairs, however, are a new addition in SpamAssassin 3.2.0.

       $status->create_fulltext_tmpfile (fulltext_ref)
           This  function  creates  a  temporary file containing the passed scalar reference data (typically the
           full/pristine text of the message).  This is typically used  by  external  programs  like  pyzor  and
           dccproc,   to   avoid  hangs  due  to  buffering  issues.    Methods  that  need  this,  should  call
           $self->create_fulltext_tmpfile($fulltext) to retrieve the temporary filename; it will be  created  if
           it has not already been.

           Note: This can only be called once until $status->delete_fulltext_tmpfile() is called.

       $status->delete_fulltext_tmpfile ()
           Will  cleanup  after  a  $status->create_fulltext_tmpfile()  call.   Deletes  the  temporary file and
           uncaches the filename.

       all_from_addrs_domains
           This function returns all the various from addresses in a message  using  all_from_addrs()  and  then
           returns only the domain names.

NAME

SYNOPSIS

DESCRIPTION

METHODS

SEE ALSO