Ubuntu Manpage: Mail::SpamAssassin::Pyzor::Digest::Pieces

name
description
functions

oracular (3) Mail::SpamAssassin::Pyzor::Digest::Pieces.3pm.gz

NAME

       Mail::SpamAssassin::Pyzor::Digest::Pieces - Pyzor backend logic module

DESCRIPTION

       This module houses backend logic for Mail::SpamAssassin::Pyzor::Digest.

       It reimplements logic found in pyzor's digest.py module
       (<https://github.com/SpamExperts/pyzor/blob/master/pyzor/digest.py>).

FUNCTIONS

   $strings_ar = digest_payloads( $EMAIL_MIME )
       This imitates the corresponding object method in digest.py.  It returns a reference to an array of
       strings. Each string can be either a byte string or a character string (e.g., UTF-8 decoded).

       NB: RFC 2822 stipulates that message bodies should use CRLF line breaks, not plain LF (nor plain CR).  We
       will thus convert any plain CRs in a quoted-printable message body into CRLF. Python, though, doesn't do
       this, so the output of our implementation of digest_payloads() diverges from that of the Python original.
       It doesn't ultimately make a difference since the line-ending whitespace gets trimmed regardless, but
       it's necessary to factor in when comparing the output of our implementation with the Python output.

   normalize( $STRING )
       This imitates the corresponding object method in digest.py.  It modifies $STRING in-place.

       As with the original implementation, if $STRING contains (decoded) Unicode characters, those characters
       will be parsed accordingly. So:

           $str = "123\xc2\xa0";   # [ c2 a0 ] == \u00a0, non-breaking space

           normalize($str);

       The above will leave $str alone, but this:

           utf8::decode($str);

           normalize($str);

       ... will trim off the last two bytes from $str.

   $yn = should_handle_line( $STRING )
       This imitates the corresponding object method in digest.py.  It returns a boolean.

   $sr = assemble_lines( \@LINES )
       This assembles a string buffer out of @LINES. The string is the buffer of octets that will be hashed to
       produce the message digest.

       Each member of @LINES is expected to be an octet string, not a character string.

   ($main, $sub, $encoding, $checkval) = parse_content_type( $CONTENT_TYPE )
   @lines = splitlines( $TEXT )
       Imitates "str.splitlines()". (cf. "pydoc str")

       Returns a plain list in list context. Returns the number of items to be returned in scalar context.