Provided by: sympa_6.2.66~dfsg-2_amd64 bug

NAME

       Sympa::Tools::Text - Text-related functions

DESCRIPTION

       This package provides some text-related functions.

   Functions
       addrencode ( $addr, [ $phrase, [ $charset, [ $comment ] ] ] )
           Returns formatted (and encoded) name-addr as RFC5322 3.4.

       canonic_email ( $email )
           Function.  Returns canonical form of e-mail address.

           Leading and trailing white spaces are removed.  Latin letters without accents are
           lower-cased.

           For malformed inputs returns "undef".

       canonic_message_id ( $message_id )
           Returns canonical form of message ID without trailing or leading whitespaces or "<",
           ">".

       canonic_text ( $text )
           Canonicalizes text.  $text should be a binary string encoded by UTF-8 character set or
           a Unicode string.  Forbidden sequences in binary string will be replaced by U+FFFD
           REPLACEMENT CHARACTERs, and Normalization Form C (NFC) will be applied.

       clip ( $string, $length )
           Function.  Clips $string according to $length by bytes, considering boundary of
           grapheme clusters.  UTF-8 is assumed for $string as bytestring.

       decode_filesystem_safe ( $str )
           Function.  Decodes a string encoded by encode_filesystem_safe().

           Parameter:

           $str
               String to be decoded.

           Returns:

           Decoded string, stripped "utf8" flag if any.

       decode_html ( $str )
           Function.  Decodes HTML entities in a string encoded by UTF-8 or a Unicode string.

           Parameter:

           $str
               String to be decoded.

           Returns:

           Decoded string, stripped "utf8" flag if any.

       encode_filesystem_safe ( $str )
           Function.  Encodes a string $str to be suitable for filesystem.

           Parameter:

           $str
               String to be encoded.

           Returns:

           Encoded string, stripped "utf8" flag if any.  All bytes except '-', '+', '.', '@' and
           alphanumeric characters are encoded to sequences '_' followed by two hexdigits.

           Note that '/' will also be encoded.

       encode_html ( $str, [ $additional_unsafe ] )
           Function.  Encodes characters in a string $str to HTML entities.  By default '<', '>',
           '&' and '"' are encoded.

           Parameter:

           $str
               String to be encoded.

           $additional_unsafe
               Character or range of characters additionally encoded as entity references.

               This optional parameter was introduced on Sympa 6.2.37b.3.

           Returns:

           Encoded string, not stripping utf8 flag if any.

       encode_uri ( $str, [ omit => $chars ] )
           Function.  Encodes potentially unsafe characters in the string using "percent"
           encoding suitable for URIs.

           Parameters:

           $str
               String to be encoded.

           omit => $chars
               By default, all characters except those defined as "unreserved" in RFC 3986 are
               encoded, that is, "[^-A-Za-z0-9._~]".  If this parameter is given, it will prevent
               encoding additional characters.

           Returns:

           Encoded string, stripped "utf8" flag if any.

       escape_chars ( $str )
           Deprecated.  Use "encode_filesystem_safe".

           Escape weird characters.

       escape_url ( $str )
           DEPRECATED.  Would be better to use "encode_uri" or "mailtourl".

       foldcase ( $str )
           Function.  Returns "fold-case" string suitable for case-insensitive match.  For
           example, a code below looks for a needle in haystack not regarding case, even if they
           are non-ASCII UTF-8 strings.

             $haystack = Sympa::Tools::Text::foldcase($HayStack);
             $needle   = Sympa::Tools::Text::foldcase($NeedLe);
             if (index $haystack, $needle >= 0) {
                 ...
             }

           Parameter:

           $str
               A string.

       guessed_to_utf8( $text, [ lang, ... ] )
           Function.  Guesses text charset considering language context and returns the text
           reencoded by UTF-8.

           Parameters:

           $text
               Text to be reencoded.

           lang, ...
               Language tag(s) which may be given by "implicated_langs" in Sympa::Language.

           Returns:

           Reencoded text.  If any charsets could not be guessed, "iso-8859-1" will be used as
           the last resort, just because it covers full range of 8-bit.

       mailtourl ( $email, [ decode_html => 1 ], [ query => {key => val, ...} ] )
           Function.  Constructs a "mailto:" URL for given e-mail.

           Parameters:

           $email
               E-mail address.

           decode_html => 1
               If set, arguments are assumed to include HTML entities.

           query => {key => val, ...}
               Optional query.

           Returns:

           Constructed URL.

       pad ( $str, $width )
           Pads space a string so that result will not be narrower than given width.

           Parameters:

           $str
               A string.

           $width
               If $width is false value or width of $str is not less than $width, does nothing.
               If $width is less than 0, pads right.  Otherwise, pads left.

           Returns:

           Padded string.

       qdecode_filename ( $filename )
           Q-Decodes web file name.

           ToDo: This should be obsoleted in the future release: Would be better to use
           "decode_filesystem_safe".

       qencode_filename ( $filename )
           Q-Encodes web file name.

           ToDo: This should be obsoleted in the future release: Would be better to use
           "encode_filesystem_safe".

       slurp ( $file )
           Get entire content of the file.  Normalization by canonic_text() is applied.  $file is
           the path to text file.

       unescape_chars ( $str )
           Deprecated.  Use "decode_filesystem_safe".

           Unescape weird characters.

       valid_email ( $string )
           Basic check of an email address.

       weburl ( $base, \@paths, [ decode_html => 1 ], [ fragment => $fragment ], [ query =>
       \%query ] )
           Constructs a "http:" or "https:" URL under given base URI.

           Parameters:

           $base
               Base URI.

           \@paths
               Additional path components.

           decode_html => 1
               If set, arguments are assumed to include HTML entities.  Exception is $base: It is
               assumed not to include entities.

           fragment => $fragment
               Optional fragment.

           query => \%query
               Optional query.

           Returns:

           A URI.

       wrap_text ( $text, [ $init_tab, [ $subsequent_tab, [ $cols ] ] ] )
           Function.  Returns line-wrapped text.

           Parameters:

           $text
               The text to be folded.

           $init_tab
               Indentation prepended to the first line of paragraph.  Default is '', no
               indentation.

           $subsequent_tab
               Indentation prepended to each subsequent line of folded paragraph.  Default is '',
               no indentation.

           $cols
               Max number of columns of folded text.  Default is 78.

HISTORY

       Sympa::Tools::Text appeared on Sympa 6.2a.41.

       decode_filesystem_safe() and encode_filesystem_safe() were added on Sympa 6.2.10.

       decode_html(), encode_html(), encode_uri() and mailtourl() were added on Sympa 6.2.14, and
       escape_url() was deprecated.

       guessed_to_utf8() and pad() were added on Sympa 6.2.17.

       canonic_text() and slurp() were added on Sympa 6.2.53b.

       clip() was added on Sympa 6.2.61b.