lunar (1) grepmail.1p.gz

Provided by: grepmail_5.3104-5_all bug

NAME

       grepmail - search mailboxes for mail matching a regular expression

SYNOPSIS

         grepmail [--help|--version] [-abBDFhHilLmrRuvVw] [-C <cache-file>]
           [-j <status>] [-s <sizespec>] [-d <date-specification>]
           [-X <signature-pattern>] [-Y <header-pattern>]
           [[-e] <pattern>|-E <expr>|-f <pattern-file>] <files...>

DESCRIPTION

         grepmail looks for mail messages containing a pattern, and prints the resulting messages
         on standard out.

         By default grepmail looks in both header and body for the specified pattern.

         When redirected to a file, the result is another mailbox, which can, in turn, be handled
         by standard User Agents, such as elm, or even used as input for another instance of
         grepmail.

         At least one of -E, -e, -d, -s, or -u must be specified. The pattern is optional if -d,
         -s, and/or -u is used. The -e flag is optional if there is no file whose name is the
         pattern. The -E option can be used to specify complex search expressions involving
         logical operators.  (See below.)

         If a mailbox can not be found, grepmail first searches the directory specified by the
         MAILDIR environment variable (if one is defined), then searches the $HOME/mail,
         $HOME/Mail, and $HOME/Mailbox directories.

OPTIONS AND ARGUMENTS

       Many of the options and arguments are analogous to those of grep.

       pattern
         The pattern to search for in the mail message.  May be any Perl regular expression, but
         should be quoted on the command line to protect against globbing (shell expansion). To
         search for more than one pattern, use the form "(pattern1|pattern2|...)".

         Note that complex pattern features such as "(?>...)" require that you use a version of
         perl which supports them. You can use the pattern "()" to indicate that you do not want
         to match anything. This is useful if you want to initialize the cache without printing
         any output.

       mailbox
         Mailboxes must be traditional, UNIX "/bin/mail" mailbox format.  The mailboxes may be
         compressed by gzip, bzip2, lzip or xz, in which case the associated compression tool
         must be installed on the system, as well as a recent version of the
         Mail::Mbox::MessageParser Perl module that supports the format.

         If no mailbox is specified, takes input from stdin, which can be compressed or not.
         grepmail's behavior is undefined when ASCII and binary data is piped together as input.

       -a
         Use arrival date instead of sent date.

       -b
         Asserts that the pattern must match in the body of the email.

       -B
         Print the body but with only minimal ('From ', 'From:', 'Subject:', 'Date:') headers.
         This flag can be used with -H, in which case it will print only short headers and no
         email bodies.

       -C
         Specifies the location of the cache file. The default is $HOME/.grepmail-cache.

       -D
         Enable debug mode, which prints diagnostic messages.

       -d
         Date specifications must be of the form of:
           - a date like "today", "yesterday", "5/18/93", "5 days ago", "5 weeks ago",
           - OR "before", "after", or "since", followed by a date as defined above,
           - OR "between <date> and <date>", where <date> is defined as above.

         Simple date expressions will first be parsed by Date::Parse. If this fails, grepmail
         will attempt to parse the date with Date::Manip, if the module is installed on the
         system. Use an empty pattern (i.e. -d "") to find emails without a "Date: ..." line in
         the header.

         Date specifications without times are interpreted as having a time of midnight of that
         day (which is the morning), except for "after" and "since" specifications, which are
         interpreted as midnight of the following day.  For example, "between today and tomorrow"
         is the same as simply "today", and returns emails whose date has the current day. ("now"
         is interpreted as "today".) The date specification "after July 5th" will return emails
         whose date is midnight July 6th or later.

       -E
         Specify a complex search expression using logical operators. The current syntax allows
         the user to specify search expressions using Perl syntax. Three values can be used:
         $email (the entire email message), $email_header (just the header), or $email_body (just
         the body). A search is specified in the form "$email =~ /pattern/", and multiple
         searches can be combined using "&&" and "||" for "and" and "or".

         For example, the expression

           $email_header =~ /^From: .*\@coppit.org/ && $email =~ /grepmail/i

         will find all emails which originate from coppit.org (you must escape the "@" sign with
         a backslash), and which contain the keyword "grepmail" anywhere in the message, in any
         capitalization.

         -E is incompatible with -b, -h, and -e. -i, -M, -S, and -Y have not yet been
         implemented.

         NOTE: The syntax of search expressions may change in the future. In particular, support
         for size, date, and other constraints may be added. The syntax may also be simplified in
         order to make expression formation easier to use (and perhaps at the expense of reduced
         functionality).

       -e
         Explicitly specify the search pattern. This is useful for specifying patterns that begin
         with "-", which would otherwise be interpreted as a flag.

       -f
         Obtain patterns from FILE, one per line.  The  empty  file  contains zero patterns, and
         therefore matches nothing.

       -F
         Force grepmail to process all files and streams as though they were mailboxes.  (i.e.
         Skip checks for non-mailbox ASCII files or binary files that don't look like they are
         compressed using known schemes.)

       -h
         Asserts that the pattern must match in the header of the email.

       -H
         Print the header but not body of matching emails.

       -i
         Make the search case-insensitive (by analogy to grep -i).

       -j
         Asserts that the email "Status:" header must contain the given flags. Order and case are
         not important, so use -j AR or -j ra to search for emails which have been read and
         answered.

       -l
         Output the names of files having an email matching the expression, (by analogy to grep
         -l).

       -L
         Follow symbolic links. (Implies -R)

       -M
         Causes grepmail to ignore non-text MIME attachments. This removes false positives
         resulting from binaries encoded as ASCII attachments.

       -m
         Append "X-Mailfolder: <folder>" to all email headers, indicating which folder contained
         the matched email.

       -n
         Prefix each line with line number information. If multiple files are specified, the
         filename will precede the line number. NOTE: When used in conjunction with -m, the
         X-Mailfolder header has the same line number as the next (blank) line.

       -q
         Quiet mode. Suppress the output of warning messages about non-mailbox files,
         directories, etc.

       -r
         Generate a report of the names of the files containing emails matching the expression,
         along with a count of the number of matching emails.

       -R
         Causes grepmail to recurse any directories encountered.

       -s
         Return emails which match the size (in bytes) specified with this flag. Note that this
         size includes the length of the header.

         Size constraints must be of the form of:
          - 12345: match size of exactly 12345
          - <12345, <=12345, >12345, >=12345: match size less than, less than or equal,
            greater than, or greater than or equal to 12345
          - 10000-12345: match size between 10000 and 12345 inclusive

       -S
         Ignore signatures. The signature consists of everything after a line consisting of "--
         ".

       -u
         Output only unique emails, by analogy to sort -u. Grepmail determines email uniqueness
         by the Message-ID header.

       -v
         Invert the sense of the search, by analogy to grep -v. This results in the set of emails
         printed being the complement of those that would be printed without the -v switch.

       -V
         Print the version and exit.

       -w
         Search for only those lines which contain the pattern as part of a word group.  That is,
         the start of the pattern must match the start of a word, and the end of the pattern must
         match the end of a word. (Note that the start and end need not be for the same word.)

         If you are familiar with Perl regular expressions, this flag simply puts a "\b" before
         and after the search pattern.

       -X
         Specify a regular expression for the signature separator. By default this pattern is
         '^-- $'.

       -Y
         Specify a pattern which indicates specific headers to be searched. The search will
         automatically treat headers which span multiple lines as one long line.  This flag
         implies -h.

         In the style of procmail, special strings in the pattern will be expanded as follows:

           If the regular expression contains "^TO:" it will be substituted by

             ^((Original-)?(Resent-)?(To|Cc|Bcc)|(X-Envelope|Apparently(-Resent)?)-To):

           which should match all headers with destination addresses.

           If the regular expression contains "^FROM_DAEMON:" it  will be substituted by

             (^(Mailing-List:|Precedence:.*(junk|bulk|list)|To: Multiple recipients of |(((Resent-)?(From|Sender)|X-Envelope-From):|>?From )([^>]*[^(.%@a-z0-9])?(Post(ma?(st(e?r)?|n)|office)|(send)?Mail(er)?|daemon|m(mdf|ajordomo)|n?uucp|LIST(SERV|proc)|NETSERV|o(wner|ps)|r(e(quest|sponse)|oot)|b(ounce|bs\.smtp)|echo|mirror|s(erv(ices?|er)|mtp(error)?|ystem)|A(dmin(istrator)?|MMGR|utoanswer))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t ][^<)]*(\(.*\).*)?)?

           which should catch mails coming from most daemons.

           If  the regular expression contains "^FROM_MAILER:" it will be substituted by

             (^(((Resent-)?(From|Sender)|X-Envelope-From):|>?From)([^>]*[^(.%@a-z0-9])?(Post(ma(st(er)?|n)|office)|(send)?Mail(er)?|daemon|mmdf|n?uucp|ops|r(esponse|oot)|(bbs\.)?smtp(error)?|s(erv(ices?|er)|ystem)|A(dmin(istrator)?|MMGR))(([^).!:a-z0-9][-_a-z0-9]*)?[%@>\t][^<)]*(\(.*\).*)?)?$([^>]|$))

           (a stripped down version of "^FROM_DAEMON:"), which should catch mails coming from
           most mailer-daemons.

           So, to search for all emails to or from "Andy":

             grepmail -Y '(^TO:|^From:)' Andy mailbox

       --help
         Print a help message summarizing the usage.

       --
         All arguments following -- are treated as mail folders.

EXAMPLES

       Count the number of emails. ("." matches every email.)

         grepmail -r . sent-mail

       Get all email between 2000 and 3000 bytes about books

         grepmail books -s 2000-3000 sent-mail

       Get all email that you mailed yesterday

         grepmail -d yesterday sent-mail

       Get all email that you mailed before the first thursday in June 1998 that pertains to
       research (requires Date::Manip):

         grepmail research -d "before 1st thursday in June 1998" sent-mail

       Get all email that you mailed before the first of June 1998 that pertains to research:

         grepmail research -d "before 6/1/98" sent-mail

       Get all email you received since 8/20/98 that wasn't about research or your job, ignoring
       case:

         grepmail -iv "(research|job)" -d "since 8/20/98" saved-mail

       Get all email about mime but not about Netscape. Constrain the search to match the body,
       since most headers contain the text "mime":

         grepmail -b mime saved-mail | grepmail Netscape -v

       Print a list of all mailboxes containing a message from Rodney. Constrain the search to
       the headers, since quoted emails may match the pattern:

         grepmail -hl "^From.*Rodney" saved-mail*

       Find all emails with the text "Pilot" in both the header and the body:

         grepmail -hb "Pilot" saved-mail*

       Print a count of the number of messages about grepmail in all saved-mail mailboxes:

         grepmail -br grepmail saved-mail*

       Remove any duplicates from a mailbox:

         grepmail -u saved-mail

       Convert a Gnus mailbox to mbox format:

         grepmail . gnus-mailbox-dir/* > mbox

       Search for all emails to or from an address (taking into account wrapped headers and
       different header names):

         grepmail -Y '(^TO:|^From:)' my@email.address saved-mail

       Find all emails from postmasters:

         grepmail -Y '^FROM_MAILER:' . saved-mail

FILES

       grepmail will not create temporary files while decompressing compressed archives. The last
       version to do this was 3.5. While the new design uses more memory, the code is much
       simpler, and there is less chance that email can be read by malicious third parties.
       Memory usage is determined by the size of the largest email message in the mailbox.

ENVIRONMENT

       The MAILDIR environment variable can be used to specify the default mail directory. This
       directory will be searched if the specified mailbox can not be found directly.

       The HOME environment variable is also used to find mailboxes if they can not be found
       directly. It is also used to store grepmail state information such as its cache file.

BUGS AND LIMITATIONS

       Patterns containing "$" may cause problems
         Currently I look for "$" followed by a non-word character and replace it with the line
         ending for the current file (either "\n" or "\r\n"). This may cause problems with
         complex patterns specified with -E, but I'm not aware of any.

       Mails without bodies cause problems
         According to RFC 822, mail messages need not have message bodies. I've found and removed
         one bug related to this. I'm not sure if there are others.

       Complex single-point dates not parsed correctly
         If you specify a point date like "September 1, 2004", grepmail creates a date range that
         includes the entire day of September 1, 2004. If you specify a complex point date such
         as "today", "1st Monday in July", or "9/1/2004 at 0:00" grepmail may parse the time
         incorrectly.

         The reason for this problem is that Date::Manip, as of version 5.42, forces default
         values for parsed dates and times. This means that grepmail has a hard time determining
         whether the user supplied certain time/date fields. (e.g. Did Date::Manip provide a
         default time of 0:00, or did the user specify it?)  grepmail tries to work around this
         problem, but the workaround is inherently incomplete in some rare cases.

       File names that look like flags cause problems.
         In some special circumstances, grepmail will be confused by files whose names look like
         flags. In such cases, use the -e flag to specify the search pattern.

LICENSE

       This code is distributed under the GNU General Public License (GPL) Version 2.  See the
       file LICENSE in the distribution for details.

AUTHOR

       David Coppit <david@coppit.org>

SEE ALSO

       elm(1), mail(1), grep(1), perl(1), printmail(1), Mail::Internet(3), procmailrc(5).
       Crocker, D.  H., Standard for the Format of Arpa Internet Text Messages, RFC 822.