Provided by: maildir-utils_1.2.0-2build1_amd64 bug

NAME

       mu_index - index e-mail messages stored in Maildirs

SYNOPSIS

       mu index [options]

DESCRIPTION

       mu  index is the mu command for scanning the contents of Maildir directories and storing the results in a
       Xapian database. The data can then be queried using mu-find(1).

       index understands Maildirs as defined by Daniel Bernstein  for  qmail(7).  In  addition,  it  understands
       recursive Maildirs (Maildirs within Maildirs), Maildir++. It can also deal with VFAT-based Maildirs which
       use '!' as the separators instead of ':'.

       E-mail  messages  which are not stored in something resembling a maildir leaf-directory (cur and new) are
       ignored, as are the cache directories for notmuch and gnus, and any dot-directory.

       Symlinks are not followed.

       If there is a file called .noindex in a directory,  the  contents  of  that  directory  and  all  of  its
       subdirectories  will  be  ignored.  This  can  be useful to exclude certain directories from the indexing
       process, for example directories with spam-messages.

       If there is a file called .noupdate in a directory, the  contents  of  that  directory  and  all  of  its
       subdirectories will be ignored, unless we do a full rebuild (with --rebuild). This can be useful to speed
       up  things  you  have some maildirs that never change. Note that you can still search for these messages,
       this only affects updating the database.

       There also the --lazy-check which can greatly speed up indexing; see below for details.

       The first run of mu index may take a few minutes if you  have  a  lot  of  mail  (tens  of  thousands  of
       messages).  Fortunately, such a full scan needs to be done only once; after that it suffices to index the
       changes, which goes much faster. See the 'Note on performance (i,ii,iii)' below for more information.

       The  optional  'phase two' of the indexing-process is the removal of messages from the database for which
       there is no longer a corresponding file in the Maildir. If  you  do  not  want  this,  you  can  use  -n,
       --nocleanup.

       When  mu  index  catches one of the signals SIGINT, SIGHUP or SIGTERM (e.g., when you press Ctrl-C during
       the indexing process), it tries to shutdown gracefully; it tries to save and commit data, and  close  the
       database  etc.  If  it  receives  another  signal  (e.g.,  when pressing Ctrl-C once more), mu index will
       terminate immediately.

OPTIONS

       Note, some of the general options are described in the mu(1) man-page and not  here,  as  they  apply  to
       multiple mu commands.

       -m, --maildir=<maildir>
              starts  searching  at  <maildir>. By default, mu uses whatever the MAILDIR environment variable is
              set to; if it is not set, it tries ~/Maildir. See the note on mixing sub-maildirs below.

       --my-address=<my-email-address>
              specifies that some e-mail address is 'my-address' (--my-address can be used multiple times). This
              is used by mu cfind -- any e-mail address found in the address fields of a message which also  has
              <my-email-address>  in  one  of  its  address fields is considered a personal e-mail address. This
              allows you, for example, to filter out (mu cfind --personal) addresses which were merely  seen  in
              mailing list messages.

       --lazy-check
              in  lazy-check  mode,  mu  does  not  consider  messages  for  which the time-stamp (ctime) of the
              directory they reside in has not changed since the previous indexing run. This is much faster than
              the non-lazy check, but won't update messages that have change (rather than having been  added  or
              removed),  since merely editing a message does not update the directory time-stamp. Of course, you
              can run mu-index occasionally without --lazy-check, to pick up such messages.

       --nocleanup
              disables the database cleanup that mu does by default after indexing.

       --rebuild
              clear all messages from the database before indexing. --rebuild guarantees that after the indexing
              has finished, there are no 'old' messages  in  the  database  anymore,  which  is  not  true  with
              --reindex  when  indexing  only  a  part  of  messages  (using  --maildir). For this reason, it is
              necessary to run mu index --rebuild when there is an upgrade in the database format. mu index will
              issue a warning about this.

       --autoupgrade
              automatically use -y, --empty when mu notices that the database version is  not  up-to-date.  This
              option  is  for use in cron scripts and the like, so they won't require any user interaction, even
              when mu introduces a new database version.

       --xbatchsize=<batch size>
              set the maximum number of messages to process in a single Xapian transaction.  In  practice,  this
              option  is  only useful if you find that mu is running out of memory while indexing; in that case,
              you can set the batch size to (for example) 1000, which will reduce memory consumption,  but  also
              substantially reduce the indexing performance.

       --max-msg-size=<max msg size>
              set  the  maximum size (in bytes) for messages. The default maximum (currently at 500Mb) should be
              enough in most cases, but if you encounter warnings from mu about ignoring messsage  because  they
              are too big, you may want to increase this. Note that the reason for having a maximum size is that
              big messages require big memory allocations, which may lead to problems.

              NOTE:  It  is  not  recommended  to mix maildirs and sub-maildirs within the hierarchy in the same
              database;  for  example,  it's  better  not  to  index   both   with   --maildir=~/MyMaildir   and
              --maildir=~/MyMaildir/foo,  as  this  may  lead  to  unexpected  results  when  searching with the
              'maildir:' search parameter (see below).

   A note on performance (i)
       As a non-scientific benchmark, a simple test on the author's machine (a Thinkpad X61s laptop using  Linux
       2.6.35 and an ext3 file system) with no existing database, and a maildir with 27273 messages:

        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
        $ time mu index --quiet
        66,65s user 6,05s system 27% cpu 4:24,20 total
       (about 103 messages per second)

       A second run, which is the more typical use case when there is a database already, goes much faster:

        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
        $ time mu index --quiet
        0,48s user 0,76s system 10% cpu 11,796 total
       (more than 56818 messages per second)

       Note  that  each  test flushes the caches first; a more common use case might be to run mu index when new
       mail has arrived; the cache may stay quite 'warm' in that case:

        $ time mu index --quiet
        0,33s user 0,40s system 80% cpu 0,905 total
       which is more than 30000 messages per second.

   A note on performance (ii)
       As per June 2012, we did the same non-scientific benchmark,  this  time  with  an  Intel  i5-2500  CPU  @
       3.30GHz, an ext4 file system and a maildir with 22589 messages. We start without an existing database.

        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
        $ time mu index --quiet
        27,79s user 2,17s system 48% cpu 1:01,47 total
       (about 813 messages per second)

       A second run, which is the more typical use case when there is a database already, goes much faster:

        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
        $ time mu index --quiet
        0,13s user 0,30s system 19% cpu 2,162 total
       (more than 173000 messages per second)

   A note on performance (iii)
       As  per  July 2016, we did the same non-scientific benchmark, again with the Intel i5-2500 CPU @ 3.30GHz,
       an ext4 file system. This time, the maildir contains 72525 messages.

        $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
        $ time mu index --quiet
        40,34s user 2,56s system 64% cpu 1:06,17 total
       (about 1099 messages per second).

       As shown, mu has been getting faster with each release, even with relatively expensive new features  such
       as  text-normalization  (for  case-insensitve/accent-insensitive matching). The profiles are dominated by
       operations in the Xapian database now.

FILES

       By default, mu index stores its message database in ~/.mu/xapian; the database has  an  embedded  version
       number,  and  mu  will  automatically  update  it  when  it  notices a different version. This allows for
       automatic updating of mu-versions, without the need to clear out any old databases.

       However, note that versions of mu before 0.7  used  a  different  scheme,  which  puts  the  database  in
       ~/.mu/xapian-<version>.  These  older  databases  can  safely be deleted. Starting from version 0.7, this
       manual cleanup should no longer be needed.

       mu stores logs of its operations and queries in <muhome>/mu.log (by default, this is ~/.mu/mu.log).  Upon
       startup,  mu  checks the size of this log file. If it exceeds 1 MB, it will be moved to ~/.mu/mu.log.old,
       overwriting any existing file of that name, and start with an empty log  file.  This  scheme  allows  for
       continued use of mu without the need for any manual maintenance of log files.

ENVIRONMENT

       mu  index  uses  MAILDIR  to  find  the  user's  Maildir  if  it  has  not been specified explicitly with
       --maildir=<maildir>. If MAILDIR is not set, mu index will try ~/Maildir.

RETURN VALUE

       mu index return 0 upon successful completion, and any other number greater than 0 signals an error.

BUGS

       Please report bugs if you find them: https://github.com/djcb/mu/issues

AUTHOR

       Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>

SEE ALSO

       maildir(5), mu(1), mu-find(1), mu-cfind(1)

User Manuals                                        July 2016                                        MU-INDEX(1)