Provided by: maildir-utils_1.10.8-2build3_amd64 bug

NAME

       muindex -- index e-mail messages stored in Maildirs

SYNOPSIS

       mu [common-options] index

DESCRIPTION

       mu  index  is  the mu command for scanning the contents of Maildir directories and storing
       the results in a Xapian database. The data can then be queried using mu-find(1).

       Before the first time you run mu index, you must run mu init to initialize the database.

       index understands Maildirs as defined by Daniel Bernstein for qmail(7).  In  addition,  it
       understands  recursive  Maildirs  (Maildirs  within Maildirs), Maildir++. It also supports
       VFAT-based Maildirs which use '!' or ';' as the separators instead of ':'.

       E-mail messages which are not stored in something resembling a maildir leaf-directory (cur
       and  new)  are  ignored,  as  are the cache directories for notmuch and gnus, and any dot-
       directory.

       Starting  with  mu  1.5.x,  symlinks  are  followed,  and  can  be  spread  over  multiple
       filesystems;  however  note  that  moving  files  around  is  much  faster  when  multiple
       filesystems are not involved.

       If there is a file called .noindex in a directory, the contents of that directory and  all
       of  its  subdirectories will be ignored. This can be useful to exclude certain directories
       from the indexing process, for example directories with spam-messages.

       If there is a file called .noupdate in a directory, the contents of that directory and all
       of  its  subdirectories  will be ignored, unless we do a full rebuild (with mu init). This
       can be useful to speed up things you have some maildirs that never change. Note  that  you
       can still search for these messages, this only affects updating the database. .noupdate is
       ignored when you start indexing with an empty database (such as directly after mu init.

       There also the --lazy-check which can greatly speed up indexing; see below for details.

       The first run of mu index may take a few minutes if you  have  a  lot  of  mail  (tens  of
       thousands  of  messages).  Fortunately, such a full scan needs to be done only once; after
       that it suffices to  index  the  changes,  which  goes  much  faster.  See  the  'Note  on
       performance (i,ii,iii)' below for more information.

       The  optional  'phase  two'  of  the  indexing-process is the removal of messages from the
       database for which there is no longer a corresponding file in the Maildir.  If you do  not
       want this, you can use -n, --nocleanup.

       When  mu  index catches one of the signals SIGINT, SIGHUP or SIGTERM (e.g., when you press
       Ctrl-C during the indexing process), it attempts to shutdown gracefully; it tries to  save
       and  commit  data,  and  close the database etc. If it receives another signal (e.g., when
       pressing Ctrl-C once more), mu index will terminate immediately.

INDEX OPTIONS

   --lazy-check
       in lazy-check mode, mu does not consider messages for which the time-stamp (ctime) of  the
       directory  they  reside  in  has not changed since the previous indexing run. This is much
       faster than the non-lazy check, but won't update messages that have  change  (rather  than
       having  been  added  or  removed),  since  merely  editing  a  message does not update the
       directory time-stamp. Of course, you can run mu-index occasionally  without  --lazy-check,
       to pick up such messages.

   --nocleanup
       disable the database cleanup that mu does by default after indexing.

   --muhome
       use  a  non-default  directory  to  store  and read the database, write the logs, etc.  By
       default, mu uses the XDG Base Directory Specification (e.g. on GNU/Linux this defaults  to
       ~/.cache/mu  and  ~/.config/mu).  Earlier  versions  of  mu  defaulted to ~/.mu, which now
       requires --muhome=~/.mu.

       The environment variable MUHOME can be used as an alternative to --muhome. The latter  has
       precedence.

COMMON OPTIONS

   -d, --debug
       makes  mu  generate  extra  debug information, useful for debugging the program itself. By
       default, debug information goes to the log file, ~/.cache/mu/mu.log.   It  can  safely  be
       deleted  when  mu  is not running. When running with --debug option, the log file can grow
       rather quickly. See the note on logging below.

   -q, --quiet
       causes mu not to output  informational  messages  and  progress  information  to  standard
       output,  but  only  to  the log file. Error messages will still be sent to standard error.
       Note that mu index is much faster with --quiet, so it is recommended you use  this  option
       when using mu from scripts etc.

   --log-stderr
       causes mu to not output log messages to standard error, in addition to sending them to the
       log file.

   --nocolor
       do not use ANSI colors. The environment variable NOCOLOR can be used as an alternative  to
       --nocolor.

   -V, --version
       prints mu version and copyright information.

   -h, --help
       lists the various command line options.

PERFORMANCE

   indexing in ancient times (2009?)
       As  a  non-scientific  benchmark,  a  simple test on the author's machine (a Thinkpad X61s
       laptop using Linux 2.6.35 and an ext3 file  system)  with  no  existing  database,  and  a
       maildir with 27273 messages:

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              66,65s user 6,05s system 27% cpu 4:24,20 total

       (about 103 messages per second)

       A  second  run,  which is the more typical use case when there is a database already, goes
       much faster:

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              0,48s user 0,76s system 10% cpu 11,796 total

       (more than 56818 messages per second)

       Note that each test flushes the caches first; a more common use case might be  to  run  mu
       index when new mail has arrived; the cache may stay quite 'warm' in that case:

              $ time mu index --quiet
              0,33s user 0,40s system 80% cpu 0,905 total

       which is more than 30000 messages per second.

   indexing in 2012
       As  per  June  2012,  we  did  the  same non-scientific benchmark, this time with an Intel
       i5-2500 CPU @ 3.30GHz, an ext4 file system and a maildir with  22589  messages.  We  start
       without an existing database.

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              27,79s user 2,17s system 48% cpu 1:01,47 total

       (about 813 messages per second)

       A  second  run,  which is the more typical use case when there is a database already, goes
       much faster:

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              0,13s user 0,30s system 19% cpu 2,162 total

       (more than 173000 messages per second)

   indexing in 2016
       As per July 2016, we did the same non-scientific benchmark, again with the  Intel  i5-2500
       CPU @ 3.30GHz, an ext4 file system. This time, the maildir contains 72525 messages.

              $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              $ time mu index --quiet
              40,34s user 2,56s system 64% cpu 1:06,17 total

       (about 1099 messages per second).

   indexing in 2022
       A  few  years later and it is June 2022. There's a lot more happening during indexing, but
       indexing became multi-threaded and machines are faster; e.g. this is  with  an  AMD  Ryzen
       Threadripper 1950X (16 cores) @ 3.399GHz.

       The  instructions  are a little different since we have a proper repeatable benchmark now.
       After building,

               $ sudo sh -c 'sync && echo 3 > /proc/sys/vm/drop_caches'
              % THREAD_NUM=4 build/lib/tests/bench-indexer -m perf
              # random seed: R02Sf5c50e4851ec51adaf301e0e054bd52b
              1..1
              # Start of bench tests
              # Start of indexer tests
              indexed 5000 messages in 20 maildirs in 3763ms; 752 μs/message; 1328 messages/s (4 thread(s))
              ok 1 /bench/indexer/4-cores
              # End of indexer tests
              # End of bench tests

       Things are again a little faster, even though  the  index  does  a  lot  more  now  (text-
       normalizatian, and pre-generating message-sexps). A faster machine helps, too!

EXIT CODE

       This  command  returns  0  upon  successful completion, or a non-zero exit code otherwise.
       Typical values are 2 (no matches found), 11 (database schema mismatch) and 12  (failed  to
       acquire database lock).

   no matches found (2)
       Nothing matching found; try a different query

   database schema mismatch (11)
       You need to re-initialize mu, see mu-init(1)

   failed to acquire lock (19)
       Some other program has exclusive access to the mu (Xapian) database

REPORTING BUGS

       Please report bugs at https://github.com/djcb/mu/issues.

AUTHOR

       Dirk-Jan C. Binnema <djcb@djcbsoftware.nl>

COPYRIGHT

       This manpage is part of mu 1.10.8.

       Copyright  ©  2022-2023  Dirk-Jan  C.  Binnema. License GPLv3+: GNU GPL version 3 or later
       https://gnu.org/licenses/gpl.html. This is free software:  you  are  free  to  change  and
       redistribute it. There is NO WARRANTY, to the extent permitted by law.

SEE ALSO

       maildir(5), mu(1), mu-init(1), mu-find(1), mu-cfind(1)

                                                                                      MU INDEX(1)