bionic (1) muchsync.1.gz

Provided by: muchsync_5-1_amd64 bug

NAME

       muchsync - synchronize maildirs and notmuch databases

SYNOPSIS

       muchsync options
       muchsync options server-name server-options
       muchsync options --init maildir server-name server-options

DESCRIPTION

       muchsync  synchronizes  the  contents  of maildirs and notmuch tags across machines.  Any given execution
       runs pairwise  between  two  replicas,  but  the  system  scales  to  an  arbitrary  number  of  replicas
       synchronizing  in  arbitrary  pairs.   For efficiency, version vectors and logical timestamps are used to
       limit synchronization to items a peer may not yet know about.

       To use muchsync, both muchsync and notmuch should be installed someplace in your PATH  on  two  machines,
       and you must be able to access the remote machine via ssh.

       In  its  simplest  usage,  you  have  a  single  notmuch database on some server SERVER and wish to start
       replicating that database on a client, where the client currently does not have any mailboxes.   You  can
       initialize a new replica in $HOME/inbox by running the following command:

              muchsync --init $HOME/inbox SERVER

       This  command  may take some time, as it transfers the entire contents of your maildir from the server to
       the client and creates a new notmuch index on the client.  Depending on your setup,  you  may  be  either
       bandwidth  limited  or  CPU  limited.   (Sadly,  the  notmuch  library  on  which  muchsync  is  built is
       non-reentrant and forces all indexing to happen on a single core at a rate of about 10,000  messages  per
       minute.)

       From then on, to synchronize the client with the server, just run:

              muchsync SERVER

       Since  muchsync replicates the tags in the notmuch database itself, you should consider disabling maildir
       flag synchronization by executing:

              notmuch config set maildir.synchronize_flags=false

       The reason is that the synchronize_flags feature only works on a small subset of pre-defined flags and so
       is not all that useful.  Moreover, it marks flags by renaming files, which is not particularly efficient.
       muchsync was largely motivated by the need for better flag synchronization.  If you  are  satisfied  with
       the  synchronize_flags  feature,  you  might  consider  a  tool  such as offlineimap as an alternative to
       muchsync.

   Synchronization algorithm
       muchsync separately synchronizes two classes of information: the message-to-directory mapping (henceforth
       link  counts)  and  the  message-id-to-tag  mapping  (henceforth tags).  Using logical timestamps, it can
       detect update conflicts for each type of information.  We describe link count and tag synchronization  in
       turn.

       Link   count   synchronization   consists   of  ensuring  that  any  given  message  (identified  by  its
       collision-resistant content hash) appears the same number of times in the  same  subdirectories  on  each
       replica.  Generally a message will appear only once in a single subdirectory.  However, if the message is
       moved or deleted on one replica, this will propagate to other replicas.

       If two replicas move or copy the same file between synchronization events (or one moves the file and  the
       other deletes it), this constitutes an update conflict.  Update conflicts are resolved by storing in each
       subdirectory a number of copies equal to the maximum of the number of copies in that subdirectory on  the
       two  replicas.   This  is  conservative, in the sense that a file will never be deleted after a conflict,
       though you may get extra copies of files.  (muchsync uses hard links, so at least these copies  will  not
       use too much disk space.)

       For  example, if one replica moves a message to subdirectory .box1/cur and another moves the same message
       to subdirectory .box2/cur, the conflict will be resolved by placing two links  to  the  message  on  each
       replica,  one  in  .box1/cur  and one in .box2/cur.  To respect the structure of maildirs, subdirectories
       ending new and cur are special-cased; conflicts between sibling new and cur subdirectories  are  resolved
       in favor of cur without creating additional copies of messages.

       Message tags are synchronized based on notmuch's message-ID (usually the Message-ID header of a message),
       rather than message contents.  On conflict, tags are  combined  as  follows.   Any  tag  in  the  notmuch
       configuration parameter muchsync.and_tags is removed from the message unless it appears on both replicas.
       Any other tag is added if it appears on any replica.  In  other  words,  tags  in  muchsync.and_tags  are
       logically anded, while all other flags are logically ored.  (This approach will give the most predictable
       results if muchsync.and_tags has the same value in all your replicas.  The --init option ensures  uniform
       configurations initially, but subsequent changes to muchsync.and_tags must be manually propagated.)

       If  your configuration file does not specify a value for muchsync.and_tags, the default is to use the set
       of tags specified in the new.tags configuration option.  This should give intuitive  results  unless  you
       use  a two-pass tagging system such as the afew tool, in which case new.tags is used to flag input to the
       second pass while you likely want muchsync.and_tags to reflect the output of the second pass.

   File deletion
       Because publishing software that actually deletes people's email is a scary prospect,  muchsync  for  the
       moment  never actually deletes mail files.  Though this may change in the future, for the moment muchsync
       moves any deleted messages to the directory .notmuch/muchsync/trash under  your  mail  directory  (naming
       deleted  messages by their content hash).  If you really want to delete mail to reclaim disk space or for
       privacy reasons, you will need to run the following on each replica:

              cd "$(notmuch config get database.path)"
              rm -rf .notmuch/muchsync/trash

OPTIONS

       -C file, --config file
              Specify the path of the notmuch configuration file to use.  If none is specified, the  default  is
              to use the contents of the environment variable $NOTMUCH_CONFIG, or if that variable is unset, the
              value $HOME/.notmuch-config.  (These are the same defaults as the notmuch command itself.)

       -F     Check for modified files.  Without this option, muchsync assumes that files in a maildir are never
              edited.   -F disables certain optimizations so as to make muchsync at least check the timestamp on
              every file, which will detect modified files at the cost of a longer startup  time.   If  muchsync
              dies  with the error "message received does not match hash," you likely need to run it with the -F
              option.

              Note that if your software regularly modifies the contents of mail files (e.g.,  because  you  are
              running  offlineimap  with  "synclabels  =  yes"),  then you will need to use -F each time you run
              muchsync.  Specify it as  a  server  option  (after  the  server  name)  if  the  editing  happens
              server-side.

       -r /path/to/muchsync
              Specifies  the path to muchsync on the server.  Ordinarily, muchsync should be in the default PATH
              on the server so this option is not required.  However, this option  is  useful  if  you  have  to
              install muchsync in a non-standard place or wish to test development versions of the code.

       -s ssh-cmd
              Specifies  a command line to pass to /bin/sh to execute a command on another machine.  The default
              value is "ssh -CTaxq".  Note that because this string is passed to the shell,  special  characters
              including spaces may need to be escaped.

       -v     The -v option increases verbosity.  The more times it is specified, the more verbose muchsync will
              become.

       --help Print a brief summary of muchsync's command-line options.

       --init maildir
              This option clones an existing mailbox on a remote server  into  maildir  on  the  local  machine.
              Neither maildir nor your notmuch configuration file (see --config above) should exist when you run
              this command, as both will be created.  The configuration file is copied from the server (adjusted
              to  reflect  the  local maildir), while maildir is created as a replica of the maildir you have on
              the server.

       --nonew
              Ordinarily, muchsync begins by running "notmuch new".  This option says not to run  "notmuch  new"
              before  starting  the muchsync operation.  It can be passed as either a client or a server option.
              For example: The command "muchsync myserver --nonew" will run "notmuch new"  locally  but  not  on
              myserver.

       --noup, --noupload
              Transfer files from the server to the client, but not vice versa.

       --upbg Transfer  files from the server to the client in the foreground.  Then fork into the background to
              upload any new files from the client to the server.  This option is useful when checking new mail,
              if  you  want  to  begin  reading  your  mail  as  soon as it has been downloaded while the upload
              continues.

       --self Print the 64-bit replica ID of  the  local  maildir  replica  and  exit.   Potentially  useful  in
              higher-level  scripts,  such  as  the  emacs notmuch-poll-script variable for identifying on which
              replica one is running, particularly if network file systems allow a replica to be  accessed  from
              multiple machines.

       --newid
              Muchsync  requires  every  replica to have a unique 64-bit identifier.  If you ever copy a notmuch
              database to another machine, including the muchsync state, bad things will happen if  both  copies
              use  muchsync,  as  they  will  both  have the same identifier.  Hence, after making such copy and
              before running muchsync to synchronize mail, run muchsync --newid to change the identifier of  one
              of the copies.

       --version
              Report on the muchsync version number

EXAMPLES

       To initialize a the muchsync database, you can run:

              muchsync -vv

       This  first  executes  "notmuch new", then builds the initial muchsync database from the contents of your
       maildir (the directory specified as database.path in your notmuch configuration file).  This command  may
       take  several minutes the first time it is run, as it must compute a content hash of every message in the
       database.  Note that you do not need to run this command, as muchsync will initialize  the  database  the
       first time a client tries to synchronize anyway.

              muchsync --init ~/maildir myserver

       First  run  "notmuch  new"  on  myserver,  then create a directory ~/maildir containing a replica of your
       mailbox on myserver.  Note that neither  your  configuration  file  (by  default  ~/.notmuch-config)  nor
       ~/maildir should exist before running this command, as both will be created.

       To  create a notmuch-poll script that fetches mail from a remote server myserver, but on that server just
       runs notmuch new, do the following: First, run muchsync --self on the server to get the replica ID.  Then
       take the ID returned (e.g., 1968464194667562615) and embed it in a shell script as follows:

              #!/bin/sh
              self=$($HOME/muchsync --self) || exit 1
              if [ "$self" = 1968464194667562615 ]; then
                  exec notmuch new
              else
                  exec $HOME/muchsync -r ./muchsync --upbg myserver
              fi

       The path of such a script is a good candidate for the emacs notmuch-poll-script variable.

       Alternatively,  to  have  the  command  notmuch new  on a client automatically fetch new mail from server
       myserver, you can place the following in the file .notmuch/hooks/post-new under your mail directory:

              #!/bin/sh
              notmuch --nonew --upbg myserver

FILES

       The default notmuch configuration file is $HOME/.notmuch-config.

       muchsync keeps all of its state in a subdirectory of your top maildir called .notmuch/muchsync.

SEE ALSO

       notmuch(1).

BUGS

       muchsync expects initially to create replicas from scratch.  If you have created a replica using  another
       tool such as offlineimap and you try to use muchsync to synchronize them, muchsync will assume every file
       has an update conflict.  This is okay if the two replicas are identical; if they are not, it will  result
       in artifacts such as files deleted in only one replica reappearing.  Ideally notmuch needs an option like
       --clobber that makes a local replica identical to the remote one without touching the remote one, so that
       an old version of a mail directory can be used as a disposable cache to bootstrap initialization.

       muchsync  never  deletes directories.  If you want to remove a subdirectory completely, you must manually
       execute rmdir on all replicas.  Even if you manually delete a  subdirectory,  it  will  live  on  in  the
       notmuch database.

       To  synchronize  deletions  and  re-creations  properly,  muchsync never deletes content hashes and their
       message IDs from its database, even after the last copy of a message has disappeared.  Such stale  hashes
       should not consume an inordinate amount of disk space, but could conceivably pose a privacy risk if users
       believe deleting a message removes all traces of it.

       Message tags are synchronized based on notmuch's message-ID (usually the Message-ID header of a message),
       rather than based on message contents.  This is slightly strange because very different messages can have
       the same Message-ID header, meaning the user will likely only read one of many messages bearing the  same
       Message-ID  header.   It  is conceivable that an attacker could suppress a message from a mailing list by
       sending another message with the same Message-ID.  This bug is in the design of notmuch,  and  hence  not
       something that muchsync can work around.  muchsync itself does not assume Message-ID equivalence, relying
       instead on content hashes to synchronize link counts.  Hence, any tools used to work around  the  problem
       should work on all replicas.

       Because  notmuch  and  Xapian  do  not  keep  any  kind  of  modification time on database entries, every
       invocation of muchsync requires a complete scan of all tags in the Xapian database to detect any  changed
       tags.   Fortunately  muchsync  heavily  optimizes the scan so that it should take well under a second for
       100,000 mail messages.  However, this means that interfaces such as those used by  notmuch-dump  are  not
       efficient enough (see the next paragraph).

       muchsync  makes  certain assumptions about the structure of notmuch's private types notmuch_message_t and
       notmuch_directory_t.  In particular, it assumes that the Xapian document ID is the second field of  these
       data structures.  Sadly, there is no efficient and clean way to extract this information from the notmuch
       library interface.  muchsync also makes other assumptions about  how  tokens  are  named  in  the  Xapian
       database.   These  assumptions  are  necessary because the notmuch library interface and the notmuch dump
       utility are too slow to support synchronization every time you check mail.

AUTHORS

       David Mazieres.

                                                                                                     muchsync(1)