lunar (1) muchsync.1.gz

Provided by: muchsync_7-1_amd64 bug

NAME

       muchsync - synchronize maildirs and notmuch databases

SYNOPSIS

       muchsync options
       muchsync options server-name server-options
       muchsync options –init maildir server-name server-options

DESCRIPTION

       muchsync  synchronizes  the  contents  of  maildirs and notmuch tags across machines.  Any
       given execution runs pairwise between two replicas, but the system scales to an  arbitrary
       number  of replicas synchronizing in arbitrary pairs.  For efficiency, version vectors and
       logical timestamps are used to limit synchronization to items a  peer  may  not  yet  know
       about.

       To  use  muchsync, both muchsync and notmuch should be installed someplace in your PATH on
       two machines, and you must be able to access the remote machine via ssh.

       In its simplest usage, you have a single notmuch database on some server SERVER  and  wish
       to  start  replicating that database on a client, where the client currently does not have
       any mailboxes.  You can initialize a new replica in $HOME/inbox by running  the  following
       command:

              muchsync --init $HOME/inbox SERVER

       This  command may take some time, as it transfers the entire contents of your maildir from
       the server to the client and creates a new notmuch index on the client.  Depending on your
       setup, you may be either bandwidth limited or CPU limited.  (Sadly, the notmuch library on
       which muchsync is built is non-reentrant and forces all indexing to  happen  on  a  single
       core at a rate of about 10,000 messages per minute.)

       From then on, to synchronize the client with the server, just run:

              muchsync SERVER

       Since  muchsync  replicates  the  tags in the notmuch database itself, you should consider
       disabling maildir flag synchronization by executing:

              notmuch config set maildir.synchronize_flags=false

       The reason is that  the  synchronize_flags  feature  only  works  on  a  small  subset  of
       pre-defined  flags  and  so  is not all that useful.  Moreover, it marks flags by renaming
       files, which is not particularly efficient.  muchsync was largely motivated  by  the  need
       for better flag synchronization.  If you are satisfied with the synchronize_flags feature,
       you might consider a tool such as offlineimap as an alternative to muchsync.

   Synchronization algorithm
       muchsync separately synchronizes two  classes  of  information:  the  message-to-directory
       mapping  (henceforth  link  counts)  and  the message-id-to-tag mapping (henceforth tags).
       Using logical timestamps, it can detect update conflicts for each type of information.  We
       describe link count and tag synchronization in turn.

       Link  count synchronization consists of ensuring that any given message (identified by its
       collision-resistant  content  hash)  appears  the  same  number  of  times  in  the   same
       subdirectories  on  each  replica.   Generally a message will appear only once in a single
       subdirectory.  However, if the message is moved or  deleted  on  one  replica,  this  will
       propagate to other replicas.

       If  two  replicas  move or copy the same file between synchronization events (or one moves
       the file and the other deletes it), this constitutes an update conflict.  Update conflicts
       are  resolved  by  storing in each subdirectory a number of copies equal to the maximum of
       the number of copies in that subdirectory on the two replicas.  This is  conservative,  in
       the  sense  that  a  file will never be deleted after a conflict, though you may get extra
       copies of files.  (muchsync uses hard links, so at least these copies  will  not  use  too
       much disk space.)

       For  example,  if  one replica moves a message to subdirectory .box1/cur and another moves
       the same message to subdirectory .box2/cur, the conflict will be resolved by  placing  two
       links  to  the message on each replica, one in .box1/cur and one in .box2/cur.  To respect
       the structure of maildirs, subdirectories ending new and cur are special-cased;  conflicts
       between  sibling  new and cur subdirectories are resolved in favor of cur without creating
       additional copies of messages.

       Message tags are synchronized based on notmuch’s message-ID (usually the Message-ID header
       of  a  message), rather than message contents.  On conflict, tags are combined as follows.
       Any tag in the notmuch configuration  parameter  muchsync.and_tags  is  removed  from  the
       message  unless  it appears on both replicas.  Any other tag is added if it appears on any
       replica.  In other words, tags in muchsync.and_tags are logically anded, while  all  other
       flags  are  logically  ored.   (This  approach  will  give the most predictable results if
       muchsync.and_tags has the same value in all your  replicas.   The  --init  option  ensures
       uniform  configurations  initially,  but  subsequent  changes to muchsync.and_tags must be
       manually propagated.)

       If your configuration file does not specify a value for muchsync.and_tags, the default  is
       to  use  the set of tags specified in the new.tags configuration option.  This should give
       intuitive results unless you use a two-pass tagging system such as the afew tool, in which
       case  new.tags  is  used  to  flag  input  to  the  second  pass  while  you  likely  want
       muchsync.and_tags to reflect the output of the second pass.

   File deletion
       Because publishing software that actually deletes people’s  email  is  a  scary  prospect,
       muchsync  for the moment never actually deletes mail files.  Though this may change in the
       future,  for  the  moment  muchsync  moves  any  deleted   messages   to   the   directory
       .notmuch/muchsync/trash  under  your  mail  directory  (naming  deleted  messages by their
       content hash).  If you really want to delete mail to reclaim disk  space  or  for  privacy
       reasons, you will need to run the following on each replica:

              cd "$(notmuch config get database.path)"
              rm -rf .notmuch/muchsync/trash

OPTIONS

       -C file, --config file
              Specify  the  path of the notmuch configuration file to use.  If none is specified,
              the default is to use the contents of the environment variable $NOTMUCH_CONFIG,  or
              if  that  variable  is unset, the value $HOME/.notmuch-config.  (These are the same
              defaults as the notmuch command itself.)

       -F     Check for modified files.  Without this option, muchsync assumes that  files  in  a
              maildir are never edited.  -F disables certain optimizations so as to make muchsync
              at least check the timestamp on every file, which will detect modified files at the
              cost  of  a longer startup time.  If muchsync dies with the error “message received
              does not match hash,” you likely need to run it with the -F option.

              Note that if your software regularly modifies the contents  of  mail  files  (e.g.,
              because you are running offlineimap with “synclabels = yes”), then you will need to
              use -F each time you run muchsync.  Specify it as a server option (after the server
              name) if the editing happens server-side.

       -r /path/to/muchsync
              Specifies  the  path  to muchsync on the server.  Ordinarily, muchsync should be in
              the default PATH on the server so this  option  is  not  required.   However,  this
              option is useful if you have to install muchsync in a non-standard place or wish to
              test development versions of the code.

       -s ssh-cmd
              Specifies a command line to pass  to  /bin/sh  to  execute  a  command  on  another
              machine.   The  default  value  is  “ssh -CTaxq”.  Note that because this string is
              passed to the shell, special characters including spaces may need to be escaped.

       -v     The -v option increases verbosity.  The  more  times  it  is  specified,  the  more
              verbose muchsync will become.

       --help Print a brief summary of muchsync’s command-line options.

       --init maildir
              This option clones an existing mailbox on a remote server into maildir on the local
              machine.  Neither maildir nor your notmuch configuration file (see --config  above)
              should exist when you run this command, as both will be created.  The configuration
              file is copied from the server (adjusted  to  reflect  the  local  maildir),  while
              maildir is created as a replica of the maildir you have on the server.

       --nonew
              Ordinarily,  muchsync begins by running “notmuch new”.  This option says not to run
              “notmuch new” before starting the muchsync operation.  It can be passed as either a
              client  or  a  server option.  For example: The command “muchsync myserver --nonew”
              will run “notmuch new” locally but not on myserver.

       --noup, --noupload
              Transfer files from the server to the client, but not vice versa.

       --upbg Transfer files from the server to the client in the foreground.  Then fork into the
              background  to  upload any new files from the client to the server.  This option is
              useful when checking new mail, if you want to begin reading your mail as soon as it
              has been downloaded while the upload continues.

       --self Print  the  64-bit  replica  ID of the local maildir replica and exit.  Potentially
              useful in higher-level scripts, such as the emacs notmuch-poll-script variable  for
              identifying  on  which replica one is running, particularly if network file systems
              allow a replica to be accessed from multiple machines.

       --newid
              Muchsync requires every replica to have a unique 64-bit identifier.   If  you  ever
              copy  a  notmuch  database  to  another  machine, including the muchsync state, bad
              things will happen if both copies use muchsync, as they will  both  have  the  same
              identifier.   Hence,  after  making  such  copy  and  before  running  muchsync  to
              synchronize mail, run muchsync --newid to change  the  identifier  of  one  of  the
              copies.

       --version
              Report on the muchsync version number

EXAMPLES

       To initialize a the muchsync database, you can run:

              muchsync -vv

       This  first  executes  “notmuch  new”,  then builds the initial muchsync database from the
       contents of your maildir  (the  directory  specified  as  database.path  in  your  notmuch
       configuration  file).   This command may take several minutes the first time it is run, as
       it must compute a content hash of every message in the database.  Note  that  you  do  not
       need to run this command, as muchsync will initialize the database the first time a client
       tries to synchronize anyway.

              muchsync --init ~/maildir myserver

       First run “notmuch new” on myserver,  then  create  a  directory  ~/maildir  containing  a
       replica  of  your  mailbox  on  myserver.   Note  that neither your configuration file (by
       default ~/.notmuch-config) nor ~/maildir should exist before running this command, as both
       will be created.

       To  create  a  notmuch-poll script that fetches mail from a remote server myserver, but on
       that server just runs notmuch new, do the following: First, run  muchsync  --self  on  the
       server  to  get the replica ID.  Then take the ID returned (e.g., 1968464194667562615) and
       embed it in a shell script as follows:

              #!/bin/sh
              self=$($HOME/muchsync --self) || exit 1
              if [ "$self" = 1968464194667562615 ]; then
                  exec notmuch new
              else
                  exec $HOME/muchsync -r ./muchsync --upbg myserver
              fi

       The path of such a script is a good candidate for the emacs notmuch-poll-script variable.

       Alternatively, to have the command notmuch new on a client automatically  fetch  new  mail
       from  server  myserver,  you  can  place the following in the file .notmuch/hooks/post-new
       under your mail directory:

              #!/bin/sh
              muchsync --nonew --upbg myserver

FILES

       The default notmuch configuration file is $HOME/.notmuch-config.

       muchsync  keeps  all  of  its  state  in  a  subdirectory  of  your  top  maildir   called
       .notmuch/muchsync.

SEE ALSO

       notmuch(1).

BUGS

       muchsync expects initially to create replicas from scratch.  If you have created a replica
       using another tool such as offlineimap and you try to use muchsync  to  synchronize  them,
       muchsync  will assume every file has an update conflict.  This is okay if the two replicas
       are identical; if they are not, it will result in artifacts such as files deleted in  only
       one  replica  reappearing.   Ideally  notmuch  needs an option like --clobber that makes a
       local replica identical to the remote one without touching the remote one, so that an  old
       version of a mail directory can be used as a disposable cache to bootstrap initialization.

       muchsync  never deletes directories.  If you want to remove a subdirectory completely, you
       must manually execute rmdir on all replicas.  Even if you manually delete a  subdirectory,
       it will live on in the notmuch database.

       To  synchronize deletions and re-creations properly, muchsync never deletes content hashes
       and their message IDs from its database, even  after  the  last  copy  of  a  message  has
       disappeared.  Such stale hashes should not consume an inordinate amount of disk space, but
       could conceivably pose a privacy risk if users believe  deleting  a  message  removes  all
       traces of it.

       Message tags are synchronized based on notmuch’s message-ID (usually the Message-ID header
       of a message), rather than based on message contents.  This is  slightly  strange  because
       very  different messages can have the same Message-ID header, meaning the user will likely
       only read one of many messages bearing the same Message-ID header.  It is conceivable that
       an  attacker  could suppress a message from a mailing list by sending another message with
       the same Message-ID.  This bug is in the design of notmuch, and hence not  something  that
       muchsync can work around.  muchsync itself does not assume Message-ID equivalence, relying
       instead on content hashes to synchronize link counts.   Hence,  any  tools  used  to  work
       around the problem should work on all replicas.

       Because  notmuch and Xapian do not keep any kind of modification time on database entries,
       every invocation of muchsync requires a complete scan of all tags in the  Xapian  database
       to  detect  any  changed tags.  Fortunately muchsync heavily optimizes the scan so that it
       should take well under a second for 100,000  mail  messages.   However,  this  means  that
       interfaces  such  as  those  used  by  notmuch-dump are not efficient enough (see the next
       paragraph).

       muchsync makes  certain  assumptions  about  the  structure  of  notmuch’s  private  types
       notmuch_message_t  and  notmuch_directory_t.   In  particular,  it assumes that the Xapian
       document ID is the second field of these data structures.  Sadly, there  is  no  efficient
       and  clean  way  to extract this information from the notmuch library interface.  muchsync
       also makes other assumptions about how tokens are named in  the  Xapian  database.   These
       assumptions  are  necessary  because  the  notmuch  library interface and the notmuch dump
       utility are too slow to support synchronization every time you check mail.

AUTHORS

       David Mazieres.

                                                                                      muchsync(1)