Provided by: public-inbox_1.6.1-2_all bug

NAME

       public-inbox-tuning - tuning public-inbox

DESCRIPTION

       public-inbox intends to support a wide variety of hardware.  While we strive to provide
       the best out-of-the-box performance possible, tuning knobs are an unfortunate necessity in
       some cases.

       1.  New inboxes: public-inbox-init -V2

       2.  Process spawning

       3.  Performance on rotational hard disk drives

       4.  Btrfs (and possibly other copy-on-write filesystems)

       5.  Performance on solid state drives

       6.  Read-only daemons

   New inboxes: public-inbox-init -V2
       If you're starting a new inbox (and not mirroring an existing one), the -V2 requires
       DBD::SQLite, but is orders of magnitude more scalable than the original "-V1" format.

   Process spawning
       Our optional use of Inline::C speeds up subprocess spawning from large daemon processes.

       To enable Inline::C, either set the "PERL_INLINE_DIRECTORY" environment variable to point
       to a writable directory, or create "~/.cache/public-inbox/inline-c" for any user(s)
       running public-inbox processes.

       More (optional) Inline::C use will be introduced in the future to lower memory use and
       improve scalability.

   Performance on rotational hard disk drives
       Random I/O performance is poor on rotational HDDs.  Xapian indexing performance degrades
       significantly as DBs grow larger than available RAM.  Attempts to parallelize random I/O
       on HDDs leads to pathological slowdowns as inboxes grow.

       While "-V2" introduced Xapian shards as a parallelization mechanism for SSDs; enabling
       "publicInbox.indexSequentialShard" repurposes sharding as mechanism to reduce the kernel
       page cache footprint when indexing on HDDs.

       Initializing a mirror with a high "--jobs" count to create more shards (in "-V2" inboxes)
       will keep each shard smaller and reduce its kernel page cache footprint.  Keep in mind
       excessive sharding imposes a performance penalty for read-only queries.

       Users with large amounts of RAM are advised to set a large value for
       "publicinbox.indexBatchSize" as documented in public-inbox-index(1).

       "dm-crypt" users on Linux 4.0+ are advised to try the "--perf-same_cpu_crypt"
       "--perf-submit_from_crypt_cpus" switches of cryptsetup(8) to reduce I/O contention from
       kernel workqueue threads.

   Btrfs (and possibly other copy-on-write filesystems)
       btrfs(5) performance degrades from fragmentation when using large databases and random
       writes.  The Xapian + SQLite indices used by public-inbox are no exception to that.

       public-inbox 1.6.0+ disables copy-on-write (CoW) on Xapian and SQLite indices on btrfs to
       achieve acceptable performance (even on SSD).  Disabling copy-on-write also disables
       checksumming, thus "raid1" (or higher) configurations may be corrupt after unsafe
       shutdowns.

       Fortunately, these SQLite and Xapian indices are designed to recoverable from git if
       missing.

       Disabling CoW does not prevent all fragmentation.  Large values of
       "publicInbox.indexBatchSize" also limit fragmentation during the initial index.

       Avoid snapshotting subvolumes containing Xapian and/or SQLite indices.  Snapshots use CoW
       despite our efforts to disable it, resulting in fragmentation.

       filefrag(8) can be used to monitor fragmentation, and "btrfs filesystem defragment -fr
       $INBOX_DIR" may be necessary.

       Large filesystems benefit significantly from the "space_cache=v2" mount option documented
       in btrfs(5).

       Older, non-CoW filesystems are generally work well out-of-the-box for our Xapian and
       SQLite indices.

   Performance on solid state drives
       While SSD read performance is generally good, SSD write performance degrades as the drive
       ages and/or gets full.  Issuing "TRIM" commands via fstrim(8) or similar is required to
       sustain write performance.

       Users of the Flash-Friendly File System F2FS <https://en.wikipedia.org/wiki/F2FS> may
       benefit from optimizations found in SQLite 3.21.0+.  Benchmarks are greatly appreciated.

   Read-only daemons
       public-inbox-httpd(1), public-inbox-imapd(1), and public-inbox-nntpd(1) are all designed
       for C10K (or higher) levels of concurrency from a single process.  SMP systems may use
       "--worker-processes=NUM" as documented in public-inbox-daemon(8) for parallelism.

       The open file descriptor limit ("RLIMIT_NOFILE", "ulimit -n" in sh(1), "LimitNOFILE=" in
       systemd.exec(5)) may need to be raised to accommodate many concurrent clients.

       Transport Layer Security (IMAPS, NNTPS, or via STARTTLS) significantly increases memory
       use of client sockets, sure to account for that in capacity planning.

CONTACT

       Feedback encouraged via plain-text mail to <mailto:meta@public-inbox.org>

       Information for *BSDs and non-traditional filesystems especially welcome.

       Our archives are hosted at <https://public-inbox.org/meta/>,
       <http://hjrcffqmbrq6wope.onion/meta/>, and other places

COPYRIGHT

       Copyright 2020 all contributors <mailto:meta@public-inbox.org>

       License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>