Provided by: recollcmd_1.26.3-1build1_amd64 bug

NAME

       recoll.conf - main personal configuration file for Recoll

DESCRIPTION

       This file defines the index configuration for the Recoll full-text search system.

       The      system-wide     configuration     file     is     normally     located     inside
       /usr/[local]/share/recoll/examples. Any parameter set in the common file may be overridden
       by setting it in the personal configuration file, by default: $HOME/.recoll/recoll.conf

       Please note while I try to keep this manual page reasonably up to date, it will frequently
       lag the current  state  of  the  software.  The  best  source  of  information  about  the
       configuration  are  the  comments in the system-wide configuration file or the user manual
       which you can access from the recoll GUI help menu or on the recoll web site.

       A short extract of the file might look as follows:

              # Space-separated list of directories to index.
              topdirs =  ~/docs /usr/share/doc

              [~/somedirectory-with-utf8-txt-files]
              defaultcharset = utf-8

       There are three kinds of lines:

              •      Comment or empty

              •      Parameter affectation

              •      Section definition

       Empty lines or lines beginning with # are ignored.

       Affectation lines are in the form 'name = value'.

       Section lines allow redefining a parameter for a directory subtree. Some of the parameters
       used for indexing are looked up hierarchically from the more to the less specific. Not all
       parameters can be meaningfully redefined, this is specified for each in the next section.

       The tilde character (~) is expanded  in  file  names  to  the  name  of  the  user's  home
       directory.

       Where  values  are  lists,  white space is used for separation, and elements with embedded
       spaces can be quoted with double-quotes.

OPTIONS

       topdirs = string
              Space-separated list of files or directories to recursively  index.  Default  to  ~
              (indexes  $HOME).  You  can  use symbolic links in the list, they will be followed,
              independently of the value of the followLinks variable.

       monitordirs = string
              Space-separated list of files or directories to monitor for updates.  When  running
              the  real-time  indexer,  this allows monitoring only a subset of the whole indexed
              area. The elements must be included in the tree defined by the 'topdirs' members.

       skippedNames = string
              Files and directories which should be  ignored.   White  space  separated  list  of
              wildcard  patterns  (simple  ones,  not  paths,  must contain no / ), which will be
              tested against file and directory names.  The list  in  the  default  configuration
              does  not exclude hidden directories (names beginning with a dot), which means that
              it may index quite a few things that you do not want. On the other hand, email user
              agents  like  Thunderbird  usually  store  messages  in hidden directories, and you
              probably  want  this  indexed.  One  possible  solution  is   to   have   ".*"   in
              "skippedNames",  and  add things like "~/.thunderbird" "~/.evolution" to "topdirs".
              Not  even  the  file  names  are  indexed  for  patterns  in  this  list,  see  the
              "noContentSuffixes"  variable  for  an  alternative approach which indexes the file
              names. Can be redefined for any subtree.

       skippedNames- = string
              List of name endings to remove from the default skippedNames list.

       skippedNames+ = string
              List of name endings to add to the default skippedNames list.

       noContentSuffixes = string
              List of name endings (not necessarily dot-separated suffixes) for  which  we  don't
              try MIME type identification, and don't uncompress or index content. Only the names
              will be indexed. This complements the now obsoleted recoll_noindex  list  from  the
              mimemap  file,  which  will  go  away in a future release (the move from mimemap to
              recoll.conf allows editing the list  through  the  GUI).  This  is  different  from
              skippedNames  because  these  are name ending matches only (not wildcard patterns),
              and the file  name  itself  gets  indexed  normally.  This  can  be  redefined  for
              subdirectories.

       noContentSuffixes- = string
              List of name endings to remove from the default noContentSuffixes list.

       noContentSuffixes+ = string
              List of name endings to add to the default noContentSuffixes list.

       skippedPaths = string
              Absolute  paths we should not go into. Space-separated list of wildcard expressions
              for  absolute  filesystem  paths.  Must  be  defined  at  the  top  level  of   the
              configuration  file,  not  in  a subsection. Can contain files and directories. The
              database and configuration directories will automatically be added. The expressions
              are  matched  using  'fnmatch(3)'  with  the FNM_PATHNAME flag set by default. This
              means  that  '/'   characters   must   be   matched   explicitly.   You   can   set
              'skippedPathsFnmPathname'  to  0  to  disable the use of FNM_PATHNAME (meaning that
              '/*/dir3' will match '/dir1/dir2/dir3'). The default value contains the usual mount
              point  for  removable media to remind you that it is a bad idea to have Recoll work
              on these (esp. with the monitor: media gets indexed on mount, all data gets  erased
              on unmount). Explicitly adding '/media/xxx' to the 'topdirs' variable will override
              this.

       skippedPathsFnmPathname = bool
              Set to 0 to override use of FNM_PATHNAME for matching skipped paths.

       nowalkfn = string
              File name which will cause its  parent  directory  to  be  skipped.  Any  directory
              containing  a  file  with  this  name  will  be  skipped  as  if it was part of the
              skippedPaths list. Ex: .recoll-noindex

       daemSkippedPaths = string
              skippedPaths equivalent specific to real time indexing. This enables  having  parts
              of  the  tree which are initially indexed but not monitored. If daemSkippedPaths is
              not set, the daemon uses skippedPaths.

       zipUseSkippedNames = bool
              Use skippedNames inside Zip archives. Fetched directly by the rclzip handler.  Skip
              the  patterns  defined  by  skippedNames  inside Zip archives. Can be redefined for
              subdirectories.                                                                 See
              https://www.lesbonscomptes.com/recoll/faqsandhowtos/FilteringOutZipArchiveMembers.html

       zipSkippedNames = string
              Space-separated list of wildcard expressions  for  names  that  should  be  ignored
              inside   zip   archives.   This   is   used   directly   by  the  zip  handler.  If
              zipUseSkippedNames is not set, zipSkippedNames defines the patterns to  be  skipped
              inside  archives.  If zipUseSkippedNames is set, the two lists are concatenated and
              used.       Can       be       redefined       for       subdirectories.        See
              https://www.lesbonscomptes.com/recoll/faqsandhowtos/FilteringOutZipArchiveMembers.html

       followLinks = bool
              Follow symbolic links during indexing. The default is to ignore symbolic  links  to
              avoid  multiple  indexing  of  linked files. No effort is made to avoid duplication
              when this option is set to true. This option can be set individually  for  each  of
              the  'topdirs' members by using sections. It can not be changed below the 'topdirs'
              level. Links in the 'topdirs' list itself are always followed.

       indexedmimetypes = string
              Restrictive list of indexed mime  types.  Normally  not  set  (in  which  case  all
              supported  types are indexed). If it is set, only the types from the list will have
              their contents indexed. The names will be indexed anyway  if  indexallfilenames  is
              set  (default).  MIME  type names should be taken from the mimemap file (the values
              may be different from xdg-mime or file -i output in some cases). Can  be  redefined
              for subtrees.

       excludedmimetypes = string
              List  of  excluded MIME types. Lets you exclude some types from indexing. MIME type
              names should be taken from the mimemap file (the values may be different from  xdg-
              mime or file -i output in some cases) Can be redefined for subtrees.

       nomd5types = string
              Don't  compute  md5  for these types. md5 checksums are used only for deduplicating
              results, and can be very expensive to compute on multimedia  or  other  big  files.
              This  list  lets  you turn off md5 computation for selected types. It is global (no
              redefinition for subtrees). At the moment, it  only  has  an  effect  for  external
              handlers  (exec  and execm). The file types can be specified by listing either MIME
              types (e.g. audio/mpeg) or handler names (e.g. rclaudio).

       compressedfilemaxkbs = int
              Size limit for compressed files.  We  need  to  decompress  these  in  a  temporary
              directory for identification, which can be wasteful in some cases. Limit the waste.
              Negative means no limit. 0 results in no processing of any compressed file. Default
              50 MB.

       textfilemaxmbs = int
              Size limit for text files. Mostly for skipping monster logs. Default 20 MB.

       indexallfilenames = bool
              Index  the file names of unprocessed files Index the names of files the contents of
              which we don't index because of an excluded or unsupported MIME type.

       usesystemfilecommand = bool
              Use a system command for file MIME type guessing as  a  final  step  in  file  type
              identification  This  is  generally  useful, but will usually cause the indexing of
              many bogus 'text' files. See 'systemfilecommand' for the command used.

       systemfilecommand = string
              Command used to guess MIME types if the internal methods fails  This  should  be  a
              "file  -i"  workalike.   The  file  path  will  be added as a last parameter to the
              command line. "xdg-mime" works better than the traditional "file" command,  and  is
              now the configured default (with a hard-coded fallback to "file")

       processwebqueue = bool
              Decide  if  we process the Web queue. The queue is a directory where the Recoll Web
              browser plugins create the copies of visited pages.

       textfilepagekbs = int
              Page size for text files. If this is set, text/plain files  will  be  divided  into
              documents  of  approximately  this size. Will reduce memory usage at index time and
              help with loading data in the preview window at  query  time.  Particularly  useful
              with  very  big  files, such as application or system logs. Also see textfilemaxmbs
              and compressedfilemaxkbs.

       membermaxkbs = int
              Size limit for archive members. This is passed to the filters in the environment as
              RECOLL_FILTER_MAXMEMBERKB.

       indexStripChars = bool
              Decide  if  we store character case and diacritics in the index. If we do, searches
              sensitive to case and diacritics can be performed, but the index  will  be  bigger,
              and  some  marginal weirdness may sometimes occur. The default is a stripped index.
              When  using  multiple  indexes  for  a  search,  this  parameter  must  be  defined
              identically for all. Changing the value implies an index reset.

       indexStoreDocText = bool
              Decide  if  we  store  the  documents'  text content in the index. Storing the text
              allows extracting snippets from it at query time, instead  of  building  them  from
              index position data.  Newer Xapian index formats have rendered our use of positions
              list unacceptably slow in some cases.  The  last  Xapian  index  format  with  good
              performance  for the old method is Chert, which is default for 1.2, still supported
              but not default in 1.4 and will be dropped in 1.6.  The  stored  document  text  is
              translated from its original format to UTF-8 plain text, but not stripped of upper-
              case, diacritics, or punctuation signs. Storing it  increases  the  index  size  by
              10-20%  typically,  but also allows for nicer snippets, so it may be worth enabling
              it even if not strictly needed for performance if you can afford  the  space.   The
              variable  only  has  an  effect  when  creating an index, meaning that the xapiandb
              directory must not exist yet. Its exact effect depends on the Xapian version.   For
              Xapian  1.4,  if  the  variable is set to 0, the Chert format will be used, and the
              text will not be stored. If the variable is 1, Glass will be  used,  and  the  text
              stored.   For Xapian 1.2, and for versions after 1.5 and newer, the index format is
              always the default, but the variable controls if the text is stored or not, and the
              abstract  generation  method. With Xapian 1.5 and later, and the variable set to 0,
              abstract generation may be very slow, but this setting may still be useful to  save
              space if you do not use abstract generation at all.

       nonumbers = bool
              Decides  if  terms  will  be  generated  for  numbers.  For example "123", "1.5e6",
              192.168.1.4, would not be indexed if nonumbers is set ("value123" would still  be).
              Numbers  are often quite interesting to search for, and this should probably not be
              set except for special situations, ie, scientific documents with  huge  amounts  of
              numbers  in them, where setting nonumbers will reduce the index size. This can only
              be set for a whole index, not for a subtree.

       dehyphenate = bool
              Determines if we index in version 1.22, and on by default. Setting the variable  to
              off allows restoring the previous behaviour.

       backslashasletter = bool
              Process  backslash as normal letter This may make sense for people wanting to index
              TeX commands as such but is not of much general use.

       maxtermlength = int
              Maximum term length. Words longer than this will be discarded.  The default  is  40
              and  used  to be hard-coded, but it can now be adjusted. You need an index reset if
              you change the value.

       nocjk = bool
              Decides if specific East Asian (Chinese Korean Japanese) characters/word  splitting
              is  turned  off. This will save a small amount of CPU if you have no CJK documents.
              If your document base does  include  such  text  but  you  are  not  interested  in
              searching it, setting nocjk may be a significant time and space saver.

       cjkngramlen = int
              This  lets  you  adjust the size of n-grams used for indexing CJK text. The default
              value of 2 is probably appropriate in most cases. A value of  3  would  allow  more
              precision and efficiency on longer words, but the index will be approximately twice
              as large.

       indexstemminglanguages = string
              Languages for which to create stemming expansion data. Stemmer names can  be  found
              by executing 'recollindex -l', or this can also be set from a list in the GUI.

       defaultcharset = string
              Default  character set. This is used for files which do not contain a character set
              definition (e.g.: text/plain). Values found inside files, e.g. a 'charset'  tag  in
              HTML  documents, will override it. If this is not set, the default character set is
              the one defined by the NLS environment ($LC_ALL, $LC_CTYPE, $LANG),  or  ultimately
              iso-8859-1  (cp-1252 in fact).  If for some reason you want a general default which
              does not match your LANG and  is  not  8859-1,  use  this  variable.  This  can  be
              redefined for any sub-directory.

       unac_except_trans = string
              A  list  of  characters,  encoded  in UTF-8, which should be handled specially when
              converting text to unaccented lowercase. For example, in Swedish, the letter a with
              diaeresis  has  full alphabet citizenship and should not be turned into an a.  Each
              element in the space-separated list has the special character as first element  and
              the  translation  following.  The  handling  of  both  the lowercase and upper-case
              versions of a character should be specified, as appartenance to the list will turn-
              off  both standard accent and case processing. The value is global and affects both
              indexing and querying.  Examples: Swedish: unac_except_trans = ää Ää öö  Öö  üü  Üü
              ßss œoe Œoe æae Æae ffff fifi flfl åå Åå unac_except_trans = ää Ää öö Öö üü Üü ßss œoe
              Œoe æae Æae ffff fifi flfl In French, you probably want to decompose  oe  and  ae  and
              nobody  would  type  a German ß unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl
              are not performed by unac, but it is unlikely that someone would type the  composed
              forms in a search.  unac_except_trans = ßss œoe Œoe æae Æae ffff fifi flfl

       maildefcharset = string
              Overrides  the  default  character  set for email messages which don't specify one.
              This is mainly useful for readpst (libpst) dumps, which are utf-8 but  do  not  say
              so.

       localfields = string
              Set  fields on all files (usually of a specific fs area). Syntax is the usual: name
              = value ; attr1 = val1 ; [...]  value is empty so this needs an initial semi-colon.
              This  is  useful,  e.g.,  for  setting  the rclaptg field for application selection
              inside mimeview.

       testmodifusemtime = bool
              Use mtime instead of ctime to test if a file has been modified. The time is used in
              addition to the size, which is always used.  Setting this can reduce re-indexing on
              systems where extended attributes are used (by some  other  application),  but  not
              indexed,  because  changing  extended attributes only affects ctime.  Notes: - This
              may prevent detection of change in some marginal  file  rename  cases  (the  target
              would  need  to  have  the  same  size  and mtime).  - You should probably also set
              noxattrfields to 1 in this case, except  if  you  still  prefer  to  perform  xattr
              indexing,  for  example  if  the local file update pattern makes it of value (as in
              general, there is  a  risk  for  pure  extended  attributes  updates  without  file
              modification to go undetected). Perform a full index reset after changing this.

       noxattrfields = bool
              Disable  extended  attributes conversion to metadata fields. This probably needs to
              be set if testmodifusemtime is set.

       metadatacmds = string
              Define commands to gather external metadata, e.g. tmsu tags.  There can be  several
              entries,  separated  by  semi-colons,  each defining which field name the data goes
              into and the command to use. Don't forget the initial  semi-colon.  All  the  field
              names  must be different. You can use aliases in the "field" file if necessary.  As
              a not too pretty hack conceded  to  convenience,  any  field  name  beginning  with
              "rclmulti"  will  be taken as an indication that the command returns multiple field
              values inside a text blob formatted as a recoll configuration  file  ("fieldname  =
              fieldvalue" lines). The rclmultixx name will be ignored, and field names and values
              will be parsed from the data.  Example: metadatacmds =  ;  tags  =  tmsu  tags  %f;
              rclmulti1 = cmdOutputsConf %f

       cachedir = dfn
              Top  directory  for  Recoll  data.  Recoll  data  directories  are normally located
              relative    to    the    configuration    directory    (e.g.    ~/.recoll/xapiandb,
              ~/.recoll/mboxcache).  If  'cachedir'  is set, the directories are stored under the
              specified value instead (e.g. if cachedir is  ~/.cache/recoll,  the  default  dbdir
              would be ~/.cache/recoll/xapiandb).  This affects dbdir, webcachedir, mboxcachedir,
              aspellDicDir, which can still be individually specified to override cachedir.  Note
              that  if  you  have  multiple  configurations, each must have a different cachedir,
              there is no automatic computation of a subpath under cachedir.

       maxfsoccuppc = int
              Maximum file system occupation  over  which  we  stop  indexing.  The  value  is  a
              percentage,  corresponding  to  what  the  "Capacity"  df  output column shows. The
              default value is 0, meaning no checking.

       dbdir = dfn
              Xapian database directory location. This will be created on first indexing. If  the
              value  is  not  an absolute path, it will be interpreted as relative to cachedir if
              set, or the configuration directory (-c argument or $RECOLL_CONFDIR).   If  nothing
              is specified, the default is then ~/.recoll/xapiandb/

       idxstatusfile = fn
              Name  of  the  scratch  file where the indexer process updates its status. Default:
              idxstatus.txt inside the configuration directory.

       mboxcachedir = dfn
              Directory location for storing mbox message offsets cache files. This  is  normally
              'mboxcache'  under  cachedir if set, or else under the configuration directory, but
              it may be useful to share a directory between different configurations.

       mboxcacheminmbs = int
              Minimum mbox file size over which we cache the offsets. There is really no sense in
              caching offsets for small files. The default is 5 MB.

       webcachedir = dfn
              Directory  where  we  store  the  archived  web pages. This is only used by the web
              history  indexing  code  Default:  cachedir/webcache  if  cachedir  is  set,   else
              $RECOLL_CONFDIR/webcache

       webcachemaxmbs = int
              Maximum  size  in  MB  of  the  Web  archive.  This is only used by the web history
              indexing code.  Default: 40 MB.  Reducing the size will not physically truncate the
              file.

       webqueuedir = fn
              The path to the Web indexing queue. This used to be hard-coded in the old plugin as
              ~/.recollweb/ToIndex so there would be no need or possibility to change it, but the
              WebExtensions plugin now downloads the files to the user Downloads directory, and a
              script moves them to webqueuedir. The script reads this value from the config so it
              has become possible to change it.

       webdownloadsdir = fn
              The  path  to  browser  downloads  directory.  This is where the new browser add-on
              extension has to create the files. They are then moved by a script to webqueuedir.

       aspellDicDir = dfn
              Aspell   dictionary   storage   directory   location.   The    aspell    dictionary
              (aspdict.(lang).rws)  is  normally stored in the directory specified by cachedir if
              set, or under the configuration directory.

       filtersdir = dfn
              Directory location for executable input handlers. If RECOLL_FILTERSDIR  is  set  in
              the  environment,  we use it instead. Defaults to $prefix/share/recoll/filters. Can
              be redefined for subdirectories.

       iconsdir = dfn
              Directory location for icons. The only reason to change this would be if  you  want
              to    change    the   icons   displayed   in   the   result   list.   Defaults   to
              $prefix/share/recoll/images

       idxflushmb = int
              Threshold (megabytes of new data) where we flush from memory to disk index. Setting
              this  allows  some  control  over memory usage by the indexer process. A value of 0
              means no explicit flushing, which  lets  Xapian  perform  its  own  thing,  meaning
              flushing  every  $XAPIAN_FLUSH_THRESHOLD documents created, modified or deleted: as
              memory usage depends on average document size, not only document count, the  Xapian
              approach  is  is not very useful, and you should let Recoll manage the flushes. The
              program compiled value is 0. The configured default value (from this file)  is  now
              50  MB,  and  should  be ok in many cases.  You can set it as low as 10 to conserve
              memory, but if you are looking for maximum speed, you may want to  experiment  with
              values  between  20  and  200.  In  my  experience,  values  beyond this are always
              counterproductive. If you find otherwise, please drop me a note.

       filtermaxseconds = int
              Maximum external filter execution time in seconds. Default 1200 (20mn).  Set  to  0
              for no limit. This is mainly to avoid infinite loops in postscript files (loop.ps)

       filtermaxmbytes = int
              Maximum  virtual  memory  space  for  filter  processes  (setrlimit(RLIMIT_AS)), in
              megabytes. Note that this includes any mapped libs (there is no reliable Linux  way
              to  limit the data space only), so we need to be a bit generous here. Anything over
              2000 will be ignored on 32 bits machines.

       thrQSizes = string
              Stage input queues configuration. There are three internal queues in  the  indexing
              pipeline  stages  (file  data  extraction,  terms  generation,  index update). This
              parameter defines the queue depths for each stage  (three  integer  values).  If  a
              value of -1 is given for a given stage, no queue is used, and the thread will go on
              performing the next stage. In practise, deep queues have not been shown to increase
              performance.  Default:  a  value  of  0 for the first queue tells Recoll to perform
              autoconfiguration based on the detected number of CPUs (no need for the  two  other
              values in this case).  Use thrQSizes = -1 -1 -1 to disable multithreading entirely.

       thrTCounts = string
              Number  of  threads  used  for each indexing stage. The three stages are: file data
              extraction, terms generation,  index  update).  The  use  of  the  counts  is  also
              controlled  by some special values in thrQSizes: if the first queue depth is 0, all
              counts are ignored (autoconfigured); if a value of -1 is used for  a  queue  depth,
              the  corresponding  thread count is ignored. It makes no sense to use a value other
              than 1 for the last stage because updating the Xapian index is necessarily  single-
              threaded (and protected by a mutex).

       loglevel = int
              Log  file  verbosity  1-6. A value of 2 will print only errors and warnings. 3 will
              print information like document updates, 4 is quite verbose and 6 very verbose.

       logfilename = fn
              Log file destination. Use 'stderr' (default) to write to the console.

       idxloglevel = int
              Override loglevel for the indexer.

       idxlogfilename = fn
              Override logfilename for the indexer.

       daemloglevel = int
              Override loglevel for the indexer in real time mode. The  default  is  to  use  the
              idx... values if set, else the log... values.

       daemlogfilename = fn
              Override  logfilename  for the indexer in real time mode. The default is to use the
              idx... values if set, else the log... values.

       orgidxconfdir = dfn
              Original location of the configuration directory.  This  is  used  exclusively  for
              movable  datasets.  Locating  the configuration directory inside the directory tree
              makes it possible to provide automatic query time path translations once  the  data
              set has moved (for example, because it has been mounted on another location).

       curidxconfdir = dfn
              Current  location  of  the  configuration  directory.  Complement orgidxconfdir for
              movable datasets. This should be used  if  the  configuration  directory  has  been
              copied from the dataset to another location, either because the dataset is readonly
              and an r/w copy is desired, or for performance reasons. This records  the  original
              moved location before copy, to allow path translation computations.  For example if
              a dataset originally  indexed  as  '/home/me/mydata/config'  has  been  mounted  to
              '/media/me/mydata',   and   the   GUI  is  running  from  a  copied  configuration,
              orgidxconfdir would be '/home/me/mydata/config', and curidxconfdir (as set  in  the
              copied configuration) would be

       idxrundir = dfn
              Indexing  process  current  directory. The input handlers sometimes leave temporary
              files in the current directory, so it makes sense to have recollindex chdir to some
              temporary  directory.  If the value is empty, the current directory is not changed.
              If the value is (literal) tmp, we  use  the  temporary  directory  as  set  by  the
              environment (RECOLL_TMPDIR else TMPDIR else /tmp). If the value is an absolute path
              to a directory, we go there.

       checkneedretryindexscript = fn
              Script used to heuristically check  if  we  need  to  retry  indexing  files  which
              previously  failed.   The  default script checks the modified dates on /usr/bin and
              /usr/local/bin. A relative path will be looked up in the filters dirs, then in  the
              path. Use an absolute path to do otherwise.

       recollhelperpath = string
              Additional  places  to  search for helper executables. This is only used on Windows
              for now.

       idxabsmlen = int
              Length of abstracts we store while indexing. Recoll stores  an  abstract  for  each
              indexed  file.  The text can come from an actual 'abstract' section in the document
              or will just be the beginning of the document. It is stored in the index so that it
              can  be  displayed  inside the result lists without decoding the original file. The
              idxabsmlen parameter defines the size of the stored abstract. The default value  is
              250 bytes. The search interface gives you the choice to display this stored text or
              a synthetic abstract built by extracting text  around  the  search  terms.  If  you
              always  prefer  the synthetic abstract, you can reduce this value and save a little
              space.

       idxmetastoredlen = int
              Truncation length of stored metadata fields. This does  not  affect  indexing  (the
              whole  field  is processed anyway), just the amount of data stored in the index for
              the purpose of displaying fields inside result lists or previews. The default value
              is 150 bytes which may be too low if you have custom fields.

       idxtexttruncatelen = int
              Truncation  length  for  all document texts. Only index the beginning of documents.
              This is not recommended except if you are sure that the interesting keywords are at
              the top and have severe disk space issues.

       aspellLanguage = string
              Language  definitions  to  use  when creating the aspell dictionary. The value must
              match a set of aspell language definition files. You can type  "aspell  dicts"   to
              see  a  list  The default if this is not set is to use the NLS environment to guess
              the value.

       aspellAddCreateParam = string
              Additional option and parameter to aspell dictionary creation command. Some  aspell
              packages  may  need  an  additional  option  (e.g.  on Debian Jessie: --local-data-
              dir=/usr/lib/aspell). See Debian bug 772415.

       aspellKeepStderr = bool
              Set this to have a look at aspell dictionary  creation  errors.  There  are  always
              many, so this is mostly for debugging.

       noaspell = bool
              Disable  aspell  use.  The  aspell  dictionary  generation  takes  time,  and  some
              combinations of aspell  version,  language,  and  local  terms,  result  in  aspell
              crashing, so it sometimes makes sense to just disable the thing.

       monauxinterval = int
              Auxiliary  database  update  interval.  The  real  time  indexer  only  updates the
              auxiliary databases (stemdb, aspell) periodically, because it would be  too  costly
              to do it for every document change. The default period is one hour.

       monixinterval = int
              Minimum interval (seconds) between processings of the indexing queue. The real time
              indexer does not  process  each  event  when  it  comes  in,  but  lets  the  queue
              accumulate,  to  diminish  overhead  and to aggregate multiple events affecting the
              same file. Default 30 S.

       mondelaypatterns = string
              Timing parameters for the real time indexing. Definitions for  files  which  get  a
              longer  delay  before  reindexing is allowed. This is for fast-changing files, that
              should only be reindexed once in a while. A list of wildcardPattern:seconds  pairs.
              The  patterns  are  matched  with  fnmatch(pattern,  path, 0) You can quote entries
              containing white space with double quotes (quote the whole entry, not the pattern).
              The default is empty.  Example: mondelaypatterns = *.log:20 "*with spaces.*:30"

       monioniceclass = int
              ionice  class  for  the  real  time  indexing  process  On  platforms where this is
              supported. The default value is 3.

       monioniceclassdata = string
              ionice class parameter for the real time indexing process. On platforms where  this
              is supported. The default is empty.

       autodiacsens = bool
              auto-trigger diacritics sensitivity (raw index only). IF the index is not stripped,
              decide if we automatically trigger diacritics sensitivity if the  search  term  has
              accented  characters  (not  in  unac_except_trans).  Else you need to use the query
              language and the "D" modifier to specify diacritics sensitivity. Default is no.

       autocasesens = bool
              auto-trigger case sensitivity (raw index only). IF the index is not  stripped  (see
              indexStripChars),  decide if we automatically trigger character case sensitivity if
              the search term has upper-case characters in any but the first position.  Else  you
              need  to  use  the  query  language  and the "C" modifier to specify character-case
              sensitivity. Default is yes.

       maxTermExpand = int
              Maximum query expansion count for a single term (e.g.: when using wildcards).  This
              only  affects  queries,  not indexing. We used to not limit this at all (except for
              filenames where the limit was too low at 1000), but it is unreasonable with  a  big
              index. Default 10000.

       maxXapianClauses = int
              Maximum  number  of  clauses  we  add  to  a single Xapian query. This only affects
              queries, not indexing.  In  some  cases,  the  result  of  term  expansion  can  be
              multiplicative, and we want to avoid eating all the memory. Default 50000.

       snippetMaxPosWalk = int
              Maximum number of positions we walk while populating a snippet for the result list.
              The  default  of  1,000,000  may  be  insufficient  for  very  big  documents,  the
              consequence would be snippets with possibly meaning-altering missing words.

       pdfocr = bool
              Attempt  OCR  of  PDF files with no text content if both tesseract and pdftoppm are
              installed. The default is off because OCR is so very slow.

       pdfocrlang = string
              Language to assume for PDF OCR. This is very important for having a reasonable rate
              of  errors with tesseract. This can also be set through a configuration variable or
              directory-local parameters. See the rclpdf.py script.

       pdfattach = bool
              Enable PDF attachment  extraction  by  executing  pdftk  (if  available).  This  is
              normally  disabled,  because  it  does slow down PDF indexing a bit even if not one
              attachment is ever found.

       pdfextrameta = string
              Extract text from selected XMP metadata tags. This is  a  space-separated  list  of
              qualified  XMP  tag  names. Each element can also include a translation to a Recoll
              field name, separated by a '|' character. If the second element is absent, the  tag
              name is used as the Recoll field names. You will also need to add specifications to
              the "fields" file to direct processing of the extracted data.

       pdfextrametafix = fn
              Define name of XMP field editing script. This defines the name of a  script  to  be
              loaded  for  editing XMP field values. The script should define a 'MetaFixer' class
              with a metafix() method which will be called with the qualified tag name and  value
              of  each selected field, for editing or erasing. A new instance is created for each
              document, so that the object can keep state for, e.g. eliminating duplicate values.

       mhmboxquirks = string
              Enable thunderbird/mozilla-seamonkey mbox format quirks Set this for the  directory
              where the email mbox files are stored.

SEE ALSO

       recollindex(1) recoll(1)

                                         14 November 2012                          RECOLL.CONF(5)