Provided by: recoll_1.17.3-2_amd64 bug

NAME

       recoll.conf - main personal configuration file for Recoll

DESCRIPTION

       This file defines the indexation configuration for the Recoll full-text search system.

       The  system-wide  configuration  file  is normally located inside /usr/[local]/share/recoll/examples. Any
       parameter set in the common file may be overridden by setting it in the personal configuration  file,  by
       default: $HOME/.recoll/recoll.conf

       Please  note  while  we  try  to  keep this manual page reasonably up to date, it will frequently lag the
       current state of the software. The best source of information about the configuration are the comments in
       the configuration file.

       A short extract of the file might look as follows:

              # Space-separated list of directories to index.
              topdirs =  ~/docs /usr/share/doc

              [~/somedirectory-with-utf8-txt-files]
              defaultcharset = utf-8

       There are three kinds of lines:

              •      Comment or empty

              •      Parameter affectation

              •      Section definition

       Empty lines or lines beginning with # are ignored.

       Affectation lines are in the form 'name = value'.

       Section lines allow redefining a parameter for a directory subtree.  Some  of  the  parameters  used  for
       indexaction  are  looked  up hierarchically from the more to the less specific. Not all parameters can be
       meaningfully redefined, this is specified for each in the next section.

       The tilde character (~) is expanded in file names to the name of the user's home directory.

       Where values are lists, white space is used for separation, and elements  with  embedded  spaces  can  be
       quoted with double-quotes.

OPTIONS

       topdirs = directories
              Specifies the list of directories to index (recursively).

       dbdir = directory
              The  name  of  the  Xapian  database  directory. It will be created if needed when the database is
              initialized. If this is not an absolute pathname, it will be taken relative to  the  configuration
              directory.

       skippedNames = patterns
              A  space-separated  list  of  patterns for names of files or directories that should be completely
              ignored. The list defined in the default file is:

              *~ #* bin CVS  Cache caughtspam  tmp

              The list can be redefined for subdirectories, but is only actually changed for the top level  ones
              in topdirs

       skippedPaths = patterns
              A  space-separated  list  of patterns for paths the indexer should not descend into. Together with
              topdirs, this allows pruning the indexed tree to one's content. daemSkippedPaths can  be  used  to
              define a specific value for the real time indexing monitor.

       followLinks = boolean
              Specifies  if the indexer should follow symbolic links while walking the file tree. The default is
              to ignore symbolic links to avoid multiple indexing of linked files. No effort is  made  to  avoid
              duplication  when  this option is set to true. This option can be set individually for each of the
              topdirs members by using sections. It can not be changed below the topdirs level.

       loglevel = value
              Verbosity level for recoll and recollindex. A value of 4 lists quite a  lot  of  debug/information
              messages.  3  lists  only  errors.   daemloglevel can be used to specify a different value for the
              real-time indexing daemon.

       logfilename = file
              Where should the messages go. 'stderr' can be used as a special  value.   daemlogfilename  can  be
              used to specify a different value for the real-time indexing daemon.

       indexstemminglanguages = languages
              A  list  of languages for which the stem expansion databases will be built. See recollindex(1) for
              possible values.

       defaultcharset = charset
              The name of the character set used for files that do not contain a character set  definition  (ie:
              plain text files). This can be redefined for any subdirectory.

       maxfsoccuppc = percentnumber
              Maximum  file  system occupation before we stop indexing. The value is a percentage, corresponding
              to what the "Capacity" df output column shows.  The default value is 0, meaning no checking.

       idxflushmb = megabytes
              Threshold (megabytes of new text data) where we flush from memory to disk index. Setting this  can
              help  control  memory  usage.  A value of 0 means no explicit flushing, letting Xapian use its own
              default, which is flushing every 10000 documents (or XAPIAN_FLUSH_THRESHOLD), meaning that  memory
              usage depends on average document size. The default value is 10.

       filtersdir = directory
              A directory to search for the external filter scripts used to index some types of files. The value
              should  not  be changed, except if you want to modify one of the default scripts. The value can be
              redefined for any subdirectory.

       iconsdir = directory
              The name of the directory where recoll result list icons are stored. You can change  this  if  you
              want different images.

       guesscharset = boolean
              Try  to  guess  the  character  set of files if no internal value is available (ie: for plain text
              files). This does not work well in general, and should probably not be used.

       usesystemfilecommand = boolean
              Decide if we use the file -i system command as a final step for determining the mime  type  for  a
              file  (the  main  procedure  uses suffix associations as defined in the mimemap file). This can be
              useful for files with suffixless names, but it will also cause the indexation of many bogus "text"
              files.

       indexedmimetypes = list
              Recoll normally indexes any file which it knows how to read.  This  list  lets  you  restrict  the
              indexed  mime  types  to  what  you specify. If the variable is unspecified or the list empty (the
              default), all supported types are processed.

       compressedfilemaxkbs = value
              Size limit for compressed (.gz or .bz2) files. These  need  to  be  decompressed  in  a  temporary
              directory  for  identification, which can be very wasteful if 'uninteresting' big compressed files
              are present.  Negative means no limit, 0 means no processing of any compressed file.  Defaults  to
              -1.

       indexallfilenames = boolean
              Recoll  indexes  file  names  into  a special section of the database to allow specific file names
              searches using wild cards. This parameter decides if file name  indexing  is  performed  only  for
              files  with  mime  types that would qualify them for full text indexation, or for all files inside
              the selected subtrees, independent of mime type.

       idxabsmlen = value
              Recoll stores an abstract for each indexed file inside the database. The text  can  come  from  an
              actual  'abstract'  section  in  the document or will just be the beginning of the document. It is
              stored in the index so that it can be displayed inside  the  result  lists  without  decoding  the
              original file. The idxabsmlen parameter defines the size of the stored abstract. The default value
              is  250  bytes.   The  search  interface  gives  you  the  choice to display this stored text or a
              synthetic abstract built by extracting text around the search terms.  If  you  always  prefer  the
              synthetic abstract, you can reduce this value and save a little space.

       aspellLanguage = lang
              Language  definitions  to  use when creating the aspell dictionary.  The value must match a set of
              aspell language definition files. You can type "aspell config" to see where  these  are  installed
              (look  for  data-dir).  The  default  if  the  variable is not set is to use your desktop national
              language environment to guess the value.

       noaspell = boolean
              If this is set, the aspell dictionary generation is turned off. Useful for cases where  you  don't
              need the functionality or when it is unusable because aspell crashes during dictionary generation.

       nocjk = boolean
              If  this  set  to true, specific east asian (Chinese Korean Japanese) characters/word splitting is
              turned off. This will save a small amount of cpu if you have no CJK documents.  If  your  document
              base  does  include  such  text but you are not interested in searching it, setting nocjk may be a
              significant time and space saver.

       cjkngramlen = value
              This lets you adjust the size of n-grams used for indexing CJK text. The default  value  of  2  is
              probably  appropriate  in  most  cases.  A value of 3 would allow more precision and efficiency on
              longer words, but the index will be approximately twice as large.

SEE ALSO

       recollindex(1) recoll(1)

                                                 8 January 2006                                   RECOLL.CONF(5)