Provided by: recoll_1.17.3-2_amd64 bug

NAME

       recoll.conf - main personal configuration file for Recoll

DESCRIPTION

       This file defines the indexation configuration for the Recoll full-text search system.

       The      system-wide     configuration     file     is     normally     located     inside
       /usr/[local]/share/recoll/examples. Any parameter set in the common file may be overridden
       by setting it in the personal configuration file, by default: $HOME/.recoll/recoll.conf

       Please  note  while  we  try  to  keep  this  manual  page  reasonably up to date, it will
       frequently lag the current state of the software. The best source of information about the
       configuration are the comments in the configuration file.

       A short extract of the file might look as follows:

              # Space-separated list of directories to index.
              topdirs =  ~/docs /usr/share/doc

              [~/somedirectory-with-utf8-txt-files]
              defaultcharset = utf-8

       There are three kinds of lines:

              ·      Comment or empty

              ·      Parameter affectation

              ·      Section definition

       Empty lines or lines beginning with # are ignored.

       Affectation lines are in the form 'name = value'.

       Section lines allow redefining a parameter for a directory subtree. Some of the parameters
       used for indexaction are looked up hierarchically from the more to the less specific.  Not
       all  parameters  can  be  meaningfully  redefined,  this is specified for each in the next
       section.

       The tilde character (~) is expanded  in  file  names  to  the  name  of  the  user's  home
       directory.

       Where  values  are  lists,  white space is used for separation, and elements with embedded
       spaces can be quoted with double-quotes.

OPTIONS

       topdirs = directories
              Specifies the list of directories to index (recursively).

       dbdir = directory
              The name of the Xapian database directory. It will be created if  needed  when  the
              database  is  initialized.  If  this  is not an absolute pathname, it will be taken
              relative to the configuration directory.

       skippedNames = patterns
              A space-separated list of patterns for names of files or directories that should be
              completely ignored. The list defined in the default file is:

              *~ #* bin CVS  Cache caughtspam  tmp

              The  list can be redefined for subdirectories, but is only actually changed for the
              top level ones in topdirs

       skippedPaths = patterns
              A space-separated list of patterns for paths the indexer should not  descend  into.
              Together  with  topdirs,  this  allows  pruning  the indexed tree to one's content.
              daemSkippedPaths can be used to define a specific value for the real time  indexing
              monitor.

       followLinks = boolean
              Specifies  if the indexer should follow symbolic links while walking the file tree.
              The default is to ignore symbolic links to avoid multiple indexing of linked files.
              No effort is made to avoid duplication when this option is set to true. This option
              can be set individually for each of the topdirs members by using sections.  It  can
              not be changed below the topdirs level.

       loglevel = value
              Verbosity  level  for  recoll  and  recollindex.  A value of 4 lists quite a lot of
              debug/information messages. 3 lists only  errors.   daemloglevel  can  be  used  to
              specify a different value for the real-time indexing daemon.

       logfilename = file
              Where   should  the  messages  go.  'stderr'  can  be  used  as  a  special  value.
              daemlogfilename can be used to specify a different value for the real-time indexing
              daemon.

       indexstemminglanguages = languages
              A  list  of  languages  for  which  the stem expansion databases will be built. See
              recollindex(1) for possible values.

       defaultcharset = charset
              The name of the character set used for files that do not contain  a  character  set
              definition (ie: plain text files). This can be redefined for any subdirectory.

       maxfsoccuppc = percentnumber
              Maximum  file system occupation before we stop indexing. The value is a percentage,
              corresponding to what the "Capacity" df output column shows.  The default value  is
              0, meaning no checking.

       idxflushmb = megabytes
              Threshold  (megabytes  of  new text data) where we flush from memory to disk index.
              Setting this can help control  memory  usage.  A  value  of  0  means  no  explicit
              flushing,  letting  Xapian  use  its  own  default,  which  is flushing every 10000
              documents (or XAPIAN_FLUSH_THRESHOLD), meaning that memory usage depends on average
              document size. The default value is 10.

       filtersdir = directory
              A  directory  to search for the external filter scripts used to index some types of
              files. The value should not be changed, except if you want to  modify  one  of  the
              default scripts. The value can be redefined for any subdirectory.

       iconsdir = directory
              The name of the directory where recoll result list icons are stored. You can change
              this if you want different images.

       guesscharset = boolean
              Try to guess the character set of files if no internal value is available (ie:  for
              plain  text  files). This does not work well in general, and should probably not be
              used.

       usesystemfilecommand = boolean
              Decide if we use the file -i system command as a final  step  for  determining  the
              mime type for a file (the main procedure uses suffix associations as defined in the
              mimemap file). This can be useful for files with suffixless names, but it will also
              cause the indexation of many bogus "text" files.

       indexedmimetypes = list
              Recoll  normally  indexes  any  file which it knows how to read. This list lets you
              restrict the indexed mime types to what you specify. If the variable is unspecified
              or the list empty (the default), all supported types are processed.

       compressedfilemaxkbs = value
              Size  limit  for compressed (.gz or .bz2) files. These need to be decompressed in a
              temporary  directory  for  identification,  which   can   be   very   wasteful   if
              'uninteresting' big compressed files are present.  Negative means no limit, 0 means
              no processing of any compressed file. Defaults to -1.

       indexallfilenames = boolean
              Recoll indexes file names into a special section of the database to allow  specific
              file  names searches using wild cards. This parameter decides if file name indexing
              is performed only for files with mime types that would qualify them for  full  text
              indexation,  or  for  all  files  inside the selected subtrees, independent of mime
              type.

       idxabsmlen = value
              Recoll stores an abstract for each indexed file inside the database. The  text  can
              come  from  an  actual  'abstract'  section  in  the  document  or will just be the
              beginning of the document. It is stored in the index so that it  can  be  displayed
              inside  the  result  lists  without  decoding  the  original  file.  The idxabsmlen
              parameter defines the size of the stored abstract. The default value is 250  bytes.
              The  search  interface  gives  you  the  choice  to  display  this stored text or a
              synthetic abstract built by extracting text around the search terms. If you  always
              prefer the synthetic abstract, you can reduce this value and save a little space.

       aspellLanguage = lang
              Language  definitions  to  use when creating the aspell dictionary.  The value must
              match a set of aspell language definition files. You can type  "aspell  config"  to
              see  where  these are installed (look for data-dir). The default if the variable is
              not set is to use your desktop national language environment to guess the value.

       noaspell = boolean
              If this is set, the aspell dictionary generation is turned off.  Useful  for  cases
              where  you  don't  need  the  functionality  or  when it is unusable because aspell
              crashes during dictionary generation.

       nocjk = boolean
              If this set to true, specific east asian (Chinese Korean Japanese)  characters/word
              splitting  is  turned  off. This will save a small amount of cpu if you have no CJK
              documents. If your document base does include such text but you are not  interested
              in searching it, setting nocjk may be a significant time and space saver.

       cjkngramlen = value
              This  lets  you  adjust the size of n-grams used for indexing CJK text. The default
              value of 2 is probably appropriate in most cases. A value of  3  would  allow  more
              precision and efficiency on longer words, but the index will be approximately twice
              as large.

SEE ALSO

       recollindex(1) recoll(1)

                                          8 January 2006                           RECOLL.CONF(5)