Ubuntu Manpage: omindex - Index static website data via the filesystem

Provided by: xapian-omega_1.4.25-2_amd64

NAME

       omindex - Index static website data via the filesystem

SYNOPSIS

       omindex [OPTIONS] --db DATABASE [BASEDIR] DIRECTORY

DESCRIPTION

       omindex - Index static website data via the filesystem

       DIRECTORY is the directory to start indexing from.

       BASEDIR is the directory corresponding to URL (default: DIRECTORY).

OPTIONS

       -d, --duplicates=ARG
              set duplicate handling: ARG can be 'ignore' or 'replace' (default: replace)

       -p, --no-delete
              skip    the    deletion    of    documents    corresponding    to   deleted   files
              (--preserve-nonduplicates is a deprecated alias for --no-delete)

       -e, --empty-docs=ARG
              how to handle documents we extract no text from: ARG can be index,  warn  (issue  a
              diagnostic and index), or skip.  (default: warn)

       -D, --db=DATABASE
              path to database to use

       -U, --url=URL
              base url BASEDIR corresponds to (default: /)

       -M, --mime-type=EXT:TYPE
              assume  any  file  with  extension EXT has MIME Content-Type TYPE, instead of using
              libmagic (empty TYPE removes any existing  mapping  for  EXT;  other  special  TYPE
              values: 'ignore' and 'skip')

       -G, --mime-type-match=GLOB:TYPE
              assume  any  file  with  leaf  name  matching  shell wildcard pattern GLOB has MIME
              Content-Type TYPE (special TYPE values: 'ignore' and 'skip')

       -F, --filter=M[,[T][,C]]:CMD
              process files with MIME Content-Type M using command CMD, which produces output (on
              stdout  or  in  a  temporary  file)  with format T (Content-Type or file extension;
              currently txt (default), html or svg) in character  encoding  C  (default:  UTF-8).
              E.g. -Fapplication/octet-stream:'strings -n8' or -Ftext/x-foo,,utf-16:'foo2utf16 %f
              %t'

       --read-filters=FILE
              bulk-load --filter arguments from FILE, which should contain one such argument  per
              line  (e.g.   text/x-bar:bar2txt  --utf8).   Lines  starting  with # are treated as
              comments and ignored.

       -l, --depth-limit=LIMIT
              set recursion limit (0 = unlimited)

       -f, --follow
              follow symbolic links

       -i, --ignore-exclusions
              ignore meta robots tags and similar exclusions

       -S, --spelling
              index data for spelling correction

       -m, --max-size=N[SUFFIX]
              maximum size of file to index (in bytes or  with  a  suffix  of  'K'/'k',  'M'/'m',
              'G'/'g') (default: unlimited)

       --sample=SOURCE
              what to use for the stored sample of text for HTML documents - SOURCE can be 'body'
              or 'description' (default: 'body')

       -E, --sample-size=SIZE
              maximum  size  for  the  document  text  sample  (supports  the  same  formats   as
              --max-size).  (default: 512)

       -T, --title-size=SIZE
              maximum  size  for  the  document  title (supports the same formats as --max-size).
              (default: 128)

       -R, --retry-failed
              retry files which omindex failed to extract text from on a previous run

       --opendir-sleep=SECS
              sleep for SECS seconds before opening each directory - sleeping for 2 seconds seems
              to reliably work around problems with indexing files on Microsoft DFS shares.

       -C, --track-ctime
              track each file's ctime so we can detect changes to ownership or permissions.

       --date-terms
              ignored for forward compatibility with Omega 1.5.x.

       --no-date-terms
              don't  index  D, M and Y prefixed terms to support date range filtering using terms
              (we now recommend using a value slot for this instead).

       -v, --verbose
              show more information about what is happening

       --overwrite
              create the database anew (the default is to update if the database already exists)

       -s, --stemmer=LANG
              set the stemming language (default: english).   Possible  values:  arabic  armenian
              basque  catalan  danish  dutch  earlyenglish  english finnish french german german2
              hungarian  indonesian  irish  italian  kraaij_pohlmann  lithuanian  lovins   nepali
              norwegian  porter  portuguese  romanian russian spanish swedish tamil turkish (pass
              'none' to disable stemming)

       -h, --help
              display this help and exit

       -V, --version
              output version information and exit

       Please report bugs at: https://xapian.org/bugs