Provided by: lttoolbox_3.5.1-2build2_amd64 bug

NAME

     lt-proc — lexical processor for Apertium

SYNOPSIS

     lt-proc [-a | -b | -o | -c | -d | -e | -g | -h | -p | -s | -t | -v | -h | -z | -w] [-W] [-N -N] [-L -N]
             [-i icx_file] fst_file [input_file [output_file]]

DESCRIPTION

     lt-proc is the application responsible for providing the four lexical processing functionalities:

        morphological analyser (option -a)

        lexical transfer (option -n)

        morphological generator (option -g)

        post-generator (option -p)

     It accomplishes these tasks by reading binary files containing a compact and efficient representation of
     dictionaries (a class of finite-state transducers called augmented letter transducers).  These files are
     generated by lt-comp(1).

     It is worth mentioning that some characters (‘[’, ‘]’, ‘$’, ‘^’, ‘/’, ‘+’) are special chars used for
     format and encapsulation.  They should be escaped if they have to be used literally, for instance:
     ‘[’...‘]’ are ignored and the format of a linefeed is ‘^...$’.

OPTIONS

     -a, --analysis
             Tokenizes the text in surface forms (lexical units as they appear in texts) and delivers, for each
             surface form, one or more lexical forms consisting of lemma, lexical category and morphological
             inflection information.  Tokenization is not straightforward due to the existence, on the one hand,
             of contractions, and, on the other hand, of multi-word lexical units.  For contractions, the system
             reads in a single surface form and delivers the corresponding sequence of lexical forms.  Multi-
             word surface forms are analysed in a left-to-right, longest-match fashion.  Multi-word surface
             forms may be invariable (such as a multi-word preposition or conjunction) or inflected (for
             example, in es, “echaban de menos”, “they missed”, is a form of the imperfect indicative tense of
             the verb “echar de menos”, “to miss”).  Limited support for some kinds of discontinuous multi-word
             units is also available.  Single-word surface forms analysis produces output like the one in these
             examples:

             “cantar” → “^cantar/cantar<vblex><inf>$” or “daba” →
             “^daba/dar<vblex><pii><p1><sg>/dar<vblex><pii><p3><sg>$”.

     -b, --bilingual
             Does lexical transference, attaching queues of morphological symbols not specified in the
             dictionaries.  As the analysis mode, supports multiple lexical forms in the target language for a
             given lexical form in the source language.  Works typically with the output of
             apertium-pretransfer(1).

     -o, --surf-bilingual
             As with -b, but takes input from apertium-tagger(1) -p, with surface forms, and if the lexical form
             is not found in the bilingual dictionary, it outputs the surface form of the word.

     -c, --case-sensitive
             Use the literal case of the incoming characters

     -d, --debugged-gen
             Morphological generation with all the stuff

     -e, --decompose-compounds
             Try to treat unknown words as compounds, and decompose them.

     -w, --dictionary-case
             Use the case information contained in the lexicon, instead of the surface case (only applied in
             analysis mode).

     -g, --generation
             Delivers a target-language surface form for each target-language lexical form, by suitably
             inflecting it.

     -n, --non-marked-gen
             Morphological generation (like -g) but without unknown word marks (asterisk ‘*’).

     -b, --tagged-gen
             Morphological generation (like -g) but retaining part-of-speech tags.

     -p, --post-generation
             Performs orthographical operations such as contractions and apostrophations.  The post-generator is
             usually dormant (just copies the input to the output) until a special alarm symbol contained in
             some target-language surface forms wakes it up to perform a particular string transformation if
             necessary; then it goes back to sleep.

     -s, --sao
             Input processing is in orthoepikon (previously sao) annotation system format:
             http://orthoepikon.sf.net.

     -t, --transliteration
             Apply a transliteration dictionary

     -i icx_file, --ignored-chars icx_file
             Ignores characters specified in the file icx_file

     -z, --null-flush
             Flush output on the null character

     -C, --careful-case
             Use dictionary case if present, else surface

     -N, --analyses
             Output no more than N analyses (if the transducer is weighted, the N best analyses)

     -L, --weight-classes
             Output no more than N best weight classes (where analyses with equal weight constitute a class)

     -W, --show-weights
             Print final analysis weights (if any)

     -v, --version
             Display the version number.

     -h, --help
             Display this help.

FILES

     input_file
             The input compiled dictionary.

SEE ALSO

     apertium(1), apertium-tagger(1), lt-comp(1), lt-expand(1)

COPYRIGHT

     Copyright © 2005, 2006 Universitat d'Alacant / Universidad de Alicante.  This is free software.  You may
     redistribute copies of it under the terms of the GNU General Public License:
     https://www.gnu.org/licenses/gpl.html.

BUGS

     Many... lurking in the dark and waiting for you!