Ubuntu Manpage: lt-proc - This application is part of the lexical processing modules and tools ( lttoolbox )

Provided by: lttoolbox_3.1.0-1.1ubuntu1_amd64

NAME

       lt-proc - This application is part of the lexical processing modules and tools ( lttoolbox )

       This tool is part of the apertium machine translation architecture: http://apertium.sf.net.

SYNOPSIS

       lt-proc [ -a | -g | -n | -p | -s | -v | -h ] fst_file [input_file [output_file]]

       lt-proc [ --analysis | --generation | --non-marked-gen | --post-generation | --sao | --version | --help ]
       fst_file [input_file [output_file]]

DESCRIPTION

       lt-proc is the application responsible of providing the four lexical processing functionalities

              • morphological analyser  ( option -a )

              • lexical transfer  ( option -n )

              • morphological generator  ( option -g )

              • post-generator  ( option -p )

       It  accomplishes these tasks by reading binary files containing a compact and efficient representation of
       dictionaries (a class of finite-state transducers called augmented letter transducers). These  files  are
       generated by lt-comp(1).

       It  is  worth  to  mention that some characters (`[', `]', `$', `^', `/', `+') are special chars used for
       format and encapsulation. They should be escaped if  they  have  to  be  used  literally,  for  instance:
       `['...`]' are ignored and the format of a linefeed is `^...$'.

OPTIONS

       -a, --analysis
              Tokenizes the text in surface forms (lexical units as they appear in texts) and delivers, for each
              surface  form,  one  or more lexical forms consisting of lemma, lexical category and morphological
              inflection information. Tokenization is not straightforward due to the existence, on the one hand,
              of contractions, and, on the other hand, of multi-word lexical units. For contractions, the system
              reads in a single surface form and delivers the corresponding sequence of  lexical  forms.  Multi-
              word  surface  forms  are  analysed  in a left-to-right, longest-match fashion. Multi-word surface
              forms may be invariable (such as a  multi-word  preposition  or  conjunction)  or  inflected  (for
              example,  in es, "echaban de menos", "they missed", is a form of the imperfect indicative tense of
              the verb "echar de menos", "to miss"). Limited support for some kinds of discontinuous  multi-word
              units  is also available. Single-word surface forms analysis produces output like the one in these
              examples:      "cantar"      ->      `^cantar/cantar<vblex><inf>$'      or      "cantaba"       ->
              `^cantaba/cantar<vblex><pii><p1><sg>/cantar<vblex><pii><p3><sg>$'.

       -g, --generation
              Delivers  a  target-language  surface  form  for  each  target-language  lexical form, by suitably
              inflecting it.

       -n, --non-marked-gen
              Morphological generation (like -g) but without unknown word marks (asterisk `*').

       -p, --post-generation
              Performs orthographical operations such as contractions and apostrophations. The post-generator is
              usually dormant (just copies the input to the output) until a special alarm  symbol  contained  in
              some  target-language  surface  forms wakes it up to perform a particular string transformation if
              necessary; then it goes back to sleep.

       -s, --sao
              Input   processing   is   in   orthoepikon   (previosuly   `sao')   annotation   system    format:
              http://orthoepikon.sf.net.

       -v, --version
              Display the version number.

       -h, --help
              Display this help.

FILES

       input_file The input compiled dictionary.

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       (c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.

                                                   2006-03-23                                         lt-proc(1)

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

FILES

SEE ALSO

BUGS

AUTHOR