bionic (1) apertium.1.gz

Provided by: apertium_3.4.2~r68466-4_amd64 bug

NAME

       apertium - This application is part of ( apertium )

       This tool is part of the apertium machine translation architecture: http://apertium.sf.net.

SYNOPSIS

       apertium [-d datadir] [-f format] [-u] [-a] {language-pair} [infile [outfile]]

DESCRIPTION

       apertium  is  the  application  that  most  people will be using as it simplifies the use of apertium/lt-
       toolbox tools for machine translation purposes.

       This tool tries to ease the use of lt-toolbox (which contains all  the  lexical  processing  modules  and
       tools)  and  apertium (which contains the rest of the engine) by providing a unique front-end to the end-
       user.

       The different modules behind the apertium machine translation architecture are in order:
              • de-formatter: Separates the text to be translated from the format information.

              • morphological-analyser: Tokenizes the text in surface forms.

              • part-of-speech tagger: Chooses one surface forms among homographs.

              • lexical transfer module: Reads each source-language lexical form and  delivers  a  corresponding
              target-language lexical form.

              •  structural  transfer module: Detects fixed-length patterns of lexical forms (chunks or phrases)
              needing special processing due to grammatical divergences between the two languages  and  performs
              the corresponding transformations.

              •  morphological  generator:  Delivers  a  target-language  surface  form for each target-language
              lexical form, by suitably inflecting it.

              • post-generator: Performs orthographical operations such as contractions and apostrophations.

              • re-formatter: Restores  the  format  information  encapsulated  by  the  de-formatter  into  the
              translated  text and removes the encapsulation sequences used to protect certain characters in the
              source text.

OPTIONS

       -d datadir The directory holding the linguistic data.  By default it will use the  expected  installation
       path.

       language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es).

       -f format Specifies the format of the input and output files which can have these values:
              • txt (default value) Input and output files are in text format.

              •  html  Input  and output files are in "html" format. This "html" is the one accepted by the vast
              majority of web browsers.

              • html-noent Input and  output  files  are  in  "html"  format,  but  preserving  native  encoding
              characters rather than using HTML text entities.

              •  rtf  Input  and  output  files  are in "rtf" format. The accepted "rtf" is the one generated by
              Microsoft WordPad (C) and Microsoft Office (C) up to and including Office-97.

       -u Disable marking of unknown words with the '*' character.

       -a Enable marking of disambiguated words with the '=' character.

FILES

       These are the two files that can be used with this command:

       -m memory.tmx use a translation memory to recycle translations

       -o direction translation direction using the translation memory, by default 'direction' is used instead

       -l lists the available translation  directions  and  exits  direction  typically,  LANG1-LANG2,  but  see
       modes.xml in language data

       infile Input file (stdin by default).

       outfile Output file (stdout by default).

SEE ALSO

       lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       (c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.

                                                   2006-03-08                                        apertium(1)