Ubuntu Manpage: apertium — machine translation application platform

Provided by: apertium_3.7.2-2build2_amd64

NAME

       apertium — machine translation application platform

SYNOPSIS

       apertium [-au] [-d datadir] [-f format] language-pair [infile [outfile]]

DESCRIPTION

       apertium  is  the  application  that  most  people will be using as it simplifies the use of apertium/lt-
       toolbox tools for machine translation purposes.

       This tool tries to ease the use of lt-toolbox (which contains all  the  lexical  processing  modules  and
       tools)  and  apertium (which contains the rest of the engine) by providing a unique front-end to the end-
       user.

       The different modules behind the apertium machine translation architecture are in order:

       de-formatter
               Separates the text to be translated from the format information.

       morphological-analyser
               Tokenizes the text in surface forms.

       part-of-speech tagger
               Chooses one surface forms among homographs.

       lexical transfer module
               Reads each source-language lexical form and  delivers  a  corresponding  target-language  lexical
               form.

       structural transfer module
               Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due
               to   grammatical   divergences   between   the  two  languages  and  performs  the  corresponding
               transformations.

       morphological generator
               Delivers a target-language surface form  for  each  target-language  lexical  form,  by  suitably
               inflecting it.

       post-generator
               Performs orthographical operations such as contractions and apostrophations.

       re-formatter
               Restores  the  format  information  encapsulated by the de-formatter into the translated text and
               removes the encapsulation sequences used to protect certain characters in the source text.

OPTIONS

       -d datadir
               The directory holding the linguistic data.  By default it  will  use  the  expected  installation
               path.

       language-pair
               The language pair: LANG1–LANG2 (for instance “es-ca” or “ca-es”).

       -f format
               Specifies the format of the input and output files which can have these values:

               txt     (default value) Input and output files are in text format.

               html    Input and output files are in “html” format.  This “html” is the one accepted by the vast
                       majority of web browsers.

               html-noent
                       Input  and  output  files are in “html” format, but preserving native encoding characters
                       rather than using HTML text entities.

               rtf     Input and output files are in “rtf” format.  The accepted “rtf” is the one  generated  by
                       Microsoft WordPad and Microsoft Office up to and including Office 97.

       -u      Disable marking of unknown words with the ‘*’ character.

       -H      Enable  header-detection  (only used in some language pairs; will lead to stray ‘❡’ characters in
               pairs that don't support it).

       -a      Enable marking of disambiguated words with the ‘=’ character.

FILES

       These are the two files that can be used with this command:

       -m memory.tmx
               use a translation memory to recycle translations

       -o direction
               translation direction using the translation memory, by default “direction” is used instead

       -l      lists the available translation directions and exits direction typically,  LANG1–LANG2,  but  see
               modes.xml in language data

       infile  Input file (stdin by default).

       outfile
               Output file (stdout by default).

COPYRIGHT

       Copyright  © 2005, 2006 Universitat d'Alacant / Universidad de Alicante.  This is free software.  You may
       redistribute   copies   of   it   under   the   terms   of    the    GNU    General    Public    License:
       https://www.gnu.org/licenses/gpl.html.

BUGS

       Many... lurking in the dark and waiting for you!

Apertium                                          March 8, 2006                                      APERTIUM(1)

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

FILES

SEE ALSO

COPYRIGHT

BUGS