Provided by: apertium_3.4.2~r68466-4_amd64 bug

NAME

       apertium - This application is part of ( apertium )

       This    tool    is    part    of    the   apertium   machine   translation   architecture:
       http://apertium.sf.net.

SYNOPSIS

       apertium [-d datadir] [-f format] [-u] [-a] {language-pair} [infile [outfile]]

DESCRIPTION

       apertium is the application that most people will be using as it  simplifies  the  use  of
       apertium/lt-toolbox tools for machine translation purposes.

       This  tool  tries to ease the use of lt-toolbox (which contains all the lexical processing
       modules and tools) and apertium (which contains the rest of the  engine)  by  providing  a
       unique front-end to the end-user.

       The different modules behind the apertium machine translation architecture are in order:
              • de-formatter: Separates the text to be translated from the format information.

              • morphological-analyser: Tokenizes the text in surface forms.

              • part-of-speech tagger: Chooses one surface forms among homographs.

              •  lexical  transfer module: Reads each source-language lexical form and delivers a
              corresponding target-language lexical form.

              • structural transfer  module:  Detects  fixed-length  patterns  of  lexical  forms
              (chunks  or  phrases)  needing  special  processing  due to grammatical divergences
              between the two languages and performs the corresponding transformations.

              • morphological generator: Delivers a target-language surface form for each target-
              language lexical form, by suitably inflecting it.

              •  post-generator:  Performs  orthographical  operations  such  as contractions and
              apostrophations.

              • re-formatter: Restores the format information encapsulated  by  the  de-formatter
              into  the  translated  text and removes the encapsulation sequences used to protect
              certain characters in the source text.

OPTIONS

       -d datadir The directory holding the linguistic data.  By default it will use the expected
       installation path.

       language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es).

       -f format Specifies the format of the input and output files which can have these values:
              • txt (default value) Input and output files are in text format.

              • html Input and output files are in "html" format. This "html" is the one accepted
              by the vast majority of web browsers.

              • html-noent Input and output files are in "html"  format,  but  preserving  native
              encoding characters rather than using HTML text entities.

              •  rtf  Input  and  output files are in "rtf" format. The accepted "rtf" is the one
              generated by Microsoft WordPad (C) and Microsoft Office (C)  up  to  and  including
              Office-97.

       -u Disable marking of unknown words with the '*' character.

       -a Enable marking of disambiguated words with the '=' character.

FILES

       These are the two files that can be used with this command:

       -m memory.tmx use a translation memory to recycle translations

       -o direction translation direction using the translation memory, by default 'direction' is
       used instead

       -l lists the available translation directions and exits direction typically,  LANG1-LANG2,
       but see modes.xml in language data

       infile Input file (stdin by default).

       outfile Output file (stdout by default).

SEE ALSO

       lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       (c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.

                                            2006-03-08                                apertium(1)