xenial (1) apertium-tagger.1.gz

Provided by: apertium_3.4.0~r61013-5_amd64 bug

NAME

       apertium-tagger - This application is part of  ( apertium )

       This tool is part of the apertium open-source machine translation architecture: http://www.apertium.org.

SYNOPSIS

       apertium-tagger --train|-t {n} DIC CRP TSX PROB [--debug|-d]

       apertium-tagger --supervised|-s {n} DIC CRP TSX PROB HTAG UNTAG [--debug|-d]

       apertium-tagger --retrain|-r {n} CRP PROB [--debug|-d]

       apertium-tagger --tagger|-g [--first|-f] PROB [--debug|-d] [INPUT [OUTPUT]]

DESCRIPTION

       apertium-tagger  is  the  application  responsible  for  the  apertium  part-of-speech tagger training or
       tagging, depending on the calling options.  This command only reads from the standard input if the option
       --tagger or -g is used.

OPTIONS

       -t {n}, --train {n}
              Initializes  parameters  through Kupiec's method (unsupervised), then performs n iterations of the
              Baum-Welch training algorithm (unsupervised).

       -s {n}, --supervised {n}
              Initializes parameters against a hand-tagged text  (supervised)  through  the  maximum  likelihood
              estimate method, then performs n iterations of the Baum-Welch training algorithm (unsupervised)

       -r {n}, --retrain {n}
              Retrains the model with n additional Baum-Welch iterations (unsupervised).

       -g, --tagger
              Tags input text by means of Viterbi algorithm.

       -p, --show-superficial
              Prints the superficial form of the word along side the lexical form in the output stream.

       -f, --first
              Used  in  conjuntion with -g (--tagger) makes the tagger give all lexical forms of each word, with
              the chosen one in the first place (after the lemma)

       -d, --debug
              Print error (if any) or debug messages while operating.

       -m, --mark
              Mark disambiguated words.

       -h, --help
              Display a help message.

FILES

       These are the kinds of files used with each option:

       DIC Full expanded dictionary file

       CRP Training text corpus file

       TSX Tagger specification file, in XML format

       PROB Tagger data file, built in the training and used while tagging

       HTAG Hand-tagged text corpus

       UNTAG Untagged text corpus, morphological analysis of HTAG corpus to use both jointly with -s option

       INPUT Input file, stdin by default

       OUTPUT Output file, stdout by default

SEE ALSO

       lt-proc(1), lt-comp(1), lt-expand(1), apertium(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       Copyright (c) 2005, 2006 Universitat d'Alacant / Universidad de Alicante.  This is  free  software.   You
       may   redistribute   copies   of   it   under   the   terms   of   the   GNU   General   Public   License
       <http://www.gnu.org/licenses/gpl.html>.

                                                   2006-08-30                                 apertium-tagger(1)