lunar (1) lt-trim.1.gz

Provided by: lttoolbox-dev_3.7.1-1build2_amd64 bug

NAME

     lt-trim — compiled dictionary trimmer for Apertium

SYNOPSIS

     lt-trim analyser_binary bidix_binary trimmed_analyser_binary

DESCRIPTION

     lt-trim is the application responsible for trimming compiled dictionaries.  The analyses
     (right-side when compiling lr) of analyser_binary are trimmed to the input side of
     bidix_binary (left-side when compiling lr, right-side when compiling rl), such that only
     analyses which would pass through ‘lt-proc(1) -b bidix_binary’ are kept.

     Both compound tags (“<compound-only-L>”, “<compound-R>”) and join elements (“<j/>” in XML,
     “+” in the stream) and the group element (“<g/>” in XML, “#” in the stream) should be
     handled correctly, even combinations of + followed by # in monodix are handled.

     Some minor caveats: If you have the capitalised lemma “Foo” in the monodix, but “foo” in the
     bidix, an analysis “^Foo<tag>$” would pass through bidix when doing lt-proc(1) -b, but will
     not make it through trimming.  Make sure your lemmas have the same capitalisation in the
     different dictionaries.  Also, you should not have literal ‘+’ or ‘#’ in your lemmas.  Since
     lt-comp(1) doesn't escape these, lt-trim cannot know that they are different from “<j/>” or
     “<g/>”, and you may get @-marked output this way.  You can analyse ‘+’ or ‘#’ by having the
     literal symbol in the “<l>” part and some other string (e.g., “plus”) in the “<r>”.

     You should not trim a generator unless you have a very simple translator pipeline, since the
     output of bidix seldom goes unchanged through transfer.

OPTIONS

     -s, --match-section
             A section with this name (id@type) in the analyser will only be trimmed against a
             section with the same id in the bidix. (The default is to trim all sections of the
             analyser against all sections of the bidix.) Using this option can some times speed
             up trimming considerably. For example, if you have some complicated regular
             expressions, try putting them in a

               <section id="regex" type="standard">

             in both .dix files and passing “regex@standard” to --match-section.

             This argument may be used multiple times to specify multiple sections that must
             match by name.

FILES

     analyser_binary
             The untrimmed analyser dictionary (a finite state transducer).

     bidix_binary
             The dictionary to use as trimmer (a finite state transducer).

     trimmed_analyser_binary
             The trimmed analyser dictionary (a finite state transducer).

SEE ALSO

     apertium(1), apertium-tagger(1), lt-comp(1), lt-expand(1), lt-print(1), lt-proc(1)

AUTHOR

     Copyright © 2005, 2006 Universitat d'Alacant / Universidad de Alicante.  This is free
     software.  You may redistribute copies of it under the terms of the GNU General Public
     License: https://www.gnu.org/licenses/gpl.html.

BUGS

     Many... lurking in the dark and waiting for you!