Provided by: lttoolbox-dev_3.5.4-1build1_amd64
NAME
lt-trim — compiled dictionary trimmer for Apertium
SYNOPSIS
lt-trim analyser_binary bidix_binary trimmed_analyser_binary
DESCRIPTION
lt-trim is the application responsible for trimming compiled dictionaries. The analyses (right-side when compiling lr) of analyser_binary are trimmed to the input side of bidix_binary (left-side when compiling lr, right-side when compiling rl), such that only analyses which would pass through ‘lt-proc(1) -b bidix_binary’ are kept. Warning: this program is experimental! It has been tested, but not deployed extensively yet. Both compound tags (“<compound-only-L>”, “<compound-R>”) and join elements (“<j/>” in XML, “+” in the stream) and the group element (“<g/>” in XML, “#” in the stream) should be handled correctly, even combinations of + followed by # in monodix are handled. Some minor caveats: If you have the capitalised lemma “Foo” in the monodix, but “foo” in the bidix, an analysis “^Foo<tag>$” would pass through bidix when doing lt-proc(1) -b, but will not make it through trimming. Make sure your lemmas have the same capitalisation in the different dictionaries. Also, you should not have literal ‘+’ or ‘#’ in your lemmas. Since lt-comp(1) doesn't escape these, lt-trim cannot know that they are different from “<j/>” or “<g/>”, and you may get @-marked output this way. You can analyse ‘+’ or ‘#’ by having the literal symbol in the “<l>” part and some other string (e.g., “plus”) in the “<r>”. You should not trim a generator unless you have a very simple translator pipeline, since the output of bidix seldom goes unchanged through transfer.
FILES
analyser_binary The untrimmed analyser dictionary (a finite state transducer). bidix_binary The dictionary to use as trimmer (a finite state transducer). trimmed_analyser_binary The trimmed analyser dictionary (a finite state transducer).
SEE ALSO
apertium(1), apertium-tagger(1), lt-comp(1), lt-expand(1), lt-print(1), lt-proc(1)
AUTHOR
Copyright © 2005, 2006 Universitat d'Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under the terms of the GNU General Public License: https://www.gnu.org/licenses/gpl.html.
BUGS
Many... lurking in the dark and waiting for you!