Provided by: apertium_3.6.1-1build2_amd64
NAME
apertium-deshtml — HTML format processor for Apertium
SYNOPSIS
apertium-deshtml [-hino] [input_file [output_file]]
DESCRIPTION
This tool is part of the Apertium open-source machine translation toolbox: https://www.apertium.org/. apertium-deshtml is an HTML format processor. Data should be passed through this processor before being piped to lt-proc(1). The program takes input in the form of an HTML document and produces output suitable for processing with lt-proc(1). HTML tags and other format information are enclosed in brackets so that lt-proc(1) treats them as whitespace between words.
OPTIONS
-h, --help Display this help. -i Makes the addition of trailing sentence terminator (‘.’) unconditional, often leading to duplicates. -n Suppresses the addition of a trailing sentence terminator. -o Inserts a "❡" (U+2761 CURVED STEM PARAGRAPH SIGN ORNAMENT) at the end of <h[1–6]> and <title> tags.
EXAMPLES
You could write the following to show how the word “gener” is analysed: echo "<b>gener</b>" | apertium-deshtml | lt-proc ca-es.automorf.bin
SEE ALSO
apertium(1), apertium-desrtf(1), apertium-destxt(1), lt-proc(1)
COPYRIGHT
Copyright © 2005, 2006 Universitat d'Alacant / Universidad de Alicante. This is free software. You may redistribute copies of it under the terms of the GNU General Public License: https://www.gnu.org/licenses/gpl.html.
BUGS
Many... lurking in the dark and waiting for you!