Provided by: html-xml-utils_7.7-1.1build2_amd64 bug

NAME

       hxunent - replace HTML predefined character entities by UTF-8

SYNOPSIS

       hxunent [ -b ] [ -f ] [ file ]

DESCRIPTION

       The  hxunent  command  reads the file (or standard input) and copies it to standard output
       with &-entities by their equivalent character (encoded as UTF-8). E.g., " is replaced
       by " and &lt; is replaced by <.

OPTIONS

       The following options are supported:

       -b        The  five  builtin  entities  of  XML  (&lt;  &gt;  &quot; &apos; &amp;) are not
                 replaced but copied unchanged. This is necessary if the output has to  be  valid
                 XML or SGML.

       -f        This  option  changes  how  unknown  entities  or  lone  ampersands are handled.
                 Normally they are copied unchanged, but this  option  tries  to  "fix"  them  by
                 replacing  ampersands  by  &amp;.  Often such stray ampersands are the result of
                 copy and paste of URLs into a document and then this option  indeed  fixes  them
                 and makes the document valid.

DIAGNOSTICS

       The program's exit value is 0 if all went well, otherwise:

       1         The input couldn't be read (file not found, file not readable...)

       2         Wrong command line arguments.

SEE ALSO

       asc2xml(1), xml2asc(1), UTF-8 (RFC 2279)

BUGS

       The  program  assumes entities are as defined by HTML. It doesn't read a document's DTD to
       find the actual definitions in use in a document.   With  -f,  it  will  even  remove  all
       entities that are not HTML entities.