Provided by: html-xml-utils_7.7-1_amd64 bug

NAME

       hxunent - replace HTML predefined character entities by UTF-8

SYNOPSIS

       hxunent [ -b ] [ -f ] [ file ]

DESCRIPTION

       The  hxunent  command reads the file (or standard input) and copies it to standard output with &-entities
       by their equivalent character (encoded as UTF-8). E.g., " is replaced by " and < is  replaced  by
       <.

OPTIONS

       The following options are supported:

       -b        The  five  builtin  entities of XML (&lt; &gt; &quot; &apos; &amp;) are not replaced but copied
                 unchanged. This is necessary if the output has to be valid XML or SGML.

       -f        This option changes how unknown entities or lone ampersands  are  handled.  Normally  they  are
                 copied  unchanged,  but this option tries to "fix" them by replacing ampersands by &amp;. Often
                 such stray ampersands are the result of copy and paste of URLs into a document  and  then  this
                 option indeed fixes them and makes the document valid.

DIAGNOSTICS

       The program's exit value is 0 if all went well, otherwise:

       1         The input couldn't be read (file not found, file not readable...)

       2         Wrong command line arguments.

SEE ALSO

       asc2xml(1), xml2asc(1), UTF-8 (RFC 2279)

BUGS

       The  program assumes entities are as defined by HTML. It doesn't read a document's DTD to find the actual
       definitions in use in a document.  With -f, it will even remove all entities that are not HTML entities.