hxextract
extract selected elements from a HTML or XML file
- Provided by: html-xml-utils (Version: 7.7-1.3)
- Report a bug
extract selected elements from a HTML or XML file
hxextract [ -h | -? ] [ -x ] [ -s text ] [ -e text ] [ -b base ] element-or-class [ -c configfile | file-or-URL ]
hxextract outputs all elements with a certain name and/or class.
Input must be well-formed, since no HTML heuristics are applied.
The following options are supported:
The following operands are supported:
To use a proxy to retrieve remote files, set the environment variables http_proxy and ftp_proxy. E.g., http_proxy="http://localhost:8080/"
Remote files (specified with a URL) are currently only supported for HTTP. Password-protected files or files that depend on HTTP "cookies" are not handled. (You can use tools such as curl(1) or wget(1) to retrieve such files.)