Provided by: tdom_0.9.0-1_amd64 bug

NAME

       dom - Create an in-memory DOM tree from XML

SYNOPSIS

       package require tdom

       dom method ?arg arg ...?
_________________________________________________________________

DESCRIPTION

       This  command  provides  the  creation  of DOM trees in memory. In the usual case a string
       containing a XML information is parsed and converted into a DOM tree. Other possible parse
       input may be HTML or JSON.  The method indicates a specific subcommand.

       The valid methods are:

       dom parse ?options? ?data?
              Parses  the  XML  information  and builds up the DOM tree in memory providing a Tcl
              object command to this DOM document object. Example:

                     dom parse $xml doc
                     $doc documentElement root

              parses the XML in the variable  xml,  creates  the  DOM  tree  in  memory,  make  a
              reference  to the document object, visible in Tcl as a document object command, and
              assigns this new object name to the variable doc. When doc gets freed, the DOM tree
              and  the  associated  Tcl  command object (document and all node objects) are freed
              automatically.

                     set document [dom parse $xml]
                     set root     [$document documentElement]

              parses the XML in the variable  xml,  creates  the  DOM  tree  in  memory,  make  a
              reference  to the document object, visible in Tcl as a document object command, and
              returns this new object name, which is  then  stored  in  document.   To  free  the
              underlying  DOM  tree  and  the associative Tcl object commands (document + nodes +
              fragment nodes) the document object command has to be explicitly deleted by:

                     $document delete

              or

                     rename $document ""

              The valid options are:

              -simple
                     If -simple is specified, a simple but fast  parser  is  used  (conforms  not
                     fully  to XML recommendation). That should double parsing and DOM generation
                     speed. The encoding of the data is not transformed inside  the  parser.  The
                     simple  parser  does  not  respect  any  encoding  information  in  the  XML
                     declaration.  It  skips  over  the  internal  DTD  subset  and  ignores  any
                     information  in  it. Therefore it doesn't include defaulted attribute values
                     into the tree, even  if  the  according  attribute  declaration  is  in  the
                     internal  subset.  It  also  doesn't  expand  internal  or  external  entity
                     references other than the predefined entities and character references.

              -html  If -html is specified, a fast HTML parser is used, which tries to even parse
                     badly formed HTML into a DOM tree.

              -html5 This option is only available if tDOM was build with --enable-html5. Try the
                     featureinfo method if you need to know if  this  feature  is  build  in.  If
                     -html5      is     specified,     the     gumbo     lib     html5     parser
                     (https://github.com/google/gumbo-parser) is used to build the DOM tree. This
                     is, as far as it goes, XML namespace-aware. Since this probably isn't wanted
                     by a lot of users and adds only burden for no good in a  lot  of  use  cases
                     -html5  can  be  combined  with  -ignorexmlns,  in  which case all nodes and
                     attributes in the DOM tree  are  not  in  an  XML  namespace.  All  tag  and
                     attribute  names  in  the  DOM  tree  will  be  lower case, even for foreign
                     elements not in the xhtml, svg or mathml namespace. The DOM tree may include
                     nodes,  that the parser inserted because they are implied by the context (as
                     <head>, <tbody>, etc.).

              -json  If -json is specified, the data is  expected  to  be  a  valid  JSON  string
                     (according  to  RFC 7159). The command returns an ordinary DOM document with
                     nesting token inside the JSON data translated into tree hierarchy. If a JSON
                     array  value is itself an object or array then container element nodes named
                     (in a default build) arraycontainer or  objectcontainer,  respectively,  are
                     inserted  into  the  tree. The JSON serialization of this document (with the
                     domDoc method asJSON) is the same JSON information as the  data,  preserving
                     JSON datatypes, allowing non-unique member names of objects while preserving
                     their order and the full range of JSON string values. JSON datatype handling
                     is  done  with  an additional property "sticking" at the doc and tree nodes.
                     This property isn't contained in an XML serialization of  the  document.  If
                     you  need  to  store the JSON data represented by a document, store the JSON
                     serialization and parse it back  from  there.  Apart  from  this  JSON  type
                     information the returned doc command or handle is an ordinary DOM doc, which
                     may be investigated or modified with the full range  of  the  doc  and  node
                     methods.  Please  note  that the element node names and the text node values
                     within the tree may be outside  of  what  the  appropriate  XML  productions
                     allow.

              -jsonmaxnesting  integer
                     This  options  only  has  effect if used together with the -json option. The
                     current implementation uses recursive descent JSON parser. In order to avoid
                     using excess stack space, any JSON input that has more than a certain levels
                     of nesting is considered invalid. The default maximum nesting is  2000.  The
                     option -jsonmaxnesting allows the user to adjust that.

              --     The  option  --  marks  the end of options.  While respected in general this
                     option is only needed in case of parsing JSON data, which may start  with  a
                     "-".

              -keepEmpties
                     If  -keepEmpties is specified then text nodes which contain only whitespaces
                     will be part of the resulting DOM tree. In default  case  (-keepEmpties  not
                     given) those empty text nodes are removed at parsing time.

              -channel  <channel-ID>
                     If  -channel  <channel-ID> is specified, the input to be parsed is read from
                     the specified channel. The encoding setting of the channel  (via  fconfigure
                     -encoding)  is respected, ie the data read from the channel are converted to
                     UTF-8 according to the encoding settings before the data is parsed.

              -baseurl  <baseURI>
                     If -baseurl <baseURI> is specified, the baseURI is used as the base  URI  of
                     the  document.   External  entities  references in the document are resolved
                     relative to this base URI. This base URI is also stored within the DOM tree.

              -feedbackAfter  <#bytes>
                     If  -feedbackAfter  <#bytes>  is  specified,  the  tcl  command   given   by
                     -feedbackcmd is evaluated at the first element start within the document (or
                     an external entity) after the start of the document or  external  entity  or
                     the   last  such  call  after  #bytes.  For  backward  compatibility  if  no
                     -feedbackcmd is given but there is a tcl proc named  ::dom::domParseFeedback
                     this  proc  is  used  as  -feedbackcmd.  If  there  isn't  such  a  proc and
                     -feedbackAfter is used it is an error to not also use -feedbackcmd.  If  the
                     called script raises error, then parsing will be aborted, the dom parse call
                     returns error, with the script error msg as error msg.  If the called script
                     return  -code  break,  the  parsing  will  abort and the dom parse call will
                     return the empty string.

              -feedbackcmd  <script>
                     If -feedbackcmd <script> is specified, the script script is evaluated at the
                     first  element  start  within the document (or an external entity) after the
                     start of the document or external entity or the last such call after  #bytes
                     value  given  by  the  -feedbackAfter option. If -feedbackAfter isn't given,
                     using this option doesn't has any effect. If the called script raises error,
                     then  parsing  will  be  aborted, the dom parse call returns error, with the
                     script error msg as error msg. If the called script return -code break,  the
                     parsing will abort and the dom parse call will return the empty string.

              -externalentitycommand  <script>
                     If -externalentitycommand <script> is specified, the specified tcl script is
                     called to  resolve  any  external  entities  of  the  document.  The  actual
                     evaluated  command  consists of this option followed by three arguments: the
                     base uri, the system identifier of the entity and the public  identifier  of
                     the  entity.  The  base uri and the public identifier may be the empty list.
                     The script has to return a tcl list consisting of three elements. The  first
                     element  of  this  list  signals  how the external entity is returned to the
                     processor. Currently the two allowed types are "string" and  "channel".  The
                     second element of the list has to be the (absolute) base URI of the external
                     entity to be parsed.  The third element of the list  are  data,  either  the
                     already  read  data out of the external entity as string in the case of type
                     "string", or the name of a tcl channel, in the case of type "channel".  Note
                     that  if  the  script  returns  a  tcl channel, it will not be closed by the
                     processor.  It must be closed separately if it is no longer needed.

              -useForeignDTD  <boolean>
                     If <boolean> is true and the document does not have an external subset,  the
                     parser will call the -externalentitycommand script with empty values for the
                     systemId and publicID arguments. Please  note  that  if  the  document  also
                     doesn't   have   an   internal   subset,  the  -startdoctypedeclcommand  and
                     -enddoctypedeclcommand scripts, if set, are not called.  The  -useForeignDTD
                     respects

              -paramentityparsing  <always|never|notstandalone>
                     The  -paramentityparsing option controls, if the parser tries to resolve the
                     external entities (including the external DTD subset) of the document  while
                     building the DOM tree.  -paramentityparsing requires an argument, which must
                     be either "always", "never", or "notstandalone".  The value  "always"  means
                     that the parser tries to resolves (recursively) all external entities of the
                     XML source. This is the default in case -paramentityparsing is omitted.  The
                     value "never" means that only the given XML source is parsed and no external
                     entity (including the external subset) will  be  resolved  and  parsed.  The
                     value "notstandalone" means, that all external entities will be resolved and
                     parsed,  with  the  execption  of   documents,   which   explicitly   states
                     standalone="yes" in their XML declaration.

              -ignorexmlns
                     It  is recommended, that you only use this option with the -html5 option. If
                     this option is given, no node within the created DOM tree will be internally
                     marked as placed into an XML Namespace, even if there is a default namespace
                     in scope for un-prefixed elements or even  if  the  element  has  a  defined
                     namespace  prefix.  One consequence is that XPath node expressions on such a
                     DOM tree doesn't work as expected. Prefixed element nodes can't be  selected
                     and  element  nodes  without  prefix will be seen by XPath expressions as if
                     they are not in any namespace (no matter if they are in fact should be in  a
                     default namespace).

       dom createDocument docElemName ?objVar?
              Creates a new DOM document object with one element node with node name docElemName.
              The objVar controls the memory handling as explained above.

       dom createDocumentNS uri docElemName ?objVar?
              Creates a new DOM document object with one element node with node name docElemName.
              Uri  gives the namespace of the document element to create. The objVar controls the
              memory handling as explained above.

       dom createDocumentNode ?objVar?
              Creates a new 'empty' DOM document object without any element node. objVar controls
              the memory handling as explained above.

       dom setResultEncoding ?encodingName?
              This  option  is for backward compatibility with Tcl 8.0. If tDOM is build with any
              newer Tcl version this option does not has any effect. If encodingName is not given
              the  current  global  result  encoding  is  returned.  Otherwise  the global result
              encoding is set to encodingName.  All character data, attribute  values  etc.  will
              then  be converted from UTF-8, which is delivered from the Expat XML parser, to the
              given 8 bit encoding at XML/DOM parse time.  Valid  values  for  encodingName  are:
              utf-8, ascii, cp1250, cp1251, cp1252, cp1253, cp1254, cp1255, cp1256, cp437, cp850,
              en, iso8859-1, iso8859-2, iso8859-3, iso8859-4,  iso8859-5,  iso8859-6,  iso8859-7,
              iso8859-8, iso8859-9, koi8-r.

       dom  createNodeCmd  ?-returnNodeCmd? ?-tagName name? ?-jsonType jsonType? ?-namespace URI?
       (element|comment|text|cdata|pi)Node commandName
              This method creates Tcl commands, which in turn create  tDOM  nodes.  Tcl  commands
              created  by  this  command  are only avaliable inside a script given to the domNode
              methods appendFromScript or  insertBeforeFromScript.  If  a  command  created  with
              createNodeCmd  is  invoked  in any other context, it will return error. The created
              command commandName replaces any existing command or procedure with that  name.  If
              the  commandName  includes any namespace qualifiers, it is created in the specified
              namespace. The -tagName option is  only  allowed  for  the  elementNode  type.  The
              -jsonType option is only allowed for elementNode and textNode types.

              If  such command is invoked inside a script given as argument to the domNode method
              appendFromScript or insertBeforeFromScript it creates a new node and  appends  this
              node  at  the  end  of  the  child list of the invoking element node. If the option
              -returnNodeCmd was given, the command returns the created node as Tcl  command.  If
              this  option  was omitted, the command returns nothing. Each command creates always
              the same type of node.  Which type of node is created by the command is  determined
              by  the  first  argument  to  the  createNodeCmd. The syntax of the created command
              depends on the type of the node it creates.

              If the command type to create is elementNode, the created command  will  create  an
              element  node,  if  called. Without the -tagName option the tag name of the created
              node is commandName without namespace qualifiers. If the -tagName option was  given
              then  the  created  command  the  created  elements will have this tag name. If the
              -jsonType option was given then the created node elements will have the given  JSON
              type.  If  the  -namespace  option  is  given  the created element node will be XML
              namespaced and in the namespace given by the  option.  The  element  name  will  be
              literal  as  given  either  by the command name or the -tagname option, if that was
              given. An appropriate XML namespace declaration will  be  automatically  added,  to
              bind  the  prefix  (if  the  element name has one) or the default namespace (if the
              element name hasn't a prefix) to the namespace if such a binding isn't in scope.

              The syntax of the created command is:

                     elementNodeCmd ?attributeName attributeValue ...? ?script?
                     elementNodeCmd ?-attributeName attributeValue ...? ?script?
                     elementNodeCmd name_value_list script

              The command syntax allows three different ways to specify  the  attributes  of  the
              resulting  element.  These  could  be  specified  with attributeName attributeValue
              argument pairs,  in  an  "option  style"  way  with  -attriubteName  attributeValue
              argument  pairs  (the  '-' character is only syntactical sugar and will be stripped
              off) or as a  Tcl  list  with  elements  interpreted  as  attribute  name  and  the
              corresponding  attribute value.  The attribute name elements in the list may have a
              leading '-' character, which will be stripped off.

              Every elementNodeCmd accepts an optional Tcl script as last argument.  This  script
              is  evaluated  as  recursive  appendFromScript  script with the node created by the
              elementNodeCmd as parent of all nodes created by the script.

              If the first argument of the method is textNode, the command  will  create  a  text
              node.  If  the -jsonType option was given then the created text node will have that
              JSON type. The syntax of the created command is:

                     textNodeCmd ?-disableOutputEscaping? data

              If the optional flag -disableOutputEscaping is given, the escaping of the ampersand
              character  (&)  and  the  left  angle  bracket (<) inside the data is disabled. You
              should use this flag carefully.

              If the first argument of the method is commentNode or cdataNode  the  command  will
              create an comment node or CDATA section node. The syntax of the created command is:

                     nodeCmd data

              If the first argument of the method is piNode, the command will create a processing
              instruction node. The syntax of the created command is:

                     piNodeCmd target data

       dom setStoreLineColumn ?boolean?
              If switched on, the DOM nodes will contain line and column position information for
              the  original  XML  document  after  parsing.  The default is not to store line and
              column position information.

       dom setNameCheck ?boolean?
              If NameCheck is true, every method which expects an XML Name, a full qualified name
              or  a  processing  instructing  target  will  check,  if  the given string is valid
              according to its production rule.  For  commands  created  with  the  createNodeCmd
              method  to  be  used  in  the context of appendFromScript the status of the flag at
              creation time decides. If NameCheck is true at  creation  time,  the  command  will
              check  its arguments, otherwise not. The setNameCheck set this flag. It returns the
              current NameCheck flag state. The default state for NameCheck is true.

       dom setTextCheck ?boolean?
              If TextCheck is true, every command which expects XML Chars,  a  comment,  a  CDATA
              section  value or a processing instructing value will check, if the given string is
              valid according to its production rule. For commands created with the createNodeCmd
              method  to  be  used  in  the context of appendFromScript the status of the flag at
              creation time decides. If TextCheck is true at  creation  time,  the  command  will
              check  its  arguments,  otherwise  not.The  setTextCheck  method sets this flag. It
              returns the current TextCheck flag state. The default state for TextCheck is true.

       dom setObjectCommands ?(automatic|token|command)?
              Controls if documents and nodes are created as tcl commands or as token to be  used
              with the domNode and domDoc commands. If the mode is 'automatic', then methods used
              at tcl commands will create tcl commands and methods used at doc or node tokes will
              create  tokens.  If the mode is 'command' then always tcl commands will be created.
              If the mode is 'token', then always token will be created. The method  returns  the
              current mode. This method is an experimental interface.

       dom isName name
              Returns  1  if  name  is  a valid XML Name according to production 5 of the XML 1.0
              recommendation. This means that name is a valid  XML  element  or  attribute  name.
              Otherwise it returns 0.

       dom isPIName name
              Returns  1  if  name  is  a  valid  XML  processing instruction target according to
              production 17 of the XML 1.0 recommendation. Otherwise it returns 0.

       dom isNCName name
              Returns 1 if name is a valid NCName  according  to  production  4  of  the  of  the
              Namespaces in XML recommendation. Otherwise it returns 0.

       dom isQName name
              Returns  1  if  name  is  a  valid  QName  according  to production 6 of the of the
              Namespaces in XML recommendation. Otherwise it returns 0.

       dom isCharData string
              Returns 1 if every character in string is a valid XML Char according to  production
              2 of the XML 1.0 recommendation. Otherwise it returns 0.

       dom isBMPCharData string
              Returns  1  if  every  character  in string is a valid XML Char with a Unicode code
              point within the Basic Multilingual Plane (that means, that every character  within
              the string is at most 3 bytes long). Otherwise it returns 0.

       dom isComment string
              Returns  1  if  string is a valid comment according to production 15 of the XML 1.0
              recommendation. Otherwise it returns 0.

       dom isCDATA string
              Returns  1  if  string  is  valid  according  to  production  20  of  the  XML  1.0
              recommendation. Otherwise it returns 0.

       dom isPIValue string
              Returns  1  if  string  is  valid  according  to  production  16  of  the  XML  1.0
              recommendation. Otherwise it returns 0.

       dom featureinfo feature
              This method provides information  about  the  used  build  options  and  the  expat
              version. The valid values for the feature argument are:

              expatversion
                     Returns  the  version  of the underlyling expat version as string, something
                     like "exapt_2.1.0". This is what the expat API  function  XML_ExpatVersion()
                     returns.

              expatmajorversion
                     Returns the major version of the underlyling expat version as integer.

              expatminorversion
                     Returns the minor version of the underlyling expat version as integer.

              expatmicroversion
                     Returns the micro version of the underlyling expat version as integer.

              dtd    Returns as boolean if build with --enable-dtd.

              ns     Returns as boolean if build with --enable-ns.

              unknown
                     Returns as boolean if build with --enable-unknown.

              tdomalloc
                     Returns as boolean if build with --enable-tdomalloc.

              lessns Returns as boolean if build with --enable-lessns.

              TCL_UTF_MAX
                     Returns  the  TCL_UTF_MAX  value  of  the  tcl  core, tDOM was build with as
                     integer

              html5  Returns as boolean, if build with --enable-html5.

KEYWORDS

       XML, DOM, document, node, parsing