Provided by: tcllib_1.21+dfsg-1_all bug

NAME

       treeql - Query tree objects

SYNOPSIS

       package require Tcl  8.2

       package require snit

       package require struct::list

       package require struct::set

       package require treeql  ?1.3.1?

       treeql objectname -tree tree ?-query query? ?-nodes nodes? ?args...?

       qo query args...

       qo result

       qo discard

_________________________________________________________________________________________________

DESCRIPTION

       This  package  provides  objects  which  can  be  used to query and transform tree objects
       following the API of tree objects created by the package struct::tree.

       The tree query and manipulation language used here,  TreeQL,  is  inspired  by  Cost  (See
       section References for more information).

       treeql,  the package, is a fairly thin query facility over tree-structured data types.  It
       implements an ordered set of nodes (really  a  list)  which  are  generated  and  filtered
       through the application of TreeQL operators to each node in turn.

API

   TREEQL CLASS API
       The  command treeql is a snit::type which implements the Treeql Query Language. This means
       that it follows the API for class commands as specified by the package snit.  Its  general
       syntax is

       treeql objectname -tree tree ?-query query? ?-nodes nodes? ?args...?
              The command creates a new tree query object and returns the fully qualified name of
              the object command as its result.  The API the returned  command  is  following  is
              described in the section TreeQL OBJECT API

              Each  query  object is associated with a single tree object. This is the object all
              queries will be run against.

              If the option -nodes was specified then its argument is treated as a list of nodes.
              This list is used to initialize the node set. It defaults to the empty list.

              If  the  option  -query  was  specified then its argument will be interpreted as an
              object, the parent query of this query. It  defaults  to  the  object  itself.  All
              queries will be interpreted in the environment of this object.

              Any  arguments coming after the options are treated as a query and run immediately,
              after the node set has been initialized. This uses the same syntax for the query as
              the method query.

              The  operations of the TreeQL available for this are explained in the section about
              The Tree Query Language. This section also explains the term node set used above.

   TREEQL OBJECT API
       As treeql has been implemented in snit all the standard methods of snit-based classes  are
       available  to  the  user  and therefore not listed here. Please read the documentation for
       snit for what they are and what functionality they provide

       The methods provided by the package treeql itself are listed and explained below.

       qo query args...
              This  method  interprets  its  arguments  as  a  series  of  TreeQL  operators  and
              interpretes  them from the left to right (i.e. first to last).  Note that the first
              operator uses the node set currently known to the object to  perform  its  actions.
              In  other words, the node set is not cleared, or modified in other ways, before the
              query is run. This allows the user to run several queries one after the  other  and
              have  each  use  the  results of the last. Any initialization has to be done by any
              query itself, using TreeQL operators.  The result of the method  is  the  node  set
              after the last operator of the query has been executed.

              Note  that uncaught errors will leave the node set of the object in an intermediate
              state, per the TreeQL operators which were executed successfully before  the  error
              occurred.

              The above means in detail that:

              [1]    The  first  argument  is  interpreted  as  the name of a query operator, the
                     number of arguments required by that operator is then determined, and  taken
                     from the immediately following arguments.

                     Because of this operators cannot have optional arguments, all arguments have
                     to be present as defined.  Failure to do this will, at  least,  confuse  the
                     query interpreter, but more likely cause errors.

              [2]    The  operator  is  applied to the current node set, yielding a new node set,
                     and/or manipulating the tree object the query object is connected to.

              [3]    The arguments used (i.e. operator name and arguments) are removed  from  the
                     list  of  method arguments, and then the whole process is repeated from step
                     [1], until the list of arguments is empty or an error occurred.

                  # q is the query object.

                  q query root children get data

                  # The above query
                  # - Resets the node set to the root node - root
                  # - Adds the children of root to the set - children
                  # - Replaces the node set with the       - get data
                  #   values for the attribute 'data',
                  #   for all nodes in the set which
                  #   have such an attribute.
                  # - And returns this information.

                  # Below we can see the same query, but rewritten
                  # to show the structure as it is seen by the query
                  # interpreter.

                  q query \
                    root \
                    children \
                    get data

       The operators of the TreeQL language available for this are explained in the section about
       The Tree Query Language. This section also explains the term node set used above.

       qo result
              This method returns a list containing the current node set.

       qo discard
              This  method  returns  the current node set (like method result), but also destroys
              the query object (qo).  This is useful  when  constructing  and  using  sub-queries
              (%AUTO% objects immediately destroyed after use).

THE TREE QUERY LANGUAGE

       This  and the following sections specify the Tree Query Language used by the query objects
       of this package in detail.

       First we explain the general concepts  underneath  the  language  which  are  required  to
       comprehend  it.  This  is  followed  by  the  specifications  for  all the available query
       operators. They fall into eight categories, and each category has its own section.

       [1]    TreeQL Concepts

       [2]    Structural generators

       [3]    Attribute Filters

       [4]    Attribute Mutators

       [5]    Attribute String Accessors

       [6]    Sub-queries

       [7]    Node Set Operators

       [8]    Node Set Iterators

       [9]    Typed node support

   TREEQL CONCEPTS
       The main concept which has to be understood is that of the node set.   Each  query  object
       maintains  exactly  one  such  node  set,  and  essentially all operators use it and input
       argument and for their result.  This structure simply contains the handles  of  all  nodes
       which  are  currently  of  interest  to  the query object.  To name it a set is a bit of a
       misnomer, because

       [1]    A node (handle) can occur in the structure more than once, and

       [2]    the order of nodes in the structure is important as  well.   Whenever  an  operator
              processes  all  nodes  in the node set it will do so in the order they occur in the
              structure.

       Regarding the possible multiple occurrence of a node, consider a node set  containing  two
       nodes  A  and  B,  both having node P as their immediate parent. Application of the TreeQL
       operator "parent" will then add P to the new node set twice, once per node it  was  parent
       of. I.e. the new node set will then be {P P}.

   STRUCTURAL GENERATORS
       All tree-structural operators locate nodes in the tree based on a structural relation ship
       to the nodes currently in the set and then replace the current node set with  the  set  of
       nodes found Nodes which fulfill such a relationship multiple times are added to the result
       as often as they fulfill the relationship.

       It is important to note that the found nodes are collected  in  a  separate  storage  area
       while  processing  the node set, and are added to (or replacing) the current node set only
       after the current node set has been processed completely.  In other words, the  new  nodes
       are not processed by the operator as well and do not affect the iteration.

       When  describing  an operator the variable N will be used to refer to any node in the node
       set.

       ancestors
              Replaces the current node set with the ancestors for all nodes N in the  node  set,
              should N have a parent. In other words, nodes without a parent do not contribute to
              the new node set. In other words, uses all nodes on the path from node N  to  root,
              in this order (root last), for all nodes N in the node set. This includes the root,
              but not the node itself.

       rootpath
              Replaces the current node set with the ancestors for all nodes N in the  node  set,
              should N have a parent. In other words, nodes without a parent do not contribute to
              the new node set.  In contrast to the operator ancestors the  nodes  are  added  in
              reverse order however, i.e. the root node first.

       parent Replaces  the  current  node  set with the parent of node N, for all nodes N in the
              node set, should N have a parent. In other words, nodes without  a  parent  do  not
              contribute to the new node set.

       children
              Replaces  the current node set with the immediate children of node N, for all nodes
              N in the node set, should N have children. In other words, nodes  without  children
              do not contribute to the new node set.

       left   Replaces the current node set with the previous/left sibling for all nodes N in the
              node set, should N have siblings to the left. In other words,  nodes  without  left
              siblings do not contribute to the new node set.

       right  Replaces  the  current  node set with the next/right sibling for all nodes N in the
              node set, should N have siblings to the right. In other words, nodes without  right
              siblings do not contribute to the new node set.

       prev   Replaces  the  current  node set with all previous/left siblings of node N, for all
              nodes N in the node set, should N have siblings to the left. In other words,  nodes
              without  left  siblings are ignored. The left sibling adjacent to the node is added
              first, and the leftmost sibling last (reverse tree order).

       esib   Replaces the current node set with all previous/left siblings of node  N,  for  all
              nodes  N in the node set, should N have siblings to the left. In other words, nodes
              without left siblings are ignored. The leftmost sibling is  added  first,  and  the
              left sibling adjacent to the node last (tree order).

              The method name is a shorthand for Earlier SIBling.

       next   Replaces the current node set with all next/right siblings of node N, for all nodes
              N in the node set, should N have siblings to  the  right.  In  other  words,  nodes
              without  right  siblings  do  not contribute to the new node set. The right sibling
              adjacent to the node is added first, and the rightmost sibling last (tree order).

       root   Replaces the current node set with a node set containing a single node, the root of
              the tree.

       tree   Replaces  the  current  node  set with a node set containing all nodes found in the
              tree. The nodes are added in pre-order (parent first,  then  children,  the  latter
              from left to right, first to last).

       descendants
              Replaces  the current node set with the nodes in all subtrees rooted at node N, for
              all nodes N in the node set, should N have children. In other words, nodes  without
              children do not contribute to the new node set.

              This  is  like  the operator children, but covers the children of children as well,
              i.e. all the proper descendants. "Rooted at N" means that N itself is not added  to
              the new set, which is also implied by proper descendants.

       subtree
              Like operator descendants, but includes the node N. In other words:

              Replaces  the  current  node  set  with the nodes of the subtree of node N, for all
              nodes N in the node set, should N have children.  In  other  words,  nodes  without
              children  do  not  contribute  to  the  new node set. I.e this is like the operator
              children, but covers the children of children, etc. as well. "Of N"  means  that  N
              itself is added to the new set.

       forward
              Replaces  the  current  node set with the nodes in the subtrees rooted at the right
              siblings of node N, for all nodes N in the node set, should N have right  siblings,
              and  they  children. In other words, nodes without right siblings, and them without
              children are ignored.

              This is equivalent to the operator sequence

              next descendants

       later  This is an alias for the operator forward.

       backward
              Replaces the current node set with the nodes in the flattened previous subtrees, in
              reverse tree order.

              This is nearly equivalent to the operator sequence

              prev descendants

              The only difference is that this uses the nodes in reverse order.

       earlier
              Replaces the current node set with the nodes in the flattened previous subtrees, in
              tree order.

              This is equivalent to the operator sequence

              prev subtree

   ATTRIBUTE FILTERS
       These operators filter the node  set  by  reference  to  attributes  of  nodes  and  their
       properties.  Filter  means that all nodes not fulfilling the criteria are removed from the
       node set. In other words, the node set is replaced by the  set  of  nodes  fulfilling  the
       filter criteria.

       hasatt attr
              Reduces the node set to nodes which have an attribute named attr.

       withatt attr value
              Reduces  the  node  set  to nodes which have an attribute named attr, and where the
              value of that attribute is equal to  value  (The  "=="  operator  is  string  equal
              -nocase).

       withatt! attr val
              This  is  the  same  as  withatt,  but  all  nodes in the node set have to have the
              attribute, and the "==" operator is string equal, i.e. no  -nocase.   The  operator
              will fail with an error if they don't have the attribute.

       attof attr vals
              Reduces  the  node  set to nodes which which have an attribute named attr and where
              the value of that attribute is contained in the list  vals  of  legal  values.  The
              contained-in  operator  used  here does glob matching (using the attribute value as
              pattern) and ignores the case of the attribute value, but not for the  elements  of
              vals.

       attmatch attr match
              Same  as  withatt,  but string match is used as the "==" operator, and match is the
              pattern checked for.

              Note that match is a interpreted as a partial argument list for string match.  This
              means  that  it  is  interpreted  as a list containing the pattern, and the pattern
              element can be preceded by options understand by string match, like -nocase.   This
              is  especially  important  should  the pattern contain spaces. It has to be wrapped
              into a list for correct interpretation by this operator

   ATTRIBUTE MUTATORS
       These operators change node attributes within the underlying tree.  In  other  words,  all
       these operators have side effects.

       set attr val
              Sets  the  attribute  attr  to the value val, for all nodes N in the node set.  The
              operator will fail if a node does not have an attribute named attr. The  tree  will
              be left in a partially modified state.

       unset attr
              Unsets the attribute attr, for all nodes N in the node set.  The operator will fail
              if a node does not have an attribute named  attr.  The  tree  will  be  left  in  a
              partially modified state.

   ATTRIBUTE STRING ACCESSORS
       These  operators  retrieve  the  values  of  node attributes from the underlying tree. The
       collected results are stored in the node set, but are not actually nodes.

       In other words, they redefine the semantics of the node set stored by the query object  to
       contain non-node data after their completion.

       The  query  interpreter  will  terminate  after  it  has  finished processing one of these
       operators, silently discarding any later query elements.  It  also  means  that  our  talk
       about  maintenance of a node set is not quite true. It is a node set while the interpreter
       is processing commands, but can be left as an attribute value set  at  the  end  of  query
       processing.

       string op attr
              Applies  the string operator op to the attribute named attr, for all nodes N in the
              node set, collects the results of that application and places them  into  the  node
              set.

              The operator will fail if a node does not have an attribute named attr.

              The  argument  op  is  interpreted as partial argument list for the builtin command
              string.  Its first word has to be any of the  sub-commands  understood  by  string.
              This  has  to  be followed by all arguments required for the subcommand, except the
              last.  that last argument is supplied by the attribute value.

       get pattern
              For all nodes N in the node set it  determines  all  their  attributes  with  names
              matching the glob pattern, then the values of these attributes, at last it replaces
              the node set with the list of these attribute values.

       attlist
              This is a convenience definition for the operator getvals *.  In  other  words,  it
              replaces  the  node  set with a list of the attribute values for all attributes for
              all nodes N in the node set.

       attrs glob
              Replaces the current node set with a list of attribute lists,  one  attribute  list
              per for all nodes N in the node set.

       attval attname
              Reduces  the current node set with the operator hasatt, and then replaces it with a
              list containing the values of the attribute named attname for all nodes  N  in  the
              node set.

   SUB-QUERIES
       Sub-queries  yield node sets which are then used to augment, reduce or replace the current
       node set.

       andq query
              Replaces the node set with the set-intersection of the node set  generated  by  the
              sub-query query and itself.

              The  execution  of  the sub-query uses the current node set as its own initial node
              set.

       orq query
              Replaces the node set with the set-union of the node set generated by the sub-query
              query and itself. Duplicate nodes are removed.

              The  execution  of  the sub-query uses the current node set as its own initial node
              set.

       notq query
              Replaces the node set with the set of nodes generated by the sub-query query  which
              are  also  not  in the current node set. In other word the set difference of itself
              and the node set generated by the sub-query.

              The execution of the sub-query uses the current node set as its  own  initial  node
              set.

   NODE SET OPERATORS
       These operators change the node set directly, without referring to the tree.

       unique Removes  duplicate  nodes  from the node set, preserving order. In other words, the
              earliest occurrence of a node  handle  is  preserved,  every  other  occurrence  is
              removed.

       select Replaces  the  current node set with a node set containing only the first node from
              the current node set

       transform query var body
              First it interprets the sub-query query, using the current node set as its  initial
              node  set.   Then  it iterates over the result of that query, binding the handle of
              each node to the variable named  in  var,  and  executing  the  script  body.   The
              collected  results  of  these  executions  is  made the new node set, replacing the
              current one.

              The script body is executed in the context of the caller.

       map var body
              Iterates over the current node set, binding the handle of each node to the variable
              named  in  var,  and  executing  the  script  body.  The collected results of these
              executions is made the new node set, replacing the current one.

              The script body is executed in the context of the caller.

       quote val
              Appends the literal value val to the current node set.

       replace val
              Replaces the current node set with the literal list value val.

   NODE SET ITERATORS
       foreach query var body
              Interprets the sub-query query, then performs the equivalent of  operator  over  on
              the  nodes  in  the  node  set  created  by that query. The current node set is not
              changed, except through side effects from the script body.

              The script body is executed in the context of the caller.

       with query body
              Interprets the query, then runs the script body on the node set  generated  by  the
              query.  At  last it restores the current node set as it was before the execution of
              the query.

              The script body is executed in the context of the caller.

       over var body
              Executes the script body for each node in the node set, with the variable named  by
              var  bound  to  the  name  of the current node.  The script body is executed in the
              context of the caller.

              This is like the builtin foreach, with the node set as the source of  the  list  to
              iterate over.

              The results of executing the body are ignored.

       delete Deletes all the nodes contained in the current node set from the tree.

   TYPED NODE SUPPORT
       These  filters  and  accessors  assume the existence of an attribute called @type, and are
       short-hand forms useful for cost-like tree query, html tree editing, and so on.

       nodetype
              Returns the node type of nodes.  Attribute string accessor. This is equivalent to

              get @type

       oftype t
              Reduces the node set to nodes whose type is equal to t, with letter case ignored.

       nottype t
              Reduces the node set to nodes whose type is  not  equal  to  t,  with  letter  case
              ignored.

       oftypes attrs
              Reduces  set  to  nodes  whose  @type is an element in the list attrs of types. The
              value of @type is used as a glob pattern, and letter case is relevant.

EXAMPLES

       ... TODO ...

REFERENCES

       [1]    COST [http://wiki.tcl.tk/COST] on the Tcler's Wiki.

       [2]    TreeQL [http://wiki.tcl.tk/treeql] on the Tcler's Wiki. Discuss this package there.

BUGS, IDEAS, FEEDBACK

       This document, and the package it describes,  will  undoubtedly  contain  bugs  and  other
       problems.    Please   report   such   in  the  category  treeql  of  the  Tcllib  Trackers
       [http://core.tcl.tk/tcllib/reportlist].  Please also report any ideas for enhancements you
       may have for either package and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the output of diff -u.

       Note further that attachments are strongly preferred over inlined patches. Attachments can
       be made by going to the Edit form of the ticket immediately after its creation,  and  then
       using the left-most button in the secondary navigation bar.

KEYWORDS

       Cost, DOM, TreeQL, XPath, XSLT, structured queries, tree, tree query language

CATEGORY

       Data structures

COPYRIGHT

       Copyright (c) 2004 Colin McCormack <coldstore@users.sourceforge.net>
       Copyright (c) 2004 Andreas Kupries <andreas_kupries@users.sourceforge.net>