Provided by: tcllib_1.21+dfsg-1_all bug

NAME

       doctools::toc::parse - Parsing text in doctoc format

SYNOPSIS

       package require doctools::toc::parse  ?0.1?

       package require Tcl  8.4

       package require doctools::toc::structure

       package require doctools::msgcat

       package require doctools::tcl::parse

       package require fileutil

       package require logger

       package require snit

       package require struct::list

       package require struct::stack

       ::doctools::toc::parse text text

       ::doctools::toc::parse file path

       ::doctools::toc::parse includes

       ::doctools::toc::parse include add path

       ::doctools::toc::parse include remove path

       ::doctools::toc::parse include clear

       ::doctools::toc::parse vars

       ::doctools::toc::parse var set name value

       ::doctools::toc::parse var unset name

       ::doctools::toc::parse var clear ?pattern?

_________________________________________________________________________________________________

DESCRIPTION

       This  package  provides  commands  to parse text written in the doctoc markup language and
       convert it into the canonical serialization of the table of contents encoded in the  text.
       See the section ToC serialization format for specification of their format.

       This  is  an  internal  package of doctools, for use by the higher level packages handling
       doctoc documents.

API

       ::doctools::toc::parse text text
              The command takes the string contained in text and parses it under  the  assumption
              that  it  contains a document written using the doctoc markup language. An error is
              thrown if this assumption is found to be false.  The  format  of  these  errors  is
              described in section Parse errors.

              When  successful  the  command  returns the canonical serialization of the table of
              contents which was encoded in the text.  See the section ToC  serialization  format
              for specification of that format.

       ::doctools::toc::parse file path
              The  same as text, except that the text to parse is read from the file specified by
              path.

       ::doctools::toc::parse includes
              This method returns the current list of search paths used when looking for  include
              files.

       ::doctools::toc::parse include add path
              This method adds the path to the list of paths searched when looking for an include
              file. The call is ignored if the path is already in the list of paths.  The  method
              returns the empty string as its result.

       ::doctools::toc::parse include remove path
              This  method  removes  the path from the list of paths searched when looking for an
              include file. The call is ignored if the path is  not  contained  in  the  list  of
              paths. The method returns the empty string as its result.

       ::doctools::toc::parse include clear
              This method clears the list of search paths for include files.

       ::doctools::toc::parse vars
              This method returns a dictionary containing the current set of predefined variables
              known to the vset markup command during processing.

       ::doctools::toc::parse var set name value
              This method adds the variable name to the set of predefined variables known to  the
              vset markup command during processing, and gives it the specified value. The method
              returns the empty string as its result.

       ::doctools::toc::parse var unset name
              This method removes the variable name from the set of predefined variables known to
              the  vset  markup command during processing. The method returns the empty string as
              its result.

       ::doctools::toc::parse var clear ?pattern?
              This method removes all variables matching the pattern from the set  of  predefined
              variables  known  to  the vset markup command during processing. The method returns
              the empty string as its result.

              The pattern matching is done with string match, and the default pattern  used  when
              none is specified, is *.

PARSE ERRORS

       The  format  of the parse error messages thrown when encountering violations of the doctoc
       markup syntax is human readable and not intended for processing by machines. As such it is
       not documented.

       However,  the  errorCode attached to the message is machine-readable and has the following
       format:

       [1]    The error code will be a list, each element describing a single error found in  the
              input. The list has at least one element, possibly more.

       [2]    Each  error  element  will  be a list containing six strings describing an error in
              detail. The strings will be

              [1]    The path of the file the error occurred in. This may be empty.

              [2]    The range of the token the error was found at. This range is  a  two-element
                     list  containing  the  offset  of the first and last character in the range,
                     counted from the beginning of the input (file).  Offsets  are  counted  from
                     zero.

              [3]    The  line the first character after the error is on.  Lines are counted from
                     one.

              [4]    The column the first character after the error is at.  Columns  are  counted
                     from zero.

              [5]    The  message  code  of  the  error.  This  value  can be used as argument to
                     msgcat::mc  to  obtain  a  localized  error  message,  assuming   that   the
                     application  had a suitable call of doctools::msgcat::init to initialize the
                     necessary message catalogs (See package doctools::msgcat).

              [6]    A list of details for the error, like the markup command  involved.  In  the
                     case  of  message code doctoc/include/syntax this value is the set of errors
                     found in the included file, using the format described here.

[DOCTOC] NOTATION OF TABLES OF CONTENTS

       The doctoc format for tables of contents, also called the doctoc markup language,  is  too
       large  to  be  covered  in  single  section.   The interested reader should start with the
       document

       [1]    doctoc language introduction

       and then proceed from there to the formal specifications, i.e. the documents

       [1]    doctoc language syntax and

       [2]    doctoc language command reference.

       to get a thorough understanding of the language.

TOC SERIALIZATION FORMAT

       Here we specify the format used by  the  doctools  v2  packages  to  serialize  tables  of
       contents as immutable values for transport, comparison, etc.

       We  distinguish  between  regular and canonical serializations.  While a table of contents
       may have more than one regular serialization only exactly one of them will be canonical.

       regular serialization

              [1]    The serialization of any table of contents is a nested Tcl dictionary.

              [2]    This dictionary holds a single key, doctools::toc, and its value. This value
                     holds the contents of the table of contents.

              [3]    The contents of the table of contents are a Tcl dictionary holding the title
                     of the table of contents, a label, and its elements. The relevant  keys  and
                     their values are

                     title  The value is a string containing the title of the table of contents.

                     label  The value is a string containing a label for the table of contents.

                     items  The  value  is  a  Tcl list holding the elements of the table, in the
                            order they are to be shown.

                            Each element is a Tcl list holding the type  of  the  item,  and  its
                            description,  in this order. An alternative description would be that
                            it is a Tcl dictionary holding a single key, the item type, mapped to
                            the item description.

                            The two legal item types and their descriptions are

                            reference
                                   This  item  describes a single entry in the table of contents,
                                   referencing a single document.  To this end its value is a Tcl
                                   dictionary  containing  an  id  for the referenced document, a
                                   label,  and  a  longer  textual  description  which   can   be
                                   associated with the entry.  The relevant keys and their values
                                   are

                                   id     The value is a string containing the id of the document
                                          associated with the entry.

                                   label  The  value  is  a  string  containing  a label for this
                                          entry. This string also identifies the  entry,  and  no
                                          two   entries   (references   and   divisions)  in  the
                                          containing list are allowed to have the same label.

                                   desc   The value is a string containing a  longer  description
                                          for this entry.

                            division
                                   This  item  describes  a  group  of  entries  in  the table of
                                   contents, inducing a hierarchy of entries.  To  this  end  its
                                   value is a Tcl dictionary containing a label for the group, an
                                   optional id to a document for the whole group, and the list of
                                   entries in the group.  The relevant keys and their values are

                                   id     The value is a string containing the id of the document
                                          associated with the whole group. This key is optional.

                                   label  The value is a string containing a label for the group.
                                          This  string  also  identifies  the  entry,  and no two
                                          entries (references and divisions)  in  the  containing
                                          list are allowed to have the same label.

                                   items  The  value  is  a  Tcl list holding the elements of the
                                          group, in the order they are to be  shown.   This  list
                                          has  the  same  structure  as the value for the keyword
                                          items used to describe the whole table of contents, see
                                          above.  This  closes  the  recusrive  definition of the
                                          structure, with divisions  holding  the  same  type  of
                                          elements  as  the  whole  table  of contents, including
                                          other divisions.

       canonical serialization
              The canonical serialization of a table of contents has the format as  specified  in
              the  previous  item,  and  then additionally satisfies the constraints below, which
              make it unique among all the possible serializations of this table of contents.

              [1]    The keys found in all the nested Tcl dictionaries are  sorted  in  ascending
                     dictionary  order,  as  generated by Tcl's builtin command lsort -increasing
                     -dict.

BUGS, IDEAS, FEEDBACK

       This document, and the package it describes,  will  undoubtedly  contain  bugs  and  other
       problems.    Please   report  such  in  the  category  doctools  of  the  Tcllib  Trackers
       [http://core.tcl.tk/tcllib/reportlist].  Please also report any ideas for enhancements you
       may have for either package and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the output of diff -u.

       Note further that attachments are strongly preferred over inlined patches. Attachments can
       be made by going to the Edit form of the ticket immediately after its creation,  and  then
       using the left-most button in the secondary navigation bar.

KEYWORDS

       doctoc, doctools, lexer, parser

CATEGORY

       Documentation tools

COPYRIGHT

       Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>