oracular (3) toc_parse.3tcl.gz

Provided by: tcllib_1.21+dfsg-1_all bug

NAME

       doctools::toc::parse - Parsing text in doctoc format

SYNOPSIS

       package require doctools::toc::parse  ?0.1?

       package require Tcl  8.4

       package require doctools::toc::structure

       package require doctools::msgcat

       package require doctools::tcl::parse

       package require fileutil

       package require logger

       package require snit

       package require struct::list

       package require struct::stack

       ::doctools::toc::parse text text

       ::doctools::toc::parse file path

       ::doctools::toc::parse includes

       ::doctools::toc::parse include add path

       ::doctools::toc::parse include remove path

       ::doctools::toc::parse include clear

       ::doctools::toc::parse vars

       ::doctools::toc::parse var set name value

       ::doctools::toc::parse var unset name

       ::doctools::toc::parse var clear ?pattern?

________________________________________________________________________________________________________________

DESCRIPTION

       This  package  provides  commands to parse text written in the doctoc markup language and convert it into
       the canonical serialization of the  table  of  contents  encoded  in  the  text.   See  the  section  ToC
       serialization format for specification of their format.

       This is an internal package of doctools, for use by the higher level packages handling doctoc documents.

API

       ::doctools::toc::parse text text
              The command takes the string contained in text and parses it under the assumption that it contains
              a document written using the doctoc markup language. An error is  thrown  if  this  assumption  is
              found to be false. The format of these errors is described in section Parse errors.

              When successful the command returns the canonical serialization of the table of contents which was
              encoded in the text.  See the section ToC serialization format for specification of that format.

       ::doctools::toc::parse file path
              The same as text, except that the text to parse is read from the file specified by path.

       ::doctools::toc::parse includes
              This method returns the current list of search paths used when looking for include files.

       ::doctools::toc::parse include add path
              This method adds the path to the list of paths searched when looking for an include file. The call
              is ignored if the path is already in the list of paths. The method returns the empty string as its
              result.

       ::doctools::toc::parse include remove path
              This method removes the path from the list of paths searched when looking for an include file. The
              call  is  ignored  if the path is not contained in the list of paths. The method returns the empty
              string as its result.

       ::doctools::toc::parse include clear
              This method clears the list of search paths for include files.

       ::doctools::toc::parse vars
              This method returns a dictionary containing the current set of predefined variables known  to  the
              vset markup command during processing.

       ::doctools::toc::parse var set name value
              This  method  adds  the  variable name to the set of predefined variables known to the vset markup
              command during processing, and gives it the specified value. The method returns the  empty  string
              as its result.

       ::doctools::toc::parse var unset name
              This  method  removes  the  variable  name  from the set of predefined variables known to the vset
              markup command during processing. The method returns the empty string as its result.

       ::doctools::toc::parse var clear ?pattern?
              This method removes all variables matching the pattern from the set of predefined variables  known
              to the vset markup command during processing. The method returns the empty string as its result.

              The  pattern  matching  is  done  with  string  match,  and  the default pattern used when none is
              specified, is *.

PARSE ERRORS

       The format of the parse error messages thrown when encountering violations of the doctoc markup syntax is
       human readable and not intended for processing by machines. As such it is not documented.

       However, the errorCode attached to the message is machine-readable and has the following format:

       [1]    The error code will be a list, each element describing a single error found in the input. The list
              has at least one element, possibly more.

       [2]    Each error element will be a list containing six  strings  describing  an  error  in  detail.  The
              strings will be

              [1]    The path of the file the error occurred in. This may be empty.

              [2]    The  range of the token the error was found at. This range is a two-element list containing
                     the offset of the first and last character in the range, counted from the beginning of  the
                     input (file). Offsets are counted from zero.

              [3]    The line the first character after the error is on.  Lines are counted from one.

              [4]    The column the first character after the error is at.  Columns are counted from zero.

              [5]    The message code of the error. This value can be used as argument to msgcat::mc to obtain a
                     localized  error  message,  assuming  that  the  application  had  a   suitable   call   of
                     doctools::msgcat::init   to   initialize   the  necessary  message  catalogs  (See  package
                     doctools::msgcat).

              [6]    A list of details for the error, like the markup command involved. In the case  of  message
                     code  doctoc/include/syntax  this  value  is  the set of errors found in the included file,
                     using the format described here.

[DOCTOC] NOTATION OF TABLES OF CONTENTS

       The doctoc format for tables of contents, also called the doctoc markup language,  is  too  large  to  be
       covered in single section.  The interested reader should start with the document

       [1]    doctoc language introduction

       and then proceed from there to the formal specifications, i.e. the documents

       [1]    doctoc language syntax and

       [2]    doctoc language command reference.

       to get a thorough understanding of the language.

TOC SERIALIZATION FORMAT

       Here  we specify the format used by the doctools v2 packages to serialize tables of contents as immutable
       values for transport, comparison, etc.

       We distinguish between regular and canonical serializations.  While a table of  contents  may  have  more
       than one regular serialization only exactly one of them will be canonical.

       regular serialization

              [1]    The serialization of any table of contents is a nested Tcl dictionary.

              [2]    This  dictionary  holds  a  single  key, doctools::toc, and its value. This value holds the
                     contents of the table of contents.

              [3]    The contents of the table of contents are a Tcl dictionary holding the title of  the  table
                     of contents, a label, and its elements. The relevant keys and their values are

                     title  The value is a string containing the title of the table of contents.

                     label  The value is a string containing a label for the table of contents.

                     items  The  value is a Tcl list holding the elements of the table, in the order they are to
                            be shown.

                            Each element is a Tcl list holding the type of the item,  and  its  description,  in
                            this  order. An alternative description would be that it is a Tcl dictionary holding
                            a single key, the item type, mapped to the item description.

                            The two legal item types and their descriptions are

                            reference
                                   This item describes a single entry in the table of  contents,  referencing  a
                                   single  document.  To this end its value is a Tcl dictionary containing an id
                                   for the referenced document, a label, and a longer textual description  which
                                   can be associated with the entry.  The relevant keys and their values are

                                   id     The  value  is  a  string containing the id of the document associated
                                          with the entry.

                                   label  The value is a string containing a label for this entry.  This  string
                                          also  identifies  the  entry,  and  no  two  entries  (references  and
                                          divisions) in the containing list are allowed to have the same label.

                                   desc   The value is a string containing a longer description for this entry.

                            division
                                   This item describes a group of entries in the table of contents,  inducing  a
                                   hierarchy of entries.  To this end its value is a Tcl dictionary containing a
                                   label for the group, an optional id to a document for the  whole  group,  and
                                   the list of entries in the group.  The relevant keys and their values are

                                   id     The  value  is  a  string containing the id of the document associated
                                          with the whole group. This key is optional.

                                   label  The value is a string containing a label for the  group.  This  string
                                          also  identifies  the  entry,  and  no  two  entries  (references  and
                                          divisions) in the containing list are allowed to have the same label.

                                   items  The value is a Tcl list holding the elements  of  the  group,  in  the
                                          order  they  are to be shown.  This list has the same structure as the
                                          value for the keyword items  used  to  describe  the  whole  table  of
                                          contents,  see  above.  This  closes  the  recusrive definition of the
                                          structure, with divisions holding the same type  of  elements  as  the
                                          whole table of contents, including other divisions.

       canonical serialization
              The  canonical  serialization  of  a table of contents has the format as specified in the previous
              item, and then additionally satisfies the constraints below, which make it unique  among  all  the
              possible serializations of this table of contents.

              [1]    The keys found in all the nested Tcl dictionaries are sorted in ascending dictionary order,
                     as generated by Tcl's builtin command lsort -increasing -dict.

BUGS, IDEAS, FEEDBACK

       This document, and the package it describes, will undoubtedly contain bugs and  other  problems.   Please
       report  such  in  the  category  doctools  of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].
       Please also report any ideas for enhancements you may have for either package and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the output of diff -u.

       Note further that attachments are strongly preferred over inlined patches. Attachments  can  be  made  by
       going  to the Edit form of the ticket immediately after its creation, and then using the left-most button
       in the secondary navigation bar.

KEYWORDS

       doctoc, doctools, lexer, parser

CATEGORY

       Documentation tools

       Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>