Provided by: tcllib_1.21+dfsg-1_all 

NAME
doctools::toc::parse - Parsing text in doctoc format
SYNOPSIS
package require doctools::toc::parse ?0.1?
package require Tcl 8.4
package require doctools::toc::structure
package require doctools::msgcat
package require doctools::tcl::parse
package require fileutil
package require logger
package require snit
package require struct::list
package require struct::stack
::doctools::toc::parse text text
::doctools::toc::parse file path
::doctools::toc::parse includes
::doctools::toc::parse include add path
::doctools::toc::parse include remove path
::doctools::toc::parse include clear
::doctools::toc::parse vars
::doctools::toc::parse var set name value
::doctools::toc::parse var unset name
::doctools::toc::parse var clear ?pattern?
________________________________________________________________________________________________________________
DESCRIPTION
This package provides commands to parse text written in the doctoc markup language and convert it into
the canonical serialization of the table of contents encoded in the text. See the section ToC
serialization format for specification of their format.
This is an internal package of doctools, for use by the higher level packages handling doctoc documents.
API
::doctools::toc::parse text text
The command takes the string contained in text and parses it under the assumption that it contains
a document written using the doctoc markup language. An error is thrown if this assumption is
found to be false. The format of these errors is described in section Parse errors.
When successful the command returns the canonical serialization of the table of contents which was
encoded in the text. See the section ToC serialization format for specification of that format.
::doctools::toc::parse file path
The same as text, except that the text to parse is read from the file specified by path.
::doctools::toc::parse includes
This method returns the current list of search paths used when looking for include files.
::doctools::toc::parse include add path
This method adds the path to the list of paths searched when looking for an include file. The call
is ignored if the path is already in the list of paths. The method returns the empty string as its
result.
::doctools::toc::parse include remove path
This method removes the path from the list of paths searched when looking for an include file. The
call is ignored if the path is not contained in the list of paths. The method returns the empty
string as its result.
::doctools::toc::parse include clear
This method clears the list of search paths for include files.
::doctools::toc::parse vars
This method returns a dictionary containing the current set of predefined variables known to the
vset markup command during processing.
::doctools::toc::parse var set name value
This method adds the variable name to the set of predefined variables known to the vset markup
command during processing, and gives it the specified value. The method returns the empty string
as its result.
::doctools::toc::parse var unset name
This method removes the variable name from the set of predefined variables known to the vset
markup command during processing. The method returns the empty string as its result.
::doctools::toc::parse var clear ?pattern?
This method removes all variables matching the pattern from the set of predefined variables known
to the vset markup command during processing. The method returns the empty string as its result.
The pattern matching is done with string match, and the default pattern used when none is
specified, is *.
PARSE ERRORS
The format of the parse error messages thrown when encountering violations of the doctoc markup syntax is
human readable and not intended for processing by machines. As such it is not documented.
However, the errorCode attached to the message is machine-readable and has the following format:
[1] The error code will be a list, each element describing a single error found in the input. The list
has at least one element, possibly more.
[2] Each error element will be a list containing six strings describing an error in detail. The
strings will be
[1] The path of the file the error occurred in. This may be empty.
[2] The range of the token the error was found at. This range is a two-element list containing
the offset of the first and last character in the range, counted from the beginning of the
input (file). Offsets are counted from zero.
[3] The line the first character after the error is on. Lines are counted from one.
[4] The column the first character after the error is at. Columns are counted from zero.
[5] The message code of the error. This value can be used as argument to msgcat::mc to obtain a
localized error message, assuming that the application had a suitable call of
doctools::msgcat::init to initialize the necessary message catalogs (See package
doctools::msgcat).
[6] A list of details for the error, like the markup command involved. In the case of message
code doctoc/include/syntax this value is the set of errors found in the included file,
using the format described here.
[DOCTOC] NOTATION OF TABLES OF CONTENTS
The doctoc format for tables of contents, also called the doctoc markup language, is too large to be
covered in single section. The interested reader should start with the document
[1] doctoc language introduction
and then proceed from there to the formal specifications, i.e. the documents
[1] doctoc language syntax and
[2] doctoc language command reference.
to get a thorough understanding of the language.
TOC SERIALIZATION FORMAT
Here we specify the format used by the doctools v2 packages to serialize tables of contents as immutable
values for transport, comparison, etc.
We distinguish between regular and canonical serializations. While a table of contents may have more
than one regular serialization only exactly one of them will be canonical.
regular serialization
[1] The serialization of any table of contents is a nested Tcl dictionary.
[2] This dictionary holds a single key, doctools::toc, and its value. This value holds the
contents of the table of contents.
[3] The contents of the table of contents are a Tcl dictionary holding the title of the table
of contents, a label, and its elements. The relevant keys and their values are
title The value is a string containing the title of the table of contents.
label The value is a string containing a label for the table of contents.
items The value is a Tcl list holding the elements of the table, in the order they are to
be shown.
Each element is a Tcl list holding the type of the item, and its description, in
this order. An alternative description would be that it is a Tcl dictionary holding
a single key, the item type, mapped to the item description.
The two legal item types and their descriptions are
reference
This item describes a single entry in the table of contents, referencing a
single document. To this end its value is a Tcl dictionary containing an id
for the referenced document, a label, and a longer textual description which
can be associated with the entry. The relevant keys and their values are
id The value is a string containing the id of the document associated
with the entry.
label The value is a string containing a label for this entry. This string
also identifies the entry, and no two entries (references and
divisions) in the containing list are allowed to have the same label.
desc The value is a string containing a longer description for this entry.
division
This item describes a group of entries in the table of contents, inducing a
hierarchy of entries. To this end its value is a Tcl dictionary containing a
label for the group, an optional id to a document for the whole group, and
the list of entries in the group. The relevant keys and their values are
id The value is a string containing the id of the document associated
with the whole group. This key is optional.
label The value is a string containing a label for the group. This string
also identifies the entry, and no two entries (references and
divisions) in the containing list are allowed to have the same label.
items The value is a Tcl list holding the elements of the group, in the
order they are to be shown. This list has the same structure as the
value for the keyword items used to describe the whole table of
contents, see above. This closes the recusrive definition of the
structure, with divisions holding the same type of elements as the
whole table of contents, including other divisions.
canonical serialization
The canonical serialization of a table of contents has the format as specified in the previous
item, and then additionally satisfies the constraints below, which make it unique among all the
possible serializations of this table of contents.
[1] The keys found in all the nested Tcl dictionaries are sorted in ascending dictionary order,
as generated by Tcl's builtin command lsort -increasing -dict.
BUGS, IDEAS, FEEDBACK
This document, and the package it describes, will undoubtedly contain bugs and other problems. Please
report such in the category doctools of the Tcllib Trackers [http://core.tcl.tk/tcllib/reportlist].
Please also report any ideas for enhancements you may have for either package and/or documentation.
When proposing code changes, please provide unified diffs, i.e the output of diff -u.
Note further that attachments are strongly preferred over inlined patches. Attachments can be made by
going to the Edit form of the ticket immediately after its creation, and then using the left-most button
in the secondary navigation bar.
KEYWORDS
doctoc, doctools, lexer, parser
CATEGORY
Documentation tools
COPYRIGHT
Copyright (c) 2009 Andreas Kupries <andreas_kupries@users.sourceforge.net>
tcllib 1 doctools::toc::parse(3tcl)