Provided by: tcllib_1.15-dfsg-2_all bug

NAME

       page - Parser Generator

SYNOPSIS

       page ?options...? ?input ?output??

_________________________________________________________________

DESCRIPTION

       The application described by this document, page, is actually not just a parser generator,
       as the name implies, but a generic tool for the execution of arbitrary transformations  on
       texts.

       Its  genericity  comes  through  the use of plugins for reading, transforming, and writing
       data, and the predefined set of plugins provided  by  Tcllib  is  for  the  generation  of
       memoizing  recursive  descent  parsers  (aka  packrat parsers) from grammar specifications
       (Parsing Expression Grammars).

       page is written on top of the package page::pluginmgr, wrapping its functionality  into  a
       command  line  based  application.  All  the  other  page::*  packages  are  plugin and/or
       supporting packages for the generation of parsers. The parsers themselves are based on the
       packages grammar::peg, grammar::peg::interp, and grammar::mengine.

   COMMAND LINE
       page ?options...? ?input ?output??
              This  is  general  form for calling page. The application will read the contents of
              the file input, process them under the control of the specified options,  and  then
              write the result to the file output.

              If  input  is the string - the data to process will be read from stdin instead of a
              file. Analogously the result will be written to stdout instead of a file if  output
              is  the string -. A missing output or input specification causes the application to
              assume -.

              The detailed specifications of the  recognized  options  are  provided  in  section
              OPTIONS.

              path input (in)
                     This  argument  specifies  the  path  to  the  file  to  be processed by the
                     application, or -. The last value causes the application to  read  the  text
                     from  stdin.  Otherwise it has to exist, and be readable. If the argument is
                     missing - is assumed.

              path output (in)
                     This argument specifies where to write the generated text.  It  can  be  the
                     path  to  a  file,  or -. The last value causes the application to write the
                     generated documented to stdout.

                     If the file output does not exist then [file dirname $output] has  to  exist
                     and  must be a writable directory, as the application will create the fileto
                     write to.

                     If the argument is missing - is assumed.

   OPERATION
   OPTIONS
       This section describes all the options available to the user of the  application.  Options
       are  always processed in order. I.e. of both --help and --version are specified the option
       encountered first has precedence.

       Unknown options specified before any of the options -rd, -wr, or -tr will cause processing
       to abort with an error. Unknown options coming in between these options, or after the last
       of them are assumed to always take a single argument and  are  associated  with  the  last
       plugin option coming before them. They will be checked after all the relevant plugins, and
       thus the options they understand, are known. I.e. such unknown options cause error if  and
       only  if  the plugin option they are associated with does not understand them, and was not
       superceded by a plugin option coming after.

       Default options are used if and only if the command line did not contain  any  options  at
       all.  They  will set the application up as a PEG-based parser generator. The exact list of
       options is

              -c peg

       And now the recognized options and their arguments, if they have any:

       --help

       -h

       -?     When one of these options is found on the command line all arguments coming  before
              or  after  are  ignored.  The  application  will  print  a short description of the
              recognized options and exit.

       --version

       -V     When one of these options is found on the command line all arguments coming  before
              or after are ignored. The application will print its own revision and exit.

       -P     This  option  signals the application to activate visual feedback while reading the
              input.

       -T     This option signals the application to collect statistics while reading  the  input
              and to print them after reading has completed, before processing started.

       -D     This  option  signals the application to activate logging in the Safe base, for the
              debugging of problems with plugins.

       -r parser

       -rd parser

       --reader parser
              These options specify the plugin the application has to use for reading the  input.
              If the options are used multiple times the last one will be used.

       -w generator

       -wr generator

       --writer generator
              These  options  specify  the  plugin  the application has to use for generating and
              writing the final output. If the options are used multiple times the last one  will
              be used.

       -t process

       -tr process

       --transform process
              These  options  specify  a  plugin  to run on the input. In contrast to readers and
              writers each use will not supersede previous uses, but add each chosen plugin to  a
              list  of transformations, either at the front, or the end, per the last seen use of
              either option -p or -a. The initial default is to append the new transformations.

       -a

       --append
              These options signal the application that all following transformations  should  be
              added at the end of the list of transformations.

       -p

       --prepend
              These  options  signal the application that all following transformations should be
              added at the beginning of the list of transformations.

       --reset
              This option signals the application to clear the list of transformations.  This  is
              necessary to wipe out the default transformations used.

       -c file

       --configuration file
              This option causes the application to load a configuration file and/or plugin. This
              is a plugin which in essence provides a pre-defined  set  of  commandline  options.
              They  are  processed  exactly as if they have been specified in place of the option
              and its arguments. This means that unknown options found at the  beginning  of  the
              configuration  file  are  associated  with the last plugin, even if that plugin was
              specified before the configuration file itself. Conversely, unknown options  coming
              after the configuration file can be associated with a plugin specified in the file.

              If  the  argument is a file which cannot be loaded as a plugin the application will
              assume that its contents are a list of options and their  arguments,  separated  by
              space,  tabs,  and newlines. Options and argumentes containing spaces can be quoted
              via double-quotes (") and quotes ('). The quote character can be  specified  within
              in a quoted string by doubling it. Newlines in a quoted string are accepted as is.

   PLUGINS
       page   makes   use   of  four  different  types  of  plugins,  namely:  readers,  writers,
       transformations, and configurations. Here we provide only a basic introduction on  how  to
       use  them from page. The exact APIs provided to and expected from the plugins can be found
       in the documentation for page::pluginmgr, for those who wish to write their own plugins.

       Plugins are specified as arguments to the options -r, -w, -t,  -c,  and  their  equivalent
       longer forms. See the section OPTIONS for reference.

       Each  such argument will be first treated as the name of a file and this file is loaded as
       the plugin. If however there is no file with that name, then it will  be  translated  into
       the  name  of  a  package,  and  this package is then loaded. For each type of plugins the
       package management searches not only the regular paths, but a set application-  and  type-
       specific paths as well. Please see the section PLUGIN LOCATIONS for a listing of all paths
       and their sources.

       -c name
              Configurations.   The   name   of   the   package   for   the   plugin   name    is
              "page::config::name".

              We have one predefined plugin:

              peg    It  sets  the  application  up  as  a  parser  generator  accepting  parsing
                     expression grammars  and  writing  a  packrat  parser  in  Tcl.  The  actual
                     arguments it specifies are:

                     --reset
                     --append
                     --reader    peg
                     --transform reach
                     --transform use
                     --writer    me

       -r name
              Readers. The name of the package for the plugin name is "page::reader::name".

              We have five predefined plugins:

              peg    Interprets  the  input as a parsing expression grammar (PEG) and generates a
                     tree representation for it. Both the syntax of PEGs and the structure of the
                     tree representation are explained in their own manpages.

              hb     Interprets  the  input  as Tcl code as generated by the writer plugin hb and
                     generates its tree representation.

              ser    Interprets the input as the serialization of a  PEG,  as  generated  by  the
                     writer plugin ser, using the package grammar::peg.

              lemon  Interprets  the  input  as  a grammar specification as understood by Richard
                     Hipp's LEMON parser generator and generates a tree  representation  for  it.
                     Both  the  input  syntax  and  the  structure of the tree representation are
                     explained in their own manpages.

              treeser
                     Interprets the input as the serialization of a struct::tree. It is validated
                     as  such,  but nothing else. It is not assumed to be the tree representation
                     of a grammar.

       -w name
              Writers. The name of the package for the plugin name is "page::writer::name".

              We have eight predefined plugins:

              identity
                     Simply writes the incoming data as it is, without making any  changes.  This
                     is good for inspecting the raw result of a reader or transformation.

              null   Generates nothing, and ignores the incoming data structure.

              tree   Assumes  that the incoming data structure is a struct::tree and generates an
                     indented textual representation of all nodes, their parental  relationships,
                     and their attribute information.

              peg    Assumes  that  the incoming data structure is a tree representation of a PEG
                     or other other grammar and writes it out as a  PEG.  The  result  is  nicely
                     formatted  and  partially simplified (strings as sequences of characters). A
                     pretty printer in essence, but can  also  be  used  to  obtain  a  canonical
                     representation of the input grammar.

              tpc    Assumes  that  the incoming data structure is a tree representation of a PEG
                     or other other grammar and writes out Tcl  code  defining  a  package  which
                     defines  a grammar::peg object containing the grammar when it is loaded into
                     an interpreter.

              hb     This is like the writer plugin tpc, but it writes only the statements  which
                     define  stat  expression  and  grammar  rules.  The code making the result a
                     package is left out.

              ser    Assumes that the incoming data structure is a tree representation of  a  PEG
                     or  other other grammar, transforms it internally into a grammar::peg object
                     and writes out its serialization.

              me     Assumes that the incoming data structure is a tree representation of  a  PEG
                     or  other  other  grammar  and  writes out Tcl code defining a package which
                     implements a memoizing recursive descent parser based on  the  match  engine
                     (ME) provided by the package grammar::mengine.

       -t name
              Transformers.    The    name    of   the   package   for   the   plugin   name   is
              "page::transform::name".

              We have two predefined plugins:

              reach  Assumes that the incoming data structure is a tree representation of  a  PEG
                     or  other  other  grammar. It determines which nonterminal symbols and rules
                     are reachable from start-symbol/expression. All  nonterminal  symbols  which
                     were not reached are removed.

              use    Assumes  that  the incoming data structure is a tree representation of a PEG
                     or other other grammar. It determines which nonterminal  symbols  and  rules
                     are  able  to  generate a finite sequences of terminal symbols (in the sense
                     for a Context Free Grammar). All nonterminal symbols which were  not  deemed
                     useful in this sense are removed.

   PLUGIN LOCATIONS
       The application-specific paths searched by page either are, or come from:

       [1]    The directory            "~/.page/plugin"

       [2]    The environment variable PAGE_PLUGINS

       [3]    The registry entry       HKEY_LOCAL_MACHINE\SOFTWARE\PAGE\PLUGINS

       [4]    The registry entry       HKEY_CURRENT_USER\SOFTWARE\PAGE\PLUGINS

       The type-specific paths searched by page either are, or come from:

       [1]    The directory            "~/.page/plugin/<TYPE>"

       [2]    The environment variable PAGE_<TYPE>_PLUGINS

       [3]    The registry entry       HKEY_LOCAL_MACHINE\SOFTWARE\PAGE\<TYPE>\PLUGINS

       [4]    The registry entry       HKEY_CURRENT_USER\SOFTWARE\PAGE\<TYPE>\PLUGINS

       Where the placeholder <TYPE> is always one of the values below, properly capitalized.

       [1]    reader

       [2]    writer

       [3]    transform

       [4]    config

       The  registry  entries  are specific to the Windows(tm) platform, all other platforms will
       ignore them.

       The contents of both environment variables and registry entries are interpreted as a  list
       of paths, with the elements separated by either colon (Unix), or semicolon (Windows).

BUGS, IDEAS, FEEDBACK

       This  document,  and the application it describes, will undoubtedly contain bugs and other
       problems.   Please  report  such  in  the  category  page  of  the  Tcllib   SF   Trackers
       [http://sourceforge.net/tracker/?group_id=12883].    Please  also  report  any  ideas  for
       enhancements you may have for either application and/or documentation.

SEE ALSO

       page::pluginmgr

KEYWORDS

       parser generator, text processing

CATEGORY

       Page Parser Generator

COPYRIGHT

       Copyright (c) 2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>