Provided by: tcllib_1.21+dfsg-1_all bug

NAME

       grammar::me::tcl - Virtual machine implementation I for parsing token streams

SYNOPSIS

       package require Tcl  8.4

       package require grammar::me::tcl  ?0.1?

       ::grammar::me::tcl cmd ...

       ::grammar::me::tcl init nextcmd ?tokmap?

       ::grammar::me::tcl lc location

       ::grammar::me::tcl tok from ?to?

       ::grammar::me::tcl tokens

       ::grammar::me::tcl sv

       ::grammar::me::tcl ast

       ::grammar::me::tcl astall

       ::grammar::me::tcl ctok

       ::grammar::me::tcl nc

       ::grammar::me::tcl next

       ::grammar::me::tcl ord

       ::grammar::me::tcl::ict_advance message

       ::grammar::me::tcl::ict_match_token tok message

       ::grammar::me::tcl::ict_match_tokrange tokbegin tokend message

       ::grammar::me::tcl::ict_match_tokclass code message

       ::grammar::me::tcl::inc_restore nt

       ::grammar::me::tcl::inc_save nt startlocation

       ::grammar::me::tcl::iok_ok

       ::grammar::me::tcl::iok_fail

       ::grammar::me::tcl::iok_negate

       ::grammar::me::tcl::icl_get

       ::grammar::me::tcl::icl_rewind oldlocation

       ::grammar::me::tcl::ier_get

       ::grammar::me::tcl::ier_clear

       ::grammar::me::tcl::ier_nonterminal message location

       ::grammar::me::tcl::ier_merge olderror

       ::grammar::me::tcl::isv_clear

       ::grammar::me::tcl::isv_terminal

       ::grammar::me::tcl::isv_nonterminal_leaf nt startlocation

       ::grammar::me::tcl::isv_nonterminal_range nt startlocation

       ::grammar::me::tcl::isv_nonterminal_reduce nt startlocation ?marker?

       ::grammar::me::tcl::ias_push

       ::grammar::me::tcl::ias_mark

       ::grammar::me::tcl::ias_pop2mark marker

_________________________________________________________________________________________________

DESCRIPTION

       This package provides an implementation of the ME virtual machine.  Please go and read the
       document grammar::me_intro first if you do not know what a ME virtual machine is.

       This implementation is tied very strongly to Tcl. All the stacks in the machine state  are
       handled  through  the  Tcl  stack,  all  control  flow is handled by Tcl commands, and the
       remaining machine instructions  are  directly  mapped  to  Tcl  commands.  Especially  the
       matching  of  nonterminal  symbols  is  handled  by  Tcl  procedures  as well, essentially
       extending the machine implementation with custom instructions.

       Further on the implementation handles only a single machine which is uninteruptible during
       execution  and  hardwired  for pull operation. I.e.  it explicitly requests each new token
       through a callback, pulling them into its state.

       A related package is grammar::peg::interp which provides a generic  interpreter  /  parser
       for  parsing  expression grammars (PEGs), implemented on top of this implementation of the
       ME virtual machine.

API

       The commands documented in this section do not implement any of the instructions of the ME
       virtual machine. They provide the facilities for the initialization of the machine and the
       retrieval of important information.

       ::grammar::me::tcl cmd ...
              This is an ensemble command  providing  access  to  the  commands  listed  in  this
              section. See the methods themselves for detailed specifications.

       ::grammar::me::tcl init nextcmd ?tokmap?
              This command (re)initializes the machine. It returns the empty string. This command
              has to be invoked before any other command of this package.

              The command prefix nextcmd represents the input stream of characters and is invoked
              by  the  machine  whenever  the  a  new  character from the stream is required. The
              instruction for handling this is ict_advance.  The callback has  to  return  either
              the empty list, or a list of 4 elements containing the token, its lexeme attribute,
              and its location as line number and column index, in this order.  The empty list is
              the  signal that the end of the input stream has been reached. The lexeme attribute
              is stored in the terminal cache, but otherwise not used by the machine.

              The optional dictionary tokmap maps from tokens to integer numbers. If present  the
              numbers   impose   an   order   on  the  tokens,  which  is  subsequently  used  by
              ict_match_tokrange to determine if a token is in the specified range or not. If  no
              token  map  is  specified  the  lexicographic  order of th token names will be used
              instead. This choice is especially asensible when using characters as tokens.

       ::grammar::me::tcl lc location
              This command converts the location of a token given as offset in the  input  stream
              into  the  associated  line number and column index. The result of the command is a
              2-element list containing the two values, in the order mentioned  in  the  previous
              sentence.   This  allows higher levels to convert the location information found in
              the error status and the generated AST into more human readable data.

              Note that the command is not able to convert locations which have not been  reached
              by the machine yet. In other words, if the machine has read 7 tokens the command is
              able to convert the offsets 0 to 6, but nothing beyond that. This also  shows  that
              it is not possible to convert offsets which refer to locations before the beginning
              of the stream.

              After a call of init the state used for the conversion is cleared,  making  further
              conversions impossible until the machine has read tokens again.

       ::grammar::me::tcl tok from ?to?
              This command returns a Tcl list containing the part of the input stream between the
              locations from and to (both inclusive). If to is not specified it will  default  to
              the value of from.

              Each  element  of  the  returned  list  is  a list of four elements, the token, its
              associated lexeme, line number, and column index, in this order.  In  other  words,
              each  element has the same structure as the result of the nextcmd callback given to
              ::grammar::me::tcl::init

              This  command  places  the  same  restrictions  on  its   location   arguments   as
              ::grammar::me::tcl::lc.

       ::grammar::me::tcl tokens
              This  command  returns  the  number  of  tokens  currently  known to the ME virtual
              machine.

       ::grammar::me::tcl sv
              This command returns the current semantic value SV stored in the machine.  This  is
              an  abstract  syntax tree as specified in the document grammar::me_ast, section AST
              VALUES.

       ::grammar::me::tcl ast
              This method returns the abstract syntax tree currently at the top of the AST  stack
              of  the  ME  virtual  machine.  This is an abstract syntax tree as specified in the
              document grammar::me_ast, section AST VALUES.

       ::grammar::me::tcl astall
              This method returns the whole stack of abstract syntax trees currently known to the
              ME virtual machine. Each element of the returned list is an abstract syntax tree as
              specified in the document grammar::me_ast, section AST  VALUES.   The  top  of  the
              stack resides at the end of the list.

       ::grammar::me::tcl ctok
              This method returns the current token considered by the ME virtual machine.

       ::grammar::me::tcl nc
              This  method  returns the contents of the nonterminal cache as a dictionary mapping
              from "symbol,location" to match information.

       ::grammar::me::tcl next
              This method returns the next token callback as specified during  initialization  of
              the ME virtual machine.

       ::grammar::me::tcl ord
              This   method   returns   a  dictionary  containing  the  tokmap  specified  during
              initialization of the ME virtual  machine.   ::grammar::me::tcl::ok  This  variable
              contains  the  current  match  status  OK.  It is provided as variable instead of a
              command because that makes access to this information  faster,  and  the  speed  of
              access  is considered very important here as this information is used constantly to
              determine the control flow.

MACHINE STATE

       Please go and read the document grammar::me_vm first for a specification of the  basic  ME
       virtual machine and its state.

       This  implementation  manages  the state described in that document, except for the stacks
       minus the AST stack. In other words, location stack, error stack, return  stack,  and  ast
       marker  stack  are  implicitly managed through standard Tcl scoping, i.e. Tcl variables in
       procedures, outside of this implementation.

MACHINE INSTRUCTIONS

       Please go and read the document grammar::me_vm first for a specification of the  basic  ME
       virtual machine and its instruction set.

       This   implementation   maps   all   instructions   to   Tcl  commands  in  the  namespace
       "::grammar::me::tcl", except for the  stack  related  commands,  nonterminal  symbols  and
       control  flow.   Here  we  simply  list  the  commands  and explain the differences to the
       specified instructions, if there are any.  For  their  semantics  see  the  aforementioned
       specification.  The  machine  commands  are  not  reachable  through  the ensemble command
       ::grammar::me::tcl.

       ::grammar::me::tcl::ict_advance message
              No changes.

       ::grammar::me::tcl::ict_match_token tok message
              No changes.

       ::grammar::me::tcl::ict_match_tokrange tokbegin tokend message
              If, and only if a token map was specified during initialization then the  arguments
              are  the  numeric  representations of the smallest and largest tokens in the range.
              Otherwise they are the relevant tokens themselves and lexicographic  comparison  is
              used.

       ::grammar::me::tcl::ict_match_tokclass code message
              No changes.

       ::grammar::me::tcl::inc_restore nt
              Instead  of  taking  a branchlabel the command returns a boolean value.  The result
              will be true if and only if cached information was found. The caller has to perform
              the appropriate branching.

       ::grammar::me::tcl::inc_save nt startlocation
              The  command  takes  the start location as additional argument, as it is managed on
              the Tcl stack, and not in the machine state.

       icf_ntcall branchlabel

       icf_ntreturn
              These  two  instructions  are  not  mapped  to  commands.  They  are  control  flow
              instructions and handled in Tcl.

       ::grammar::me::tcl::iok_ok
              No changes.

       ::grammar::me::tcl::iok_fail
              No changes.

       ::grammar::me::tcl::iok_negate
              No changes.

       icf_jalways branchlabel

       icf_jok branchlabel

       icf_jfail branchlabel

       icf_halt
              These  four  instructions  are  not  mapped  to  commands.  They  are  control flow
              instructions and handled in Tcl.

       ::grammar::me::tcl::icl_get
              This command returns the current location CL in the input.  It replaces icl_push.

       ::grammar::me::tcl::icl_rewind oldlocation
              The command takes the location as argument as it comes from the Tcl stack, not  the
              machine state.

       icl_pop
              Not mapped, the stacks are not managed by the package.

       ::grammar::me::tcl::ier_get
              This command returns the current error state ER.  It replaces ier_push.

       ::grammar::me::tcl::ier_clear
              No changes.

       ::grammar::me::tcl::ier_nonterminal message location
              The  command takes the location as argument as it comes from the Tcl stack, not the
              machine state.

       ::grammar::me::tcl::ier_merge olderror
              The command takes the second error state to merge as argument as it comes from  the
              Tcl stack, not the machine state.

       ::grammar::me::tcl::isv_clear
              No changes.

       ::grammar::me::tcl::isv_terminal
              No changes.

       ::grammar::me::tcl::isv_nonterminal_leaf nt startlocation
              The  command  takes  the start location as argument as it comes from the Tcl stack,
              not the machine state.

       ::grammar::me::tcl::isv_nonterminal_range nt startlocation
              The command takes the start location as argument as it comes from  the  Tcl  stack,
              not the machine state.

       ::grammar::me::tcl::isv_nonterminal_reduce nt startlocation ?marker?
              The  command  takes  start location and marker as argument as it comes from the Tcl
              stack, not the machine state.

       ::grammar::me::tcl::ias_push
              No changes.

       ::grammar::me::tcl::ias_mark
              This command returns a marker for the current state of the AST stack AS. The marker
              stack is not managed by the machine.

       ::grammar::me::tcl::ias_pop2mark marker
              The  command  takes  the marker as argument as it comes from the Tcl stack, not the
              machine state. It replaces ias_mpop.

BUGS, IDEAS, FEEDBACK

       This document, and the package it describes,  will  undoubtedly  contain  bugs  and  other
       problems.   Please  report  such  in  the  category  grammar_me  of  the  Tcllib  Trackers
       [http://core.tcl.tk/tcllib/reportlist].  Please also report any ideas for enhancements you
       may have for either package and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the output of diff -u.

       Note further that attachments are strongly preferred over inlined patches. Attachments can
       be made by going to the Edit form of the ticket immediately after its creation,  and  then
       using the left-most button in the secondary navigation bar.

KEYWORDS

       grammar, parsing, virtual machine

CATEGORY

       Grammars and finite automata

COPYRIGHT

       Copyright (c) 2005 Andreas Kupries <andreas_kupries@users.sourceforge.net>