Provided by: bisonc++_1.6.1-1_i386 bug
 

NAME

        bisonc++ - Generate a C++ parser class and parsing function
 

SYNOPSIS

        bisonc++ [OPTIONS] grammar-file
 

DESCRIPTION

        The  program bisonc++ is based on previous work on bison by Alain Coet‐
        meur (coetmeur@icdc.fr), who created in the  early  ’90s  a  C++  class
        encapsulating  the  yyparse()  function  as  generated by the GNU-bison
        parser generator.
 
        Initial versions of bisonc++ (up to version 0.92) wrapped Alain’s  pro‐
        gram  in  a program offering a more modern user-interface, removing all
        old-style (C) %define directives  from  bison++’s  input  specification
        file  (see  below for an in-depth discussion of the differences between
        bison++ and bisonc++). Starting with version 0.98, bisonc++ is compiled
        from  a complete rebuilt of the parser generator, closely following the
        description of Aho, Sethi and Ullman’s Dragon Book. Moreover,  starting
        with  version  0.98 bisonc++ is now a C++ program, rather than a C pro‐
        gram generating C++ code.
 
        Bisonc++ expands  the  concepts  initially  implemented  in  bison  and
        bison++,  offering  a  cleaner setup of the generated parser class. The
        parser class is  derived  from  a  base-class,  mainly  containing  the
        parser’s  token-  and  type-definitions as well as several member func‐
        tions which should not be (re)defined by the programmer.
 
        Most of these base-class members might also be defined directly in  the
        parser  class, but were defined in the parser’s base-class. This design
        results in a very lean parser class, declaring only  members  that  are
        actually  defined by the programmer or that must be defined by bisonc++
        itself (e.g., the member function parse()  as  well  as  those  support
        functions requiring access to facilities that are only available in the
        parser class itself, rather than in the parser’s base class).
 
        Moreover, this design does not require the use of virtual members:  the
        members which are not involved in the actual parsing process may always
        be (re)implemented directly by the programmer. Thus there is no need to
        apply or define virtual member functions.
 
        In  fact,  there are only two public members in the parser class gener‐
        ated by bisonc++: setDebug() (see below) and parse(). Remaining members
        are  private,  and  those that can be redefined by the programmer using
        bisonc++ usually receive initial, very simple default in-line implemen‐
        tations.  The  (partial)  exception to this rule is the member function
        lex(), producing the next lexical token. For lex() either  a  standard‐
        ized  interface  or a mere declaration is offerered (requiring the pro‐
        grammer to provide a tailor-made implementation for lex()).
 
        To enforce a primitive namespace, bison used a  well-known  naming-con‐
        vention:  all  its  public symbols started with yy or YY.  Bison++ fol‐
        lowed bison in this respect, even  though  a  class  by  itself  offers
        enough  protection of its identifiers. Consequently, the present author
        feels that these yy and YY conventions are outdated,  and  consequently
        bisonc++  does  not  generate  any symbols defined in either the parser
        (base) class or in the parser function starting with yy or YY. Instead,
        all  data  members  have names, following a suggestion by Lakos (2001),
        starting with d_, and all static data members have names starting  with
        s_.   This   convention   was  not  introduced  to  enforce  identifier
        protection, but to clarify the storage type of variables. Other (local)
        symbols  lack specific prefixes. Furthermore, bisonc++ allows its users
        to define the parser class in  a  particular  namespace  of  their  own
        choice.
 
        Bisonc++ should be used as follows:
 
        o      As  usual,  a grammar must be defined. Using bisonc++ this is no
               different, and the reader is referred to  bison’s  documentation
               for details about specifying and decorating grammars.
 
        o      The  number  and function of the various %define declarations as
               used by bison++, however, is  greatly  modified.  Actually,  all
               %define  declarations are replaced by their (former) first argu‐
               ments. Furthermore, ‘macro-style’  declarations  are  no  longer
               supported  or  required.  Finally, all directives use lower-case
               characters only and do not contain  underscore  characters  (but
               sometimes  hyphens).  E.g.,  %define  DEBUG  is  now declared as
               %debug; %define LSP_NEEDED is now declared as %lsp-needed  (note
               the hyphen).
 
        o      As  noted,  no  ‘macro  style’ %define declarations are required
               anymore. Instead, the normal practice of defining class  members
               in  source  files and declaring them in a class header files can
               be adhered to using bisonc++.  Basically, bisonc++  concentrates
               on its main tasks: the definition of an initial parser class and
               the implementation of its parsing function int parse(),  leaving
               all  other parts of the parser class’ definition to the program‐
               mer.
 
        o      Having specified  the  grammar  and  (usually)  some  directives
               bisonc++ is able to generate files defining the parser class and
               the implementation of the member function parse() and  its  sup‐
               port functions. See the next section for details about the vari‐
               ous files that may be written by bisonc++.
 
        o      All members (except for the  member  parse())  and  its  support
               functions  must  be  implemented  by  the programmer. Of course,
               additional member functions  should  also  be  declared  in  the
               parser  class’  header.   At the very least the member int lex()
               must be implemented (although a standardized implementation  can
               also  be  generated  by bisonc++). The member lex() is called by
               parse() (support functions) to obtain the next available  token.
               The  member function void error(char const *msg) may also be re-
               implemented by the programmer, but a basic  in-line  implementa‐
               tion  is  provided  by  default.  The member function error() is
               called when parse() detects (syntactical) errors.
 
        o      The parser can now be used in a program. A very  simple  example
               would be:
 
                   int main()
                   {
                       Parser parser;
                       return parser.parse();
                   }
        Bisonc++ may create the following files:
 
        o      A  file  containing  the  implementation  of the member function
               parse() and its support functions. The member parse() is a  pub‐
               lic  member  that  can  be  called  to  parse  a  token-sequence
               according to a specified LALR1 type grammar. The implementations
               of  these  members  is  by default written on the file parse.cc.
               There should be no need for the programmer to alter the contents
               of  this  file,  as  its contents change whenever the grammar is
               modified. Hence it is rewritten by  default.  The  option  --no-
               parse-member  may  be  specified to prevent this file from being
               (re)written.  In  normal  circumstances,  however,  this  option
               should be avoided.
 
        o      A file containing an initial setup of the parser class, contain‐
               ing the declaration of the public  member  parse()  and  of  its
               (private)  support  members.  The  members  error()  and print()
               receive default in-line implementations which may be altered  by
               the  programmer. The member lex() may receive a standard in-line
               implementation (see below), or it will merely  be  declared,  in
               which  case the programmer must provide an implementation.  Fur‐
               thermore, new members may be added to the parser class as  well.
               By  default  this file will only be created if not yet existing,
               using the filename <parser-class>.h (where <parser-class> is the
               the name of the defined parser class). The option --force-class-
               header may be used to  (re)write  this  file,  even  if  already
               existing.
 
        o      A  file containing the parser class’ base class. This base class
               should not be modified by  the  programmer.  It  contains  types
               defined by bisonc++, as well as several (protected) data members
               and member functions, which should not be redefined by the  pro‐
               grammer. All symbolic parser terminal tokens are defined in this
               class, so it escalates these definitions  in  a  separate  class
               (cf.  Lakos,  (2001)),  thus  preventing  circular  dependencies
               between the lexical scanner and the parser  (circular  dependen‐
               cies  occur  in  situations where the parser needs access to the
               lexical scanner class to define a lexical scanner as one of  its
               data members, whereas the lexical scanner, in turn, needs access
               to the parser class to know about the grammar’s symbolic  termi‐
               nal  tokens.  Escalation is a way out of such circular dependen‐
               cies). By  default  this  file  will  be  (re)written  any  time
               bisonc++ is called, using the filename <parser-class>base.h. The
               option --no-baseclass-header may be  specified  to  prevent  the
               base class header file from being (re)written. In normal circum‐
               stances, however, this option should be avoided.
 
        o      A file containing an implementation  header.  An  implementation
               header  may be included by source files implementing the various
               member functions of a class.  The  implementation  header  first
               includes  its  associated  class  header  file,  followed by any
               directives (formerly defined in the %{header ... %}  section  of
               the bison++ parser specification file) that are required for the
               proper compilation of these member functions. The implementation
               header  is included by the file defining parse(). By default the
               implementation header will  be  created  if  not  yet  existing,
               receiving  the  filename <parser-class>.ih.  The option --force-
               implementation-header may be used to (re)write this  file,  even
               if already existing.
 
        o      A verbose description of the generated parser. This file is com‐
               parable to  the  verbose  ouput  file  originally  generated  by
               bison++. It is generated when the option --verbose or -V is pro‐
               vided. When generated, it will use the  filename  <grammar>.out‐
               put,  where  <grammar>  is  the  name of the file containing the
               grammar definition.
 

OPTIONS

        If available, single letter options are listed between parentheses fol‐
        lowing  their  associated  long-option  variants. Single letter options
        require arguments if their associated long options require arguments as
        well.
 
        o      --baseclass-preinclude=header (-H)
               Use  header  as  the  pathname  to  the  file preincluded in the
               parser’s base-class header. This option is useful in  situations
               where the base class header file refers to types which might not
               yet be known. E.g., with %union a std::string * field  might  be
               used.  Since the class std::string might not yet be known to the
               compiler once it processes the base class header file we need  a
               way  to  inform  the compiler about these classes and types. The
               suggested procedure is to use a pre-include header file  declar‐
               ing  the required types. By default header will be surrounded by
               double quotes (using, e.g., #include "header").  When the  argu‐
               ment is surrounded by pointed brackets #include <header> will be
               included. In the latter case, quotes might be required to escape
               interpretation by the shell (e.g., using -H ’<header>’).
 
        o      --baseclass-header=header (-b)
               Use  header  as the pathname of the file containing the parser’s
               base class. This class  defines,  e.g.,  the  parser’s  symbolic
               tokens. Defaults to the name of the parser class plus the suffix
               base.h. It is generated, unless otherwise indicated  (see  --no-
               baseclass-header and --dont-rewrite-baseclass-header below).
 
        o      --baseclass-skeleton=skeleton (-B)
               Use skeleton as the pathname of the file containing the skeleton
               of  the  parser’s  base  class.   Its   filename   defaults   to
               bisonc++base.h.
 
        o      --class-header=header (-c)
               Use  header  as  the  pathname of the file containing the parser
               class. Defaults to the name of the parser class plus the  suffix
               .h
 
        o      --class-skeleton=skeleton (-C)
               Use skeleton as the pathname of the file containing the skeleton
               of the parser class. Its filename defaults  to  bisonc++.h.  The
               environment variable BISON_SIMPLE_H is not inspected anymore.
 
        o      --construction
               This  option  may  be  specified to write details about the con‐
               struction of the parsing tables to the standard  output  stream.
               This  information  is  primarily useful for developers, and aug‐
               ments the information written  to  the  verbose  grammar  output
               file, produced by the --verbose option.
 
        o      --debug
               Provide  parse()  and its support functions with debugging code,
               showing the  actual  parsing  process  on  the  standard  output
               stream.  When  included,  the  debugging  output  is  active  by
               default, but its activity may be  controlled  using  the  setDe     
               bug(bool  on-off)  member.  Note that no #ifdef DEBUG macros are
               used anymore. By rerunning bisonc++ without the  --debug  option
               an  equivalent  parser is generated not containing the debugging
               code.
 
        o      --filenames=filename (-f)
               Specify a filename to use for all files  produced  by  bisonc++.
               Specific options overriding particular filenames are also avail‐
               able (which then, in turn, overide the name  specified  by  this
               option).
 
        o      --force-class-header
               By default the generated class header is not overwritten once it
               has been created. This option can be used to force the (re)writ‐
               ing of the file containing the parser’s class.
 
        o      --force-implementation-header
               By  default the generated implementation header is not overwrit‐
               ten once it has been created. This option can be used  to  force
               the (re)writing of the implementation header file.
 
        o      --help (-h)
               Write  basic usage information to the standard output stream and
               terminate.
 
        o      --implementation-header=header (-i)
               Use header as the pathname of the file containing the  implemen‐
               tation  header.  Defaults  to  the  name of the generated parser
               class plus the suffix .ih. The implementation header should con‐
               tain  all directives and declarations only used by the implemen‐
               tations of the parser’s member functions. It is the only  header
               file  that  is  included by the source file containing parse()’s
               implementation . User defined implementation of other class mem‐
               bers  may use the same convention, thus concentrating all direc‐
               tives and declarations that are required for the compilation  of
               other  source  files belonging to the parser class in one header
               file.
 
        o      --implementation-skeleton=skeleton (-I)
               Use skeleton as the pathname of the file containing the skeleton
               of   the   implementation   header.  Its  filename  defaults  to
               bisonc++.ih.
 
        o      --lines (-l)
               Put #line preprocessor directives in  the  file  containing  the
               parser’s parse() function. By including this option the compiler
               and debuggers will associate errors with lines in  your  grammar
               specification  file, rather than with the source file containing
               the parse() function itself.
 
        o      --no-lines
               Do not put #line preprocessor directives in the file  containing
               the  parser’s  parse() function. This option is primarily useful
               in combination with  the  %lines  directive,  to  suppress  that
               directive. It also overrides option --lines, though.
 
        o      --namespace=namespace (-n)
               Define  the  parser  base  class, the paser class and the parser
               implentations in the namespace namespace. By default  no  names‐
               pace  is  defined.  If  this  options is used the implementation
               header will contain a commented out using namespace  declaration
               for the requested namespace.
 
        o      --no-baseclass-header
               Do  not  write the file containing the parser class’ base class,
               even if that file doesn’t yet exist. By default  the  file  con‐
               taining  the  parser’s  base  class  is  (re)written  each  time
               bisonc++ is called. Note that this  option  should  normally  be
               avoided,  as the base class defines the symbolic terminal tokens
               that are returned by the lexical  scanner.  By  suppressing  the
               construction  of  this  file  any modification in these terminal
               tokens will not be communicated to the lexical scanner.
 
        o      --no-parse-member
               Do not write the file containing the parser’s predefined  parser
               member  functions,  even  if  that  file  doesn’t  yet exist. By
               default the file containing the parser’s parse() member function
               is  (re)written  each  time  bisonc++  is called. Note that this
               option should normally be avoided, as this file contains parsing
               tables which are altered whenever the grammar definition is mod‐
               ified.
 
        o      --parsefun-source=source (-p)
               Define source as the name of  the  source  file  containing  the
               parser member function parse(). Defaults to parse.cc.
 
        o      --parsefun-skeleton=skeleton (-P)
               Use  skeleton as the pathname of the file containing the parsing
               member   function’s   skeleton.   Its   filename   defaults   to
               bisonc++.cc.   The  environment  variable  BISON_SIMPLE  is  not
               inspected anymore.
 
        o      --scanner=header (-s)
               Use header as the pathname to the file defining a class Scanner,
               offering  a member int yylex() producing the next token from the
               input stream to be analyzed by the parser generated by bisonc++.
               When  this  option is used the parser’s member int lex() will be
               predefined as
 
                   int lex()
                   {
                       return d_scanner.yylex();
                   }
 
               and an object  Scanner  d_scanner  will  be  composed  into  the
               parser.  The  d_scanner  object  will  be  constructed using its
               default constructor. If another  constructor  is  required,  the
               parser  class  may  be provided with an appropriate (overloaded)
               parser constructor after having constructed the  default  parser
               class  header  file  using  bisonc++.  By default header will be
               surrounded by double quotes (using,  e.g.,  #include  "header").
               When  the  argument  is  surrounded by pointed brackets #include
               <header> will be included. In the latter case, quotes  might  be
               required  to  escape interpretation by the shell (e.g., using -s
               ’<header>’).
 
        o      --show-filenames
               Write the names of the files that are generated to the  standard
               error stream.
 
        o      --usage
               Write  basic usage information to the standard output stream and
               terminate.
 
        o      --verbose (-V)
               Write a file  containing  verbose  descriptions  of  the  parser
               states  and  what  is  done for each type of look-ahead token in
               that state.  This file also describes all conflicts detected  in
               the  grammar,  both  those  resolved  by operator precedence and
               those that remain unresolved.  By default it will  not  be  cre‐
               ated, but if requested it will receive the filename <parse>.out‐
               put, where <parse> is the filename (without the  .cc  extension)
               of the file containing parse()’s implementation.
 
        o      --version (-v)
               Display bisonc++’s version number and terminate.
 

DIRECTIVES

        The  following  directives  can  be  used in the initial section of the
        grammar specification file. When command-line  options  for  directives
        exist,  they overrule the corresponding directives given in the grammar
        specification file.
 
        o      %baseclass-header header
               Defines the pathname of the file containing  the  parser’s  base
               class. This directive is overridden by the --baseclass-header or
               -b command-line options.
 
        o      %baseclass-preinclude header
               Use header as the pathname  to  the  file  pre-included  in  the
               parser’s  base-class  header. See the description of the --base     
               class-preinclude option for details about this option.  Like the
               convention  adopted  for this argument, header will (by default)
               be surrounded by double quotes.  However, when the  argument  is
               surrounded   by  pointed  brackets  #include  <header>  will  be
               included.
 
        o      %class-header header
               Defines the pathname of the file containing  the  parser  class.
               This  directive  is  overridden by the --class-header or -c com‐
               mand-line options.
 
        o      %class-name parser-class-name
               Declares the name of this parser. This  directive  replaces  the
               %name  declaration  previously  used  by bison++. It defines the
               name of the C++  class  that  will  be  generated.  Contrary  to
               bison++’s  %name declaration, %class-name may appear anywhere in
               the first section of the grammar specification file. However, it
               may  be  defined  only  once. If no %class-name is specified the
               default class name Parser will be used.
 
        o      %debug
               Provide parse() and its support functions with  debugging  code,
               showing  the  actual  parsing  process  on  the  standard output
               stream.  When  included,  the  debugging  output  is  active  by
               default,  but  its  activity  may be controlled using the setDe     
               bug(bool on-off) member. Note that no #ifdef  DEBUG  macros  are
               used  anymore.  By rerunning bisonc++ without the --debug option
               an equivalent parser is generated not containing  the  debugging
               code.
 
        o      %error-verbose
               (to  do)  if defined the parser stack is dumped when an error is
               detected by the parse() member function.
 
        o      %expect number
               If defined the parser will not report  encountered  shift/reduce
               and  reduce/reduce conflicts if all detected conflicts are equal
               to the number following %expect. Conflicts are mentioned in  the
               .output file and the number of encountered conflicts is shown on
               the standard output if the actual number of  conflicts  deviates
               from number.
 
        o      %filenames header
               Defines the generic name of all generated files, unless overrid‐
               den by specific names.  This  directive  is  overridden  by  the
               --filenames or -f command-line options.
 
        o      %implementation-header header
               Defines  the  pathname of the file containing the implementation
               header.  This directive is overridden by  the  --implementation-
               header or -i command-line options.
 
        o      %lines
               Put  #line  preprocessor  directives  in the file containing the
               parser’s parse() function. It acts identically to the -l command
               line option, and is suppressed by the --no-lines option.
 
        o      %lsp-needed
               Defining this causes bisonc++ to include code into the generated
               parser using the standard location  stack.   The  token-location
               type  defaults  to the following struct, defined in the parser’s
               base class when this directive is specified:
 
                   struct LTYPE
                   {
                       int timestamp;
                       int first_line;
                       int first_column;
                       int last_line;
                       int last_column;
                       char *text;
                   };
 
        o      %ltype typename
               Specifies a user-defined token  location  type.   If  %ltype  is
               used,  typename  should be the name of an alternate (predefined)
               type (e.g., size_t/*unsigned*/). It should  not  be  used  if  a
               %locationstruct specification is defined (see below). Within the
               parser class, this type will be available as the  type  ‘LTYPE’.
               All  text  on the line following %ltype is used for the typename
               specification. It should therefore not contain  comment  or  any
               other  characters  that  are not part of the actual type defini‐
               tion.
 
        o      %namespace namespace
               Define the parser class in the namespace namespace.  By  default
               no namespace is defined. If this options is used the implementa‐
               tion header will contain a commented out using namespace  decla‐
               ration  for the requested namespace.  This directive is overrid‐
               den by the --namespace command-line option.
 
        o      %negative-dollar-indices
               Do not generate warnings when zero- or  negative  dollar-indices
               are  used  in the grammar’s action blocks. Zero or negative dol‐
               lar-indices are commonly used to implement inherited attributes,
               and should normally be avoided. When used, they can be specified
               like $-1 or $<type>-1, where type is a %union field-name.
 
        o      %parsefun-source source
               Defines the pathname of the file containing  the  parser  member
               parse().  This  directive is overridden by the --parse-source or
               -p command-line options.
 
        o      %scanner header
               Use header as the pathname  to  the  file  pre-included  in  the
               parser’s  class  header.  See  the  description of the --scanner
               option for details about this option.  Similar to the convention
               adopted  for  this  argument,  header  will (by default) be sur‐
               rounded by double quotes.  However, when the  argument  is  sur‐
               rounded  by pointed brackets #include <header> will be included.
               Note that using this directive implies the definition of a  com‐
               posed  Scanner  d_scanner data member into the generated parser,
               as well as a predefined  int  lex()  member,  returning  d_scan‐
               ner.yylex().  If this is inappropriate, a user defined implemen‐
               tation of int lex() must be provided.
 
        o      %stype typename
               The type of the semantic value  of  tokens.   The  specification
               typename  should  be  the  name  of  an unstructured type (e.g.,
               size_t/*unsigned*/). By default it is int. See YYSTYPE in bison.
               It  should  not  be  used  if a %union specification is defined.
               Within the parser class, this type will be available as the type
               ‘STYPE’.  All  text on the line following %stype is used for the
               typename specification. It should therefore not contain  comment
               or  any  other  characters  that are not part of the actual type
               definition.
 
        o      %locationstruct struct-definition
               Defines the organization of the location-struct data type LTYPE.
               This  struct  should  be  specified  analogously  to the way the
               parser’s stacktype is defined  using  %union  (see  below).  The
               location  struct  is  named LTYPE. If neither locationstruct nor
               LTYPE is specified, the aforementioned default struct is used.
 
        o      %left terminal ...
               Defines the names of symbolic terminal  tokens  that  should  be
               treated  as  left-associative.  I.e.,  in case of a shift/reduce
               conflict, a reduction will be preferred over a shift.  Sequences
               of %left, %nonassoc, %right and %token directives may be used to
               define the precedence of operators. In  expressions,  the  first
               used  directive  will  have the lowest precedence, the last used
               the highest.
 
        o      %nonassoc terminal ...
               Defines the names of symbolic terminal  tokens  that  should  be
               treated as non-associative. I.e., in case of a shift/reduce con‐
               flict, a reduction will be preferred over a shift.  Sequences of
               %left,  %nonassoc,  %right  and %token directives may be used to
               define the precedence of operators. In  expressions,  the  first
               used  directive  will  have the lowest precedence, the last used
               the highest.
 
        o      %prec token
               Overrules the defined precendence of an operator for a  particu‐
               lar grammatical rule. Well known is the construction
 
                   expression:
                       ’-’ expression %prec UMINUS
                       {
                           ...
                       }
 
               Here,  the  default  priority and precedence of the ‘-’ token as
               the subtraction operator is overruled by the precedence and pri‐
               ority of the UMINUS token, which is commonly defined as
 
                   %right UMINUS
 
               (see below) following, e.g., the ’*’ and ’/’ operators.
 
        o      %right terminal ...
               Defines  the  names  of  symbolic terminal tokens that should be
               treated as right-associative. I.e., in case  of  a  shift/reduce
               conflict, a shift will be preferred over a reduction.  Sequences
               of %left, %nonassoc, %right and %token directives may be used to
               define  the  precedence  of operators. In expressions, the first
               used directive will have the lowest precedence,  the  last  used
               the highest.
 
        o      %start non-terminal
               The  non-terminal  non-terminal  should be used as the grammar’s
               start-symbol. If omitted, the first  grammatical  rule  will  be
               used  as  the grammar’s starting rule. All syntactically correct
               sentences must be derivable from this starting rule.
 
        o      %token terminal ...
               Defines the names of symbolic  terminal  tokens.   Sequences  of
               %left,  %nonassoc,  %right  and %token directives may be used to
               define the precedence of operators. In  expressions,  the  first
               used  directive  will  have the lowest precedence, the last used
               the highest.
 
        o      %type <type> non-terminal ...
               In combination with %union: associate the semantical value of  a
               non-terminal  symbol  with  a  union field defined by the %union
               directive.
 
        o      %union union-definition
               Acts identically to the bison and bison++ declaration.  as  with
               bison  generate  a  union for semantic type.  The union is named
               STYPE. If no %union is declared,  a  simple  stack-type  may  be
               defined  using  the  %stype directive. If no %stype directive is
               used, the default stacktype (int) is used.
        The following public members can be used by users of the parser classes
        generated by bisonc++ (‘Parser Class’:: prefixes are silently implied):
 
        o      LTYPE:
               The parser’s location type (user-definable). Available only when
               either %lsp-needed, %ltype or %locationstruct has been declared.
 
        o      STYPE:
               The parser’s stack-type (user-definable), defaults to int.
 
        o      Tokens:
               The enumeration type of all the symbolic tokens defined  in  the
               grammar  file  (i.e., bisonc++’s input file). The scanner should
               be prepared to return these symbolic tokens Note that, since the
               symbolic tokens are defined in the parser’s class and not in the
               scanner’s class, the lexical scanner must  prefix  the  parser’s
               class  name  to the symbolic token names when they are returned.
               E.g., return Parser::IDENT should be  used  rather  than  return
               IDENT.
 
        o      int parse():
               The  parser’s parsing member function. It returns 0 when parsing
               has completed successfully, 1 if errors were  encountered  while
               parsing the input.
 
        o      void setDebug(bool mode):
               This member can be used to activate or deactivate the debug-code
               compiled into the parsing function. It is available but  has  no
               effect if no debug code has been compiled into the parsing func‐
               tion. When debugging code has been  compiled  into  the  parsing
               function,  it is active by default, but debug-code is suppressed
               by calling setDebug(false).
        The following enumerations and types can be used by members  of  parser
        classes  generated  by bisonc++. When prefixed by Base:: they are actu‐
        ally protected members inherited from the parser’s base class.
 
        o      Base::ErrorRecovery:
               This enumeration defines two values:
 
                   DEFAULT_RECOVERY_MODE,
                   UNEXPECTED_TOKEN
 
               DEFAULT_RECOVERY_MODE consists of terminating the  parsing  pro‐
               cess. UNEXPECTED_TOKEN activates the recovery procedure whenever
               an error is encountered.  The  recovery  procedure  consists  of
               looking  for the first state on the state-stack having an error-
               production, and then skipping subsequent tokens until  (in  that
               state)  a token is retrieved which may follow the error terminal
               token in that production rule. If this error recovery  procedure
               fails  (i.e.,  if no acceptable token is ever encountered) error
               recovery falls back to the default  recovery  mode,  terminating
               the parsing process.
 
        o      Base::Return:
               This enumeration defines two values:
 
                   PARSE_ACCEPT = 0,
                   PARSE_ABORT = 1
 
               (which are of course the parse() function’s return values).  )
        The  following private members can be used by members of parser classes
        generated by bisonc++. When prefixed by Base:: they are  actually  pro‐
        tected members inherited from the parser’s base class.
 
        o      Base::ParserBase():
               The  default base-class constructor. Can be ignored in practical
               situations.
 
        o      void Base::ABORT() const throw(Return):
               This member can be called from any member function (called  from
               any  of  the parser’s action blocks) to indicate a failure while
               parsing thus terminating the  parsing  function  with  an  error
               value  1.  Note that this offers a marked extension and improve‐
               ment of the macro YYABORT defined by  bison++  in  that  YYABORT
               could not be called from outside of the parsing member function.
 
        o      void Base::ACCEPT() const throw(Return):
               This member can be called from any member function (called  from
               any  of the parser’s action blocks) to indicate successful pars‐
               ing and thus terminating the parsing function.  Note  that  this
               offers  a marked extension and improvement of the macro YYACCEPT
               defined by bison++ in that YYACCEPT could  not  be  called  from
               outside of the parsing member function.
 
        o      void Base::checkEOF():
               Used internally by the parsing function. Not to be called other‐
               wise.
 
        o      void Base::clearin():
               This member replaces  bison(++)’s  macro  yyclearin  and  causes
               bisonc++ to request another token from its lex() member, even if
               the current token has not yet been processed.  It  is  a  useful
               member  when  the  parser  should be reset to its initial state,
               e.g., between successive calls of parse(). In this situation the
               scanner  will  probably be reloaded with new information too (in
               the context of a flex-generated scanner by,  e.g.,  calling  the
               scanner’s yyrestart() member.
 
        o      bool Base::debug() const:
               This member returns the current value of the debug variable.
 
        o      void Base::ERROR() const throw(ErrorRecovery):
               This  member can be called from any member function (called from
               any of the parser’s action blocks) to  generate  an  error,  and
               thus  initiate  the parser’s error recovery code. Note that this
               offers a marked extension and improvement of the  macro  YYERROR
               defined by bison++ in that YYERROR could not be called from out‐
               side of the parsing member function.
 
        o      void error(char const *msg):
               This member may be redefined in the parser  class.  Its  default
               (inline)  implementation  is  to  write  a simple message to the
               standard error stream. It is called when a syntactical error  is
               encountered.
 
        o      void errorRecovery():
               Used  internally  by  the  parsing  function.  Not  to be called
               otherwise.
 
        o      void executeAction():
               Used internally by the parsing function. Not to be called other‐
               wise.
 
        o      int lex():
               This  member  may be pre-implemented using the scanner option or
               directive (see above) or it must be implemented by the  program‐
               mer. It interfaces to the lexical scanner, and should return the
               next token produced by the lexical scanner, either  as  a  plain
               character  or  as  one  of  the  symbolic  tokens defined in the
               Parser::Tokens enumeration. Zero or negative  token  values  are
               interpreted as ‘end of input’.
 
        o      int lookup():
               Used internally by the parsing function. Not to be called other‐
               wise. See also below, section BUGS.
 
        o      void nextToken():
               Used internally by the parsing function. Not to be called other‐
               wise. See also below, section BUGS.
 
        o      void Base::pop():
               Used internally by the parsing function. Not to be called other‐
               wise.
 
        o      void print()):
               This member can be redefined in the parser class to print infor‐
               mation  about  the  parser’s  state.  It is called by the parser
               immediately after retrieving a token from lex(). As it is a mem‐
               ber  function it has access to all the parser’s members, in par‐
               ticular d_token, the current token value and d_loc, the  current
               token location information (if %lsp-needed, %ltype or %location     
               struct has been specified).
 
        o      void Base::push():
               Used internally by the parsing function. Not to be called other‐
               wise.
 
        o      void Base::reduce():
               Used internally by the parsing function. Not to be called other‐
               wise.
 
        o      void Base::top():
               Used internally by the parsing function. Not to be called other‐
               wise.
        The  following private members can be used by members of parser classes
        generated by bisonc++. All data members are actually protected  members
        inherited from the parser’s base class.
 
        o      bool d_debug:
               When the debug option has been specified, this variable (true by
               default) determines whether debug information is  actually  dis‐
               played.
 
        o      LTYPE d_loc:
               The location type value associated with a terminal token. It can
               be used by, e.g., lexical scanners to pass location  information
               of  a  matched  token  to the parser in parallel with a returned
               token. It is available only when %lsp-needed, %ltype  or  %loca     
               tionstruct has been defined.
               Lexical  scanners  may be offered the facility to assign a value
               to this variable in parallel with a returned token. In order  to
               allow  a  scanner  access  to  d_loc,  d_loc’s address should be
               passed to the scanner. This can be  realized,  for  example,  by
               defining  a  member void setLoc(STYPE *) in the lexical scanner,
               which is then called from the parser’s constructor as follows:
 
                           d_scanner.setSLoc(&d_loc);
 
               Subsequently, the lexical scanner may  assign  a  value  to  the
               parser’s  d_loc  variable  through  the  pointer to d_loc stored
               inside the lexical scanner.
 
        o      LTYPE d_lsp:
               The location stack pointer. Used internally by the  parser.  Not
               to be used otherwise.
 
        o      STYPE d_val:
               The  semantic  value of a returned token or non-terminal symbol.
               With non-terminal tokens it is  assigned  a  value  through  the
               action  rule’s  symbol  $$.  Lexical scanners may be offered the
               facility to assign a semantic value to this variable in parallel
               with  a  returned  token.  In order to allow a scanner access to
               d_val, d_val’s address should be passed to the scanner. This can
               be   realized,   for   example,   by   defining  a  member  void
               setSval(STYPE *) in the lexical scanner, which  is  then  called
               from the parser’s constructor as follows:
 
                           d_scanner.setSval(&d_val);
 
               Subsequently,  the  lexical  scanner  may  assign a value to the
               parser’s d_val variable through  the  pointer  to  d_val  stored
               inside the lexical scanner.
 
        o      LTYPE d_vsp:
               The semantic value stack pointer. Used internally by the parser.
               Not to be used otherwise.
 
        o      size_t/*unsigned*/ d_nErrors:
               The number of errors counted by parse(). It  is  initialized  by
               the  parser’s  base  class  initializer,  and  is  updated while
               parse() executes. When parse()  has  returned  it  contains  the
               total number of errors counted by parse().
 
        o      int d_state:
               The  current parsing state. Used internally by the parsing func‐
               tion. Not to be used otherwise.
 
        o      int d_token:
               The current token used internally by the parser. The parser  may
               modify  the  token value retrieved via lex(), so d_token may not
               be the value of the last token actually retrieved by lex().
 
        o      static PI s_productionInfo:
               Used internally by the parsing function. Not to be  used  other‐
               wise.
 
        o      static SR s_<nr>[]:
               Here,  <nr>  is  a  numerical value representing a state number.
               Used internally by the parsing function. Not to be  used  other‐
               wise.
 
        o      static SR *s_state[]:
               Used  internally  by the parsing function. Not to be used other‐
               wise.
        In the file defining the parse() function the following types and vari‐
        ables  are defined in the anonymous namespace. These are mentioned here
        for the sake of completeness, and are not normally accessible to  other
        parts of the parser.
 
        o      ReservedTokens:
               This  enumeration  defines  some token values used internally by
               the parsing functions. They are:
 
                   _UNDETERMINED_ = -2,
                   _EOF_          = -1,
                   _error_        = 256,
 
               These tokens are used by the parser to determine whether another
               token  should be requested from the lexical scanner, and to han‐
               dle error-conditions.
 
        o      SR (Shift-Reduce Info):
               This struct provides the shift/reduce information for the  vari‐
               ous  grammatical  states. SR values are collected in arrays, one
               array per grammatical state. These array,  named  s_<nr>,  where
               tt<nr>  is a state number are defined in the anonymous namespace
               as well. The SR elements consist of two unions, defining  fields
               that  are  applicable  to, respectively, the first, intermediate
               and the last array elements.
               The first element of each array consists of (1st field) a State     
               Type and (2nd field) the index of the last array element; inter‐
               mediate elements consist of (1st field) a symbol value and  (2nd
               field)  (if negative) the production rule number reducing to the
               indicated symbol value or (if positive) the next state when  the
               symbol  given  in  the  1st field is the current token; the last
               element of each array consists of (1st field) a placeholder  for
               the  current token and (2nd field) the (negative) rule number to
               reduce to by default or the (positive) number of an  error-state
               to  go to when an erroneous token has been retrieved. If the 2nd
               field is zero, no error or default action has been  defined  for
               the state, and error-recovery is attepted.
 
        o      StateType:
               This enumeration defines the type of the various grammar-states.
               They are:
 
                   NORMAL,
                   HAS_ERROR_ITEM,
                   IS_ERROR_STATE,
 
               HAS_ERROR_ITEM is used for a state having at  least  one  error-
               production.  IS_ERROR_STATE is used for a state from which error
               recovery is attempted. So, while  in  these  states  tokens  are
               retrieved  until a token from where parsing may continue is seen
               by the parser. All other states are NORMAL states.
 
        o      PI (Production Info):
               This struct provides information about production rules. It  has
               two  fields:  d_nonTerm is the identification number of the pro‐
               duction’s non-terminal, d_size represents the number of elements
               of the productin rule.
        All  DECLARATIONS  and  DEFINE  symbols not listed above but defined in
        bison++ are obsolete with bisonc++. In particular, there is no %header{
        ...  %}  section  anymore.  Also,  all DEFINE symbols related to member
        functions are now obsolete. There is no need for these symbols  anymore
        as  they  can  simply  be declared in the class header file and defined
        elsewhere.
 

EXAMPLE

        Using a fairly worn-out example, we’ll construct  a  simple  calculator
        below.  The  basic  operators  as  well  as  parentheses can be used to
        specify expressions, and each expression should be terminated by a new‐
        line. The program terminates when a q is entered. Empty lines result in
        a mere prompt.
 
        First an associated grammar is constructed. When a syntactical error is
        encountered all tokens are skipped until then next newline and a simple
        message is printed using the default error() function.  It  is  assumed
        that  no  semantic  errors occur (in particular, no divisions by zero).
        The grammar is decorated with actions performed when the  corresponding
        grammatical production rule is recognized. The grammar itself is rather
        standard and straightforward, but note the first part of the specifica‐
        tion  file, containing various other directives, among which the %scan     
        ner directive, resulting in a composed d_scanner object as well  as  an
        implementation  of  the  member  function int lex(). In this example, a
        common Scanner class construction strategy was used: the class  Scanner
        was  derived  from  the  class  yyFlexLexer generated by flex++(1). The
        actual process of constructing a class using flex++(1)  is  beyond  the
        scope of this man-page, but flex++(1)’s specification file is mentioned
        below, to further complete the example. Here is bisonc++’s input file:
        %filenames parser
        %scanner ../scanner/scanner.h
 
                                        // lowest precedence
        %token  NUMBER                  // integral numbers
                EOLN                    // newline
 
        %left   ’+’ ’-’
        %left   ’*’ ’/’
        %right  UNARY
                                        // highest precedence
 
        %%
 
        expressions:
            expressions
            evaluate
        |
            prompt
        ;
 
        evaluate:
            alternative
            prompt
        ;
 
        prompt:
            {
                prompt();
            }
        ;
 
        alternative:
            expression
            EOLN
            {
                cout << $1 << endl;
            }
        |
            ’q’
            done
        |
            EOLN
        |
            error
            EOLN
        ;
 
        done:
            {
                cout << "Done.\n";
                ACCEPT();
            }
        ;
 
        expression:
            expression
            ’+’
            expression
            {
                $$ = $1 + $3;
            }
        |
            expression
            ’-’
            expression
            {
                $$ = $1 - $3;
            }
        |
            expression
            ’*’
            expression
            {
                $$ = $1 * $3;
            }
        |
            expression
            ’/’
            expression
            {
                $$ = $1 / $3;
            }
        |
            ’-’
            expression      %prec UNARY
            {
                $$ = -$2;
            }
        |
            ’+’
            expression      %prec UNARY
            {
                $$ = $2;
            }
        |
            ’(’
            expression
            ’)’
            {
                $$ = $2;
            }
        |
            NUMBER
            {
                $$ = atoi(d_scanner.YYText());
            }
        ;
 
        Next, bisonc++ processes this file. In the process, bisonc++  generates
        the following files from its skeletons:
 
        o      The parser’s base class, which is not modified by the programmer
               at all:
               #ifndef ParserBase_h_included
               #define ParserBase_h_included
 
               #include <vector>
               #include <iostream>
 
               namespace // anonymous
               {
                   struct PI;
               }
 
               class ParserBase
               {
                   public:
               // $insert tokens
 
                   // Symbolic tokens:
                   enum Tokens
                   {
                       NUMBER = 257,
                       EOLN,
                       UNARY,
                   };
 
               // $insert STYPE
                   typedef int STYPE;
 
                   private:
                       int d_stackIdx;
                       std::vector<size_t>   d_stateStack;
                       std::vector<STYPE>    d_valueStack;
 
                   protected:
                       enum Return
                       {
                           PARSE_ACCEPT = 0,   // values used as parse()’s return values
                           PARSE_ABORT  = 1
                       };
                       enum ErrorRecovery
                       {
                           DEFAULT_RECOVERY_MODE,
                           UNEXPECTED_TOKEN,
                       };
                       bool        d_debug;
                       size_t    d_nErrors;
                       int         d_token;
                       int         d_nextToken;
                       size_t    d_state;
                       STYPE      *d_vsp;
                       STYPE       d_val;
 
                       ParserBase();
 
                       void ABORT() const throw(Return);
                       void ACCEPT() const throw(Return);
                       void ERROR() const throw(ErrorRecovery);
                       void clearin();
                       bool debug() const;
                       void pop(size_t count = 1);
                       void push(size_t nextState);
                       void reduce(PI const &productionInfo);
                       size_t top() const;
 
                   public:
                       void setDebug(bool mode);
               };
 
               inline bool ParserBase::debug() const
               {
                   return d_debug;
               }
 
               inline void ParserBase::setDebug(bool mode)
               {
                   d_debug = mode;
               }
 
               // As a convenience, when including ParserBase.h its symbols are available as
               // symbols in the class Parser, too.
               #define Parser ParserBase
 
               #endif
 
        o      The parser class parser.h itself. In the  grammar  specification
               various  member  functions are used (e.g., done()) and prompt().
               These functions are so small that they can very well  be  imple‐
               mented inline. Note that done() calls ACCEPT() to terminate fur‐
               ther parsing. ACCEPT() and related members (e.g.,  ABORT())  can
               be  called  from any member called by parse(). As a consequence,
               action blocks could contain mere  function  calls,  rather  than
               several  statements,  thus minimizing the need to rerun bisonc++
               when an action is modified.
 
               Once bisonc++ had created parser.h it  was  augmented  with  the
               required  additional  members,  resulting in the following final
               version:
               #ifndef Parser_h_included
               #define Parser_h_included
 
               // $insert baseclass
               #include "parserbase.h"
               // $insert scanner.h
               #include "../scanner/scanner.h"
 
               #undef Parser
               class Parser: public ParserBase
               {
                   // $insert scannerobject
                   Scanner d_scanner;
 
                   public:
                       int parse();
 
                   private:
                       void error(char const *msg);    // called on (syntax) errors
                       int lex();                      // returns the next token from the
                                                       // lexical scanner.
                       void print();                   // use, e.g., d_token, d_loc
 
                       void prompt();
                       void done();
 
                   // support functions for parse():
                       void executeAction(int ruleNr);
                       void errorRecovery();
                       int lookup(bool recovery);
                       void nextToken();
               };
 
               inline void Parser::error(char const *msg)
               {
                   std::cerr << msg << std::endl;
               }
 
               // $insert lex
               inline int Parser::lex()
               {
                   return d_scanner.yylex();
               }
 
               inline void Parser::print()      // use d_token, d_loc
               {}
 
               inline void Parser::prompt()
               {
                   std::cout << "? " << std::flush;
               }
 
               inline void Parser::done()
               {
                   std::cout << "Done\n";
                   ACCEPT();
               }
 
               #endif
 
        o      To complete the example, the following lexical scanner  specifi‐
               cation was used:
               %{
                   #define _SKIP_YYFLEXLEXER_
                   #include "scanner.ih"
 
                   #include "../parser/parser.h"
               %}
 
               %option yyclass="Scanner" outfile="yylex.cc"
               %option c++ 8bit warn noyywrap yylineno
 
               %%
 
               [ \t]+                          // skip white space
 
               \n                              return Parser::EOLN;
 
               [0-9]+                          return Parser::NUMBER;
 
               .                               return yytext[0];
 
               %%
 
        o      Since  no  member  functions  other than parse() were defined in
               separate source files, only parse()  includes  parser.ih.  Since
               cerr  and endl are used in the grammar’s actions, a using names‐
               pace std or comparable statement is required. This was  effectu‐
               ated  from parser.ih Here is the implementation header declaring
               the standard namespace:
               // include this file in the sources of the class Calculator,
               // and add any includes etc. that are only needed for
               // the compilation of these sources.
 
               // include the file defining the parser class:
               #include "parser.h"
 
               // UN-comment if you don’t want to prefix std::
               // for every symbol defined in the std. namespace:
 
               using namespace std;
 
               The implementation of the parsing  member  function  parse()  is
               basically  irrelevant,  since  it  should not be modified by the
               programmer. It was written on the file parse.cc.
 
        o      Finally, here is the program offering our simple calculator:
               #include "parser/parser.h"
 
               int main()
               {
                   Parser calculator;
                   return calculator.parse();
               }
        Note here that although the  file  parserbase.h,  defining  the  parser
        class’  base-class,  rather  than the header file parser.h defining the
        parser class is included, the lexical scanner may simply return  tokens
        of  the class Calculator (e.g., Calculator::NUMBER rather than Calcula     
        torBase::NUMBER). In fact, using a simple #define - #undef pair  gener‐
        ated  by  the bisonc++ respectively at the end of the base class header
        the file and just before the definition of the parser class  itself  it
        is  the  possible  to  assume  in  the lexical scanner that all symbols
        defined in the the parser’s base class  are  actually  defined  in  the
        parser  class itself. It the should be noted that this feature can only
        be used to access base class the enum  and  types.  The  actual  parser
        class  is not available by the time the the lexical scanner is defined,
        thus avoiding circular class dependencies.
 

FILES

        o      bisonc++base.h: skeleton of the parser’s base class;
 
        o      bisonc++.h: skeleton of the parser class;
 
        o      bisonc++.ih: skeleton of the implementation header;
 
        o      bisonc++.cc: skeleton of the member parse().
        bison(1), bison++(1), bison.info (using texinfo), flex++(1)
 
        Lakos, J. (2001) Large Scale C++ Software Design, Addison Wesley.
        Aho, A.V., Sethi, R., Ullman, J.D. (1986) Compilers, Addison Wesley.
 

BUGS

        The Semantic- and Pure parsers, mentioned in bison++(1) are not  imple‐
        mented  in bisonc++(1). According to bison++(1) the semantic parser was
        not available in bison++ either, while the pure parser was deemed  ‘not
        very useful’.
 
        The  member  function  void lookup (< 1.00) was replaced by int lookup.
        When regenerating parsers created by early versions of  bisonc++  (ver‐
        sions  before  version 1.00), lookup’s prototype should be corrected by
        hand, since bisonc++ will not by  itself  rewrite  the  parser  class’s
        header file.
        Bisonc++  was  based on bison++, originally developed by Alain Coetmeur
        (coetmeur@icdc.fr), R&D department (RDT), Informatique-CDC, France, who
        based his work on bison, GNU version 1.21.
 
        Bisonc++  version  0.98  and  beyond is a complete rewrite of an LALR-1
        parser  generator,  closely  following  the  construction  process   as
        described  in  Aho, Sethi and Ullman’s (1986) book Compilers (i.e., the
        Dragon book).  It the uses same  grammar  specification  as  bison  and
        bison++,  and  it  uses  practically the same options and directives as
        bisonc++ versions earlier than 0.98. Variables, declarations and macros
        that  are  obsolete  were  removed.  Since bisonc++ is a completely new
        program, it will most likely contain bugs. Please report  bugs  to  the
        author:
 

AUTHOR

        Frank B. Brokken (f.b.brokken@rug.nl).