Provided by: pretzel_2.0n-2-0.3_amd64 bug

NAME

       pretzel - the universal prettyprinter generator

SYNOPSIS

       pretzel [-qtgdh] [-o outfile] fileprefix

       pretzel [-qtgdh] [-o outfile] file1 file2

DESCRIPTION

       Pretzel  is  a  program that generates a prettyprinter module from a formal description of
       the way a certain language should be prettyprinted.  A  prettyprinter  is  a  function  or
       program  that rearranges source code to enhance its readability.  Prettyprinters generated
       by pretzel output LaTeX source code that can be used within your own documents.   NB  that
       pretzel produces modules, not programs!

       You  have  to  provide  two  input files to pretzel that specify the way given source code
       should be prettyprinted. These two files are called the formatted token file (suffix  .ft)
       and the formatted grammar file (suffix .fg).

       From  this  input,  pretzel  generates  two  things:  a  valid flex(1) file that forms the
       prettyprinting scanner and a valid bison(1) input file that  can  be  used  to  build  the
       prettyprinting  parser  (which  is  the  actual  prettyprinter).   There is a shell script
       pretzel-it that faciliates using pretzel (see pretzel-it(1)).  This man page is only meant
       as a quick reference to pretzel usage.  Look into the main documentation of pretzel if you
       are new to all this.

   Invoking pretzel
       Invoking pretzel can take two forms: Either invoke it specifying only the common prefix of
       the  two input files, or specify both files seperately on the command line. If you specify
       both files, the formatted token file comes first.

   Examples
       Say your input files are called foo.ft and foo.fg.  Then you can say

              pretzel foo

       to invoke pretzel properly. If your files are called foo.ft and bar.fg then you would have
       to say

              pretzel foo.ft bar.fg

       to do the job.

OPTIONS

       Pretzel recognizes the following options:

              -q     Run quietly.

              -t     Process formatted token file only.

              -g     Process  formatted  grammar  file  only  (options  -t  and  -g  are mutually
                     exclusive).

              -d     Print debug information to the screen.

              -h     Print full usage message.

              -o name
                     Use name as prefix of the generated output files.

THE INPUT FILES

       This section summarizes the format of the input files and the  format  command  primitives
       that pretzel supports.

   The formatted token file
       The  formatted  token  file  contains a list of token definitions with their corresponding
       "prettyprinted" form. The prettyprinted form of a token will be called an attribute  or  a
       translation.

       The general outline of the formatted token file is

              declarations

              %%

              token definitions

       Normally,  the  declarations  part is empty. You can put a general description of the file
       here (as a C comment) and redefinitions of the default interface go here as well.

       The token definitions section of the formatted token  file  contains  a  series  of  token
       definitions of the form:

              pattern token attribute

       The  pattern  must  be  a  valid  regular  expression  (in  terms  of flex(1)) and must be
       unindented. The token specifies the symbolic name of the token for the pattern and  begins
       at  the  first  non-whitespace character after the pattern. The token name must be a legal
       name for an identifier in Pascal notation and must be all in upper case.  (Underlines  are
       allowed but not at the beginning of a word.)

       The  attribute  for  this  token,  that  is  it's prettyprinted form, consists of all text
       between the two curling brackets {  and  }.   Attributes  can  be  either  simple  strings
       (surrounded by double quotes), format commands (see below), your own C++ code (enclosed in
       angled brackets [ and ], see below) or  a  combination  of  both  joined  together  by  an
       optional  + sign. Attribute definitions can cover several lines and the starting { needn't
       stand on the same line as the token definition; however subsequent lines must be  indented
       with at least one blank or one tab.

       If  you define strings as part of an attribute definition, you have to specify them in a C
       kind of fashion, i.e. you can insert newlines and tabs with \n and \t.  But if you want to
       insert  a  backslash  into a string, you mustn't forget to put two backslashes \\ into the
       input file. This is especially noteworthy if you are using TeX as typesetter.

       If the definition of the attribute is  omitted  pretzel  creates  an  attribute  for  this
       pattern  by  default.  The  default  attribute  consists of the string containing the text
       matched by the corresponding pattern.

       The user himself may also refer to the matched text by using the sequence **.  Thus

              "foo"       BAR

              "foo"       BAR     { ** }

              "foo"       BAR     { "foo" }

       all have the same meaning.

       You can use a | sign as a token name; this signals that the current regular expression has
       the  same token name (and also the same attribute) as the token specified in the following
       line (empty lines are ignored). An attribute definition behind a |  is  illegal.   However
       you  may  specify regular expressions with neither a token name nor an attribute to give a
       default rule or to eat up whitespace.

       The declarations and the token definitions must be separated by a line containing only the
       two characters %%.

   Examples
       The following examples are all legal token definitions:

              [0-9]           DIGIT

              "{"             OPEN           { "\\{" indent force }

              [a-z][a-z0-9]*  ID             { "{\\it " ** "}" }

              "function"      |

              "procedure"     PROC_INTRO     { big_force + ** }

              [\t\ \n]        |

              .

   The formatted grammar file
       In  the formatted grammar file the user encodes the general prettyprinting grammar for the
       programming language. This is done by specifying a context free grammar  of  the  language
       and by adding information about the creation of new attributes in every rule.  Its general
       outline looks like this:

              token declarations

              %%

              grammar rules

       The token declarations section may be empty and the separator between the two parts of the
       file %% must appear unindented on a single line by itself.

       The  grammar  rules  section  contains the collection of rules of the context free grammar
       that can be accompanied by an attribute definition.  A rule is specified  by  stating  the
       resulting token, a colon and then the series of tokens which will be reduced by this rule.
       The rule is ended by a semicolon. A block definition in Pascal for example might look like
       this:

                block : BEGIN stmt_list END ;

       Following  the  token  list on the right side of the colon can be an attribute definition;
       this definition states, how the translation of the produced symbol is  obtained  from  the
       tokens on the right side of the rule.

       An attribute definition is bracketed amidst curling brackets { and } and can again consist
       of strings (in double quotes), format commands or C code (enclosed in  angled  brackets  [
       and  ],  see  below) joined together by an optional +.  But here you can also refer to the
       attributes of the tokens on the right side of the rule. This is done in a slightly awkward
       notation  with  a  number  that is preceded with a $ dollar sign. The numbers refer to the
       order of appearance of the symbols on the right side of the rule.  So  $1  refers  to  the
       first token of the rule, $2 to the second, and so on.

       Again  attribute  definitions  are  allowed  to  span  several  lines  and strings must be
       specified in C manner.

       The attribute definition may be omitted. If this is so, pretzel will by default  form  the
       attribute  of  the  produced symbol from the simple concatenation of the attributes on the
       right side of the rule.  Of course you may also have empty  right  sides  of  a  rule  (to
       produce  things  out  of nothing) or simply concatenate two or more rules resulting in the
       same symbol with a |.

       For every terminal token that appears in the grammar  rules  a  special  line  has  to  be
       written into the declarations section of the file. These definitions are of the form

              %token tokenname

       It is very important not to forget this.

   Examples
       For  example,  here  again  is  the  possible definition of a block in Pascal, now with an
       example attribute definition:

                block : BEGIN stmt_list END   { $1 $2 force $3 } ;

       The attribute of a block will therefore  consist  of  the  attributes  of  the  BEGIN  and
       stmt_list  tokens,  joined  together  with  a force command and the translation of the END
       token.

       These two lines mean the same:

              stmt : block SEMI ;

              stmt : block SEMI       { $1 $2 } ;

       These are legal rules too:

              stmt_list   :                      { force }
                          | stmt_list stmt SEMI  { $1 $2 $3 force };

   Comments and Code
       There is a very simple way of putting comments into  the  formatted  token  and  formatted
       grammar files. This is done in a C++ kind of manner by preceding the comment with a double
       slash //.  All characters between this sign and  the  end  of  the  line  are  ignored  by
       pretzel.

       In  both  files you can put additional C/C++ code before and after the definitions/grammar
       sections.  If you want to insert code at the end of your file, you have to put a second %%
       on  a  line  by itself and put the code behind it. C/C++ code before the definitions/rules
       section has to be tied in with a %{, %} pair. Inserting  extra  code  is  interesting  for
       people who want to access it from within the attribute definition.

   Code within attribute definitions
       From  version  2.0  onwards  pretzel allows to insert C++ code into attribute definitions.
       This is how pretzel expects you to write code inside your pretzel input files:

       Code fragments are bracketed within angled  brackets.  Any  angled  brackets  that  appear
       within  the  C  code must be escaped with a backslash. There can blocks of code before and
       behind the attribute definition which are called starting code and endingcode.   Only  one
       starting  or  ending code block is allowed.  Both are totally optional, but if you want to
       specify either or, you need an attribute definition. Starting code is executed before  the
       attribute  of the new token is built, ending code is executed after building the attribute
       and before returning to the calling function (in the scanner).

       Code parts within attribute definitions must return a pointer to an Attribute class object
       (see  file  attr/attr.nw  in  the pretzel distribution for details).  Within the formatted
       token file, the matched text is visible to you in form of a  char*  yytext  variable.  The
       symbolic  names  of  the  tokens  are  available by the same name that pretzel gives them.
       Starting code, code within attribute definitions and ending code is totally optional.  But
       at  any place where they are allowed, only one bracketed code bit may be placed. Here's an
       example from the formatted grammar file:

              id : ID  { [lookup($1) ? create("{\\bf ") :

                            create("{\\it ")] $1 "}" };

       This example shows how to format an identifier depending on whether  it  is  in  a  lookup
       table or not. Identifiers could be installed in the table for example like this:

              typedef : TYPEDEF_LIKE INT_LIKE ID

                         [ install($3); ]

                         { $1 $2 "{\\bf " $3 "}" };

       More  examples  can be found in the Pretzelbook. Common routines to escape identifiers, to
       build and manage lookup tables, to convert to and  from  Attribute*  or  to  output  debug
       information  can  be  found in the files belonging to the C prettyprinter in the directory
       languages/cee of the pretzel distribution.

   The set of format commands
       Here's a list of the format commands supported by pretzel and their meaning:
       null   empty command.
       indent indents the next line a little more.
       outdent
              takes back the last indentation (de-indent).
       force  forces a line break.
       break_space
              denotes a possible space for a line break.
       opt1...opt9
              denotes an optional line break with the continuation line  indented  a  litte  with
              respect to the normal starting position.
       backup denotes a small backspace.
       big_force
              forces a line break and inserts a little extra space.
       no_indent
              causes the current line to be output flushleft.
       cancel obliterates  any  break_space,  opt,  force  or  big_force  command that immediatly
              precedes or follows it and also cancels any backup command that follows it.

              For a complete reference on how to write pretzel input, look into  the  Pretzelbook
              which is included in the pretzel distribution.

   Format command preprocessing
       The format commands are preprocessed according to the following two rules:

       1. A sequence of consecutive
              break_space,  force, and/or big_force commands is replaced by a single command (the
              maximum of the given ones).

       2. The cancel command cancels any  break_space,  opt,  force  or  big_force  command  that
              immediatly  precede  or  follow it and also cancels any backup command that follows
              it.

THE OUTPUT FILES

       If pretzel runs without error, you will obtain the definition of a C++ prettyprinter class
       in  form  of  two  files.  The  first  file is a valid bison(1) file from which the actual
       prettyprinting parser class can be obtained. The second file (generated from the formatted
       token file) can be processed with the flex(1) scanner generator to form the prettyprinting
       scanner class used by the parser.

   The bison file
       The generated bison file contains the definitions for a prettyprinting parser  class  that
       is  a subclass of the following abstract base class (contained in the file Pparse.h within
       the pretzel include directory):

              #include<iostream>

              #include"attr.h"

              #include"output.h"

              class Pparse {

              public:
                     Pparse() {};

                     ~Pparse() {};

                     virtual int prettyprint(istream*, ostream*) = 0;

                     virtual int prettyprint(istream*, Output*) = 0;
              };

       The prettyprinter generated by pretzel will be a subclass of the following form:

              #include Pparse.h // include abstract base class

              class PPARSE_NAME : public Pparse {

              public:
                     PPARSE_NAME(); ~PPARSE_NAME();

                     int prettyprint(istream*, ostream*);

                     int prettyprint(istream*, Output*);

                     void debug_on(); void debug_off();
              };

       The name of the class may be changed by  redefining  the  preprocessor  macro  PPARSE_NAME
       within  the formatted grammar file. The actual prettyprinting function is prettyprint that
       reads text from an input stream (i.e. a C++ istream object) and outputs the results to  an
       output  stream  (i.e. a C++ ostream object, see ios(3C++)).  The second overloaded version
       of prettyprint takes an Output object (see the file output/output.nw and  the  Pretzelbook
       in  the  pretzel distribution for details) and uses this to output the prettyprinted code.
       The debug functions can be used to turn debugging output to cerr on and off.

   The flex file
       The prettyprinting parser class relies on the service of a prettyprinting scanner that can
       be  produced using the second pretzel file. It contails a complete definition of a scanner
       subclass of this abstract base class (see file Pscan.h in the pretzel include directory):

              #include<iostream> #include"attr.h"

              class Pscan {

              public:
                     Pscan(istream*) {}; ~Pscan() {};

                     virtual int scan(Attribute**) = 0;
              };

       The scanner must be initialized with a C++ istream pointer from which it takes its  input.
       A  call  to  the actual scan function returns an integer (the token code of the token just
       scanned or 0 on end-of-file) plus a call by reference attribute containing the contents of
       the token (see file attr/attr.nw from the pretzel distribution).

       The produced prettyprinting scanner class is a subclass and looks like this:

              #include Pscan.h // include abstract base class

              class PSCAN_NAME : public Pscan {

              public:
                     PSCAN_NAME(istream*);

                     ~PSCAN_NAME();

                     int scan(Attribute**);

       The  name  of the scanner can be changed within the formatted token file by redefining the
       PSCAN_NAME macro within the declarations section. The scanner class expects to find  token
       definitions  common to the scanner and the parser in a file called ptokdefs.h and will try
       to include this file. You either have to provide this file yourself or use the  -d  option
       of  Bison  to create one that fits a formatted grammar (see bison(1)).  You may change the
       name of the file that the scanner expects by redefining the  PTOKDEFS_NAME  macro  in  the
       declarations  section  of  the formatted token file.  Commen header files for the abstract
       base classes and the default subclasses reside in the pretzel include directory.

FILES

       /usr/lib/pretzel/libpretzel.a pretzel runtime library.
       /usr/include/pretzel          directory for runtime library include files (pretzel include
                                     directory).
       /usr/local/lib/pretzel/include/Pscan.h
       /usr/include/pretzel/Pparse.h headers for abstract base files.
       /usr/include/pretzel/Ppscan.h
       /usr/include/pretzel/Ppparse.h
                                     default headers for generated subclasses.
       /usr/lib/texmf/tex/latex/pretzel/pretzel-latex.sty
                                     LaTeX style to typeset pretzel output.

SEE ALSO

       pretzel-it(1), flex(1), bison(1)
       The  PretzelBook, second edition - ultimate source of information, included in the pretzel
       distribution.
       The    Pretzel    homepage     on     the     WWW     at     http://www.iti.informatik.tu-
       darmstadt.de/~gaertner/pretzel

AUTHOR

       Felix Gaertner, email: fcg@acm.org

                                          June 11, 1998                                pretzel(1)