xenial (1) pretzel.1.gz

Provided by: pretzel_2.0n-2-0.3_amd64 bug

NAME

       pretzel - the universal prettyprinter generator

SYNOPSIS

       pretzel [-qtgdh] [-o outfile] fileprefix

       pretzel [-qtgdh] [-o outfile] file1 file2

DESCRIPTION

       Pretzel is a program that generates a prettyprinter module from a formal description of the way a certain
       language should be prettyprinted.  A prettyprinter is a function or program that rearranges  source  code
       to  enhance  its  readability.   Prettyprinters generated by pretzel output LaTeX source code that can be
       used within your own documents.  NB that pretzel produces modules, not programs!

       You have to provide two input files to  pretzel  that  specify  the  way  given  source  code  should  be
       prettyprinted. These two files are called the formatted token file (suffix .ft) and the formatted grammar
       file (suffix .fg).

       From this input, pretzel generates two things: a valid flex(1) file that forms the prettyprinting scanner
       and  a valid bison(1) input file that can be used to build the prettyprinting parser (which is the actual
       prettyprinter).  There is a shell script pretzel-it that faciliates using  pretzel  (see  pretzel-it(1)).
       This  man  page is only meant as a quick reference to pretzel usage.  Look into the main documentation of
       pretzel if you are new to all this.

   Invoking pretzel
       Invoking pretzel can take two forms: Either invoke it specifying only the common prefix of the two  input
       files,  or  specify  both  files seperately on the command line. If you specify both files, the formatted
       token file comes first.

   Examples
       Say your input files are called foo.ft and foo.fg.  Then you can say

              pretzel foo

       to invoke pretzel properly. If your files are called foo.ft and bar.fg then you would have to say

              pretzel foo.ft bar.fg

       to do the job.

OPTIONS

       Pretzel recognizes the following options:

              -q     Run quietly.

              -t     Process formatted token file only.

              -g     Process formatted grammar file only (options -t and -g are mutually exclusive).

              -d     Print debug information to the screen.

              -h     Print full usage message.

              -o name
                     Use name as prefix of the generated output files.

THE INPUT FILES

       This section summarizes the format of the input files and the  format  command  primitives  that  pretzel
       supports.

   The formatted token file
       The  formatted  token  file contains a list of token definitions with their corresponding "prettyprinted"
       form. The prettyprinted form of a token will be called an attribute or a translation.

       The general outline of the formatted token file is

              declarations

              %%

              token definitions

       Normally, the declarations part is empty. You can put a general description of the  file  here  (as  a  C
       comment) and redefinitions of the default interface go here as well.

       The  token  definitions section of the formatted token file contains a series of token definitions of the
       form:

              pattern token attribute

       The pattern must be a valid regular expression (in terms of flex(1)) and must be  unindented.  The  token
       specifies the symbolic name of the token for the pattern and begins at the first non-whitespace character
       after the pattern. The token name must be a legal name for an identifier in Pascal notation and  must  be
       all in upper case.  (Underlines are allowed but not at the beginning of a word.)

       The  attribute  for  this  token,  that  is it's prettyprinted form, consists of all text between the two
       curling brackets { and }.  Attributes can be either simple strings (surrounded by double quotes),  format
       commands (see below), your own C++ code (enclosed in angled brackets [ and ], see below) or a combination
       of both joined together by an optional + sign. Attribute definitions can  cover  several  lines  and  the
       starting  {  needn't  stand  on  the  same line as the token definition; however subsequent lines must be
       indented with at least one blank or one tab.

       If you define strings as part of an attribute definition, you have  to  specify  them  in  a  C  kind  of
       fashion,  i.e.  you  can  insert newlines and tabs with \n and \t.  But if you want to insert a backslash
       into a string, you mustn't forget to put two backslashes \\ into  the  input  file.  This  is  especially
       noteworthy if you are using TeX as typesetter.

       If  the  definition of the attribute is omitted pretzel creates an attribute for this pattern by default.
       The default attribute consists of the string containing the text matched by the corresponding pattern.

       The user himself may also refer to the matched text by using the sequence **.  Thus

              "foo"       BAR

              "foo"       BAR     { ** }

              "foo"       BAR     { "foo" }

       all have the same meaning.

       You can use a | sign as a token name; this signals that the current regular expression has the same token
       name  (and  also  the  same  attribute)  as  the  token  specified in the following line (empty lines are
       ignored). An attribute definition behind a | is illegal.  However you  may  specify  regular  expressions
       with neither a token name nor an attribute to give a default rule or to eat up whitespace.

       The declarations and the token definitions must be separated by a line containing only the two characters
       %%.

   Examples
       The following examples are all legal token definitions:

              [0-9]           DIGIT

              "{"             OPEN           { "\\{" indent force }

              [a-z][a-z0-9]*  ID             { "{\\it " ** "}" }

              "function"      |

              "procedure"     PROC_INTRO     { big_force + ** }

              [\t\ \n]        |

              .

   The formatted grammar file
       In the formatted grammar file the user encodes the general prettyprinting  grammar  for  the  programming
       language.  This  is  done  by specifying a context free grammar of the language and by adding information
       about the creation of new attributes in every rule.  Its general outline looks like this:

              token declarations

              %%

              grammar rules

       The token declarations section may be empty and the separator between the two parts of the file  %%  must
       appear unindented on a single line by itself.

       The  grammar  rules  section  contains  the  collection  of rules of the context free grammar that can be
       accompanied by an attribute definition.  A rule is specified by stating the resulting token, a colon  and
       then  the  series of tokens which will be reduced by this rule. The rule is ended by a semicolon. A block
       definition in Pascal for example might look like this:

                block : BEGIN stmt_list END ;

       Following the token list on the right side of the colon can be an attribute definition;  this  definition
       states,  how  the translation of the produced symbol is obtained from the tokens on the right side of the
       rule.

       An attribute definition is bracketed amidst curling brackets { and } and can again consist of strings (in
       double  quotes),  format  commands  or  C  code  (enclosed  in angled brackets [ and ], see below) joined
       together by an optional +.  But here you can also refer to the attributes of the tokens on the right side
       of  the  rule. This is done in a slightly awkward notation with a number that is preceded with a $ dollar
       sign. The numbers refer to the order of appearance of the symbols on the right side of the  rule.  So  $1
       refers to the first token of the rule, $2 to the second, and so on.

       Again attribute definitions are allowed to span several lines and strings must be specified in C manner.

       The attribute definition may be omitted. If this is so, pretzel will by default form the attribute of the
       produced symbol from the simple concatenation of the attributes on the right side of the rule.  Of course
       you  may  also  have empty right sides of a rule (to produce things out of nothing) or simply concatenate
       two or more rules resulting in the same symbol with a |.

       For every terminal token that appears in the grammar rules a special line has  to  be  written  into  the
       declarations section of the file. These definitions are of the form

              %token tokenname

       It is very important not to forget this.

   Examples
       For  example,  here  again is the possible definition of a block in Pascal, now with an example attribute
       definition:

                block : BEGIN stmt_list END   { $1 $2 force $3 } ;

       The attribute of a block will therefore consist of the attributes of  the  BEGIN  and  stmt_list  tokens,
       joined together with a force command and the translation of the END token.

       These two lines mean the same:

              stmt : block SEMI ;

              stmt : block SEMI       { $1 $2 } ;

       These are legal rules too:

              stmt_list   :                      { force }
                          | stmt_list stmt SEMI  { $1 $2 $3 force };

   Comments and Code
       There is a very simple way of putting comments into the formatted token and formatted grammar files. This
       is done in a C++ kind of manner by preceding the comment with a double slash //.  All characters  between
       this sign and the end of the line are ignored by pretzel.

       In  both  files  you can put additional C/C++ code before and after the definitions/grammar sections.  If
       you want to insert code at the end of your file, you have to put a second %% on a line by itself and  put
       the code behind it. C/C++ code before the definitions/rules section has to be tied in with a %{, %} pair.
       Inserting extra code is interesting  for  people  who  want  to  access  it  from  within  the  attribute
       definition.

   Code within attribute definitions
       From  version  2.0  onwards  pretzel  allows  to  insert C++ code into attribute definitions. This is how
       pretzel expects you to write code inside your pretzel input files:

       Code fragments are bracketed within angled brackets. Any angled brackets that appear within  the  C  code
       must  be  escaped  with  a backslash. There can blocks of code before and behind the attribute definition
       which are called starting code and endingcode.  Only one starting or ending code block is allowed.   Both
       are  totally  optional,  but if you want to specify either or, you need an attribute definition. Starting
       code is executed before the attribute of the new token is built, ending code is executed  after  building
       the attribute and before returning to the calling function (in the scanner).

       Code  parts  within  attribute  definitions  must return a pointer to an Attribute class object (see file
       attr/attr.nw in the pretzel distribution for details).  Within the formatted token file, the matched text
       is  visible  to you in form of a char* yytext variable. The symbolic names of the tokens are available by
       the same name that pretzel gives them.  Starting code, code within attribute definitions and ending  code
       is  totally optional. But at any place where they are allowed, only one bracketed code bit may be placed.
       Here's an example from the formatted grammar file:

              id : ID  { [lookup($1) ? create("{\\bf ") :

                            create("{\\it ")] $1 "}" };

       This example shows how to format an identifier depending on whether it is  in  a  lookup  table  or  not.
       Identifiers could be installed in the table for example like this:

              typedef : TYPEDEF_LIKE INT_LIKE ID

                         [ install($3); ]

                         { $1 $2 "{\\bf " $3 "}" };

       More examples can be found in the Pretzelbook. Common routines to escape identifiers, to build and manage
       lookup tables, to convert to and from Attribute* or to output debug information can be found in the files
       belonging to the C prettyprinter in the directory languages/cee of the pretzel distribution.

   The set of format commands
       Here's a list of the format commands supported by pretzel and their meaning:
       null   empty command.
       indent indents the next line a little more.
       outdent
              takes back the last indentation (de-indent).
       force  forces a line break.
       break_space
              denotes a possible space for a line break.
       opt1...opt9
              denotes  an  optional  line  break with the continuation line indented a litte with respect to the
              normal starting position.
       backup denotes a small backspace.
       big_force
              forces a line break and inserts a little extra space.
       no_indent
              causes the current line to be output flushleft.
       cancel obliterates any break_space, opt, force or big_force command that immediatly precedes  or  follows
              it and also cancels any backup command that follows it.

              For  a  complete  reference  on  how  to  write  pretzel input, look into the Pretzelbook which is
              included in the pretzel distribution.

   Format command preprocessing
       The format commands are preprocessed according to the following two rules:

       1. A sequence of consecutive
              break_space, force, and/or big_force commands is replaced by a single command (the maximum of  the
              given ones).

       2. The cancel command cancels any break_space, opt, force or big_force command that immediatly precede or
              follow it and also cancels any backup command that follows it.

THE OUTPUT FILES

       If pretzel runs without error, you will obtain the definition of a C++ prettyprinter class in form of two
       files.  The  first file is a valid bison(1) file from which the actual prettyprinting parser class can be
       obtained. The second file (generated from the formatted token file) can be  processed  with  the  flex(1)
       scanner generator to form the prettyprinting scanner class used by the parser.

   The bison file
       The generated bison file contains the definitions for a prettyprinting parser class that is a subclass of
       the following abstract base class (contained in the file Pparse.h within the pretzel include directory):

              #include<iostream>

              #include"attr.h"

              #include"output.h"

              class Pparse {

              public:
                     Pparse() {};

                     ~Pparse() {};

                     virtual int prettyprint(istream*, ostream*) = 0;

                     virtual int prettyprint(istream*, Output*) = 0;
              };

       The prettyprinter generated by pretzel will be a subclass of the following form:

              #include Pparse.h // include abstract base class

              class PPARSE_NAME : public Pparse {

              public:
                     PPARSE_NAME(); ~PPARSE_NAME();

                     int prettyprint(istream*, ostream*);

                     int prettyprint(istream*, Output*);

                     void debug_on(); void debug_off();
              };

       The name of the class may be  changed  by  redefining  the  preprocessor  macro  PPARSE_NAME  within  the
       formatted  grammar  file. The actual prettyprinting function is prettyprint that reads text from an input
       stream (i.e. a C++ istream object) and outputs the results to  an  output  stream  (i.e.  a  C++  ostream
       object,  see  ios(3C++)).   The  second overloaded version of prettyprint takes an Output object (see the
       file output/output.nw and the Pretzelbook in the pretzel distribution  for  details)  and  uses  this  to
       output  the  prettyprinted  code. The debug functions can be used to turn debugging output to cerr on and
       off.

   The flex file
       The prettyprinting parser class relies on the service of a prettyprinting scanner that  can  be  produced
       using  the  second pretzel file. It contails a complete definition of a scanner subclass of this abstract
       base class (see file Pscan.h in the pretzel include directory):

              #include<iostream> #include"attr.h"

              class Pscan {

              public:
                     Pscan(istream*) {}; ~Pscan() {};

                     virtual int scan(Attribute**) = 0;
              };

       The scanner must be initialized with a C++ istream pointer from which it takes its input. A call  to  the
       actual  scan  function  returns an integer (the token code of the token just scanned or 0 on end-of-file)
       plus a call by reference attribute containing the contents of the token (see file attr/attr.nw  from  the
       pretzel distribution).

       The produced prettyprinting scanner class is a subclass and looks like this:

              #include Pscan.h // include abstract base class

              class PSCAN_NAME : public Pscan {

              public:
                     PSCAN_NAME(istream*);

                     ~PSCAN_NAME();

                     int scan(Attribute**);

       The name of the scanner can be changed within the formatted token file by redefining the PSCAN_NAME macro
       within the declarations section. The scanner class expects  to  find  token  definitions  common  to  the
       scanner  and the parser in a file called ptokdefs.h and will try to include this file. You either have to
       provide this file yourself or use the -d option of Bison to create one that fits a formatted grammar (see
       bison(1)).   You may change the name of the file that the scanner expects by redefining the PTOKDEFS_NAME
       macro in the declarations section of the formatted token file.  Commen header files for the abstract base
       classes and the default subclasses reside in the pretzel include directory.

FILES

       /usr/lib/pretzel/libpretzel.a pretzel runtime library.
       /usr/include/pretzel          directory for runtime library include files (pretzel include directory).
       /usr/local/lib/pretzel/include/Pscan.h
       /usr/include/pretzel/Pparse.h headers for abstract base files.
       /usr/include/pretzel/Ppscan.h
       /usr/include/pretzel/Ppparse.h
                                     default headers for generated subclasses.
       /usr/lib/texmf/tex/latex/pretzel/pretzel-latex.sty
                                     LaTeX style to typeset pretzel output.

SEE ALSO

       pretzel-it(1), flex(1), bison(1)
       The PretzelBook, second edition - ultimate source of information, included in the pretzel distribution.
       The Pretzel homepage on the WWW at http://www.iti.informatik.tu-darmstadt.de/~gaertner/pretzel

AUTHOR

       Felix Gaertner, email: fcg@acm.org

                                                  June 11, 1998                                       pretzel(1)