Ubuntu Manpage: antlr - ANother Tool for Language Recognition

NAME

       antlr - ANother Tool for Language Recognition

SYNTAX

       antlr [options] grammar_files

DESCRIPTION

       Antlr  converts  an  extended form of context-free grammar into a set of C functions which
       directly implement an efficient form  of  deterministic  recursive-descent  LL(k)  parser.
       Context-free  grammars  may  be  augmented with predicates to allow semantics to influence
       parsing; this allows a form of context-sensitive parsing.  Selective backtracking is  also
       available  to  handle  non-LL(k)  and  even non-LALR(k) constructs.  Antlr also produces a
       definition of a lexer which can be automatically converted into C  code  for  a  DFA-based
       lexer  by  dlg.   Hence,  antlr  serves  a function much like that of yacc, however, it is
       notably more flexible and is more  integrated  with  a  lexer  generator  (antlr  directly
       generates dlg code, whereas yacc and lex are given independent descriptions).  Unlike yacc
       which accepts LALR(1) grammars, antlr accepts LL(k) grammars in an extended BNF notation —
       which eliminates the need for precedence rules.

       Like  yacc  grammars,  antlr  grammars  can  use automatically-maintained symbol attribute
       values referenced as dollar variables.  Further, because antlr generates top-down parsers,
       arbitrary  values  may  be  inherited from parent rules (passed like function parameters).
       Antlr also has a mechanism for creating and manipulating abstract-syntax-trees.

       There are various other niceties in antlr, including the ability  to  spread  one  grammar
       over  multiple files or even multiple grammars in a single file, the ability to generate a
       version of the grammar with actions stripped out (for documentation  purposes),  and  lots
       more.

OPTIONS

-ck n Use up to n symbols of lookahead when using compressed (linear approximation)
lookahead. This type of lookahead is very cheap to compute and is attempted before
full LL(k) lookahead, which is of exponential complexity in the worst case. In
general, the compressed lookahead can be much deeper (e.g, -ck 10) than the full
lookahead (which usually must be less than 4).

-CC Generate C++ output from both ANTLR and DLG.

-cr Generate a cross-reference for all rules. For each rule, print a list of all other
rules that reference it.

-e1 Ambiguities/errors shown in low detail (default).

-e2 Ambiguities/errors shown in more detail.

-e3 Ambiguities/errors shown in excruciating detail.

-fe file
Rename err.c to file.

-fh file
Rename stdpccts.h header (turns on -gh) to file.

-fl file
Rename lexical output, parser.dlg, to file.

-fm file
Rename file with lexical mode definitions, mode.h, to file.

-fr file
Rename file which remaps globally visible symbols, remap.h, to file.

-ft file
Rename tokens.h to file.

-ga Generate ANSI-compatible code (default case). This has not been rigorously tested
to be ANSI XJ11 C compliant, but it is close. The normal output of antlr is
currently compilable under both K&R, ANSI C, and C++—this option does nothing
because antlr generates a bunch of #ifdef's to do the right thing depending on the
language.

-gc Indicates that antlr should generate no C code, i.e., only perform analysis on the
grammar.

-gd C code is inserted in each of the antlr generated parsing functions to provide for
user-defined handling of a detailed parse trace. The inserted code consists of
calls to the user-supplied macros or functions called zzTRACEIN and zzTRACEOUT.
The only argument is a char * pointing to a C-style string which is the grammar
rule recognized by the current parsing function. If no definition is given for the
trace functions, upon rule entry and exit, a message will be printed indicating
that a particular rule as been entered or exited.

-ge Generate an error class for each non-terminal.

-gh Generate stdpccts.h for non-ANTLR-generated files to include. This file contains
all defines needed to describe the type of parser generated by antlr (e.g. how much
lookahead is used and whether or not trees are constructed) and contains the header
action specified by the user.

-gk Generate parsers that delay lookahead fetches until needed. Without this option,
antlr generates parsers which always have k tokens of lookahead available.

-gl Generate line info about grammar actions in C parser of the form # line "file"
which makes error messages from the C/C++ compiler make more sense as they will
point into the grammar file not the resulting C file. Debugging is easier as well,
because you will step through the grammar not C file.

-gs Do not generate sets for token expression lists; instead generate a ||-separated
sequence of LA(1)==token_number. The default is to generate sets.

-gt Generate code for Abstract-Syntax Trees.

-gx Do not create the lexical analyzer files (dlg-related). This option should be
given when the user wishes to provide a customized lexical analyzer. It may also
be used in make scripts to cause only the parser to be rebuilt when a change not
affecting the lexical structure is made to the input grammars.

-k n Set k of LL(k) to n; i.e. set tokens of look-ahead (default==1).

-o dir Directory where output files should go (default="."). This is very nice for
keeping the source directory clear of ANTLR and DLG spawn.

-p The complete grammar, collected from all input grammar files and stripped of all
comments and embedded actions, is listed to stdout. This is intended to aid in
viewing the entire grammar as a whole and to eliminate the need to keep actions
concisely stated so that the grammar is easier to read. Hence, it is preferable to
embed even complex actions directly in the grammar, rather than to call them as
subroutines, since the subroutine call overhead will be saved.

-pa This option is the same as -p except that the output is annotated with the first
sets determined from grammar analysis.

-prc on
Turn on the computation and hoisting of predicate context.

-prc off
Turn off the computation and hoisting of predicate context. This option makes 1.10
behave like the 1.06 release with option -pr on. Context computation is off by
default.

-rl n Limit the maximum number of tree nodes used by grammar analysis to n.
Occasionally, antlr is unable to analyze a grammar submitted by the user. This
rare situation can only occur when the grammar is large and the amount of lookahead
is greater than one. A nonlinear analysis algorithm is used by PCCTS to handle the
general case of LL(k) parsing. The average complexity of analysis, however, is
near linear due to some fancy footwork in the implementation which reduces the
number of calls to the full LL(k) algorithm. An error message will be displayed,
if this limit is reached, which indicates the grammar construct being analyzed when
antlr hit a non-linearity. Use this option if antlr seems to go out to lunch and
your disk start thrashing; try n=10000 to start. Once the offending construct has
been identified, try to remove the ambiguity that antlr was trying to overcome with
large lookahead analysis. The introduction of (...)? backtracking blocks
eliminates some of these problems — antlr does not analyze alternatives that begin
with (...)? (it simply backtracks, if necessary, at run time).

-w1 Set low warning level. Do not warn if semantic predicates and/or (...)? blocks are
assumed to cover ambiguous alternatives.

-w2 Ambiguous parsing decisions yield warnings even if semantic predicates or (...)?
blocks are used. Warn if predicate context computed and semantic predicates
incompletely disambiguate alternative productions.

- Read grammar from standard input and generate stdin.c as the parser file.

SPECIAL CONSIDERATIONS

       Antlr  works...   we  think.   There  is no implicit guarantee of anything.  We reserve no
       legal rights to the software known as the Purdue Compiler Construction Tool Set (PCCTS)  —
       PCCTS  is  in  the public domain.  An individual or company may do whatever they wish with
       source code distributed  with  PCCTS  or  the  code  generated  by  PCCTS,  including  the
       incorporation  of  PCCTS,  or its output, into commercial software.  We encourage users to
       develop software with PCCTS.  However, we do ask that credit is given to us for developing
       PCCTS.   By  "credit",  we  mean  that if you incorporate our source code into one of your
       programs (commercial product, research project, or otherwise) that  you  acknowledge  this
       fact  somewhere  in the documentation, research report, etc...  If you like PCCTS and have
       developed a nice tool with the output, please mention that you developed it  using  PCCTS.
       As  long as these guidelines are followed, we expect to continue enhancing this system and
       expect to make other tools available as they are completed.

FILES