Ubuntu Manpage: Marpa::R2::Progress - Progress reports on your parse

Provided by: libmarpa-r2-perl_2.086000~dfsg-5build1_amd64

NAME

       Marpa::R2::Progress - Progress reports on your parse

About this document

       This document describes the progress reports for Marpa's SLIF interface.  These allow an
       application to know exactly where it is in the parse at any point.  For parse locations of
       the user's choosing, progress reports list all the rules in play, and indicate the
       location at which the rule started, and how far into the rule parsing has progressed.

       Progress reports are extremely useful in debugging grammars and the detailed example in
       this document is a debugging situation.  Readers specifically interested in debugging a
       grammar should read the document on tracing problems before reading this document.

Introduction to Earley items

To read the "show_progress" output, it is important to have a basic idea of what Earley
items are, and of what the information in them means. Everything that the user needs to
know is explained in this section.

Dotted rules
Marpa is based on Jay Earley's algorithm for parsing. The idea behind Earley's algorithm
is that you can parse by building a table of rules and where you are in those rules.
"Where" means two things: location in the rule relative to the rule's symbols, and
location relative to the parse's input stream.

Let's look at an example of a rule in a context-free grammar. Here's the rule for
assignment from the Perl distribution's "perly.y"

" termbinop -> term ASSIGNOP term"

"ASSIGNOP" is "perly.y"'s internal name for the assignment operator. In plain Perl terms,
this is the ""="" character.

In parsing this rule, we can be at any of four possible locations. One location is at the
beginning, before all of the symbols. The other three locations are immediately after
each of the rule's three symbols.

Within a rule, position relative to the symbols of the rule is traditionally indicated
with a dot. In fact, the symbol-relative rule position is very often called the dot
location. Taken as a pair, a rule and a dot location are called a dotted rule.

Here's our rule with a dot location indicated:

" termbinop -> X term ASSIGNOP term"

The dot location in this dotted rule is at the beginning. A dot location at the beginning
of a dotted rule means that we have not recognized any symbols in the rule yet. All we
are doing is predicting that the rule will occur. A dotted rule with the dot before all
of its symbols is called a prediction or a predicted rule.

Here's another dotted rule:

" termbinop -> term X ASSIGNOP term"

In this dotted rule, we are saying we have seen a "term", but have not yet recognized an
"ASSIGNOP".

There's another special kind of dotted rule, a completion. A completion (also called a
completed rule) is a dotted rule with the dot after all of the symbols. Here is the
completion for the rule that we have been using as an example:

" termbinop -> term ASSIGNOP term X"

A completion indicates that a rule has been fully recognized.

Earley items
The dotted rules contain all but one piece of the information that Marpa needs to track.
The missing piece is the second of the two "wheres": where in the input stream. To
associate input stream location and dotted rules, Marpa uses what are now called Earley
items.

A convenient way to think of an Earley item is as a triple, or 3-tuple, consisting of
dotted rule, origin and current location. The origin is the location in the input stream
where the dotted rule starts. The current location (also called the dot location) is the
location in the input stream which corresponds to the dot position.

In Marpa terms, G1 location is location in terms of the G1 subgrammar's Earley sets. When
the term "location" is used in this document, it means G1 location unless otherwise
indicated.

A user often finds it much more convenient to think in terms of line and column position
in the input stream, instead of G1 location. Every G1 location corresponds to a range of
positions in the input stream. When the term "position" is used in this document, it
means input stream position, unless otherwise indicated.

Two noteworthy consequences follow from the way in which origin and current G1 location
are defined. First, if a dotted rule is a prediction, then origin and current location
will always be the same. Second, the input stream location where a rule ends is not
tracked unless the dotted rule is a completion. In other cases, an Earley item does not
tell us if a rule will ever be completed, much less at which location.

The problem

       For this example of debugging, I have taken a very simple prototype of a string expression
       calculator and deliberately introduced a problem.  I've commented out one of the correct
       rules:

           # <numeric assignment> ::= variable '=' <numeric expression>

       and replaced it with a altered one:

           <numeric assignment> ::= variable '=' expression

       For those readers who like to look ahead (and I encourage you to be one of those readers)
       all of the code and outputs for this example are collected in the "Appendix".

       This altered rule contains an mistake of the kind that is easy to make in actual practice.
       (In this case, a unlucky choice of naming conventions may have contributed.)  The altered
       version will cause problems.  In what follows, we'll pretend we don't already know where
       the problem is, and that in desk-checking the grammar our eye does not spot the mistake,
       so that we need to use the Marpa diagnostics and tracing facilities to "discover" it.

The example

       The example we will use is a prototype string calculator.  It's extremely simple, to make
       the example easy to follow.  But it can be seen as a realistic example, if it is thought
       of as a very early stage in the incremental development of something useful.

           :default ::= action => ::array bless => ::lhs
           :start ::= statements
           statements ::= statement *
           statement ::= assignment | <numeric assignment>
           assignment ::= 'set' variable 'to' expression

           # This is a deliberate error in the grammar
           # The next line should be:
           # <numeric assignment> ::= variable '=' <numeric expression>
           # I have changed the <numeric expression>  to <expression> which
           # will cause problems.
           <numeric assignment> ::= variable '=' expression

           expression ::=
                  variable | string
               || 'string' '(' <numeric expression> ')'
               || expression '+' expression
           <numeric expression> ::=
                  variable | number
               || <numeric expression> '+' <numeric expression>
               || <numeric expression> '*' <numeric expression>
           variable ~ [\w]+
           number ~ [\d]+
           string ~ ['] <string contents> [']
           <string contents> ~ [^'\x{0A}\x{0B}\x{0C}\x{0D}\x{0085}\x{2028}\x{2029}]+
           :discard ~ whitespace
           whitespace ~ [\s]+

       At this stage of developing our string calculator, we have assignment, variables,
       constants, concatenation and conversion of numerics.  For numerics, we have assignment,
       variables, constants, multiplication and addition.

       We decide that, since string expressions and variables are the "default", that in the
       grammar we'll make the symbol names for numeric assignment and expressions explicit:
       "<numeric expression>" and "<numeric assignment>".  But since strings are the default, we
       decide to call our string expressions simply "<expression>", and to call our string
       assignments simply "<assignment>".  This seems like a good idea, but it is also likely to
       cause confusion.  For the sake of our example we will pretend that it did.

The error message

       If we try the following input,

           my $test_input = 'a = 8675309 + 42 * 711';

       we will get this error message,

           Error in SLIF parse: No lexemes accepted at line 1, column 18
             Rejected lexeme #0: Lexer "L0"; '*'; value="*"; length = 1
           * String before error: a = 8675309 + 42\s
           * The error was at line 1, column 18, and at character 0x002a '*', ...
           * here: * 711

       The error message indicates that Marpa rejected the ""*"" operator.

The value of the parse

In debugging this issue, we'll look at the value of the parse first. The parse value
differs from the other debugging aids we'll discuss. Every other debugging tool we will
describe is always available, no matter how badly the parse failed. But if you have a
problem parsing, you often won't get a parse value.

Our luck holds. Here's a dump of the parse value at the point of failure. It's a nice to
way to see what Marpa thinks the parse was so far.

\bless( [
bless( [
bless( [
'a',
'=',
bless( [
bless( [
'8675309'
], 'My_Nodes::expression' ),
'+',
bless( [
'42'
], 'My_Nodes::expression' )
], 'My_Nodes::expression' )
], 'My_Nodes::numeric_assignment' )
], 'My_Nodes::statement' )
], 'My_Nodes::statements' );

If we were perceptive, we might spot the error here. Our parse is not quite right, and
that shows up in the outer "My_Nodes::expression" -- it should be
"My_Nodes::numeric_expression". We'll assume that we don't notice this.

In fact, in the following, we'll pretend we haven't seen the dump of the parse value. We
can't always get a parse value, so we don't want to rely on it.

Output from trace_terminals()
You can rely on getting the output from "trace_terminals", and it is a good next place to
check. Typically, you will be interested in the last tokens to be accepted. Sometimes
that information alone is enough to make it clear where the problem is.

The full "trace_terminals" output for this example is in the Appendix. We see that the
recognizer accepts the input as far as the multiplication sign (""*""), which it rejects.
In Marpa, a lexeme is "acceptable" if it fits the grammar and the input so far. A lexeme
is rejected if it is not acceptable.

The last two lines of the "trace_terminals" output are:

Lexer "L0" discarded lexeme L1c17: whitespace
Lexer "L0" rejected lexeme L1c18: '*'; value="*"

A note in passing: Marpa shows the input string position of the tokens it accepts, discard
and rejects. "<whitespace>" is supposed to be discarded and that was what happened at
line 1, column 17. But the '*' that was next in the input was rejected, and that was not
supposed to happen.

Output from show_progress()
Marpa's most powerful tool for debugging grammars is its progress report, which shows the
Earley items being worked on. In the Appendix, progress reports for the entire parse are
shown. Our example in this document is a very small one, so that producing progress
reports for the entire parse is a reasonable thing to do in this case. If a parse is at
all large, you will usually need to be selective.

The progress report that is usually of most interest is the one for the Earley set that
you were working on when the error occurred. This is called the current location. In our
example the current location is G1 location 5. By default, "show_progress" prints out
only the progress reports for the current location.

Here are the progress reports for the current location, location 5, from our example.

P0 @0-5 L1c1-16 statements -> . statement *
F0 @0-5 L1c1-16 statements -> statement * .
P1 @5-5 L1c15-16 statement -> . assignment
P2 @5-5 L1c15-16 statement -> . <numeric assignment>
F2 @0-5 L1c1-16 statement -> <numeric assignment> .
P3 @5-5 L1c15-16 assignment -> . 'set' variable 'to' expression
P4 @5-5 L1c15-16 <numeric assignment> -> . variable '=' expression
F4 @0-5 L1c1-16 <numeric assignment> -> variable '=' expression .
F5 @2-5 L1c3-16 expression -> expression .
F7 @4-5 L1c13-16 expression -> expression .
F8 @4-5 L1c13-16 expression -> variable .
R11:1 @2-5 L1c3-16 expression -> expression . '+' expression
F11 @2-5 L1c3-16 expression -> expression '+' expression .
F19 @0-5 L1c1-16 :start -> statements .

Progress report lines
F19 @0-5 L1c1-16 :start -> statements .

The last field of each progress report line shows, in fully expanded form, the dotted rule
we were working on. Prefixed to the dotted rule are three fields. In the example just
above they are ""F0 @0-5 L1c1-16"". The ""F0"" says that this is a completed or final
rule, and that it is rule number 0. The rule number is a convenient way to refer to a
rule and is used when displaying the whole rule would take too much space.

The ""@0-5"" describes the G1 locations of the dotted rule in the parse. In its simplest
form, the location field is two G1 location numbers, separated by a hyphen. The first G1
location number is the origin, the place where Marpa first started recognizing the rule.
The last G1 location number is the dot location, the G1 location of the dot in a dotted
rule. ""@0-3"" says that this rule began at G1 location 0, and that the dot is at G1
location 3.

Following the G1 location is the range of positions in the input string: ""L1c1-16"".
This indicates that the origin of dotted rule is at line 1, column 1, and that its dot
position is after line 1, column 16.

The current location is also just after line 1, column 16, and at G1 location 5, and this
is no coincidence. Whenever we are displaying the progress report for a G1 location, all
the progress report lines will have their dot location at that G1 location.

As an aside, notice that the left hand side symbol is ":start". That is the start pseudo-
symbol. The presence of a completed start rule in our progress report indicates that if
our input had ended at location 5, it would be a valid sentence in the language of our
grammar. (And it is because the input at G1 location 5 was a valid sentence of the
grammar, that we were able to look at the value of the parse at location 5 for debugging
purposes.)

Let's look at another progress report line:

R11:2 @2-4 L1c3-13 expression -> expression '+' . expression

Here the ""R12:2"" indicates that this is rule number 12 (the ""R"" stands for rule
number) and that its dot position is after the second symbol on the right hand side.
Symbol positions are numbered using the ordinal of the symbol just before the position.
Symbols are numbered starting with 1, and symbol position 2 is the position immediately
after symbol 2.

Predicted rules also appear in progress reports:

P2 @3-3 L1c5-11 statement -> . <numeric assignment>

Here the ""P"" in the summary field means "predicted". Notice that in the predicted rule,
the origin is the same as the dot location. This will always be the case with predicted
rules.

OK! Now to find the bug
If we look again are progress reports at the location 5, the location where things went
wrong: We see that we have completed rules for "<expression>", "<numeric assignment>",
"<statement>", "<statements>", as expected. We also see two Earley items that show that
we are in the process of building another "<expression>", and that it is expecting a '"+"'
symbol.

What we want to know is, why is the recognizer not expecting an '"*"' symbol? Looking
back at the grammar, we see that only one rule uses the '"*"' symbol. Here it is as part
of a prioritized rule in the DSL:

<numeric expression> ::=
variable | number
|| <numeric expression> '+' <numeric expression>
|| <numeric expression> '*' <numeric expression>

Here it is from the "show_rules()" listing:

G1 R18 <numeric expression> ::= <numeric expression> '*' <numeric expression>

It's rule 19 in subgrammar G1, and for convenience we will call it R19. The next step is
to look at the Earley items for this rule. But there is a problem. We don't find any.

Next, we ask ourselves, what is the earliest place R19 should be appearing? The answer is
that there should be a prediction of R19 at location 0. So we look at the predictions at
location 0.

P0 @0-0 L0c0 statements -> . statement *
P1 @0-0 L0c0 statement -> . assignment
P2 @0-0 L0c0 statement -> . <numeric assignment>
P3 @0-0 L0c0 assignment -> . 'set' variable 'to' expression
P4 @0-0 L0c0 <numeric assignment> -> . variable '=' expression
P19 @0-0 L0c0 :start -> . statements

No R19 predicted at G1 location 0. Next we look through the the entire progress report,
at all G1 locations, to see if R19 is predicted anywhere. No R19. Not anywhere.

The LHS of R19 is "<numeric expression>". We look in the progress report for dotted rules
where "<numeric expression>" is expected -- that is, dotted rules where "<numeric
expression>" is the post-dot symbol. There are none.

Next we look for places in the progress reports where "<numeric expression>" occurs at
all, whether post-dot or not. In the progress reports, "<numeric expression>" occurs in
only two dotted rule instances. Here they are:

P10 @2-2 L1c3 expression -> . 'string' '(' <numeric expression> ')'

P10 @4-4 L1c13 expression -> . 'string' '(' <numeric expression> ')'

In both cases these are predictions of a string operator, the operator we plan to use for
converting numerics to strings. They are just predictions, predictions which go no
further because there is no '"string"' operator in our input. That's fine, but why no
other, more relevant, occurrences of "<numeric expression>"?

We look back at the grammar. Aside for the rule for the '"string"' operator, "<numeric
expression>" occurs on a RHS in two places. One is in the prioritized rule which defines
"<numeric expression>".

<numeric expression> ::=
variable | number
|| <numeric expression> '+' <numeric expression>
|| <numeric expression> '*' <numeric expression>

This rule will never put "<numeric expression>" into the Earley items unless there is a
"<numeric expression>" already there. But that is not its job. This rule is just fine
and does not need fixing.

That leaves one rule to look at.

<numeric assignment> ::= variable '=' expression

This rule is one that should lead to the prediction of a new "<numeric expression>" in our
example. And now we see our problem. This rule is never leading to the prediction of a
new "<numeric expression>", because there is no "<numeric expression>" on its RHS, or for
that matter anywhere else in it. On the RHS, where we wrote "<expression>", we should
have written "<numeric expression>". Change that and the problem is fixed.

Complications

We have finished our main example. This section discusses some aspects of debugging which
did not arise in the example, and which might be unexpected.

Empty rules
When a symbol is nulled in your parse, "show_progress" show only the nulled symbol. It
does not show the symbols expansion into rules, or any of its nulled child symbols. This
reduces clutter, and usually one does not notice the missing nulled rules and symbols.
Not showing these seems to be the intuitive way to treat them.

Input string ranges
G1 locations run in a monotonic sequence, starting with 0. G1 locations never run
backwards, they are never visited twice, and they leave no gaps.

Input string positions, on the other hand, can do all of these things. An application is
allowed to jump around in the input. An input string position may be encountered more
than once. It is quite possible to write your application so that it encounters, for
example, line 42 before line 7. And your application does not have to visit line 42 on
its way from line 41 to line 43. For that matter, an application does not ever have to
visit any position in its input.

How does Marpa deal with this when reporting input string ranges? Marpa always reports
the minimum range that includes all the input string positions visited in the dotted rule.
The range is always reported in increasing numeric order, even when the position at the
end of the range was visited before the input string position at the beginning of the
range. And, if necessary to include all visited input string positions, the range may
include input string positions which were not visited.

Most applications move forward continuously in the input string, and if yours is one of
them, you don't have to worry about these issues. But if you do unusual things when
reading the input, it helps to be aware of how input string ranges are reported by Marpa
when tracing and debugging.

Multiple instances of dotted rules
It does not happen in our main example for this document, but a dotted rule can appear in
the same Earley set more than once. In fact, this happens frequently. When it does
happen, the lines in the progress report will look like these

F11 x13 @0...40-41 L1c1-L2c40 <plain assignment> -> 'x' '=' expression .

F1 x20 @0...38-41 L1c1-L2c40 expression -> assignment .

F6 x12 @0...38-41 L1c1-L2c40 assignment -> <plain assignment> .

These are some of the progress report lines for an indirect right recursion, one that
recurses from a "<plain assignment>" symbol to an "<expression>" symbol, and then to an
"<assignment>" symbol, before completing the recursion by returning to a "<plain
assignment>".

In each of the three lines, notice that a new field appears second. This second field is
variously ""x13"", ""x20"" or ""x12"". These are counts, indicating the number of
instances of that dotted rule at the dotted rule's G1 dot location. Every dotted rule
instance will have the same G1 location, but the instances may have many different origins
-- hundreds or even more. In each of the three report lines above, the G1 dot location is
41.

Note that when parsing, Marpa handles the long series of Earley items generated by right
recursions very efficiently. It uses a technique invented by Joop Leo to memoize and
eliminate them. When a progress report is requested at a G1 location, the Leo-memoization
is unfolded, and the full list of Earley items is reported.

Each instance may have its own span in the input string, and the input string range will
include them all. When there are many instances of a dotted rule at a single location,
the origins in the location field are shown as a range, with the earliest separated from
the most recent by a ""..."". For example, above, where the first four fields were ""F7
x12 @0...38-41 L1c1-L2c40"", that tells us that the dotted rule is rule 7, which has 12
instances. All 12 instances have their dot location at G1 location 41, but their origins
are in the range from G1 location 0 to G1 location 38.

The last field in ""F7 x12 @0...38-41 L1c1-L2c40"" is an input string range.
""L1c1-L2c40"" says that input string positions visited by the the 12 instances start at
line 1, column 1, and end at line 2, column 40. The reported input string range will be
the shortest range that includes all of the input string positions visited by any of the
dotted rule instances.

If there are only a few origins, Marpa may explicitly list them all. In the follow
example, there are only 2 instances of this rule, both with a dot location of 41. Their
origins are at G1 locations 8 and 18. The range of input string positions is from line 1,
column 17 to line 2, column 40.

F2 x2 @8,18-41 L1c17-L2c40 assignment -> <divide assignment> .

Access to the "raw" progress report information

This section deals with the "progress()" recognizer method, which allows access to the raw
progress report information. This method is not needed for typical debugging and tracing
situations. It is intended for applications which want to leverage Marpa's "situational
awareness" in innovative ways.

progress()
my $report0 = $slr->progress(0);

my $latest_report = $slr->progress();

Given the G1 location (Earley set ID) as its argument, the "progress()" recognizer method
returns a reference to an array of "report items". The G1 location may be given as a
negative number. An argument of -X will be interpreted as G1 location N+X+1, where N is
the latest Earley set. This means that an argument of -1 indicates the latest Earley set,
an argument of -2 indicates the Earley set just before the latest one, etc.

Each report item is a triple: an array of three elements. The three elements are, in
order, rule ID, dot position, and origin. The data returned by the two displays above, as
well as the data for the other G1 locations in our example, are shown below.

The rule ID is the same number that Marpa uses to identify rules in tracing and debugging
output. Given a rule ID, an application can expand it into its LHS and RHS symbols using
the SLIF grammar's "rule_expand()" method. Given a symbol ID, its name and other
information can be found using other SLIF grammar methods.

Dot position is -1 for completions, and 0 for predictions. Where the report item is not
for a completion or a prediction, dot position is N, where N is the number of RHS symbols
successfully recognized at the G1 location of the progress report.

Origin is the G1 location (Earley set ID) at which the rule application reported by the
report item began. For a prediction, origin will always be the same as the G1 location of
the parse report.

Progress reports and efficiency
When progress reports are used for production parsing, instead of just for debugging and
tracing, efficiency considerations become significant. Progress reports themselves are
implemented in optimized C, and that logic is very fast. However, the use of progress
reports usually implies considerable post-processing in Perl. It is almost always
possible to use Marpa's named events instead of progress reports, and solutions using
named events are usually better targeted, simpler and faster.

If you do decide to use progress reports in an application, you should be aware of the
efficiency considerations when there are right recursions in the grammar. For most
purposes, Marpa optimizes right recursions, so that they run in linear time. However, to
create a progress report every potential right recursion must be fully unfolded, and at
each G1 location the number of these grows linearly with the length of the recursion. If
you are creating progress reports for more than a limited number of G1 locations, this
means processing that can be quadratic in the length of the recursion. When a right
recursion is lengthy, the impact on speed can be be very serious.

If lengthy right recursions are being expanded, this will be evident from the parse report
itself, which will contain one report item for every completion in the right-recursive
chain of completions. Note that the efficiency consideration just mentioned for following
right recursions is never an issue for left recursions. Left recursions only produce at
most two report items per G1 location and are extremely fast to process. It is also not
an issue for Marpa's sequence rules, because sequence rules are implemented internally as
left recursions.

Appendix

       Below are the code, the trace outputs and the progress report for the example used in this
       document.

   Code
           my $slif_debug_source = <<'END_OF_SOURCE';
           :default ::= action => ::array bless => ::lhs
           :start ::= statements
           statements ::= statement *
           statement ::= assignment | <numeric assignment>
           assignment ::= 'set' variable 'to' expression

           # This is a deliberate error in the grammar
           # The next line should be:
           # <numeric assignment> ::= variable '=' <numeric expression>
           # I have changed the <numeric expression>  to <expression> which
           # will cause problems.
           <numeric assignment> ::= variable '=' expression

           expression ::=
                  variable | string
               || 'string' '(' <numeric expression> ')'
               || expression '+' expression
           <numeric expression> ::=
                  variable | number
               || <numeric expression> '+' <numeric expression>
               || <numeric expression> '*' <numeric expression>
           variable ~ [\w]+
           number ~ [\d]+
           string ~ ['] <string contents> [']
           <string contents> ~ [^'\x{0A}\x{0B}\x{0C}\x{0D}\x{0085}\x{2028}\x{2029}]+
           :discard ~ whitespace
           whitespace ~ [\s]+
           END_OF_SOURCE

           my $slg = Marpa::R2::Scanless::G->new(
               {
               bless_package => 'My_Nodes',
               source => \$slif_debug_source,
           });

           my $slr = Marpa::R2::Scanless::R->new(
               { grammar => $slg,
               trace_terminals => 1,
               trace_values => 1,
               } );

           my $test_input = 'a = 8675309 + 42 * 711' ;
           my $eval_error = $EVAL_ERROR if not eval { $slr->read( \$test_input ); 1 };

           $progress_report = $slr->show_progress( 0, -1 );

   Error message
           Error in SLIF parse: No lexemes accepted at line 1, column 18
             Rejected lexeme #0: Lexer "L0"; '*'; value="*"; length = 1
           * String before error: a = 8675309 + 42\s
           * The error was at line 1, column 18, and at character 0x002a '*', ...
           * here: * 711

   Parse value at error location
       Note that when there is a parse error, there will not always be a parse value.  But
       sometimes the parse is "successful" enough, in a technical sense, to produce a value, and
       in those cases examining the value can be helpful in determining what the parser thinks it
       has seen so far.

           my $value_ref = $slr->value();
           my $expected_output = \bless( [
                            bless( [
                                     bless( [
                                              'a',
                                              '=',
                                              bless( [
                                                       bless( [
                                                                '8675309'
                                                              ], 'My_Nodes::expression' ),
                                                       '+',
                                                       bless( [
                                                                '42'
                                                              ], 'My_Nodes::expression' )
                                                     ], 'My_Nodes::expression' )
                                            ], 'My_Nodes::numeric_assignment' )
                                   ], 'My_Nodes::statement' )
                      ], 'My_Nodes::statements' );

   Trace output
           Setting trace_terminals option
           Setting trace_values option
           Lexer "L0" accepted lexeme L1c1: variable; value="a"
           Lexer "L0" discarded lexeme L1c2: whitespace
           Lexer "L0" accepted lexeme L1c3: '='; value="="
           Lexer "L0" discarded lexeme L1c4: whitespace
           Lexer "L0" rejected lexeme L1c5-11: number; value="8675309"
           Lexer "L0" accepted lexeme L1c5-11: variable; value="8675309"
           Lexer "L0" discarded lexeme L1c12: whitespace
           Lexer "L0" rejected lexeme L1c13: '+'; value="+"
           Lexer "L0" accepted lexeme L1c13: '+'; value="+"
           Lexer "L0" discarded lexeme L1c14: whitespace
           Lexer "L0" rejected lexeme L1c15-16: number; value="42"
           Lexer "L0" accepted lexeme L1c15-16: variable; value="42"
           Lexer "L0" discarded lexeme L1c17: whitespace
           Lexer "L0" rejected lexeme L1c18: '*'; value="*"

   show_progress() output
         P0 @0-0 L0c0 statements -> . statement *
         P1 @0-0 L0c0 statement -> . assignment
         P2 @0-0 L0c0 statement -> . <numeric assignment>
         P3 @0-0 L0c0 assignment -> . 'set' variable 'to' expression
         P4 @0-0 L0c0 <numeric assignment> -> . variable '=' expression
         P19 @0-0 L0c0 :start -> . statements
         R4:1 @0-1 L1c1 <numeric assignment> -> variable . '=' expression
         R4:2 @0-2 L1c1-3 <numeric assignment> -> variable '=' . expression
         P5 @2-2 L1c3 expression -> . expression
         P6 @2-2 L1c3 expression -> . expression
         P7 @2-2 L1c3 expression -> . expression
         P8 @2-2 L1c3 expression -> . variable
         P9 @2-2 L1c3 expression -> . string
         P10 @2-2 L1c3 expression -> . 'string' '(' <numeric expression> ')'
         P11 @2-2 L1c3 expression -> . expression '+' expression
         P0 @0-3 L1c1-11 statements -> . statement *
         F0 @0-3 L1c1-11 statements -> statement * .
         P1 @3-3 L1c5-11 statement -> . assignment
         P2 @3-3 L1c5-11 statement -> . <numeric assignment>
         F2 @0-3 L1c1-11 statement -> <numeric assignment> .
         P3 @3-3 L1c5-11 assignment -> . 'set' variable 'to' expression
         P4 @3-3 L1c5-11 <numeric assignment> -> . variable '=' expression
         F4 @0-3 L1c1-11 <numeric assignment> -> variable '=' expression .
         F5 @2-3 L1c3-11 expression -> expression .
         F6 @2-3 L1c3-11 expression -> expression .
         F7 @2-3 L1c3-11 expression -> expression .
         F8 @2-3 L1c3-11 expression -> variable .
         R11:1 @2-3 L1c3-11 expression -> expression . '+' expression
         F19 @0-3 L1c1-11 :start -> statements .
         P7 @4-4 L1c13 expression -> . expression
         P8 @4-4 L1c13 expression -> . variable
         P9 @4-4 L1c13 expression -> . string
         P10 @4-4 L1c13 expression -> . 'string' '(' <numeric expression> ')'
         R11:2 @2-4 L1c3-13 expression -> expression '+' . expression
         P0 @0-5 L1c1-16 statements -> . statement *
         F0 @0-5 L1c1-16 statements -> statement * .
         P1 @5-5 L1c15-16 statement -> . assignment
         P2 @5-5 L1c15-16 statement -> . <numeric assignment>
         F2 @0-5 L1c1-16 statement -> <numeric assignment> .
         P3 @5-5 L1c15-16 assignment -> . 'set' variable 'to' expression
         P4 @5-5 L1c15-16 <numeric assignment> -> . variable '=' expression
         F4 @0-5 L1c1-16 <numeric assignment> -> variable '=' expression .
         F5 @2-5 L1c3-16 expression -> expression .
         F7 @4-5 L1c13-16 expression -> expression .
         F8 @4-5 L1c13-16 expression -> variable .
         R11:1 @2-5 L1c3-16 expression -> expression . '+' expression
         F11 @2-5 L1c3-16 expression -> expression '+' expression .
         F19 @0-5 L1c1-16 :start -> statements .

   show_rules() output
       This is the G1 portion of the "show_rules()" output at verbosity level 3.  In ordinary
       work, you'd use verbosity level 1 (the default), but the more verbose output is included
       here to illustrate the example.

         G1 Rules:
         G1 R0 statements ::= statement *
           Symbol IDs: <16> ::= <17>
           Internal symbols: <statements> ::= <statement>
         G1 R1 statement ::= assignment
           Symbol IDs: <17> ::= <18>
           Internal symbols: <statement> ::= <assignment>
         G1 R2 statement ::= <numeric assignment>
           Symbol IDs: <17> ::= <19>
           Internal symbols: <statement> ::= <numeric assignment>
         G1 R3 assignment ::= 'set' variable 'to' expression
           Symbol IDs: <18> ::= <1> <20> <2> <21>
           Internal symbols: <assignment> ::= <[Lex-0]> <variable> <[Lex-1]> <expression>
         G1 R4 <numeric assignment> ::= variable '=' <numeric expression>
           Symbol IDs: <19> ::= <20> <3> <22>
           Internal symbols: <numeric assignment> ::= <variable> <[Lex-2]> <numeric expression>
         G1 R5 expression ::= expression
           Internal rule top priority rule for <expression>
           Symbol IDs: <21> ::= <10>
           Internal symbols: <expression> ::= <expression[0]>
         G1 R6 expression ::= expression
           Internal rule for symbol <expression> priority transition from 0 to 1
           Symbol IDs: <10> ::= <11>
           Internal symbols: <expression[0]> ::= <expression[1]>
         G1 R7 expression ::= expression
           Internal rule for symbol <expression> priority transition from 1 to 2
           Symbol IDs: <11> ::= <12>
           Internal symbols: <expression[1]> ::= <expression[2]>
         G1 R8 expression ::= variable
           Symbol IDs: <12> ::= <20>
           Internal symbols: <expression[2]> ::= <variable>
         G1 R9 expression ::= string
           Symbol IDs: <12> ::= <23>
           Internal symbols: <expression[2]> ::= <string>
         G1 R10 expression ::= 'string' '(' <numeric expression> ')'
           Symbol IDs: <11> ::= <4> <5> <22> <6>
           Internal symbols: <expression[1]> ::= <[Lex-3]> <[Lex-4]> <numeric expression> <[Lex-5]>
         G1 R11 expression ::= expression '+' expression
           Symbol IDs: <10> ::= <10> <7> <11>
           Internal symbols: <expression[0]> ::= <expression[0]> <[Lex-6]> <expression[1]>
         G1 R12 <numeric expression> ::= <numeric expression>
           Internal rule top priority rule for <numeric expression>
           Symbol IDs: <22> ::= <13>
           Internal symbols: <numeric expression> ::= <numeric expression[0]>
         G1 R13 <numeric expression> ::= <numeric expression>
           Internal rule for symbol <numeric expression> priority transition from 0 to 1
           Symbol IDs: <13> ::= <14>
           Internal symbols: <numeric expression[0]> ::= <numeric expression[1]>
         G1 R14 <numeric expression> ::= <numeric expression>
           Internal rule for symbol <numeric expression> priority transition from 1 to 2
           Symbol IDs: <14> ::= <15>
           Internal symbols: <numeric expression[1]> ::= <numeric expression[2]>
         G1 R15 <numeric expression> ::= variable
           Symbol IDs: <15> ::= <20>
           Internal symbols: <numeric expression[2]> ::= <variable>
         G1 R16 <numeric expression> ::= number
           Symbol IDs: <15> ::= <24>
           Internal symbols: <numeric expression[2]> ::= <number>
         G1 R17 <numeric expression> ::= <numeric expression> '+' <numeric expression>
           Symbol IDs: <14> ::= <14> <8> <15>
           Internal symbols: <numeric expression[1]> ::= <numeric expression[1]> <[Lex-7]> <numeric expression[2]>
         G1 R18 <numeric expression> ::= <numeric expression> '*' <numeric expression>
           Symbol IDs: <13> ::= <13> <9> <14>
           Internal symbols: <numeric expression[0]> ::= <numeric expression[0]> <[Lex-8]> <numeric expression[1]>
         G1 R19 :start ::= statements
           Symbol IDs: <0> ::= <16>
           Internal symbols: <[:start]> ::= <statements>
         Lex (L0) Rules:
         L0 R0 'set' ::= [s] [e] [t]
           Internal rule for single-quoted string 'set'
           Symbol IDs: <2> ::= <27> <21> <28>
           Internal symbols: <[Lex-0]> ::= <[[s]]> <[[e]]> <[[t]]>
         L0 R1 'to' ::= [t] [o]
           Internal rule for single-quoted string 'to'
           Symbol IDs: <3> ::= <28> <25>
           Internal symbols: <[Lex-1]> ::= <[[t]]> <[[o]]>
         L0 R2 '=' ::= [\=]
           Internal rule for single-quoted string '='
           Symbol IDs: <4> ::= <16>
           Internal symbols: <[Lex-2]> ::= <[[\=]]>
         L0 R3 'string' ::= [s] [t] [r] [i] [n] [g]
           Internal rule for single-quoted string 'string'
           Symbol IDs: <5> ::= <27> <28> <26> <23> <24> <22>
           Internal symbols: <[Lex-3]> ::= <[[s]]> <[[t]]> <[[r]]> <[[i]]> <[[n]]> <[[g]]>
         L0 R4 '(' ::= [\(]
           Internal rule for single-quoted string '('
           Symbol IDs: <6> ::= <12>
           Internal symbols: <[Lex-4]> ::= <[[\(]]>
         L0 R5 ')' ::= [\)]
           Internal rule for single-quoted string ')'
           Symbol IDs: <7> ::= <13>
           Internal symbols: <[Lex-5]> ::= <[[\)]]>
         L0 R6 '+' ::= [\+]
           Internal rule for single-quoted string '+'
           Symbol IDs: <8> ::= <15>
           Internal symbols: <[Lex-6]> ::= <[[\+]]>
         L0 R7 '+' ::= [\+]
           Internal rule for single-quoted string '+'
           Symbol IDs: <9> ::= <15>
           Internal symbols: <[Lex-7]> ::= <[[\+]]>
         L0 R8 '*' ::= [\*]
           Internal rule for single-quoted string '*'
           Symbol IDs: <10> ::= <14>
           Internal symbols: <[Lex-8]> ::= <[[\*]]>
         L0 R9 variable ::= [\w] +
           Symbol IDs: <29> ::= <19>
           Internal symbols: <variable> ::= <[[\w]]>
         L0 R10 number ::= [\d] +
           Symbol IDs: <30> ::= <17>
           Internal symbols: <number> ::= <[[\d]]>
         L0 R11 string ::= ['] <string contents> [']
           Symbol IDs: <31> ::= <11> <32> <11>
           Internal symbols: <string> ::= <[[']]> <string contents> <[[']]>
         L0 R12 <string contents> ::= [^'\x{0A}\x{0B}\x{0C}\x{0D}\x{0085}\x{2028}\x{2029}] +
           Symbol IDs: <32> ::= <20>
           Internal symbols: <string contents> ::= <[[^'\x{0A}\x{0B}\x{0C}\x{0D}\x{0085}\x{2028}\x{2029}]]>
         L0 R13 :discard ::= whitespace
           Discard rule for <whitespace>
           Symbol IDs: <0> ::= <33>
           Internal symbols: <[:discard]> ::= <whitespace>
         L0 R14 whitespace ::= [\s] +
           Symbol IDs: <33> ::= <18>
           Internal symbols: <whitespace> ::= <[[\s]]>
         L0 R15 :start_lex ::= :discard
           Internal lexical start rule for <[:discard]>
           Symbol IDs: <1> ::= <0>
           Internal symbols: <[:start_lex]> ::= <[:discard]>
         L0 R16 :start_lex ::= 'set'
           Internal lexical start rule for <[Lex-0]>
           Symbol IDs: <1> ::= <2>
           Internal symbols: <[:start_lex]> ::= <[Lex-0]>
         L0 R17 :start_lex ::= 'to'
           Internal lexical start rule for <[Lex-1]>
           Symbol IDs: <1> ::= <3>
           Internal symbols: <[:start_lex]> ::= <[Lex-1]>
         L0 R18 :start_lex ::= '='
           Internal lexical start rule for <[Lex-2]>
           Symbol IDs: <1> ::= <4>
           Internal symbols: <[:start_lex]> ::= <[Lex-2]>
         L0 R19 :start_lex ::= 'string'
           Internal lexical start rule for <[Lex-3]>
           Symbol IDs: <1> ::= <5>
           Internal symbols: <[:start_lex]> ::= <[Lex-3]>
         L0 R20 :start_lex ::= '('
           Internal lexical start rule for <[Lex-4]>
           Symbol IDs: <1> ::= <6>
           Internal symbols: <[:start_lex]> ::= <[Lex-4]>
         L0 R21 :start_lex ::= ')'
           Internal lexical start rule for <[Lex-5]>
           Symbol IDs: <1> ::= <7>
           Internal symbols: <[:start_lex]> ::= <[Lex-5]>
         L0 R22 :start_lex ::= '+'
           Internal lexical start rule for <[Lex-6]>
           Symbol IDs: <1> ::= <8>
           Internal symbols: <[:start_lex]> ::= <[Lex-6]>
         L0 R23 :start_lex ::= '+'
           Internal lexical start rule for <[Lex-7]>
           Symbol IDs: <1> ::= <9>
           Internal symbols: <[:start_lex]> ::= <[Lex-7]>
         L0 R24 :start_lex ::= '*'
           Internal lexical start rule for <[Lex-8]>
           Symbol IDs: <1> ::= <10>
           Internal symbols: <[:start_lex]> ::= <[Lex-8]>
         L0 R25 :start_lex ::= number
           Internal lexical start rule for <number>
           Symbol IDs: <1> ::= <30>
           Internal symbols: <[:start_lex]> ::= <number>
         L0 R26 :start_lex ::= string
           Internal lexical start rule for <string>
           Symbol IDs: <1> ::= <31>
           Internal symbols: <[:start_lex]> ::= <string>
         L0 R27 :start_lex ::= variable
           Internal lexical start rule for <variable>
           Symbol IDs: <1> ::= <29>
           Internal symbols: <[:start_lex]> ::= <variable>

   show_symbols() output
           G1 Symbols:
           G1 S0 :start -- Internal G1 start symbol
             Internal name: <[:start]>
           G1 S1 'set' -- Internal lexical symbol for "'set'"
             /* terminal */
             Internal name: <[Lex-0]>
             SLIF name: 'set'
           G1 S2 'to' -- Internal lexical symbol for "'to'"
             /* terminal */
             Internal name: <[Lex-1]>
             SLIF name: 'to'
           G1 S3 '=' -- Internal lexical symbol for "'='"
             /* terminal */
             Internal name: <[Lex-2]>
             SLIF name: '='
           G1 S4 'string' -- Internal lexical symbol for "'string'"
             /* terminal */
             Internal name: <[Lex-3]>
             SLIF name: 'string'
           G1 S5 '(' -- Internal lexical symbol for "'('"
             /* terminal */
             Internal name: <[Lex-4]>
             SLIF name: '('
           G1 S6 ')' -- Internal lexical symbol for "')'"
             /* terminal */
             Internal name: <[Lex-5]>
             SLIF name: ')'
           G1 S7 '+' -- Internal lexical symbol for "'+'"
             /* terminal */
             Internal name: <[Lex-6]>
             SLIF name: '+'
           G1 S8 '+' -- Internal lexical symbol for "'+'"
             /* terminal */
             Internal name: <[Lex-7]>
             SLIF name: '+'
           G1 S9 '*' -- Internal lexical symbol for "'*'"
             /* terminal */
             Internal name: <[Lex-8]>
             SLIF name: '*'
           G1 S10 expression -- <expression> at priority 0
             Internal name: <expression[0]>
             SLIF name: expression
           G1 S11 expression -- <expression> at priority 1
             Internal name: <expression[1]>
             SLIF name: expression
           G1 S12 expression -- <expression> at priority 2
             Internal name: <expression[2]>
             SLIF name: expression
           G1 S13 <numeric expression> -- <numeric expression> at priority 0
             Internal name: <numeric expression[0]>
             SLIF name: numeric expression
           G1 S14 <numeric expression> -- <numeric expression> at priority 1
             Internal name: <numeric expression[1]>
             SLIF name: numeric expression
           G1 S15 <numeric expression> -- <numeric expression> at priority 2
             Internal name: <numeric expression[2]>
             SLIF name: numeric expression
           G1 S16 statements
             Internal name: <statements>
           G1 S17 statement
             Internal name: <statement>
           G1 S18 assignment
             Internal name: <assignment>
           G1 S19 <numeric assignment>
             Internal name: <numeric assignment>
           G1 S20 variable
             /* terminal */
             Internal name: <variable>
           G1 S21 expression
             Internal name: <expression>
           G1 S22 <numeric expression>
             Internal name: <numeric expression>
           G1 S23 string
             /* terminal */
             Internal name: <string>
           G1 S24 number
             /* terminal */
             Internal name: <number>

   progress() outputs
       These section contains samples of the output of the "progress()" method -- the progress
       reports in their "raw" format.  The output is shown in Data::Dumper format, with
       "Data::Dumper::Indent" set to 0 and "Data::Dumper::Terse" set to 1.

       The "Data::Dumper" output from "progress()" at G1 location 0:

           [[0,0,0],[1,0,0],[2,0,0],[3,0,0],[4,0,0],[19,0,0]]

       The "Data::Dumper" output from "progress()" at G1 location 1:

           [[4,1,0]]

       The "Data::Dumper" output from "progress()" at location 2:

           [[5,0,2],[6,0,2],[7,0,2],[8,0,2],[9,0,2],[10,0,2],[11,0,2],[4,2,0]]

       The default "progress()" output is for the latest Earley set.  Here is the "progress()"
       output for the latest Earley set.

           [[0,-1,0],[2,-1,0],[4,-1,0],[5,-1,2],[7,-1,4],[8,-1,4],[11,-1,2],[19,-1,0],[0,0,0],[1,0,5],[2,0,5],[3,0,5],[4,0,5],[11,1,2]]

Copyright and License

         Copyright 2014 Jeffrey Kegler
         This file is part of Marpa::R2.  Marpa::R2 is free software: you can
         redistribute it and/or modify it under the terms of the GNU Lesser
         General Public License as published by the Free Software Foundation,
         either version 3 of the License, or (at your option) any later version.

         Marpa::R2 is distributed in the hope that it will be useful,
         but WITHOUT ANY WARRANTY; without even the implied warranty of
         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
         Lesser General Public License for more details.

         You should have received a copy of the GNU Lesser
         General Public License along with Marpa::R2.  If not, see
         http://www.gnu.org/licenses/.