Provided by: libmarpa-r2-perl_2.086000~dfsg-8build5_amd64 bug

Name

       Marpa::R2::Scanless::R - Scanless interface recognizers

Synopsis

           my $recce = Marpa::R2::Scanless::R->new( { grammar => $grammar } );
           my $self = bless { grammar => $grammar }, 'My_Actions';
           $self->{recce} = $recce;

           if ( not defined eval { $recce->read($p_input_string); 1 }
               )
           {
               ## Add last expression found, and rethrow
               my $eval_error = $EVAL_ERROR;
               chomp $eval_error;
               die $self->show_last_expression(), "\n", $eval_error, "\n";
           } ## end if ( not defined eval { $event_count = $recce->read...})

           my $value_ref = $recce->value( $self );
           if ( not defined $value_ref ) {
               die $self->show_last_expression(), "\n",
                   "No parse was found, after reading the entire input\n";
           }

           package My_Actions;
           sub do_parens    { shift; return $_[1] }
           sub do_add       { shift; return $_[0] + $_[2] }
           sub do_subtract  { shift; return $_[0] - $_[2] }
           sub do_multiply  { shift; return $_[0] * $_[2] }
           sub do_divide    { shift; return $_[0] / $_[2] }
           sub do_pow       { shift; return $_[0]**$_[2] }
           sub do_first_arg { shift; return shift; }
           sub do_script    { shift; return join q{ }, @_ }

About this document

       This page is the reference document for the recognizer objects of Marpa's SLIF (Scanless
       interface).

Internal and external scanning

       The Scanless interface is so-called because it does not require the application to supply
       a scanner (lexer).  The SLIF contains its own lexer, one whose use is integrated into its
       syntax.  In this document, use of the SLIF's internal scanner is called internal scanning.

       The SLIF allows applications that find it useful to do their own scanning.  When an
       application bypasses the SLIF's internal scanner and does its own scanning, this document
       calls it external scanning.  An application can use external scanning to supplement
       internal scanning, or to replace the SLIF's internal scanner entirely.

Locations

   Input stream locations
       An input stream location is the offset of a codepoint in the input stream.  When the input
       stream is being treated as a string, input stream location corresponds to Perl's pos()
       location.  In this document, the word "location" refers to location in the input stream
       unless otherwise specified.

   Negative locations
       Several methods allow locations and lengths to be specified as negative numbers.  A
       negative location is a location counted from the end, so that -1 means location before the
       last character of the string, -2 the location before the second to last character of a
       string, etc.  A negative length indicates a distance to a location counted from the end.
       A length of -1 indicates the distance to the end of the string, -2 indicates the distance
       to the location just before the last character of the string, etc.

       For example, suppose that we are dealing with input stream locations.  The span ("0, -1")
       is the entire input stream.  The span ("-1, -1") is the last character of input stream.
       The span ("-2, -1") is the last two characters of the input stream.  The span ("-2, 1") is
       the second to last character of the input stream.

   G1 locations
       In addition to input stream location, the SLIF also tracks G1 location.  G1 location
       starts at zero, and increases by exactly one as each lexeme is read.  G1 location is
       usually not the same as input stream location.  There is also a concept of G1 length,
       which is simply length calculated in terms of G1 locations.

       G1 location can be ignored most of the time, but it does become relevant to a small degree
       when dealing with ambiguous terminals, and to a greater degree when tracing the G1
       grammar.  (For those more familiar with Marpa's internals, the G1 location is the G1
       Earley set index.)

   Current location
       The SLIF tracks the current location in the input stream, more usually simply called the
       current location.  Locations are zero-based, so that location 0 is the start of the input
       stream.  A location is said to point to the character after it, if there is such a
       character.  For example, location 0 points to the first character of the input stream,
       unless the stream is of zero length, in which case there is no first character.

       A current location equal to the length of the input stream indicates EOS (end of stream).
       In a zero length stream, location 0 is EOS.  The EOS location never points to a character.

       In the SLIF, when the current input stream location moves, it does not necessarily advance
       -- it can skip forward, or can be positioned to an earlier location.  The application can
       skip sections of the input stream.  The application is also free to revisit spans of the
       input stream as often as it wants.

       Here are the guarantees:

       •   Initially, the current location is 0.

       •   The current location will never be negative.

       •   The current location will never be greater than EOS.

   Literals and G1 spans
       Often it is useful to find the literal substring of the input which corresponds to a span
       of G1 locations.  If an application reads the input monotonically within the G1 span this
       presents no complications.

       "Monotonically" here means that, for the G1 span "$g1_start, $g1_length", the application
       reads the G1 locations in sequence and one-by-one, starting at $g1_start and ending at
       "$g1_start+$g1_length".  This is the usual case.

       Reading the input monotonically is the default, and by far the most common case.  But
       Marpa applications are free to skip forward in the stream, to jump backward, to reread the
       same input multiple times, etc., etc.  It is entirely possible the final input stream
       location of a G1 span will be before the start of the G1 span.

       In precise terms, the substring returned for a G1 span "$g1_start, $g1_length" is
       determined as follows: The string will start at the first input stream location in the
       span for G1 location "$g1_start+1".  The end of the string will be at the last input
       stream location in the span for G1 location "$g1_start+$g1_length".  When an application
       moves backward in the input, the end of the string, as calculated above, may be before the
       start of the string.  When the end of a string is before its start, the substring returned
       will be the zero-length string.

       Applications which do not read monotonically, but which also want to associate spans of G1
       locations with the input stream, may need to reassemble the input based on their own
       ideas.  The "literal()" method can assist in this process.

How internal scanning works

       The SLIF always starts scanning using the read() method.  Pedantically, this means
       scanning always begins with a phase of internal scanning.  But that first phase may be of
       zero length, and after that, internal scanning does not have to be resumed.

       Internal scanning can be resumed with the resume() method.  Both the read() and resume()
       methods require the application to specify a span in the input stream.  The read() method
       sets the input stream, and that input stream is the one used by all resume() method calls
       for that recognizer.

       In what follows, the term "internal scanning method" refers to either the read() or the
       resume() method.  After an internal scanning method, the current location will indicate
       how far in the input stream the internal scanning method actually read.  If the internal
       scanning method paused before EOS, the current location will be the one at which it
       paused.  If the internal scanning method pauses at EOS, the current location will be EOS.
       The return value of the read() and the resume() method is the current location.

   EOS
       The location of EOS depends on the $start and $length arguments to the last internal
       scanning method, and on the length of the input string.

       •   If the $length argument of the last internal scanning method was non-negative, EOS
           will be at "$start+$length".

       •   If the $length argument was negative, EOS will be at "$length + 1 + length
           $input_string".

       •   The default length for the internal scanning methods is always -1, so that the default
           EOS is always at "length $input_string", the end of the input string.

   Pauses in internal scanning
       When a read() and the resume() method pauses, one of more of the following occurred.

       •   A named event

           One or more named events may have triggered.  Named events are created by named event
           statements.  They can also be created by lexeme pseudo-rules.  Named events may be
           queried using the events() method().

       •   A unnamed lexeme pause event

           A lexeme pause that is not a named event may have triggered.  Lexeme pauses are
           created by lexeme pseudo-rules.  Applications can always name lexeme pause events,
           using the event adverb, and are strongly encouraged to do so.  If all lexeme pauses
           are named, the check for unnamed events can be omitted.  The presence or absence of an
           unnamed lexeme pause event may be checked for using the lexeme_pause() method.

       •   EOS

           EOS may have been reached.  This may be checked for by comparing the current location
           with the expected EOS.

The input stream

       For error message and other purposes, even external lexemes are required to correspond to
       a span of the input stream.  An external scanner must set up a relationship to the input
       stream, even if that relationship is completely artificial.

       One way to do this is to put an artificial preamble in front of the input stream.  For
       example, the first 7 characters of the input stream could be a preamble containing the
       characters ""NO TEXT"".  This preamble could be immediately followed by what is seen as
       the text from a more natural point of view.  In this case, the initial call to the read()
       method could take the form "$slr->read($input_string, 7)".  Lexemes corresponding to the
       artificial preamble would be read using a method call similar to
       "$slr->lexeme_read($symbol_name, 0, 7, $value)".

Constructor

           my $recce = Marpa::R2::Scanless::R->new( { grammar => $grammar } );

       The new() method is the constructor for SLIF recognizers.  The new() constructor accepts a
       hash of named arguments.  The "grammar" named argument is required.  All other named
       arguments are optional.

       The following named arguments are allowed:

   end
       Most users will want to ignore this argument.  It is an advanced argument, mainly for use
       in testing.  The "end" named argument specifies the parse end, as a G1 location.  The
       default is for the parse to end where the input did, so that the parse returned is of the
       entire input.  The "end" named argument is not allowed once a parse series has begun.

   grammar
       The "new" method is required to have a "grammar" named argument.  Its value must be an
       SLIF grammar object.

   max_parses
       If non-zero, causes a fatal error when that number of parse results is exceeded.
       "max_parses" is useful to limit CPU usage and output length when testing and debugging.
       Stable and production applications may prefer to count the number of parses, and take a
       less Draconian response when the count is exceeded.

       The value must be an integer.  If it is zero, there will be no limit on the number of
       parse results returned.  The default is for there to be no limit.

   ranking_method
       The value must be a string: one of ""none"", ""rule"", or ""high_rule_only"".  When the
       value is ""none"", Marpa returns the parse results in arbitrary order.  This is the
       default.  The "ranking_method" named argument is not allowed once evaluation has begun.

       The ""rule"" and ""high_rule_only"" ranking methods allows the user to control the order
       in which parse results are returned by the "value" method, and to exclude some parse
       results from the parse series.  For details, see the document on parse order.

   semantics_package
       Sets the semantic package for the recognizer.  The setting of this argument takes
       precedence over any package implied by the blessing of the per-parse arguments to the SLIF
       recognizer's value() method.  The semantics package is used when resolving action names to
       fully qualified Perl names.  For more details on the SLIF semantics, see the document on
       SLIF semantics.

   too_many_earley_items
       The "too_many_earley_items" argument is optional, and very few applications will need it.
       If specified, it sets the Earley item warning threshold to a value other than its default.
       If an Earley set becomes larger than the Earley item warning threshold, a recognizer event
       is generated, and a warning is printed to the trace file handle.

       Marpa parses from any BNF, and can handle grammars and inputs which produce very large
       Earley sets.  But parsing that involves very large Earley sets can be slow.  Large Earley
       sets are something most applications can, and will wish to, avoid.

       By default, Marpa calculates an Earley item warning threshold for the G1 recognizer based
       on the size of the G1 grammar, and for each L0 recognizer based on the size of the L0
       grammar.  The default thresholds will never be less than 100.  The default is the result
       of considerable experience and almost all users will be happy with it.

       If the Earley item warning threshold is changed from its default, the change applies to
       both L0 and G1 -- currently there is no way to set them separately.  If the Earley item
       warning threshold is set to 0, no recognizer event is generated, and warnings about large
       Earley sets are turned off.  An Earley item threshold warning almost always indicates a
       serious issue, and turning these warnings off will rarely be what an application wants.

   trace_terminals
       If non-zero, traces the lexemes -- those tokens passed from the L0 parser to the G1
       parser.  This named argument is the best way to follow what the L0 parser is doing, and it
       is also very helpful for tracing the G1 parser.

   trace_values
       The trace_values named argument is a numeric trace level.  If the numeric trace level is
       1, Marpa prints tracing information as values are computed in the evaluation stack.  A
       trace level of 0 turns value tracing off, which is the default. Traces are written to the
       trace file handle.

   trace_file_handle
       The value is a file handle.  Trace output and warning messages go to the trace file
       handle.  By default, the trace file handle is inherited from the grammar.

Basic mutators

   read()
           $recce->read($p_input_string);

           $recce->read( \$string, 0, 0 );

       Given a pointer to an input stream, read() parses it according to the grammar.  Only a
       single call to read() is allowed for a scanless recognizer.

       read() recognizes optional second and third arguments.  The second argument is a location
       in the input stream at which internal scanning will start.  The third argument is the
       length of the section of the input stream to be scanned before pausing.  The default start
       location is zero.  The default length is -1.  Negative locations and lengths have the
       standard interpretation, as described above.

       Start location and length can both be zero.  This pauses internal scanning immediately and
       can be used to hand complete control of scanning over to an external scanner.

       Completion named events can occur during the read() method.  When a named event occurs,
       the read() method pauses.  Named events can be queried using the Scanless recognizer's
       events() method.  The read() method also pauses as specified with the Scanless DSL's pause
       adverb.

       On failure, throws an exception.  The call is considered successful if it ended because a
       parse was found, or because internal scanning was paused.  On success, read() returns the
       location in the input stream at which internal scanning ended.  This value may be zero.

   series_restart()
           $slr->series_restart( { end => $i } );

       The series_restart() method ends the current parse series, and starts another.  It allows,
       as optional arguments, hashes of named arguments for the SLIF recognizer.  These named
       arguments can be any of those allowed by the set() method.

       series_restart() resets all the named arguments to their defaults.  An application that
       wants a non-default named argument to have effect in each of its parse series must
       respecify it at the beginning of each parse series.  series_restart() is particularly
       useful for the "end" and "semantics_package" named arguments, which cannot be changed once
       a parse series is underway.  To change their values, an application must start a new parse
       series.

   set()
           $slr->set( { max_parses => 42 } );

       This method allows the named arguments to be changed after an SLIF grammar is created.
       Currently, the arguments that may be changed are "end", "max_parses", "semantics_package"
       and "trace_file_handle".

   value()
           my $value_ref = $recce->value( $self );

       The "value" method call evaluates the next parse tree in the parse series, and returns a
       reference to the parse result for that parse tree.  If there are no more parse trees, the
       "value" method returns "undef".

       Because Marpa parses ambiguous grammars, every parse is a series of zero or more parse
       trees.  This series of zero or more parse trees is called a parse series.  There are zero
       parse trees if there was no valid parse of the input according to the grammar.

       The value() method allows one, optional argument.  This argument can be a Perl scalar of
       any kind, but the most useful possibilities are references (blessed or unblessed) to
       hashes or array.  If provided, the argument of the value() method explicitly specifies the
       per-parse argument for the parse tree.  The per-parse argument will be the first argument
       of all Perl semantics closures, and can be used to share data within the tree, when that
       data does not conveniently fit into the bottom-up flow of parse tree evaluation.  Symbol
       tables are one example of the kind of data which parses often require, but which it is not
       convenient to accumulate bottom-up.

       If the "semantics_package" named argument of the SLIF recognizer was not specified, Marpa
       will use the package into which the per-parse argument was blessed as the semantics
       package -- the package in which to look for the parse's Perl semantic closures.  In this
       case, Marpa will regard the per-parse arguments of all calls in the same parse series as
       the source of the semantics package, and it will require that the calls be consistent --
       each call must have a per-parse argument, and that per-parse argument  must be blessed
       into the semantics package.

Mutators for external scanning

   activate()
               $slr->activate($_, 0) for @events;

       The activate() method allows the recognizer to deactivate and reactivate named events.
       Named events allow the recognizer to stop for external scanning at conveniently defined
       locations.  Named events can be defined for the prediction and completion of non-zero-
       length symbols, and nulled events can be defined to trigger when zero-length symbols are
       recognized.

       The activate() method takes two arguments.  The first is the name of an event, and the
       second (optional) argument is 0 or 1.  If the argument is 0, the event is deactivated.  If
       the argument is 1, the event is reactivated.  An argument of 1 is the default.  but, since
       an SLIF recognizer always starts with all defined events activated, 0 will probably be
       more common as the second argument to activate()

       Location 0 events are triggered in the SLIF recognizer's constructor, before the
       activate() method can be called.  This means that currently there is no way to deactivate
       location zero events.

       The overhead imposed by events can be reduced by using the activate() method.  But making
       many calls to the the activate() method purely for efficiency purposes will be counter-
       productive.  Also, deactivated events still impose some overhead, so if an event is never
       used it should be commented out in the SLIF DSL.

   lexeme_alternative()
                   if ( not defined $recce->lexeme_alternative($token_name) ) {
                       die
                           qq{Parser rejected token "$long_name" at position $start_of_lexeme, before "},
                           substr( $string, $start_of_lexeme, 40 ), q{"};
                   }

       The lexeme_alternative() method allows an external scanner to read ambiguous tokens.  Most
       applications will prefer the simpler lexeme_read().

       lexeme_alternative() takes one or two arguments.  The first argument, which is required,
       is the name of a symbol to be read at the current location.  The second argument, which is
       optional, is the value of the symbol.  The value argument is interpreted as described for
       lexeme_read().

       Any number of tokens may be read using lexeme_alternative() without advancing the current
       location.  This allows an application to use ambiguous tokens.  To complete reading at a
       G1 location, and advance the current G1 location to the next G1 location, use the
       lexeme_complete() method.

       On success, returns a non-negative number.  Returns "undef" if the token was rejected.
       Failures are thrown as exceptions.

   lexeme_complete()
                   next TOKEN
                       if $recce->lexeme_complete( $start_of_lexeme,
                               ( length $lexeme ) );

       The lexeme_complete() method allows an external scanner to read ambiguous tokens.  Most
       applications will prefer the simpler lexeme_read().

       The lexeme_complete() method requires two arguments, a input stream start location and a
       length.  These are interpreted as described for the corresponding second and third
       arguments to lexeme_read().  The lexeme_complete() method completes the reading of
       alternative tokens at the current G1 location, and advances the current G1 location by
       one.  Current location in the input stream is moved to the location after the new lexeme,
       as indicated by the arguments.

       Completion named events can occur during the lexeme_complete() method.  Named events can
       be queried using the Scanless recognizer's events() method.

       Return value: On success, lexeme_complete() returns the new current location.  This will
       never be location zero, because a succesful call of lexeme_complete() always advances the
       location.  On unthrown failure, lexeme_complete() returns 0.

   lexeme_read()
           $re->lexeme_read( 'lstring', $start, $length, $value ) // die;

       The lexeme_read() method reads a single, unambiguous, lexeme.  It takes four arguments,
       only the first of which is required.  The first argument is the lexeme's symbol name.  The
       second and third arguments specify the span in the input stream to be associated with the
       lexeme.  The last argument indicates its value.

       The second and third arguments are, respectively, the start and length of a span in the
       input stream.  The start defaults to the current location.  If the pause span is defined,
       and the start of the pause lexeme is the same as the current location, length defaults to
       the length of the pause span.  Otherwise length defaults to -1.

       Negative values are allowed and are interpreted as described above.  This span will be
       treated as the section of the input stream that corresponds to the tokens read at the
       current location.  This correspondence may be artificial, but a span must always be
       specified.

       The fourth argument specifies the value of the lexeme.  If the value argument is omitted,
       the token's value will be a string containing the corresponding substring of the input
       stream.  Omitting the value argument does not have the same effect as passing an explicit
       Perl "undef".  If the value argument is an explicit Perl "undef", the value of the lexeme
       will be a Perl "undef".

           $slr->lexeme_read($symbol, $start, $length, $value)

       is the equivalent of

           $slr->lexeme_alternative($symbol, $value)
           $slr->lexeme_complete($start, $length)

       Current location in the input stream is moved to the place where read() paused or, if it
       never pauses, to "$start+$length".  Current G1 location is advanced by one.

       Completion named events can occur during the lexeme_read() method.  Named events can be
       queried using the Scanless recognizer's events() method.

       Return value: On success, lexeme_read() returns the new current location.  This will never
       be location zero, because lexemes cannot be zero length.  If the token was rejected,
       returns a Perl "undef".  On other unthrown failure, returns 0.

   resume()
           my $re = Marpa::R2::Scanless::R->new(
               {   grammar           => $parser->{grammar},
                   semantics_package => 'MarpaX::JSON::Actions'
               }
           );
           my $length = length $string;
           for (
               my $pos = $re->read( \$string );
               $pos < $length;
               $pos = $re->resume()
               )
           {
               my ( $start, $length ) = $re->pause_span();
               my $value = substr $string, $start + 1, $length - 2;
               $value = decode_string($value) if -1 != index $value, '\\';
               $re->lexeme_read( 'lstring', $start, $length, $value ) // die;
           } ## end for ( my $pos = $re->read( \$string ); $pos < $length...)
           my $per_parse_arg = bless {}, 'MarpaX::JSON::Actions';
           my $value_ref = $re->value($per_parse_arg);
           return ${$value_ref};

       The resume() method takes two arguments, a start location and a length.  The default start
       location is the current location.  The default length is -1.  Negative arguments are
       interpreted as described above.

       The resume() method resumes the SLIF's internal scanning, as described above.

       Completion named events can occur during the resume() method.  When a named event occurs,
       the resume() method pauses.  Named events can be queried using the Scanless recognizer's
       events() method.  The resume() method also pauses as specified with the Scanless DSL's
       pause adverb.

       On success, resume() moves the current location to where it paused, or to the EOS.  The
       return value is the new current location.  On unthrown failure, resume() return a Perl
       "undef".

Accessors

   ambiguity_metric()
           my $ambiguity_metric = $slr->ambiguity_metric();

       Returns 1 if there is an unambiguous parse, and 2 or greater if there is a ambiguous
       parse.  Returns 0 if called before parsing.  Returns 0 or less than zero on other unthrown
       failure.

   current_g1_location()
           my $current_g1_location = $slr->current_g1_location();

       Returns the current G1 location.

   events()
               EVENT:
               for my $event ( @{ $slr->events() } ) {
                   my ($name) = @{$event};
                   push @actual_events, $name;
               }

       The events() method takes no arguments, and returns an array of event descriptors.  It
       returns the empty array if there were no event.

       Each named event descriptor is a reference to an array of one, and potentially more,
       elements.  The first element of every named event descriptor is a string containing the
       name of the event, and this is typically the only element.  In certain cases, there could
       be other elements of a named event descriptor, which will be as described for the type of
       named event.  Named events are described in the SLIF DSL.

       Events occur during the the Scanless recognizer's read(), resume(), lexeme_complete(), and
       lexeme_read() methods.  Any subsequent call to an SLIF recognizer mutator may clear the
       list of triggered events, The assumption is that an application interested in events will
       call the events() method almost as soon as control is returned to it.

       Named events are returned in order by type.  Completion events are first.  They are
       followed by the nulled events.  These are in turn followed by prediction events.  Within
       each type, the order of events is arbitrary.

       Applications may find it convenient to turn specific events off, temporarily or
       permanently.  Events may be activated or deactivated with the SLIF recognizer's activate()
       method.

   exhausted()
           my $exhausted_status = $slr->exhausted();

       The exhausted method returns a Perl true if parsing in a SLIF recognizer is exhausted, and
       a Perl false otherwise. Parsing is exhausted when the recognizer will not accept any
       further input.

       An attempt to read input into an exhausted parser causes an exception to be thrown.  The
       exception is all that most applications require, but this method allows the recognizer's
       exhaustion status to be discovered directly.

   g1_location_to_span()
               my ( $span_start, $span_length ) =
                   $slr->g1_location_to_span($g1_location);

       G1 locations do not correspond to a single input stream location, but to a span of them.
       The g1_location_to_span() method returns an array of two elements, representing a span in
       the input stream.  The first element of the array is the input stream location where the
       span starts.  The second element of the array is the length of the span.  As a special
       case, the input stream span for G1 location 0 is always (0,0).

       Sometimes it is convenient to think of G1 location as corresponding to a single input
       stream location.  When this is the case, what is usually intended is the last input stream
       location of the span.  The last input stream location of the span will always be
       "$span_start+$span_length".

   input_length()
           my $input_length = $slr->input_length();

       The input_length() method accepts no arguments, and returns the length of the input
       stream.

   last_completed()
           sub show_last_expression {
               my ($self) = @_;
               my $recce = $self->{recce};
               my ( $g1_start, $g1_length ) = $recce->last_completed('Expression');
               return 'No expression was successfully parsed' if not defined $g1_start;
               my $last_expression = $recce->substring( $g1_start, $g1_length );
               return "Last expression successfully parsed was: $last_expression";
           } ## end sub show_last_expression

           my ( $g1_start, $g1_length ) = $recce->last_completed('Expression');

       Given the name of a symbol, returns the start G1 location and the length in G1 locations
       of the most recent match.  If there was more than one most recent match, it returns the
       longest.  If there was no match, returns the empty array in array context and a Perl false
       in scalar context.

   line_column()
           my ( $start, $span_length ) = $re->pause_span();
           my ( $line,  $column )      = $re->line_column($start);

       The line_column() method accepts one, optional, argument: a location in the input stream.
       The location defaults to the current location.  line_column() returns the corresponding
       line and column position, as a 2-element array.  The first element of the array is the
       line position, and the second element is the column position.

       Numbering of lines and columns is 1-based, following UNIX editor tradition.  Except at
       EOF, the line and column will be that of an actual character.  At EOF the line number will
       be that of the last line, and the column number will be that of the last column plus one.
       Applications which want to treat EOF as a special case can test it for using the pos()
       method and the input_length() method.

       A line is considered to end with any newline sequence as defined in the Unicode
       Specification 4.0.0, Section 5.8.  Specifically, a line ends with one of the following:

       •   a LF (line feed U+000A);

       •   a CR (carriage return, U+000D), when it is not followed by a LF;

       •   a CRLF sequence (U+000D,U+000A);

       •   a NEL (next line, U+0085);

       •   a VT (vertical tab, U+000B);

       •   a FF (form feed, U+000C);

       •   a LS (line separator, U+2028) or

       •   a PS (paragraph separator, U+2029).

   literal()
           my $literal_string = $re->literal( $start, $span_length );

       The literal() method accepts two arguments, the start location and length of a span in the
       input stream.  It returns the substring of the input stream corresponding to that span.

   pause_lexeme()
          my $lexeme = $re->pause_lexeme();

       The pause_lexeme() method accepts no arguments, and returns the name of the lexeme which
       caused the most recent pause.  The pause lexeme is initially undefined and it is reset to
       undefined at the beginning of each call to the read() or resume() methods.

       More than one lexeme may cause a pause.  When this is the case, all the causal lexemes
       will be acceptable to the G1 grammar, and all the causal lexemes will have the same lexeme
       priority.  When more than one lexeme causes a pause, the choice of pause lexeme is
       arbitrary.  Applications may not rely on a particular choice, or on that choice being
       repeated, even when the choice is made in similar or identical circumstances.

       Not every pause is caused by a lexeme.  A pause often occurs because of the length
       argument of an internal scanning method.  When the most recent pause was not caused by a
       lexeme, the pause lexeme is undefined.  pause_lexeme() returns a Perl "undef" when the
       pause lexeme is undefined.

   pause_span()
           my ( $start, $length ) = $re->pause_span();

       The pause_span() method accepts no arguments, and returns the "pause span" as a 2-element
       array.  The "pause span" is the start location and length of the lexeme which caused the
       most recent pause.  The pause span is initially undefined and it is reset to undefined at
       the beginning of each call to the read() or resume() methods.

       A pause is not always caused by a lexeme -- internal scanning may be paused because of the
       length argument of an internal scanning method.  When the most recent pause was not caused
       by a lexeme, no span can be associated with it, and the pause span is undefined.
       pause_span() returns a Perl "undef" if the pause span is undefined.

   pos()
           my $pos = $slr->pos();

       The pos() method accepts no arguments, and returns the current input stream location.

   progress()
           my $progress_output = $slr->progress();

       Returns an array that describes the progress of a parse at a location.  With no argument,
       progress() reports progress at the current location.  If a G1 location is given as its
       argument, progress() reports progress at that G1 location.  The G1 location may be
       negative.  An argument of -X will be interpreted as location N+X+1, where N is the current
       G1 location.  In other words, an argument of -1 indicates the current G1 location, an
       argument of -2 indicates the G1 location just before the current one, etc.

       The progress reports returned by the progress() method identify rules by their G1 rule ID.
       G1 rules IDs can be converted to a list of the rule's symbols using the rule() method of
       the SLIF grammar.  Details on progress reports can be found in their own document.

   show_progress()
           my $show_progress_output = $slr->show_progress();

       Shows the progress of the G1 parse.  For a description of its output, see
       Marpa::R2::Progress.

       With no arguments, the string contains reports for the current location.  If locations are
       specified as arguments to show_progress(), they need to be G1 locations.

       With a single integer argument N, the string contains reports for G1 location N.  With two
       numeric arguments, N and M, the arguments are interpreted as the start and end points of a
       range of G1 locations and the returned string contains reports for all locations in the
       range.

       If an argument is negative, -N, it indicates the Nth location counting backward from the
       furthest location of the parse.  For example, if 42 was the furthest G1 location, -1 would
       be G1 location 42 and -2 would be location 41.  For example, the method call
       "$recce->show_progress(-3, -1)" returns reports for the last three G1 locations of the
       parse.  The method call "$recce->show_progress(0, -1)" will print progress reports for the
       entire parse.

       Locations are G1 locations instead of string offsets, for two reasons.  First, G1 parse
       state is only defined at the start of parsing, and at the end of a non-discarded lexeme.
       Therefore many strings offsets will not have a G1 parse state.  Second, SLIF recognizers
       using external scanning are allowed to rescan the same string repeatedly.  Therefore, a
       single string offset may have many G1 parse states.

   substring()
           my $last_expression = $recce->substring( $g1_start, $g1_length );

       Given a G1 span -- that is, a G1 start location and a length in G1 locations -- the
       substring() method returns a substring of the input stream.  A G1 length of zero will
       produce the zero-length string.

       The substring of the input stream is determined on the assumption that the application
       reads the input monotonically.  When this is not the case, the substring is determined as
       described above.

   terminals_expected()
           my @terminals_expected = @{$slr->terminals_expected()};

       Returns a reference to a list of strings, where the strings are the names of the lexemes
       acceptable at the current location.  The presence of a lexeme in this list means that
       lexeme will be acceptable in the next call of the resume() method.

       This is highly useful for Ruby Slippers parsing.  A more fine-tuned approach is to
       identify the lexemes of interest and create "predicted symbol" events for them.

Discouraged methods

       Methods in this section continue to be supported, but their use is discouraged in favor of
       other, better solutions.  New applications should avoid using discouraged methods.

   event()
                   my $event    = $slr->event($event_ix);

       Use of this method is discouraged in favor of the more efficient events() method.  The
       event() method requires one argument, an event index.  It returns a descriptor of the
       named event with that index, or a Perl "undef" if there is no such event.  For more
       details on events, see the description of the events() method.

   last_completed_range()
       Use of this method is discouraged in favor of "last_completed()".  Given the name of a
       symbol, last_completed_range() returns the G1 start and G1 end locations of the most
       recent match.  If there was more than one most recent match, last_completed_range()
       returns the longest.  If there was no match, last_completed_range() returns the empty
       array in array context and a Perl false in scalar context.

   range_to_string()
       Use of this method is discouraged in favor of "substring()".  Given a G1 start and a G1
       end location, range_to_string() returns the substring of the input stream that is between
       the two.  The range_to_string() method assumes that the application read forward smoothly
       in the input stream, while reading the sequence of G1 locations.  When that is not the
       case, range_to_string() behaves in much the same way as described above for "substring()".

Copyright and License

         Copyright 2014 Jeffrey Kegler
         This file is part of Marpa::R2.  Marpa::R2 is free software: you can
         redistribute it and/or modify it under the terms of the GNU Lesser
         General Public License as published by the Free Software Foundation,
         either version 3 of the License, or (at your option) any later version.

         Marpa::R2 is distributed in the hope that it will be useful,
         but WITHOUT ANY WARRANTY; without even the implied warranty of
         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
         Lesser General Public License for more details.

         You should have received a copy of the GNU Lesser
         General Public License along with Marpa::R2.  If not, see
         http://www.gnu.org/licenses/.