Provided by: libstatistics-r-io-perl_1.0002-2_all bug

NAME

       Statistics::R::IO::Parser - Functions for parsing R data files

VERSION

       version 1.0002

SYNOPSIS

           use Statistics::R::IO::ParserState;
           use Statistics::R::IO::Parser;

           my $state = Statistics::R::IO::ParserState->new(
               data => 'file.rds'
           );
           say $state->at
           say $state->next->at;

DESCRIPTION

       You shouldn't create instances of this class, it exists mainly to handle deserialization
       of R data files by the "IO" classes.

FUNCTIONS

       This library is inspired by monadic parser frameworks from the Haskell world, like Packrat
       <http://bford.info/packrat/> or Parsec <http://hackage.haskell.org/package/parsec>. What
       this means is that parsers are constructed by combining simpler parsers.

       The library offers a selection of basic parsers and combinators.  Each of these is a
       function (think of it as a factory) that returns another function (the actual parser)
       which receives the current parsing state (Statistics::R::IO::ParserState) as the argument
       and returns a two-element array reference (called for brevity "a pair" in the following
       text) with the result of the parser in the first element and the new parser state in the
       second element. If the parser fails, say if the current state is "a" where a number is
       expected, it returns "undef" to signal failure.

       The descriptions of individual functions below use a shorthand because the above mechanism
       is implied. Thus, when "any_char" is described as "parses any character", it really means
       that calling "any_char" will return a function that when called with the current state
       will return "a pair of the character...", etc.

   CHARACTER PARSERS
       any_char
           Parses any character, returning a pair of the character at the current State's
           position and the new state, advanced by one from the starting state. If the state is
           at the end ("$state-"eof> is true), returns undef to signal failure.

       char $c
           Parses the given character $c, returning a pair of the character at the current
           State's position if it is equal to $c and the new state, advanced by one from the
           starting state. If the state is at the end ("$state-"eof> is true) or the character at
           the current position is not $c, returns undef to signal failure.

       string $s
           Parses the given string $s, returning a pair of the sequence of characters starting at
           the current State's position if it is equal to $s and the new state, advanced by
           "length($s)" from the starting state. If the state is at the end ("$state-"eof> is
           true) or the string starting at the current position is not $s, returns undef to
           signal failure.

   NUMBER PARSERS
       endianness [$end]
           When the $end argument is given, this functions sets the byte order used by parsers in
           the module to be little- (when $end is "<") or big-endian ($end is ">"). This function
           changes the module's state and remains in effect until the next change.

           When called with no arguments, "endianness" returns the current byte order in effect.
           The starting byte order is big-endian.

       any_uint8, any_uint16, any_uint24, any_uint32
           Parses an 8-, 16-, 24-, and 32-bit unsigned integer, returning a pair of the integer
           starting at the current State's position and the new state, advanced by 1, 2, 3, or 4
           bytes from the starting state, depending on the parser. The integer value is
           determined by the current value of "endianness". If there are not enough elements left
           in the data from the current position, returns undef to signal failure.

       uint8 $n, uint16 $n, uint24 $n, uint32 $n
           Parses the specified 8-, 16-, 24-, and 32-bit unsigned integer $n, returning a pair of
           the integer at the current State's position if it is equal $n and the new state. The
           new state is advanced by 1, 2, 3, or 4 bytes from the starting state, depending on the
           parser. The integer value is determined by the current value of "endianness". If there
           are not enough elements left in the data from the current position or the current
           position is not $n, returns undef to signal failure.

       any_int8, any_int16, any_int24, any_int32
           Parses an 8-, 16-, 24-, and 32-bit signed integer, returning a pair of the integer
           starting at the current State's position and the new state, advanced by 1, 2, 3, or 4
           bytes from the starting state, depending on the parser. The integer value is
           determined by the current value of "endianness". If there are not enough elements left
           in the data from the current position, returns undef to signal failure.

       int8 $n, int16 $n, int24 $n, int32 $n
           Parses the specified 8-, 16-, 24-, and 32-bit signed integer $n, returning a pair of
           the integer at the current State's position if it is equal $n and the new state. The
           new state is advanced by 1, 2, 3, or 4 bytes from the starting state, depending on the
           parser. The integer value is determined by the current value of "endianness". If there
           are not enough elements left in the data from the current position or the current
           position is not $n, returns undef to signal failure.

       any_real32, any_real64
           Parses an 32- or 64-bit real number, returning a pair of the number starting at the
           current State's position and the new state, advanced by 4 or 8 bytes from the starting
           state, depending on the parser. The real value is determined by the current value of
           "endianness". If there are not enough elements left in the data from the current
           position, returns undef to signal failure.

       any_int32_na, any_real64_na
           Parses a 32-bit signed integer or 64-bit real number, respectively, but recognizing
           R-style missing values (NAs): INT_MIN for integers and a special NaN bit pattern for
           reals. Returns a pair of the number value ("undef" if a NA) and the new state,
           advanced by 4 or 8 bytes from the starting state, depending on the parser. If there
           are not enough elements left in the data from the current position, returns undef to
           signal failure.

   SEQUENCING
       seq $p1, ...
           This combinator applies parsers $p1, ... in sequence, using the returned parse state
           of $p1 as the input parse state to $p2, etc.  Returns a pair of the concatenation of
           all the parsers' results and the parsing state returned by the final parser. If any of
           the parsers returns undef, "seq" will return it immediately without attempting to
           apply any further parsers.

       many_till $p, $end
           This combinator applies a parser $p until parser $end succeeds.  It does this by
           alternating applications of $end and $p; once $end succeeds, the function returns the
           concatenation of results of preceding applications of $p. (Thus, if $end succeeds
           immediately, the 'result' is an empty list.) Otherwise, $p is applied and must
           succeed, and the procedure repeats. Returns a pair of the concatenation of all the
           $p's results and the parsing state returned by the final parser. If any applications
           of $p returns undef, "many_till" will return it immediately.

       count $n, $p
           This combinator applies the parser $p exactly $n times in sequence, threading the
           parse state through each call.  Returns a pair of the concatenation of all the
           parsers' results and the parsing state returned by the final application. If any
           application of $p returns undef, "count" will return it immediately without attempting
           any more applications.

       with_count [$num_p = any_uint32], $p
           This combinator first applies parser $num_p to get the number of times that $p should
           be applied in sequence. If only one argument is given, "any_uint32" is used as the
           default value of $num_p.  (So "with_count" works by getting a number $n by applying
           $num_p and then calling "count $n, $p".) Returns a pair of the concatenation of all
           the parsers' results and the parsing state returned by the final application. If the
           initial application of $num_p or any application of $p returns undef, "with_count"
           will return it immediately without attempting any more applications.

       choose $p1, ...
           This combinator applies parsers $p1, ... in sequence, until one of them succeeds, when
           it immediately returns the parser's result.  If all of the parsers fail, "choose"
           fails and returns undef

   COMBINATORS
       bind $p1, $f
           This combinator applies parser $p1 and, if it succeeds, calls function $f using the
           first element of $p1's result as the argument. The call to $f needs to return a
           parser, which "bind" applies to the parsing state after $p1's application.

           The "bind" combinator is an essential building block for most combinators described so
           far. For instance, "with_count" can be written as:

               bind($num_p,
                    sub {
                        my $n = shift;
                        count $n, $p;
                    })

       mreturn $value
           Returns a parser that when applied returns $value without changing the parsing state.

       error $message
           Returns a parser that when applied croaks with the $message and the current parsing
           state.

   SINGLETONS
       These functions are an interface to ParseState's singleton-related functions,
       "add_singleton" in ParseState and "get_singleton" in ParseState. They exist because
       certain types of objects in R data files, for instance environments, have to exist as
       unique instances, and any subsequent objects that include them refer to them by a
       "reference id".

       add_singleton $singleton
           Adds the $singleton to the current parsing state.  Returns a pair of $singleton and
           the new parsing state.

       get_singleton $ref_id
           Retrieves from the current parse state the singleton identified by $ref_id, returning
           a pair of the singleton and the (unchanged) state.

       reserve_singleton $p
           Preallocates a space for a singleton before running a given parser, and then assigns
           the parser's value to the singleton. Returns a pair of the singleton and the new parse
           state.

BUGS AND LIMITATIONS

       Instances of this class are intended to be immutable. Please do not try to change their
       value or attributes.

       There are no known bugs in this module. Please see Statistics::R::IO for bug reporting.

SUPPORT

       See Statistics::R::IO for support and contact information.

AUTHOR

       Davor Cubranic <cubranic@stat.ubc.ca>

COPYRIGHT AND LICENSE

       This software is Copyright (c) 2017 by University of British Columbia.

       This is free software, licensed under:

         The GNU General Public License, Version 3, June 2007