Provided by: re2c_0.16-1_amd64 bug

NAME

       re2c - convert regular expressions to C/C++ code

SYNOPSIS

       re2c [OPTIONS] FILE

DESCRIPTION

       re2c  is a lexer generator for C/C++. It finds regular expression specifications inside of
       C/C++ comments and replaces them  with  a  hard-coded  DFA.  The  user  must  supply  some
       interface code in order to control and customize the generated DFA.

OPTIONS

       -? -h --help
              Invoke a short help.

       -b --bit-vectors
              Implies  -s.  Use bit vectors as well in the attempt to coax better code out of the
              compiler. Most useful for specifications with more than a few  keywords  (e.g.  for
              most programming languages).

       -c --conditions
              Used to support (f)lex-like condition support.

       -d --debug-output
              Creates  a  parser  that  dumps information about the current position and in which
              state the parser is while parsing the input. This is useful to debug parser  issues
              and  states.  If  you  use  this  switch you need to define a macro YYDEBUG that is
              called like a function with two parameters: void YYDEBUG (int state, char current).
              The  first parameter receives the state or -1 and the second parameter receives the
              input at the current cursor.

       -D --emit-dot
              Emit Graphviz dot data. It can then be processed with e.g. dot  -Tpng  input.dot  >
              output.png. Please note that scanners with many states may crash dot.

       -e --ecb
              Generate  a  parser  that  supports  EBCDIC.  The  generated code can deal with any
              character up to 0xFF. In this mode re2c assumes that  input  character  size  is  1
              byte. This switch is incompatible with -w, -x, -u and -8.

       -f --storable-state
              Generate a scanner with support for storable state.

       -F --flex-syntax
              Partial  support  for  flex syntax. When this flag is active then named definitions
              must be surrounded by curly braces and can be defined without an equal sign and the
              terminating semi colon.  Instead names are treated as direct double quoted strings.

       -g --computed-gotos
              Generate  a  scanner  that  utilizes  GCC's  computed  goto  feature.  That is re2c
              generates jump tables whenever a decision is of a certain complexity (e.g. a lot of
              if  conditions are otherwise necessary). This is only useable with GCC and produces
              output that cannot be compiled with any other compiler. Note that this  implies  -b
              and that the complexity threshold can be configured using the inplace configuration
              cgoto:threshold.

       -i --no-debug-info
              Do not output #line information. This is useful when you want use a CMS  tool  with
              the  re2c output which you might want if you do not require your users to have re2c
              themselves when building from your source.

       -o OUTPUT --output=OUTPUT
              Specify the OUTPUT file.

       -r --reusable
              Allows reuse of scanner definitions with /*!use:re2c */ after /*!rules:re2c */.  In
              this  mode  no  /*!re2c  */ block and exactly one /*!rules:re2c */ must be present.
              The rules are being saved and used by every  /*!use:re2c  */  block  that  follows.
              These   blocks   can   contain  inplace  configurations,  especially  re2c:flags:e,
              re2c:flags:w, re2c:flags:x, re2c:flags:u and re2c:flags:8.  That way it is possible
              to  create the same scanner multiple times for different character types, different
              input mechanisms or different output mechanisms.  The  /*!use:re2c  */  blocks  can
              also  contain  additional  rules  that  will  be  appended  to  the set of rules in
              /*!rules:re2c */.

       -s --nested-ifs
              Generate nested ifs for some switches. Many compilers need this assist to  generate
              better code.

       -t HEADER --type-header=HEADER
              Create  a  HEADER  file  that contains types for the (f)lex-like condition support.
              This can only be activated when -c is in use.

       -u --unicode
              Generate a parser that supports UTF-32. The generated code can deal with any  valid
              Unicode  character  up  to 0x10FFFF. In this mode re2c assumes that input character
              size is 4 bytes. This switch is incompatible with -e, -w, -x and -8.  This  implies
              -s.

       -v --version
              Show version information.

       -V --vernum
              Show the version as a number XXYYZZ.

       -w --wide-chars
              Generate  a  parser that supports UCS-2. The generated code can deal with any valid
              Unicode character up to 0xFFFF.  In this mode re2c  assumes  that  input  character
              size  is  2 bytes. This switch is incompatible with -e, -x, -u and -8. This implies
              -s.

       -x --utf-16
              Generate a parser that supports UTF-16. The generated code can deal with any  valid
              Unicode  character  up  to 0x10FFFF. In this mode re2c assumes that input character
              size is 2 bytes. This switch is incompatible with -e, -w, -u and -8.  This  implies
              -s.

       -8 --utf-8
              Generate  a  parser that supports UTF-8. The generated code can deal with any valid
              Unicode character up to 0x10FFFF. In this mode re2c assumes  that  input  character
              size is 1 byte. This switch is incompatible with -e, -w, -x and -u.

       --case-insensitive
              All  strings are case insensitive, so all "-expressions are treated in the same way
              '-expressions are.

       --case-inverted
              Invert the meaning of single and double quoted strings.  With  this  switch  single
              quotes are case sensitive and double quotes are case insensitive.

       --no-generation-date
              Suppress date output in the generated file.

       --no-generation-date
              Suppress version output in the generated file.

       --encoding-policy POLICY
              Specify how re2c must treat Unicode surrogates. POLICY can be one of the following:
              fail (abort with error when surrogate encountered), substitute (silently substitute
              surrogate  with  error  code point 0xFFFD), ignore (treat surrogates as normal code
              points). By default re2c ignores surrogates (for backward  compatibility).  Unicode
              standard  says  that  standalone  surrogates are invalid code points, but different
              libraries and programs treat them differently.

       --input INPUT
              Specify re2c input API. INPUT can be one of the following: default, custom.

       -S --skeleton
              Instead  of  embedding  re2c-generated  code  into   C/C++   source,   generate   a
              self-contained   program  for  the  same  DFA.  Most  useful  for  correctness  and
              performance testing.

       --empty-class POLICY
              What to do if user  inputs  empty  character  class.  POLICY  can  be  one  of  the
              following:  match-empty  (match  empty  input:  pretty  illogical,  but this is the
              default for backwards compatibility reason),  match-none  (fail  to  match  on  any
              input),  error  (compilation  error). Note that there are various ways to construct
              empty class, e.g: [], [^\x00-\xFF], [\x00-\xFF][\x00-\xFF].

       --dfa-minimization <table | moore>
              Internal algorithm used by re2c to minimize DFA (defaults to  moore).   Both  table
              filling  and  Moore's  algorithms  should  produce  identical  DFA  (up  to  states
              relabelling).  Table filling algorithm is much simpler and slower; it serves  as  a
              reference implementation.

       -1 --single-pass
              Deprecated and does nothing (single pass is by default now).

       -W     Turn on all warnings.

       -Werror
              Turn  warnings  into  errors.  Note  that  this  option  along  doesn't turn on any
              warnings, it only affects those warnings that have been turned on so far or will be
              turned on later.

       -W<warning>
              Turn on individual warning.

       -Wno-<warning>
              Turn off individual warning.

       -Werror-<warning>
              Turn on individual warning and treat it as error (this implies -W<warning>).

       -Wno-error-<warning>
              Don't  treat  this  particular  warning as error. This doesn't turn off the warning
              itself.

       -Wcondition-order
              Warn if the generated program makes implicit assumptions about condition numbering.
              One  should  use  either  -t,  --type-header option or /*!types:re2c*/ directive to
              generate mapping of condition names to  numbers  and  use  autogenerated  condition
              names.

       -Wempty-character-class
              Warn  if regular expression contains empty character class. From the rational point
              of view trying to match empty character class makes  no  sense:  it  should  always
              fail.  However,  for  backwards  compatibility  reasons re2c allows empty character
              class and treats it as empty string. Use --empty-class  option  to  change  default
              behaviour.

       -Wmatch-empty-string
              Warn  if  regular  expression  in a rule is nullable (matches empty string). If DFA
              runs in a loop and empty match is unintentional (input  position  in  not  advanced
              manually), lexer may get stuck in eternal loop.

       -Wswapped-range
              Warn if range lower bound is greater that upper bound. Default re2c behaviour is to
              silently swap range bounds.

       -Wundefined-control-flow
              Warn if some input strings cause  undefined  control  flow  in  lexer  (the  faulty
              patterns  are  reported).  This is the most dangerous and common mistake. It can be
              easily fixed by adding default rule * (this rule has the lowest  priority,  matches
              any code unit and consumes exactly one code unit).

       -Wuseless-escape
              Warn if a symbol is escaped when it shouldn't be.  By default re2c silently ignores
              escape, but this may as well indicate a typo or an error in escape sequence.

INTERFACE CODE

       The user must supply interface code either in the form of C/C++ code  (macros,  functions,
       variables,  etc.) or in the form of INPLACE CONFIGURATIONS.  Which symbols must be defined
       and which are optional depends on a particular use case.

       YYCONDTYPE
              In -c mode you can use -t to generate a file that contains the enumeration used  as
              conditions. Each of the values refers to a condition of a rule set.

       YYCTXMARKER
              l-value  of type YYCTYPE *.  The generated code saves trailing context backtracking
              information in YYCTXMARKER. The user only needs to define this macro if  a  scanner
              specification uses trailing context in one or more of its regular expressions.

       YYCTYPE
              Type  used  to  hold an input symbol (code unit). Usually char or unsigned char for
              ASCII, EBCDIC and UTF-8, unsigned short for UTF-16 or UCS-2 and  unsigned  int  for
              UTF-32.

       YYCURSOR
              l-value  of  type  YYCTYPE * that points to the current input symbol. The generated
              code advances YYCURSOR as symbols are matched. On entry,  YYCURSOR  is  assumed  to
              point  to the first character of the current token. On exit, YYCURSOR will point to
              the first character of the following token.

       YYDEBUG (state, current)
              This is only needed if the -d flag was specified. It allows one to easily debug the
              generated  parser  by calling a user defined function for every state. The function
              should have the following signature: void YYDEBUG (int state,  char  current).  The
              first  parameter  receives  the  state  or -1 and the second parameter receives the
              input at the current cursor.

       YYFILL (n)
              The generated code "calls"" YYFILL (n) when the buffer needs (re)filling: at  least
              n  additional  characters  should  be  provided. YYFILL (n) should adjust YYCURSOR,
              YYLIMIT, YYMARKER and YYCTXMARKER as needed.  Note  that  for  typical  programming
              languages  n will be the length of the longest keyword plus one. The user can place
              a comment of the form /*!max:re2c*/ to insert YYMAXFILL definition that is  set  to
              the maximum length value.

       YYGETCONDITION ()
              This  define  is  used to get the condition prior to entering the scanner code when
              using -c switch. The value must be initialized with a value  from  the  enumeration
              YYCONDTYPE type.

       YYGETSTATE ()
              The  user  only  needs  to  define this macro if the -f flag was specified. In that
              case, the generated code "calls" YYGETSTATE () at the very beginning of the scanner
              in order to obtain the saved state. YYGETSTATE () must return a signed integer. The
              value must be either -1, indicating that the scanner is entered for the first time,
              or a value previously saved by YYSETSTATE (s). In the second case, the scanner will
              resume operations right after where the last YYFILL (n) was called.

       YYLIMIT
              Expression of type YYCTYPE * that marks the end of the buffer  YYLIMIT[-1]  is  the
              last  character  in the buffer). The generated code repeatedly compares YYCURSOR to
              YYLIMIT to determine when the buffer needs (re)filling.

       YYMARKER
              l-value of type YYCTYPE *.  The generated code saves  backtracking  information  in
              YYMARKER. Some easy scanners might not use this.

       YYMAXFILL
              This will be automatically defined by /*!max:re2c*/ blocks as explained above.

       YYSETCONDITION (c)
              This  define  is  used to set the condition in transition rules. This is only being
              used when -c is active and transition rules are being used.

       YYSETSTATE (s)
              The user only needs to define this macro if the -f  flag  was  specified.  In  that
              case,  the  generated  code  "calls" YYSETSTATE just before calling YYFILL (n). The
              parameter to YYSETSTATE is a signed integer that uniquely identifies  the  specific
              instance of YYFILL (n) that is about to be called. Should the user wish to save the
              state of the scanner and have YYFILL (n) return to the caller, all he has to do  is
              store  that  unique  identifer  in  a variable. Later, when the scannered is called
              again, it will call YYGETSTATE () and resume execution right where it left off. The
              generated  code  will contain both YYSETSTATE (s) and YYGETSTATE even if YYFILL (n)
              is being disabled.

SYNTAX

       Code for re2c consists of a set of RULES, NAMED DEFINITIONS and INPLACE CONFIGURATIONS.

   RULES
       Rules consist of a regular expression (see REGULAR EXPRESSIONS)  along  with  a  block  of
       C/C++  code  that is to be executed when the associated regular expression is matched. You
       can either start the code with an opening curly brace or the sequence :=.  When  the  code
       with  a  curly  brace  then  re2c  counts  the  brace  depth  and  stops  looking for code
       automatically. Otherwise curly braces are not allowed and re2c stops looking for  code  at
       the  first  line  that  does  not begin with whitespace. If two or more rules overlap, the
       first rule is preferred.
          regular-expression { C/C++ code }

          regular-expression := C/C++ code

       There is one special rule: default rule *
          * { C/C++ code }

          * := C/C++ code

       Note that default rule * differs from [^]: default rule has the lowest  priority,  matches
       any  code  unit  (either  valid  or  invalid) and always consumes one character; while [^]
       matches any valid code point (not code unit) and can consume multiple code units. In fact,
       when  variable-length  encoding is used, * is the only possible way to match invalid input
       character (see ENCODINGS for details).

       If -c is active then each regular expression is preceded by  a  list  of  comma  separated
       condition  names. Besides normal naming rules there are two special cases: <*> (such rules
       are merged to all conditions) and <> (such the rule  cannot  have  an  associated  regular
       expression,  its  code is merged to all actions). Non empty rules may further more specify
       the new condition. In that case re2c will  generate  the  necessary  code  to  change  the
       condition  automatically.  Rules  can use :=> as a shortcut to automatically generate code
       that not only sets the new condition state but  also  continues  execution  with  the  new
       state.  A shortcut rule should not be used in a loop where there is code between the start
       of the loop and the re2c block unless re2c:cond:goto is changed to continue.  If  code  is
       necessary  before  all  rules  (though  not  simple  jumps)  you  can  doso  by  using <!>
       pseudo-rules.
          <condition-list> regular-expression { C/C++ code }

          <condition-list> regular-expression := C/C++ code

          <condition-list> * { C/C++ code }

          <condition-list> * := C/C++ code

          <condition-list> regular-expression => condition { C/C++ code }

          <condition-list> regular-expression => condition := C/C++ code

          <condition-list> * => condition { C/C++ code }

          <condition-list> * => condition := C/C++ code

          <condition-list> regular-expression :=> condition

          <*> regular-expression { C/C++ code }

          <*> regular-expression := C/C++ code

          <*> * { C/C++ code }

          <*> * := C/C++ code

          <*> regular-expression => condition { C/C++ code }

          <*> regular-expression => condition := C/C++ code

          <*> * => condition { C/C++ code }

          <*> * => condition := C/C++ code

          <*> regular-expression :=> condition

          <> { C/C++ code }

          <> := C/C++ code

          <> => condition { C/C++ code }

          <> => condition := C/C++ code

          <> :=> condition

          <> :=> condition

          <! condition-list> { C/C++ code }

          <! condition-list> := C/C++ code

          <!> { C/C++ code }

          <!> := C/C++ code

   NAMED DEFINITIONS
       Named definitions are of the form:
          name = regular-expression;

       If -F is active, then named definitions are also of the form:
          name { regular-expression }

   INPLACE CONFIGURATIONS
       re2c:condprefix = yyc;
              Allows one to specify the prefix used for condition labels. That is  this  text  is
              prepended to any condition label in the generated output file.

       re2c:condenumprefix = yyc;
              Allows  one  to  specify the prefix used for condition values. That is this text is
              prepended to any condition enum value in the generated output file.

       re2c:cond:divider = /* *********************************** */ ;
              Allows one to customize the devider for condition blocks. You can use @@ to put the
              name of the condition or customize the placeholder using re2c:cond:divider@cond.

       re2c:cond:divider@cond = @@;
              Specifies  the  placeholder  that  will  be  replaced  with  the  condition name in
              re2c:cond:divider.

       re2c:cond:goto = goto @@; ;
              Allows one to customize the condition goto statements used with  :=>  style  rules.
              You  can  use @@ to put the name of the condition or ustomize the placeholder using
              re2c:cond:goto@cond. You can also change this to continue;, which would  allow  you
              to continue with the next loop cycle including any code between loop start and re2c
              block.

       re2c:cond:goto@cond = @@;
              Spcifies the placeholder  that  will  be  replaced  with  the  condition  label  in
              re2c:cond:goto.

       re2c:indent:top = 0;
              Specifies  the  minimum  number  of  indentation  to  use. Requires a numeric value
              greater than or equal zero.

       re2c:indent:string = \t ;
              Specifies the string to use for indentation. Requires a string that should  contain
              only whitespace unless you need this for external tools. The easiest way to specify
              spaces is to enclude them in single or double quotes.   If  you  do  not  want  any
              indentation at all you can simply set this to "".

       re2c:yych:conversion = 0;
              When  this  setting  is non zero, then re2c automatically generates conversion code
              whenever  yych  gets  read.  In  this  case  the  type  must   be   defined   using
              re2c:define:YYCTYPE.

       re2c:yych:emit = 1;
              Generation of yych can be suppressed by setting this to 0.

       re2c:yybm:hex = 0;
              If  set to zero then a decimal table is being used else a hexadecimal table will be
              generated.

       re2c:yyfill:enable = 1;
              Set this to zero to suppress generation of YYFILL (n). When using this be  sure  to
              verify  that  the  generated  scanner  does  not  read  behind input. Allowing this
              behavior might introduce sever security issues to you programs.

       re2c:yyfill:check = 1;
              This can be set 0 to suppress output  of  the  pre  condition  using  YYCURSOR  and
              YYLIMIT which becomes useful when YYLIMIT + YYMAXFILL is always accessible.

       re2c:define:YYFILL = YYFILL ;
              Substitution for YYFILL. Note that by default re2c generates argument in braces and
              semicolon after YYFILL. If you need to make YYFILL an  arbitrary  statement  rather
              than    a    call,    set    re2c:define:YYFILL:naked    to    non-zero   and   use
              re2c:define:YYFILL@len to denote formal parameter inside of YYFILL body.

       re2c:define:YYFILL@len = @@ ;
              Any occurrence of this text inside of YYFILL  will  be  replaced  with  the  actual
              argument.

       re2c:yyfill:parameter = 1;
              Controls  argument  in  braces  after  YYFILL.  If  zero,  agrument  is omitted. If
              non-zero, argument is generated unless re2c:define:YYFILL:naked is set to non-zero.

       re2c:define:YYFILL:naked = 0;
              Controls argument in braces and semicolon after YYFILL. If zero, both agrument  and
              semicolon    are    omitted.    If   non-zero,   argument   is   generated   unless
              re2c:yyfill:parameter is set to zero and semicolon is generated unconditionally.

       re2c:startlabel = 0;
              If set to a non zero integer then the start label of the next scanner  blocks  will
              be  generated even if not used by the scanner itself. Otherwise the normal yy0 like
              start label is only being generated if needed. If set to a text value then a  label
              with  that  text  will be generated regardless of whether the normal start label is
              being used or not. This setting is being reset to 0 after a start  label  has  been
              generated.

       re2c:labelprefix = yy ;
              Allows  one  to  change the prefix of numbered labels. The default is yy and can be
              set any string that is a valid label.

       re2c:state:abort = 0;
              When not zero and switch -f is active then the  YYGETSTATE  block  will  contain  a
              default case that aborts and a -1 case is used for initialization.

       re2c:state:nextlabel = 0;
              Used  when  -f  is  active to control whether the YYGETSTATE block is followed by a
              yyNext: label line.  Instead of using yyNext you can usually also use configuration
              startlabel  to  force  a  specific  start  label  or default to yy0 as start label.
              Instead of using a dedicated label it is often better to  separate  the  YYGETSTATE
              code from the actual scanner code by placing a /*!getstate:re2c*/ comment.

       re2c:cgoto:threshold = 9;
              When  -g  is  active  this  value  specifies the complexity threshold that triggers
              generation of jump tables rather than using nested if's and decision bitfields. The
              threshold  is  compared  against a calculated estimation of if-s needed where every
              used bitmap divides the threshold by 2.

       re2c:yych:conversion = 0;
              When the input uses signed characters and -s or -b  switches  are  in  effect  re2c
              allows  one  to  automatically  convert to the unsigned character type that is then
              necessary for its internal single character. When this setting is zero or an  empty
              string  the conversion is disabled. Using a non zero number the conversion is taken
              from YYCTYPE. If that is given by an inplace  configuration  that  value  is  being
              used.  Otherwise  it  will  be  (YYCTYPE)  and changes to that configuration are no
              longer possible. When this setting is a string the braces must  be  specified.  Now
              assuming  your  input is a char * buffer and you are using above mentioned switches
              you can set YYCTYPE to unsigned char and this setting  to  either  1  or  (unsigned
              char).

       re2c:define:YYCONDTYPE = YYCONDTYPE ;
              Enumeration used for condition support with -c mode.

       re2c:define:YYCTXMARKER = YYCTXMARKER ;
              Allows  one to overwrite the define YYCTXMARKER and thus avoiding it by setting the
              value to the actual code needed.

       re2c:define:YYCTYPE = YYCTYPE ;
              Allows one to overwrite the define YYCTYPE and thus  avoiding  it  by  setting  the
              value to the actual code needed.

       re2c:define:YYCURSOR = YYCURSOR ;
              Allows  one  to  overwrite  the define YYCURSOR and thus avoiding it by setting the
              value to the actual code needed.

       re2c:define:YYDEBUG = YYDEBUG ;
              Allows one to overwrite the define YYDEBUG and thus  avoiding  it  by  setting  the
              value to the actual code needed.

       re2c:define:YYGETCONDITION = YYGETCONDITION ;
              Substitution  for  YYGETCONDITION. Note that by default re2c generates braces after
              YYGETCONDITION. Set re2c:define:YYGETCONDITION:naked to non-zero to omit braces.

       re2c:define:YYGETCONDITION:naked = 0;
              Controls braces after YYGETCONDITION. If zero, braces  are  omitted.  If  non-zero,
              braces are generated.

       re2c:define:YYSETCONDITION = YYSETCONDITION ;
              Substitution  for  YYSETCONDITION.  Note that by default re2c generates argument in
              braces and semicolon after YYSETCONDITION. If you need to  make  YYSETCONDITION  an
              arbitrary  statement  rather  than  a call, set re2c:define:YYSETCONDITION:naked to
              non-zero and use re2c:define:YYSETCONDITION@cond to denote formal parameter  inside
              of YYSETCONDITION body.

       re2c:define:YYSETCONDITION@cond = @@ ;
              Any  occurrence  of  this  text  inside of YYSETCONDITION will be replaced with the
              actual argument.

       re2c:define:YYSETCONDITION:naked = 0;
              Controls argument in braces and  semicolon  after  YYSETCONDITION.  If  zero,  both
              agrument  and  semicolon  are omitted. If non-zero, both argument and semicolon are
              generated.

       re2c:define:YYGETSTATE = YYGETSTATE ;
              Substitution for YYGETSTATE. Note that  by  default  re2c  generates  braces  after
              YYGETSTATE. Set re2c:define:YYGETSTATE:naked to non-zero to omit braces.

       re2c:define:YYGETSTATE:naked = 0;
              Controls  braces after YYGETSTATE. If zero, braces are omitted. If non-zero, braces
              are generated.

       re2c:define:YYSETSTATE = YYSETSTATE ;
              Substitution for YYSETSTATE. Note that by default re2c generates argument in braces
              and  semicolon  after  YYSETSTATE.  If  you  need  to  make YYSETSTATE an arbitrary
              statement rather than a call, set re2c:define:YYSETSTATE:naked to non-zero and  use
              re2c:define:YYSETSTATE@cond to denote formal parameter inside of YYSETSTATE body.

       re2c:define:YYSETSTATE@state = @@ ;
              Any  occurrence  of this text inside of YYSETSTATE will be replaced with the actual
              argument.

       re2c:define:YYSETSTATE:naked = 0;
              Controls argument in braces and semicolon after YYSETSTATE. If zero, both  agrument
              and semicolon are omitted. If non-zero, both argument and semicolon are generated.

       re2c:define:YYLIMIT = YYLIMIT ;
              Allows  one  to  overwrite  the  define YYLIMIT and thus avoiding it by setting the
              value to the actual code needed.

       re2c:define:YYMARKER = YYMARKER ;
              Allows one to overwrite the define YYMARKER and thus avoiding  it  by  setting  the
              value to the actual code needed.

       re2c:label:yyFillLabel = yyFillLabel ;
              Allows one to overwrite the name of the label yyFillLabel.

       re2c:label:yyNext = yyNext ;
              Allows one to overwrite the name of the label yyNext.

       re2c:variable:yyaccept = yyaccept;
              Allows one to overwrite the name of the variable yyaccept.

       re2c:variable:yybm = yybm ;
              Allows one to overwrite the name of the variable yybm.

       re2c:variable:yych = yych ;
              Allows one to overwrite the name of the variable yych.

       re2c:variable:yyctable = yyctable ;
              When  both  -c  and -g are active then re2c uses this variable to generate a static
              jump table for YYGETCONDITION.

       re2c:variable:yystable = yystable ;
              Deprecated.

       re2c:variable:yytarget = yytarget ;
              Allows one to overwrite the name of the variable yytarget.

   REGULAR EXPRESSIONS
       "foo"  literal string "foo". ANSI-C escape sequences can be used.

       'foo'  literal string "foo" (characters [a-zA-Z] treated case-insensitive). ANSI-C  escape
              sequences can be used.

       [xyz]  character class; in this case, regular expression matches either x, y, or z.

       [abj-oZ]
              character class with a range in it; matches a, b, any letter from j through o or Z.

       [^class]
              inverted character class.

       r \ s  match  any  r  which  isn't  s.  r  and  s must be regular expressions which can be
              expressed as character classes.

       r*     zero or more occurrences of r.

       r+     one or more occurrences of r.

       r?     optional r.

       (r)    r; parentheses are used to override precedence.

       r s    r followed by s (concatenation).

       r | s  either r or s (alternative).

       r / s  r but only if it is followed by s. Note that s is not part  of  the  matched  text.
              This  type of regular expression is called "trailing context". Trailing context can
              only be the end of a rule and not part of a named definition.

       r{n}   matches r exactly n times.

       r{n,}  matches r at least n times.

       r{n,m} matches r at least n times, but not more than m times.

       .      match any character except newline.

       name   matches named definition as specified by name only if -F is off. If  -F  is  active
              then  this  behaves  like  it  was enclosed in double quotes and matches the string
              "name".

       Character  classes  and  string  literals  may  contain  octal  or  hexadecimal  character
       definitions  and the following set of escape sequences: \a, \b, \f, \n, \r, \t, \v, \\. An
       octal character is defined by a backslash followed by its three octal digits (e.g.  \377).
       Hexadecimal  characters  from  0 to 0xFF are defined by backslash, a lower cased x and two
       hexadecimal digits (e.g. \x12). Hexadecimal characters from 0x100 to 0xFFFF are defined by
       backslash,  a  lower  cased  \u  or  an  upper  cased \X and four hexadecimal digits (e.g.
       \u1234).  Hexadecimal characters from 0x10000 to 0xFFFFffff are defined by  backslash,  an
       upper cased \U and eight hexadecimal digits (e.g. \U12345678).

       The only portable "any" rule is the default rule *.

SCANNER WITH STORABLE STATES

       When  the -f flag is specified, re2c generates a scanner that can store its current state,
       return to the caller, and later resume operations exactly where it left off.

       The default operation of re2c is a "pull" model, where the scanner asks  for  extra  input
       whenever  it  needs  it.  However,  this mode of operation assumes that the scanner is the
       "owner" the parsing loop, and that may not always be convenient.

       Typically, if there is a preprocessor ahead of the scanner in  the  stream,  or  for  that
       matter  any other procedural source of data, the scanner cannot "ask" for more data unless
       both scanner and source live in a separate threads.

       The -f flag is useful for just this situation: it lets users design scanners that work  in
       a  "push"  model,  i.e.  where data is fed to the scanner chunk by chunk. When the scanner
       runs out of data to consume, it just stores its state, and return to the caller. When more
       input data is fed to the scanner, it resumes operations exactly where it left off.

       Changes needed compared to the "pull" model:

       • User has to supply macros YYSETSTATE () and YYGETSTATE (state).

       • The  -f  option  inhibits  declaration  of yych and yyaccept. So the user has to declare
         these.  Also  the  user   has   to   save   and   restore   these.    In   the   example
         examples/push_model/push.re these are declared as fields of the (C++) class of which the
         scanner is a method, so they do not need to be saved/restored  explicitly.  For  C  they
         could  e.g.  be  made macros that select fields from a structure passed in as parameter.
         Alternatively, they could be declared as local variables, saved with YYFILL (n) when  it
         decides  to  return  and  restored  at  entry  to  the  function. Also, it could be more
         efficient to save the state  from  YYFILL  (n)  because  YYSETSTATE  (state)  is  called
         unconditionally.   YYFILL  (n) however does not get state as parameter, so we would have
         to store state in a local variable by YYSETSTATE (state).

       • Modify YYFILL (n) to return (from the function calling it) if more input is needed.

       • Modify caller to recognise if more input is needed and respond appropriately.

       • The generated code will contain a switch block that is used to restores the  last  state
         by  jumping behind the corrspoding YYFILL (n) call. This code is automatically generated
         in the epilog of the first /*!re2c */ block. It is possible to trigger generation of the
         YYGETSTATE  () block earlier by placing a /*!getstate:re2c*/ comment. This is especially
         useful when the scanner code should be wrapped inside a loop.

       Please see examples/push_model/push.re for "push" model scanner. The generated code can be
       tweaked using inplace configurations state:abort and state:nextlabel.

SCANNER WITH CONDITION SUPPORT

       You  can  preceed  regular  expressions  with  a list of condition names when using the -c
       switch. In this case re2c generates scanner blocks for each conditon. Where  each  of  the
       generated  blocks  has  its  own  precondition. The precondition is given by the interface
       define YYGETCONDITON() and must be of type YYCONDTYPE.

       There are two special rule types. First, the rules of the condition <*> are merged to  all
       conditions  (note  that  they have lower priority than other rules of that condition). And
       second the empty condition list allows one to provide a code block that does  not  have  a
       scanner  part.   Meaning  it  does  not  allow any regular expression. The condition value
       referring to this special block is always the one with the enumeration value 0.  This  way
       the  code  of  this  special  rule  can  be  used to initialize a scanner. It is in no way
       necessary to  have  these  rules:  but  sometimes  it  is  helpful  to  have  a  dedicated
       uninitialized condition state.

       Non empty rules allow one to specify the new condition, which makes them transition rules.
       Besides generating  calls  for  the  define  YYSETCONDTITION  no  other  special  code  is
       generated.

       There is another kind of special rules that allow one to prepend code to any code block of
       all rules of a certain set of conditions or to all code blocks to all rules. This  can  be
       helpful  when some operation is common among rules. For instance this can be used to store
       the length of the scanned string. These special setup rules start with an exclamation mark
       followed  by  either  a  list  of conditions <! condition, ... > or a star <!*>. When re2c
       generates the code for a rule whose state does not have a setup rule and  a  star'd  setup
       rule is present, than that code will be used as setup code.

ENCODINGS

       re2c  supports  the  following encodings: ASCII (default), EBCDIC (-e), UCS-2 (-w), UTF-16
       (-x), UTF-32 (-u) and UTF-8 (-8).  See also inplace configuration re2c:flags.

       The following concepts should be clarified when talking about encoding.  Code point is  an
       abstract  number,  which represents single encoding symbol. Code unit is the smallest unit
       of memory, which is used in the encoded text (it corresponds to one character in the input
       stream).  One or more code units can be needed to represent a single code point, depending
       on the encoding. In fixed-length encoding, each  code  point  is  represented  with  equal
       number  of  code  units.  In  variable-length  encoding,  different  code  points  can  be
       represented with different number of code units.

       ASCII  is a fixed-length encoding. Its code space includes 0x100 code points,  from  0  to
              0xFF.  One  code  point is represented with exactly one 1-byte code unit, which has
              the same value as the code point. Size of YYCTYPE must be 1 byte.

       EBCDIC is a fixed-length encoding. Its code space includes 0x100 code points,  from  0  to
              0xFF.  One  code  point is represented with exactly one 1-byte code unit, which has
              the same value as the code point. Size of YYCTYPE must be 1 byte.

       UCS-2  is a fixed-length encoding. Its code space includes 0x10000 code points, from 0  to
              0xFFFF.  One code point is represented with exactly one 2-byte code unit, which has
              the same value as the code point. Size of YYCTYPE must be 2 bytes.

       UTF-16 is a variable-length encoding. Its code space includes  all  Unicode  code  points,
              from  0  to  0xD7FF and from 0xE000 to 0x10FFFF. One code point is represented with
              one or two 2-byte code units. Size of YYCTYPE must be 2 bytes.

       UTF-32 is a fixed-length encoding. Its code space includes all Unicode code points, from 0
              to  0xD7FF  and from 0xE000 to 0x10FFFF. One code point is represented with exactly
              one 4-byte code unit. Size of YYCTYPE must be 4 bytes.

       UTF-8  is a variable-length encoding. Its code space includes  all  Unicode  code  points,
              from  0  to  0xD7FF and from 0xE000 to 0x10FFFF. One code point is represented with
              sequence of one, two, three or four 1-byte code units. Size of YYCTYPE  must  be  1
              byte.

       In  Unicode,  values  from  range 0xD800 to 0xDFFF (surrogates) are not valid Unicode code
       points, any encoded sequence of code units, that would map to Unicode code points  in  the
       range  0xD800-0xDFFF,  is ill-formed. The user can control how re2c treats such ill-formed
       sequences with --encoding-policy <policy> flag (see OPTIONS for full explanation).

       For some encodings, there are code units, that never occur in valid encoded  stream  (e.g.
       0xFF  byte in UTF-8). If the generated scanner must check for invalid input, the only true
       way to do so is to use default rule *. Note, that full range rule [^] won't catch  invalid
       code units when variable-length encoding is used ([^] means "all valid code points", while
       default rule * means "all possible code units").

GENERIC INPUT API

       re2c  usually  operates  on  input  using  pointer-like  primitives  YYCURSOR,   YYMARKER,
       YYCTXMARKER and YYLIMIT.

       Generic  input  API  (enabled  with  --input  custom switch) allows one to customize input
       operations. In this mode, re2c will express all  operations  on  input  in  terms  of  the
       following primitives:

                           ┌────────────────┬──────────────────────────────────┐
                           │YYPEEK ()       │ get current input character      │
                           ├────────────────┼──────────────────────────────────┤
                           │YYSKIP ()       │ advance to the next character    │
                           ├────────────────┼──────────────────────────────────┤
                           │YYBACKUP ()     │ backup current input position    │
                           ├────────────────┼──────────────────────────────────┤
                           │YYBACKUPCTX ()  │ backup  current  input  position │
                           │                │ for trailing context             │
                           ├────────────────┼──────────────────────────────────┤
                           │YYRESTORE ()    │ restore current input position   │
                           ├────────────────┼──────────────────────────────────┤
                           │YYRESTORECTX () │ restore current  input  position │
                           │                │ for trailing context             │
                           ├────────────────┼──────────────────────────────────┤
                           │YYLESSTHAN (n)  │ check   if  less  than  n  input │
                           │                │ characters are left              │
                           └────────────────┴──────────────────────────────────┘

       A couple of useful links that provide some examples:

       1. http://skvadrik.github.io/aleph_null/posts/re2c/2015-01-13-input_model.html

       2. http://skvadrik.github.io/aleph_null/posts/re2c/2015-01-15-input_model_custom.html

SEE ALSO

       You can find more information about re2c  on  the  website:  http://re2c.org.   See  also:
       flex(1), lex(1), quex (http://quex.sourceforge.net).

AUTHORS

       Peter Bumbulis   peter@csg.uwaterloo.ca

       Brian Young      bayoung@acm.org

       Dan Nuffer       nuffer@users.sourceforge.net

       Marcus Boerger   helly@users.sourceforge.net

       Hartmut Kaiser   hkaiser@users.sourceforge.net

       Emmanuel Mogenet mgix@mgix.com

       Ulya Trofimovich skvadrik@gmail.com

VERSION INFORMATION

       This manpage describes re2c version 0.16, package date 21 Jan 2016.

                                                                                          RE2C(1)