Provided by: re2c_1.0.1-1_amd64 bug

NAME

       re2c - convert regular expressions to C/C++ code

SYNOPSIS

       re2c [OPTIONS] FILE

DESCRIPTION

       re2c  is a lexer generator for C/C++. It finds regular expression specifications inside of C/C++ comments
       and replaces them with a hard-coded DFA. The user must supply some interface code in order to control and
       customize the generated DFA.

OPTIONS

       -? -h --help
              Show a short help screen:

       -b --bit-vectors
              Implies -s. Use bit vectors as well to try to coax better code out of the  compiler.  Most  useful
              for specifications with more than a few keywords (e.g., for most programming languages).

       -c --conditions
              Used for (f)lex-like condition support.

       -d --debug-output
              Creates a parser that dumps information about the current position and the state the parser is in.
              This  is useful for debugging parser issues and states. If you use this switch, you need to define
              a YYDEBUG macro, which will be called like a function  with  two  parameters:  void  YYDEBUG  (int
              state,  char  current).   The  first  parameter  receives the state or -1 and the second parameter
              receives the input at the current cursor.

       -D --emit-dot
              Emit Graphviz dot data, which can then be processed with e.g., dot -Tpng input.dot  >  output.png.
              Please note that scanners with many states may crash dot.

       -e --ecb
              Generate a parser that supports EBCDIC. The generated code can deal with any character up to 0xFF.
              In this mode, re2c assumes an input character size of 1 byte. This switch is incompatible with -w,
              -x, -u, and -8.

       -f --storable-state
              Generate a scanner with support for storable state.

       -F --flex-syntax
              Partial support for flex syntax. When this flag is active, named definitions must be surrounded by
              curly  braces  and  can  be defined without an equal sign and the terminating semicolon.  Instead,
              names are treated as direct double quoted strings.

       -g --computed-gotos
              Generate a scanner that utilizes GCC's computed-goto feature. That is, re2c generates jump  tables
              whenever  a  decision  is  of  certain complexity (e.g., a lot of if conditions would be otherwise
              necessary). This is only usable with compilers that support this feature.  Note that this  implies
              -b  and  that  the  complexity  threshold  can  be  configured  using  the cgoto:threshold inplace
              configuration.

       -i --no-debug-info
              Do not output #line information. This is useful when you want use a CMS tool with  re2c's  output.
              You  might  want  to  do  this  if  you do not want to impose re2c as a build requirement for your
              source.

       -o OUTPUT --output=OUTPUT
              Specify the OUTPUT file.

       -r --reusable
              Allows reuse of scanner definitions with /*!use:re2c */ after /*!rules:re2c */.  In this mode,  no
              /*!re2c  */  block and exactly one /*!rules:re2c */ must be present.  The rules are saved and used
              by every /*!use:re2c */ block that follows.  These  blocks  can  contain  inplace  configurations,
              especially  re2c:flags:e, re2c:flags:w, re2c:flags:x, re2c:flags:u, and re2c:flags:8.  That way it
              is possible to create the same scanner multiple times for  different  character  types,  different
              input  mechanisms,  or  different  output  mechanisms.  The /*!use:re2c */ blocks can also contain
              additional rules that will be appended to the set of rules in /*!rules:re2c */.

       -s --nested-ifs
              Generate nested ifs for some switches. Many compilers need this assist to generate better code.

       -t HEADER --type-header=HEADER
              Create a HEADER file that contains types for the (f)lex-like condition support. This can  only  be
              activated when -c is in use.

       -T --tags
              Enable submatch extraction with tags.

       -P --posix-captures
              Enable submatch extraction with POSIX-style capturing groups.

       -u --unicode
              Generate  a  parser  that  supports  UTF-32.  The  generated  code can deal with any valid Unicode
              character up to 0x10FFFF. In this mode, re2c assumes an input character  size  of  4  bytes.  This
              switch is incompatible with -e, -w, -x, and -8. This implies -s.

       -v --version
              Show version information.

       -V --vernum
              Show the version as a number in the MMmmpp (Majorm, minor, patch) format.

       -w --wide-chars
              Generate  a  parser  that  supports  UCS-2.  The  generated  code  can deal with any valid Unicode
              character up to 0xFFFF.  In this mode, re2c assumes an input  character  size  of  2  bytes.  This
              switch is incompatible with -e, -x, -u, and -8. This implies -s.

       -x --utf-16
              Generate  a  parser  that  supports  UTF-16.  The  generated  code can deal with any valid Unicode
              character up to 0x10FFFF. In this mode, re2c assumes an input character  size  of  2  bytes.  This
              switch is incompatible with -e, -w, -u, and -8. This implies -s.

       -8 --utf-8
              Generate  a  parser  that  supports  UTF-8.  The  generated  code  can deal with any valid Unicode
              character up to 0x10FFFF. In this mode, re2c assumes an input  character  size  of  1  byte.  This
              switch is incompatible with -e, -w, -x, and -u.

       --case-insensitive
              Makes   all  strings  case  insensitive.  This  makes  "-quoted  expressions  behave  as  '-quoted
              expressions.

       --case-inverted
              Invert the meaning of single and double quoted strings. With this switch, single quotes  are  case
              sensitive and double quotes are case insensitive.

       --no-generation-date
              Suppress date output in the generated file.

       --no-lookahead
              Use  TDFA(0)  instead  of  TDFA(1).   This  option only has effect with --tags or --posix-captures
              options.

       --no-optimize-tags
              Suppress optimization of tag variables (mostly used for debugging).

       --no-version
              Suppress version output in the generated file.

       --no-generation-date
              Suppress version output in the generated file.

       --encoding-policy POLICY
              Specify how re2c must treat Unicode surrogates. POLICY can be one of the  following:  fail  (abort
              with  an  error when a surrogate is encountered), substitute (silently replace surrogates with the
              error code point 0xFFFD), ignore (treat surrogates  as  normal  code  points).  By  default,  re2c
              ignores  surrogates  (for  backward  compatibility).  The  Unicode  standard  says that standalone
              surrogates are invalid code points, but different libraries and programs treat them differently.

       --input INPUT
              Specify re2c's input API. INPUT can be either default or custom.

       -S --skeleton
              Instead of embedding re2c-generated code into C/C++ source, generate a self-contained program  for
              the same DFA. Most useful for correctness and performance testing.

       --empty-class POLICY
              What  to  do  if  the  user  uses  an  empty  character class. POLICY can be one of the following:
              match-empty (match  empty  input:  pretty  illogical,  but  this  is  the  default  for  backwards
              compatibility  reasons),  match-none (fail to match on any input), error (compilation error). Note
              that  there  are  various  ways  to  construct   an   empty   class,   e.g.,   [],   [^\x00-\xFF],
              [\x00-\xFF][\x00-\xFF].

       --dfa-minimization <table | moore>
              The  internal  algorithm  used  by  re2c  to minimize the DFA (defaults to moore).  Both the table
              filling algorithm and the Moore algorithm should produce the same DFA (up to  states  relabeling).
              The table filling algorithm is much simpler and slower; it serves as a reference implementation.

       --eager-skip
              This  option  controls  when  the  generated  lexer  advances  to  the next input symbol (that is,
              increments YYCURSOR or invokes YYSKIP).  By default this happens  after  transition  to  the  next
              state,  but --eager-skip option allows one to override default behavior and advance input position
              immediately after reading input symbol.  This option is implied by --no-lookahead.

       --dump-nfa
              Generate .dot representation of NFA and dump it on stderr.

       --dump-dfa-raw
              Generate .dot representation of DFA under construction and dump it on stderr.

       --dump-dfa-det
              Generate .dot representation of DFA immediately after determinization and dump it on stderr.

       --dump-dfa-tagopt
              Generate .dot representation of DFA after tag optimizations and dump it on stderr.

       --dump-dfa-min
              Generate .dot representation of DFA after minimization and dump it on stderr.

       --dump-adfa
              Generate .dot representation of DFA after tunneling and dump it on stderr.

       -1 --single-pass
              Deprecated. Does nothing (single pass is the default now).

       -W     Turn on all warnings.

       -Werror
              Turn warnings into errors. Note that this option alone doesn't  turn  on  any  warnings;  it  only
              affects those warnings that have been turned on so far or will be turned on later.

       -W<warning>
              Turn on a warning.

       -Wno-<warning>
              Turn off a warning.

       -Werror-<warning>
              Turn on a warning and treat it as an error (this implies -W<warning>).

       -Wno-error-<warning>
              Don't treat this particular warning as an error. This doesn't turn off the warning itself.

       -Wcondition-order
              Warn if the generated program makes implicit assumptions about condition numbering. You should use
              either  the  -t,  --type-header  option  or the /*!types:re2c*/ directive to generate a mapping of
              condition names to numbers and then use the autogenerated condition names.

       -Wempty-character-class
              Warn if a regular expression contains an empty character class. Rationally,  trying  to  match  an
              empty  character class makes no sense: it should always fail. However, for backwards compatibility
              reasons, re2c  allows  empty  character  classes  and  treats  them  as  empty  strings.  Use  the
              --empty-class option to change the default behavior.

       -Wmatch-empty-string
              Warn if a regular expression in a rule is nullable (matches an empty string). If the DFA runs in a
              loop  and an empty match is unintentional (the input position in not advanced manually), the lexer
              may get stuck in an infinite loop.

       -Wswapped-range
              Warn if the lower bound of a range is greater than its upper bound. The  default  behavior  is  to
              silently swap the range bounds.

       -Wundefined-control-flow
              Warn  if  some  input  strings  cause undefined control flow in the lexer (the faulty patterns are
              reported). This is the most dangerous and most common mistake. It can be easily  fixed  by  adding
              the  default  rule  (*)  (this  rule  has the lowest priority, matches any code unit, and consumes
              exactly one code unit).

       -Wunreachable-rules
              Warn about rules that are shadowed by other rules and will never match.

       -Wuseless-escape
              Warn if a symbol is escaped when it shouldn't be.  By default, re2c silently ignores such escapes,
              but this may as well indicate a typo or error in the escape sequence.

       -Wnondeterministic-tags
              Warn if tag has n-th degree of nondeterminism, where n is greater than 1.

INTERFACE CODE

       The user must supply interface code either in the form of C/C++ code (macros, functions, variables, etc.)
       or in the form of INPLACE CONFIGURATIONS.  Which symbols must be defined and which are  optional  depends
       on the particular use case.

       YYBACKUP ()
              Backup current input position (used only with generic API).

       YYBACKUPCTX ()
              Backup current input position for trailing context (used only with generic API).

       YYCONDTYPE
              In  -c  mode,  you can use -t to generate a file that contains the enumeration used as conditions.
              Each of the values refers to a condition of a rule set.

       YYCTXMARKER
              l-value of type YYCTYPE *.  The generated code saves trailing context backtracking information  in
              YYCTXMARKER.  The  user  only  needs to define this macro if a scanner specification uses trailing
              context in one or more of its regular expressions.

       YYCTYPE
              Type used to hold an input symbol (code unit). Usually char or unsigned char for ASCII, EBCDIC  or
              UTF-8, or unsigned short for UTF-16 or UCS-2, or unsigned int for UTF-32.

       YYCURSOR
              l-value of type YYCTYPE * that points to the current input symbol.  The  generated  code  advances
              YYCURSOR  as symbols are matched. On entry, YYCURSOR is assumed to point to the first character of
              the current token. On exit, YYCURSOR will point to the first character of the following token.

       YYDEBUG (state, current)
              This is only needed if the -d flag was specified. It allows easy debugging of the generated parser
              by calling a user defined function for  every  state.  The  function  should  have  the  following
              signature:  void  YYDEBUG  (int state, char current). The first parameter receives the state or -1
              and the second parameter receives the input at the current cursor.

       YYFILL (n)
              The generated code "calls"" YYFILL (n) when the buffer needs (re)filling: at  least  n  additional
              characters  should  be  provided.  YYFILL  (n)  should  adjust  YYCURSOR,  YYLIMIT,  YYMARKER, and
              YYCTXMARKER as needed. Note that for typical programming languages n will be  the  length  of  the
              longest  keyword  plus  one.  The  user  can place a comment of the form /*!max:re2c*/ to insert a
              YYMAXFILL define set to the maximum length value.

       YYGETCONDITION ()
              This define is used to get the condition prior to entering the scanner  code  when  using  the  -c
              switch. The value must be initialized with a value from the YYCONDTYPE enumeration type.

       YYGETSTATE ()
              The user only needs to define this macro if the -f flag was specified. In that case, the generated
              code  "calls"  YYGETSTATE  ()  at  the  very beginning of the scanner in order to obtain the saved
              state. YYGETSTATE () must return a signed integer. The value must be either  -1,  indicating  that
              the  scanner  is entered for the first time, or a value previously saved by YYSETSTATE (s). In the
              second case, the scanner will resume operations right after where the last YYFILL (n) was called.

       YYLESSTHAN (n)
              Check if less than n input characters are left (used only with generic API).

       YYLIMIT
              An expression of type YYCTYPE * that marks the end of the buffer YYLIMIT[-1] is the last character
              in the buffer). The generated code repeatedly compares YYCURSOR to YYLIMIT to determine  when  the
              buffer needs (re)filling.

       YYMARKER
              An l-value of type YYCTYPE *.  The generated code saves backtracking information in YYMARKER. Some
              simple scanners might not use this.

       YYMTAGP (t)
              Append current input position to the history of tag t.

       YYMTAGN (t)
              Append default value to the history of tag t.

       YYMAXFILL
              This will be automatically defined by /*!max:re2c*/ blocks as explained above.

       YYMAXNMATCH
              This will be automatically defined by /*!maxnmatch:re2c*/.

       YYPEEK ()
              Get current input character (used only with generic API).

       YYRESTORE ()
              Restore input position (used only with generic API).

       YYRESTORECTX ()
              Restore input position from the value of trailing context (used only with generic API).

       YYRESTORETAG (t)
              Restore input position from the value of tag t (used only with generic API).

       YYSETCONDITION (c)
              This  define  is used to set the condition in transition rules. This is only being used when -c is
              active and transition rules are being used.

       YYSETSTATE (s)
              The user only needs to define this macro if the -f flag was specified. In that case, the generated
              code "calls" YYSETSTATE just before calling YYFILL (n). The parameter to YYSETSTATE  is  a  signed
              integer  that  uniquely identifies the specific instance of YYFILL (n) that is about to be called.
              Should the user wish to save the state of the scanner and have YYFILL (n) return  to  the  caller,
              all  he has to do is store that unique identifier in a variable. Later, when the scanner is called
              again, it will call YYGETSTATE () and resume execution right where it left off. The generated code
              will contain both YYSETSTATE (s) and YYGETSTATE even if YYFILL (n) is disabled.

       YYSKIP ()
              Advance input position to the next character (used only with generic API).

       YYSTAGP (t)
              Save current input position to tag t (used only with generic API).

       YYSTAGN (t)
              Save default value to tag t (used only with generic API).

SYNTAX

       Code for re2c consists of a set of RULES, NAMED DEFINITIONS, CODE and INPLACE CONFIGURATIONS.

   RULES
       Each rule consist of a regular expression  (see REGULAR EXPRESSIONS) accompanied with a  block  of  C/C++
       code  which is to be executed when the associated regular expression is matched. You can either start the
       code with an opening curly brace or the sequence :=. If you use an opening curly brace, re2c  will  count
       brace  depth  and  stop  looking for code automatically. Otherwise, curly braces are not allowed and re2c
       stops looking for code at the first line that does not begin  with  whitespace.  If  two  or  more  rules
       overlap, the first rule is preferred.

       There  is one special rule that can be used instead of regular expression: the default rule *.  Note that
       the default rule * differs from [^]: the default rule has the lowest  priority,  matches  any  code  unit
       (either valid or invalid) and always consumes exactly one character.  [^], on the other hand, matches any
       valid  code  point  (not  the  same  as a code unit) and can consume multiple code units. In fact, when a
       variable-length encoding is used, * is the only possible way to match an invalid input character.

       In general, all rules have the form:
          regular-expression-or-* code

       If -c is active, then each regular expression is preceded by a list of comma-separated  condition  names.
       Besides  the  normal  naming  rules,  there  are  two  special  cases: <*> (these rules are merged to all
       conditions) and <> (these rules cannot have an associated regular expression; their code is merged to all
       actions). Non-empty rules may furthermore specify the new condition. In that case, re2c will generate the
       necessary code to change the condition automatically. Rules can use :=> as a  shortcut  to  automatically
       generate code that not only sets the new condition state but also continues execution with the new state.
       A  shortcut  rule  should not be used in a loop where there is code between the start of the loop and the
       re2c block unless re2c:cond:goto is changed to continue. If some code is needed before all rules  (though
       not before simple jumps),  you can insert it with <!> pseudo-rules.
          <condition-list-or-*> regular-expression-or-* code

          <condition-list-or-*> regular-expression-or-* => condition code

          <condition-list-or-*> regular-expression-or-* :=> condition

          <> code

          <> => condition code

          <> :=> condition

          <!condition-list> code

          <!> code

   NAMED DEFINITIONS
       Named definitions are of the form:
          name = regular-expression;

       If -F is active, then named definitions are also of the form:
          name { regular-expression }

   INPLACE CONFIGURATIONS
       re2c:cgoto:threshold = 9;
              When  -g  is active, this value specifies the complexity threshold that triggers the generation of
              jump tables rather than nested ifs and decision bitfields. The threshold  is  compared  against  a
              calculated estimation of ifs needed where every used bitmap divides the threshold by 2.

       re2c:cond:divider = '/* *********************************** */';
              Allows  one  to  customize the divider for condition blocks. You can use @@ to put the name of the
              condition or customize the placeholder using re2c:cond:divider@cond.

       re2c:cond:divider@cond = @@;
              Specifies the placeholder that will be replaced with the condition name in re2c:cond:divider.

       re2c:condenumprefix = yyc;
              Allows one to specify the prefix used for condition values. That is, the text to be  prepended  to
              condition enum values in the generated output file.

       re2c:cond:goto@cond = @@;
              Specifies the placeholder that will be replaced with the condition label in re2c:cond:goto.

       re2c:cond:goto = 'goto @@;';
              Allows one to customize the condition goto statements used with :=> style rules. You can use @@ to
              put the name of the condition or customize the placeholder using re2c:cond:goto@cond. You can also
              change this to continue;, which would allow you to continue with the next loop cycle including any
              code between your loop start and your re2c block.

       re2c:condprefix = yyc;
              Allows  one  to specify the prefix used for condition labels. That is, the text to be prepended to
              condition labels in the generated output file.

       re2c:define:YYBACKUPCTX = 'YYBACKUPCTX';
              Replaces YYBACKUPCTX identifier with the specified string.

       re2c:define:YYBACKUP = 'YYBACKUP';
              Replaces YYBACKUP identifier with the specified string.

       re2c:define:YYCONDTYPE = 'YYCONDTYPE';
              Enumeration used for condition support with -c mode.

       re2c:define:YYCTXMARKER = 'YYCTXMARKER';
              Replaces the YYCTXMARKER placeholder with the specified identifier.

       re2c:define:YYCTYPE = 'YYCTYPE';
              Replaces the YYCTYPE placeholder with the specified type.

       re2c:define:YYCURSOR = 'YYCURSOR';
              Replaces the YYCURSOR placeholder with the specified identifier.

       re2c:define:YYDEBUG = 'YYDEBUG';
              Replaces the YYDEBUG placeholder with the specified identifier.

       re2c:define:YYFILL@len = '@@';
              Any occurrence of this text inside of a YYFILL call will be replaced with the actual argument.

       re2c:define:YYFILL:naked = 0;
              Controls the argument in the parentheses after YYFILL and the following semicolon. If  zero,  both
              the  argument  and  the  semicolon  are  omitted.  If  non-zero,  the argument is generated unless
              re2c:yyfill:parameter is set to zero; the semicolon is generated unconditionally.

       re2c:define:YYFILL = 'YYFILL';
              Define a substitution for YYFILL. Note that by default, re2c generates an argument in  parentheses
              and  a  semicolon  after  YYFILL.  If you need to make YYFILL an arbitrary statement rather than a
              call, set re2c:define:YYFILL:naked to a non-zero value and use  re2c:define:YYFILL@len  to  set  a
              placeholder for the formal parameter inside of your YYFILL body.

       re2c:define:YYGETCONDITION:naked = 0;
              Controls  the parentheses after YYGETCONDITION. If zero, the parentheses are omitted. If non-zero,
              the parentheses are generated.

       re2c:define:YYGETCONDITION = 'YYGETCONDITION';
              Substitution  for  YYGETCONDITION.  Note  that  by  default,  re2c  generates  parentheses   after
              YYGETCONDITION. Set re2c:define:YYGETCONDITION:naked to non-zero to omit the parentheses.

       re2c:define:YYGETSTATE:naked = 0;
              Controls  the  parentheses  that  follow  YYGETSTATE.  If  zero,  the  parentheses are omitted. If
              non-zero, they are generated.

       re2c:define:YYGETSTATE = 'YYGETSTATE';
              Substitution for YYGETSTATE. Note that by default, re2c generates  parentheses  after  YYGETSTATE.
              Set re2c:define:YYGETSTATE:naked to non-zero to omit the parentheses.

       re2c:define:YYLESSTHAN = 'YYLESSTHAN';
              Replaces YYLESSTHAN identifier with the specified string.

       re2c:define:YYLIMIT = 'YYLIMIT';
              Replaces the YYLIMIT placeholder with the specified identifier.  needed.

       re2c:define:YYMARKER = 'YYMARKER';
              Replaces the YYMARKER placeholder with the specified identifier.

       re2c:define:YYMTAGN = 'YYMTAGN';
              Replaces YYMTAGN identifier with the specified string.

       re2c:define:YYMTAGP = 'YYMTAGP';
              Replaces YYMTAGP identifier with the specified string.

       re2c:define:YYPEEK = 'YYPEEK';
              Replaces YYPEEK identifier with the specified string.

       re2c:define:YYRESTORECTX = 'YYRESTORECTX';
              Replaces YYRESTORECTX identifier with the specified string.

       re2c:define:YYRESTORE = 'YYRESTORE';
              Replaces YYRESTORE identifier with the specified string.

       re2c:define:YYRESTORETAG = 'YYRESTORETAG';
              Replaces YYRESTORETAG identifier with the specified string.

       re2c:define:YYSETCONDITION@cond = '@@';
              Any occurrence of this text inside of YYSETCONDITION will be replaced with the actual argument.

       re2c:define:YYSETCONDITION:naked = 0;
              Controls  the  argument  in  parentheses and the semicolon after YYSETCONDITION. If zero, both the
              argument and the semicolon are omitted. If non-zero, both  the  argument  and  the  semicolon  are
              generated.

       re2c:define:YYSETCONDITION = 'YYSETCONDITION';
              Substitution  for  YYSETCONDITION. Note that by default, re2c generates an argument in parentheses
              followed by semicolon after YYSETCONDITION. If  you  need  to  make  YYSETCONDITION  an  arbitrary
              statement   rather   than  a  call,  set  re2c:define:YYSETCONDITION:naked  to  non-zero  and  use
              re2c:define:YYSETCONDITION@cond to denote the formal parameter inside of the YYSETCONDITION body.

       re2c:define:YYSETSTATE:naked = 0;
              Controls the argument in parentheses and the semicolon after YYSETSTATE. If  zero,  both  argument
              and the semicolon are omitted. If non-zero, both the argument and the semicolon are generated.

       re2c:define:YYSETSTATE@state = '@@';
              Any occurrence of this text inside of YYSETSTATE will be replaced with the actual argument.

       re2c:define:YYSETSTATE = 'YYSETSTATE';
              Substitution  for  YYSETSTATE.  Note  that  by  default, re2c generates an argument in parentheses
              followed by a semicolon after YYSETSTATE. If you need to make YYSETSTATE  an  arbitrary  statement
              rather    than    a    call,    set    re2c:define:YYSETSTATE:naked    to    non-zero    and   use
              re2c:define:YYSETSTATE@cond to denote formal parameter inside of your YYSETSTATE body.

       re2c:define:YYSKIP = 'YYSKIP';
              Replaces YYSKIP identifier with the specified string.

       re2c:define:YYSTAGN = 'YYSTAGN';
              Replaces YYSTAGN identifier with the specified string.

       re2c:define:YYSTAGP = 'YYSTAGP';
              Replaces YYSTAGP identifier with the specified string.

       re2c:flags:8 or re2c:flags:utf-8
              Same as -8 --utf-8 command-line option.

       re2c:flags:b or re2c:flags:bit-vectors
              Same as -b --bit-vectors command-line option.

       re2c:flags:case-insensitive = 0;
              Same as --case-insensitive command-line option.

       re2c:flags:case-inverted = 0;
              Same as --case-inverted command-line option.

       re2c:flags:d or re2c:flags:debug-output
              Same as -d --debug-output command-line option.

       re2c:flags:dfa-minimization = 'moore';
              Same as --dfa-minimization command-line option.

       re2c:flags:eager-skip = 0;
              Same as --eager-skip command-line option.

       re2c:flags:e or re2c:flags:ecb
              Same as -e --ecb command-line option.

       re2c:flags:empty-class = 'match-empty';
              Same as --empty-class command-line option.

       re2c:flags:encoding-policy = 'ignore';
              Same as --encoding-policy command-line option.

       re2c:flags:g or re2c:flags:computed-gotos
              Same as -g --computed-gotos command-line option.

       re2c:flags:i or re2c:flags:no-debug-info
              Same as -i --no-debug-info command-line option.

       re2c:flags:input = 'default';
              Same as --input command-line option.

       re2c:flags:lookahead = 1;
              Same as inverted --no-lookahead command-line option.

       re2c:flags:optimize-tags = 1;
              Same as inverted --no-optimize-tags command-line option.

       re2c:flags:P or re2c:flags:posix-captures
              Same as -P --posix-captures command-line option.

       re2c:flags:s or re2c:flags:nested-ifs
              Same as -s --nested-ifs command-line option.

       re2c:flags:T or re2c:flags:tags
              Same as -T --tags command-line option.

       re2c:flags:u or re2c:flags:unicode
              Same as -u --unicode command-line option.

       re2c:flags:w or re2c:flags:wide-chars
              Same as -w --wide-chars command-line option.

       re2c:flags:x or re2c:flags:utf-16
              Same as -x --utf-16 command-line option.

       re2c:indent:string = '\t';
              Specifies the string to use for indentation. Requires a string that should contain only whitespace
              unless you need something else for external tools. The easiest way to specify spaces is to enclose
              them in single or double quotes.  If you do  not want any indentation at all, you can  simply  set
              this to ''.

       re2c:indent:top = 0;
              Specifies the minimum amount of indentation to use. Requires a numeric value greater than or equal
              to zero.

       re2c:labelprefix = 'yy';
              Allows  one to change the prefix of numbered labels. The default is yy. Can be set any string that
              is valid in a label name.

       re2c:label:yyFillLabel = 'yyFillLabel';
              Overrides the name of the yyFillLabel label.

       re2c:label:yyNext = 'yyNext';
              Overrides the name of the yyNext label.

       re2c:startlabel = 0;
              If set to a non zero integer, then the start label of the next scanner  block  will  be  generated
              even  if  it  isn't used by the scanner itself. Otherwise, the normal yy0-like start label is only
              generated if needed. If set to a text value, then  a  label  with  that  text  will  be  generated
              regardless  of  whether  the normal start label is used or not. This setting is reset to 0 after a
              start label has been generated.

       re2c:state:abort = 0;
              When not zero and the -f switch is active, then the YYGETSTATE block will contain a  default  case
              that aborts and a -1 case will be used for initialization.

       re2c:state:nextlabel = 0;
              Used  when  -f  is  active  to control whether the YYGETSTATE block is followed by a yyNext: label
              line.  Instead of using yyNext, you can usually also  use  configuration  startlabel  to  force  a
              specific start label or default to yy0 as a start label. Instead of using a dedicated label, it is
              often  better  to  separate  the  YYGETSTATE  code  from  the  actual  scanner  code  by placing a
              /*!getstate:re2c*/ comment.

       re2c:tags:expression = '@@';
              Allows one to customize the way re2c addresses tag variables: by default it emits  expressions  of
              the  form  yyt<N>,  but  this  might  be  inconvenient if tag variables are defined as fields in a
              struct,  or  for  any  other   reason   require   special   accessors.    For   example,   setting
              re2c:tags:expression = p->@@ will result in p->yyt<N>.

       re2c:tags:prefix = 'yyt';
              Allows one to override prefix of tag variables.

       re2c:variable:yyaccept = yyaccept;
              Overrides the name of the yyaccept variable.

       re2c:variable:yybm = 'yybm';
              Overrides the name of the yybm variable.

       re2c:variable:yych = 'yych';
              Overrides the name of the yych variable.

       re2c:variable:yyctable = 'yyctable';
              When  both  -c  and -g are active, re2c will use this variable to generate a static jump table for
              YYGETCONDITION.

       re2c:variable:yystable = 'yystable';
              Deprecated.

       re2c:variable:yytarget = 'yytarget';
              Overrides the name of the yytarget variable.

       re2c:yybm:hex = 0;
              If set to zero, a decimal table will be used. Otherwise, a hexadecimal table will be generated.

       re2c:yych:conversion = 0;
              When this setting is non zero, re2c automatically generates conversion  code  whenever  yych  gets
              read. In this case, the type must be defined using re2c:define:YYCTYPE.

       re2c:yych:emit = 1;
              Set this to zero to suppress the generation of yych.

       re2c:yyfill:check = 1;
              This  can  be  set  to  0  to  suppress the generations of YYCURSOR and YYLIMIT based precondition
              checks. This option is useful when YYLIMIT + YYMAXFILL is always accessible.

       re2c:yyfill:enable = 1;
              Set this to zero to suppress the generation of YYFILL (n). When using this, be sure to verify that
              the generated scanner does not read beyond the available input, as allowing  such  behavior  might
              introduce severe security issues to your programs.

       re2c:yyfill:parameter = 1;
              Controls the argument in the parentheses that follow YYFILL. If zero, the argument is omitted.  If
              non-zero, the argument is generated unless re2c:define:YYFILL:naked is set to non-zero.

   REGULAR EXPRESSIONS
       "foo"  literal string "foo". ANSI-C escape sequences can be used.

       'foo'  literal  string  "foo" (case insensitive for characters [a-zA-Z]).  ANSI-C escape sequences can be
              used.

       [xyz]  character class; in this case, the regular expression matches x, y, or z.

       [abj-oZ]
              character class with a range in it; matches a, b, any letter from j through o, or Z.

       [^class]
              inverted character class.

       r \ s  match any r which isn't s. r and s must be regular expressions which can be expressed as character
              classes.

       r*     zero or more occurrences of r.

       r+     one or more occurrences of r.

       r?     optional r.

       (r)    r; parentheses are used to override precedence.

       r s    r followed by s (concatenation).

       r | s  r or s (alternative).

       r / s  r but only if it is followed by s. Note that s is not part of  the  matched  text.  This  type  of
              regular expression is called "trailing context". Trailing context can only be at the end of a rule
              and cannot be part of a named definition.

       r{n}   matches r exactly n times.

       r{n,}  matches r at least n times.

       r{n,m} matches r at least n times, but not more than m times.

       .      match any character except newline.

       name   matches  a  named  definition  as  specified  by name only if -F is off. If -F is active then this
              behaves like it was enclosed in double quotes and matches the string "name".

       @stag  save input position at which @stag matches in a variable named stag

       #mtag  save all input positions at which #mtag matches in a variable named mtag (multiple  positions  are
              possible if #mtag is enclosed in a repetition subexpression that matches several times)

       Character  classes  and  string  literals  may contain octal or hexadecimal character definitions and the
       following set of escape sequences: \a, \b, \f, \n, \r, \t, \v, \\. An octal character  is  defined  by  a
       backslash  followed  by  its  three octal digits (e.g., \377).  Hexadecimal characters from 0 to 0xFF are
       defined by a backslash, a lower case x and two hexadecimal digits (e.g.,  \x12).  Hexadecimal  characters
       from  0x100  to  0xFFFF  are  defined  by  a  backslash, a lower case \u``or an upper case ``\X, and four
       hexadecimal digits (e.g., \u1234).  Hexadecimal characters from 0x10000 to 0xFFFFffff are  defined  by  a
       backslash, an upper case \U, and eight hexadecimal digits (e.g., \U12345678).

       The only portable "any" rule is the default rule, *.

SUBMATCH EXTRACTION

       re2c supports two kinds of submatch extraction.

       The  first  option  is  -P  --posix-captures:  it enables POSIX-compliant capturing groups.  In this mode
       parentheses in regular expressions denote the beginning and  the  end  of  capturing  groups;  the  whole
       regular  expression  is  group  number  zero.   The number of groups for the matching rule is stored in a
       variable yynmatch, and submatch results are stored in yypmatch array.  Both yynmatch and yypmatch  should
       be  defined  by  the  user;  note  that  yypmatch  size must be at least [yynmatch * 2].  re2c provides a
       directive /*!maxnmatch:re2c*/ that defines a constant YYMAXNMATCH: the maximal value  of  yynmatch  among
       all  rules.  Note that re2c implements POSIX-compliant disambiguation: each subexpression matches as long
       as possible, and subexpressions that start  earlier  in  regular  expression  have  priority  over  those
       starting later.

       Second  option  is  -T  --tags.  With this option one can use standalone tags of the form @stag and #mtag
       instead of capturing parentheses, where stag and mtag are arbitrary used-defined names.  Tags can be used
       anywhere inside of a regular expression; semantically they are just position markers.  Tags of  the  form
       @stag  are  called  s-tags:  they  denote a single submatch value (the last input position where this tag
       matched).  Tags of the form #mtag are called m-tags: they denote  multiple  submatch  values  (the  whole
       history  of  repetitions  of  this  tag).   All  tags should be defined by the user as variables with the
       corresponding names.  With standalone tags re2c uses leftmost greedy disambiguation:  submatch  positions
       correspond to the leftmost matching path through the regular expression.

       With  both  --posix-captures and --tags options re2c generates a number of tag variables that are used by
       the lexer to track multiple possible versions of each tag  (multiple  versions  are  caused  by  possible
       ambiguity  of  submatch).   When  a  rule  matches,  ambiguity  is resolved and all tags of this rule (or
       capturing parentheses, which are also implemented as tags) are initialized with the values of appropriate
       tag variables.  Note that there is no one-to-one correspondence between tag variables and tags: the  same
       tag variable may be reused for different tags, and one tag may require multiple tag variables to hold all
       its  ambiguous  versions.   The  exact  number  of  tag  variables is unknown to the user; this number is
       determined by re2c.  However, tag variables should be defined by the user, because it might be  necessary
       to  update  them  in  YYFILL  and  store  them between invocations of lexer with --storable-state option.
       Therefore re2c provides directives /*!stags:re2c ... */ and /*!mtags:re2c ... */  that  can  be  used  to
       declare, initialize and manipulate tag variables.

       S-tags must support the following operations:

       • save input position to s-tag: t = YYCURSOR with default API, or user-defined operation YYSTAGP (t) with
         generic API

       • save  default  value  to  s-tag:  t = NULL with default API, or user-defined operation YYSTAGN (t) with
         generic API

       • copy one s-tag to another: t1 = t2

       M-tags must support the following operations:

       • append input position to m-tag: user-defined operation YYMTAGP (t) with both default and generic API

       • append default value to m-tag: user-defined operation YYMTAGN (t) with both default and generic API

       • copy one m-tag to another: t1 = t2

       S-tags can be  implemented  as  scalar  values  (pointers  or  offsets).   M-tags  need  a  more  complex
       representation,  as  they  need  to  store  a  sequence  of  tag  values.  The most naive and inefficient
       representation of m-tag is a list (array, vector) of tag values; a more efficient  representation  is  to
       store  all  m-tags in a prefix-tree represented as array of nodes (v, p), where v is tag value and p is a
       pointer to parent node.

       For further details see http://re2c.org/examples/examples.html page  on  the  website  or  re2c/examples/
       subdirectory of re2c distribution.

SCANNER WITH STORABLE STATES

       When  the  -f flag is specified, re2c generates a scanner that can store its current state, return to its
       caller, and later resume operations exactly where it left off.

       The default mode of operation in re2c is a "pull" model, where the scanner asks for extra input  whenever
       it needs it. However, this mode of operation assumes that the scanner is the "owner" of the parsing loop,
       and that may not always be convenient.

       Typically,  if  there is a preprocessor ahead of the scanner in the stream, or for that matter, any other
       procedural source of data, the scanner cannot "ask" for more data unless both the scanner and the  source
       live in separate threads.

       The  -f  flag  is  useful  exactly for situations like that: it lets users design scanners that work in a
       "push" model, i.e., a model where data is fed to the scanner chunk by chunk. When the scanner runs out of
       data to consume, it stores its state and returns to the caller. When  more  input  data  is  fed  to  the
       scanner, it resumes operations exactly where it left off.

       Changes needed compared to the "pull" model:

       • The user has to supply macros named YYSETSTATE () and YYGETSTATE (state).

       • The  -f  option inhibits declaration of yych and yyaccept, so the user has to declare them and save and
         restore them where required.  In the examples/push_model/push.re example, these are declared as  fields
         of a (C++) class of which the scanner is a method, so they do not need to be saved/restored explicitly.
         For  C,  they could, e.g., be made macros that select fields from a structure passed in as a parameter.
         Alternatively, they could be declared as local variables, saved with YYFILL  (n)  when  it  decides  to
         return and restored upon entering the function. Also, it could be more efficient to save the state from
         YYFILL (n) because YYSETSTATE (state) is called unconditionally.  YYFILL (n) however does not get state
         as a parameter, so we would have to store state in a local variable by YYSETSTATE (state).

       • Modify YYFILL (n) to return (from the function calling it) if more input is needed.

       • Modify the caller to recognize if more input is needed and respond appropriately.

       • The generated code will contain a switch block that is used to restore the last state by jumping behind
         the  corresponding  YYFILL  (n) call. This code is automatically generated in the epilogue of the first
         /*!re2c */ block. It is possible to trigger generation of the YYGETSTATE () block earlier by placing  a
         /*!getstate:re2c*/  comment. This is especially useful when the scanner code should be wrapped inside a
         loop.

       Please see examples/push_model/push.re for an example of a "push" model scanner. The generated  code  can
       be tweaked with inplace configurations state:abort and state:nextlabel.

SCANNER WITH CONDITION SUPPORT

       You  can  precede  regular expressions with a list of condition names when using the -c switch. re2c will
       then generate a scanner block for each condition, and each of the generated  blocks  will  have  its  own
       precondition.  The  precondition  is  given  by  the interface define YYGETCONDITON() and must be of type
       YYCONDTYPE.

       There are two special rule types. First, the rules of the condition <*>  are  merged  to  all  conditions
       (note  that  they  have  a  lower  priority  than  other  rules of that condition). And second, the empty
       condition list allows one to provide a code block that does not have a scanner part, meaning it does  not
       allow any regular expressions. The condition value referring to this special block is always the one with
       the  enumeration  value 0. This way the code of this special rule can be used to initialize a scanner. It
       is in no way necessary to have these rules: but sometimes it is helpful to have a dedicated uninitialized
       condition state.

       Non empty rules allow one to specify the new  condition,  which  makes  them  transition  rules.  Besides
       generating calls for the YYSETCONDTITION define, no other special code is generated.

       There is another kind of special rule that allows one to prepend code to any code block of all rules of a
       certain  set of conditions or to all code blocks of all rules. This can be helpful when some operation is
       common among rules. For instance, this can be used to store the  length  of  the  scanned  string.  These
       special  setup rules start with an exclamation mark followed by either a list of conditions <! condition,
       ... > or a star <!*>. When re2c generates the code for a rule whose state does not have a setup rule  and
       a starred setup rule is present, the starred setup code will be used as setup code.

ENCODINGS

       re2c supports the following encodings: ASCII (default), EBCDIC (-e), UCS-2 (-w), UTF-16 (-x), UTF-32 (-u)
       and UTF-8 (-8).  See also inplace configuration re2c:flags.

       The  following  concepts  should  be clarified when talking about encodings.  A code point is an abstract
       number that represents a single symbol.  A code unit is the smallest unit of memory, which is used in the
       encoded text (it corresponds to one character in the input stream). One or more code units may be  needed
       to  represent a single code point, depending on the encoding. In a fixed-length encoding, each code point
       is represented with an equal number of code units. In variable-length encodings,  different  code  points
       can be represented with different number of code units.

       • ASCII  is  a  fixed-length  encoding. Its code space includes 0x100 code points, from 0 to 0xFF. A code
         point is represented with exactly one 1-byte code unit, which has the same value as the code point. The
         size of YYCTYPE must be 1 byte.

       • EBCDIC is a fixed-length encoding. Its code space includes 0x100 code points, from 0 to  0xFF.  A  code
         point is represented with exactly one 1-byte code unit, which has the same value as the code point. The
         size of YYCTYPE must be 1 byte.

       • UCS-2  is  a  fixed-length encoding. Its code space includes 0x10000 code points, from 0 to 0xFFFF. One
         code point is represented with exactly one 2-byte code unit, which has  the  same  value  as  the  code
         point. The size of YYCTYPE must be 2 bytes.

       • UTF-16 is a variable-length encoding. Its code space includes all Unicode code points, from 0 to 0xD7FF
         and  from 0xE000 to 0x10FFFF. One code point is represented with one or two 2-byte code units. The size
         of YYCTYPE must be 2 bytes.

       • UTF-32 is a fixed-length encoding. Its code space includes all Unicode code points, from  0  to  0xD7FF
         and  from 0xE000 to 0x10FFFF. One code point is represented with exactly one 4-byte code unit. The size
         of YYCTYPE must be 4 bytes.

       • UTF-8 is a variable-length encoding. Its code space includes all Unicode code points, from 0 to  0xD7FF
         and  from 0xE000 to 0x10FFFF. One code point is represented with a sequence of one, two, three, or four
         1-byte code units. The size of YYCTYPE must be 1 byte.

       In Unicode, values from range 0xD800 to 0xDFFF (surrogates)  are  not  valid  Unicode  code  points.  Any
       encoded  sequence  of  code  units  that  would map to Unicode code points in the range 0xD800-0xDFFF, is
       ill-formed. The user can control how re2c treats such ill-formed  sequences  with  the  --encoding-policy
       <policy> switch.

       For  some  encodings, there are code units that never occur in a valid encoded stream (e.g., 0xFF byte in
       UTF-8). If the generated scanner must check for invalid input, the only correct way to do so  is  to  use
       the  default  rule  (*).  Note  that  the  full  range  rule  ([^]) won't catch invalid code units when a
       variable-length encoding is used ([^] means "any valid code point", whereas the default  rule  (*)  means
       "any possible code unit").

GENERIC INPUT API

       re2c usually operates on input with pointer-like primitives YYCURSOR, YYMARKER, YYCTXMARKER, and YYLIMIT.

       The  generic  input  API (enabled with the --input custom switch) allows customizing input operations. In
       this mode, re2c will express all operations on input in terms of the following primitives:
                              ┌──────────────────┬───────────────────────────────────────┐
                              │ YYPEEK ()        │ get current input character           │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYSKIP ()        │ advance to next character             │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYBACKUP ()      │ backup current input position         │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYBACKUPCTX ()   │ backup  current  input  position  for │
                              │                  │ trailing context                      │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYSTAGP (t)      │ save current input position to tag t  │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYSTAGN (t)      │ save default value to tag t           │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYMTAGP (t)      │ append  input position to the history │
                              │                  │ of tag t                              │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYMTAGN (t)      │ append default value to  the  history │
                              │                  │ of tag t                              │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYRESTORE ()     │ restore current input position        │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYRESTORECTX ()  │ restore  current  input  position for │
                              │                  │ trailing context                      │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYRESTORETAG (t) │ restore current input  position  from │
                              │                  │ tag t                                 │
                              ├──────────────────┼───────────────────────────────────────┤
                              │ YYLESSTHAN (n)   │ check if less than n input characters │
                              │                  │ are left                              │
                              └──────────────────┴───────────────────────────────────────┘

       A couple of useful links that provide some examples:

       1. http://skvadrik.github.io/aleph_null/posts/re2c/2015-01-13-input_model.html

       2. http://skvadrik.github.io/aleph_null/posts/re2c/2015-01-15-input_model_custom.html

SEE ALSO

       You  can  find  more  information  about  re2c  at:  http://re2c.org.  See also: flex(1), lex(1), quex (‐
       http://quex.sourceforge.net).

AUTHORS

       Peter Bumbulis   peter@csg.uwaterloo.ca

       Brian Young      bayoung@acm.org

       Dan Nuffer       nuffer@users.sourceforge.net

       Marcus Boerger   helly@users.sourceforge.net

       Hartmut Kaiser   hkaiser@users.sourceforge.net

       Emmanuel Mogenet mgix@mgix.com

       Ulya Trofimovich skvadrik@gmail.com

VERSION INFORMATION

       This manpage describes re2c version 1.0.1, package date 11 Aug 2017.

                                                                                                         RE2C(1)