lunar (1) ragel.1.gz

Provided by: ragel_6.10-4_amd64 bug

NAME

       ragel - compile regular languages into executable state machines

SYNOPSIS

       ragel [options] file

DESCRIPTION

       Ragel  compiles  executable  finite  state  machines  from  regular  languages.  Ragel can
       generate C, C++, Objective-C, D, Go, or Java code.  Ragel  state  machines  can  not  only
       recognize  byte  sequences as regular expression machines do, but can also execute code at
       arbitrary points in the recognition of a regular language.  User code  is  embedded  using
       inline operators that do not disrupt the regular language syntax.

       The  core  language  consists  of  standard  regular  expression operators, such as union,
       concatenation and kleene star, accompanied  by  action  embedding  operators.  Ragel  also
       provides  operators  that  let  you control any non-determinism that you create, construct
       scanners using the longest match paradigm, and build state machines using  the  statechart
       model.  It  is  also possible to influence the execution of a state machine from inside an
       embedded action by jumping or calling to other  parts  of  the  machine  and  reprocessing
       input.

       Ragel  provides  a  very  flexibile  interface to the host language that attempts to place
       minimal  restrictions  on  how  the  generated  code  is  used  and  integrated  into  the
       application. The generated code has no dependencies.

OPTIONS

       -h, -H, -?, --help
              Display help and exit.

       -v     Print version information and exit.

       -o  file
              Write  output  to  file.  If  -o  is  not  given,  a default file name is chosen by
              replacing the file extenstion of the input file. For source files ending in .rh the
              suffix .h is used. For all other source files a suffix based on the output language
              is used (.c, .cpp, .m, etc.). If -o is not given for Graphviz output the  generated
              dot file is written to standard output.

       -s     Print some statistics on standard error.

       --error-format=gnu
              Print error messages using the format "file:line:column:" (default)

       --error-format=msvc
              Print error messages using the format "file(line,column):"

       -d     Do not remove duplicate actions from action lists.

       -I  dir
              Add dir to the list of directories to search for included and imported files

       -n     Do not perform state minimization.

       -m     Perform minimization once, at the end of the state machine compilation.

       -l     Minimize  after nearly every operation. Lists of like operations such as unions are
              minimized once at the end. This is the default minimization option.

       -e     Minimize after every operation.

       -x     Compile the state machines and emit an XML representation of the host data and  the
              machines.

       -V     Generate a dot file for Graphviz.

       -p     Display printable characters on labels.

       -S <spec>
              FSM specification to output.

       -M <machine>
              Machine definition/instantiation to output.

       -C     The  host  language  is C, C++, Obj-C or Obj-C++. This is the default host language
              option.

       -D     The host language is D.

       -J     The host language is Java.

       -Z     The host language is Go.

       -R     The host language is Ruby.

       -L     Inhibit writing of #line directives.

       -T0    (C/D/Java/Ruby/C#/Go) Generate a table driven FSM. This is the default code  style.
              The  table driven FSM represents the state machine as static data. There are tables
              of states, transitions, indicies and actions. The current  state  is  stored  in  a
              variable.   The  execution  is  a  loop that looks that given the current state and
              current character to process looks up the transition to take using a binary search,
              executes  any  actions  and moves to the target state. In general, the table driven
              FSM produces a smaller binary and requires a less expensive host  language  compile
              but results in slower running code. The table driven FSM is suitable for any FSM.

       -T1    (C/D/Ruby/C#/Go)  Generate  a  faster table driven FSM by expanding action lists in
              the action execute code.

       -F0    (C/D/Ruby/C#/Go) Generate a flat table driven FSM. Transitions are  represented  as
              an  array indexed by the current alphabet character. This eliminates the need for a
              binary search to locate transitions and produces faster code, however  it  is  only
              suitable for small alphabets.

       -F1    (C/D/Ruby/C#/Go)  Generate a faster flat table driven FSM by expanding action lists
              in the action execute code.

       -G0    (C/D/C#/Go) Generate a goto driven FSM. The goto driven FSM  represents  the  state
              machine  as a series of goto statements. While in the machine, the current state is
              stored by the processor's instruction pointer. The execution  is  a  flat  function
              where  control  is passed from state to state using gotos. In general, the goto FSM
              produces faster code but results in a larger  binary  and  a  more  expensive  host
              language compile.

       -G1    (C/D/C#/Go)  Generate  a  faster  goto  driven FSM by expanding action lists in the
              action execute code.

       -G2    (C/D/Go) Generate a really fast goto driven FSM by embedding action  lists  in  the
              state machine control code.

       -P<N>  (C/D) N-Way Split really fast goto-driven FSM.

RAGEL INPUT

       NOTE:  This  is a very brief description of Ragel input. Ragel is described in more detail
       in the user guide available from the homepage (see below).

       Ragel  normally  passes  input  files  straight  to  the  output.  When  it  sees  an  FSM
       specification that contains machine instantiations it stops to generate the state machine.
       If there are write statements (such as "write exec") then ragel  emits  the  corresponding
       code.  There  can  be  any number of FSM specifications in an input file. A multi-line FSM
       specification starts with '%%{' and ends with  '}%%'.  A  single  line  FSM  specification
       starts with %% and ends at the first newline.

FSM STATEMENTS

       Machine Name:
              Set the the name of the machine. If given, it must be the first statement.

       Alphabet Type:
              Set the data type of the alphabet.

       GetKey:
              Specify how to retrieve the alphabet character from the element type.

       Include:
              Include  a machine of same name as the current or of a different name in either the
              current file or some other file.

       Action Definition:
              Define an action that can be invoked by the FSM.

       Fsm Definition, Instantiation and Longest Match Instantiation:
              Used to build FSMs. Syntax description in next few sections.

       Access:
              Specify how to access the persistent state machine variables.

       Write: Write some component of the machine.

       Variable:
              Override the default variable names (p, pe, cs, act, etc).

BASIC MACHINES

       The basic machines are the base operands of the regular language expressions.

       'hello'
              Concat literal. Produces a concatenation of the characters in the string.  Supports
              escape  sequences with '\'.  The result will have a start state and a transition to
              a new state for each character in the string. The last state in the  sequence  will
              be made final. To make the string case-insensitive, append an 'i' to the string, as
              in 'cmd'i.

       "hello"
              Identical to single quote version.

       [hello]
              Or literal. Produces a union of characters.  Supports character  ranges  with  '-',
              negating  the sense of the union with an initial '^' and escape sequences with '\'.
              The result will have two states with a transition between them for  each  character
              or range.

       NOTE:  '', "", and [] produce null FSMs. Null machines have one state that is both a start
       state and a final state and match the zero length string. A null machine  may  be  created
       with the null builtin machine.

       integer
              Makes a two state machine with one transition on the given integer number.

       hex    Makes a two state machine with one transition on the given hexidecimal number.

       /simple_regex/
              A  simple  regular  expression.  Supports the notation '.', '*' and '[]', character
              ranges with '-', negating the sense of an OR expression with and  initial  '^'  and
              escape  sequences with '\'. Also supports one trailing flag: i. Use it to produce a
              case-insensitive regular expression, as in /GET/i.

       lit .. lit
              Specifies a range. The allowable upper and lower  bounds  are  concat  literals  of
              length  one and number machines.  For example, 0x10..0x20,  0..63, and 'a'..'z' are
              valid ranges.

       variable_name
              References the machine definition assigned to the variable name given.

       builtin_machine
              There are several builtin machines available. They are all two state  machines  for
              the purpose of matching common classes of characters. They are:

              any    Any character in the alphabet.

              ascii  Ascii characters 0..127.

              extend Ascii  extended characters. This is the range -128..127 for signed alphabets
                     and the range 0..255 for unsigned alphabets.

              alpha  Alphabetic characters /[A-Za-z]/.

              digit  Digits /[0-9]/.

              alnum  Alpha numerics /[0-9A-Za-z]/.

              lower  Lowercase characters /[a-z]/.

              upper  Uppercase characters /[A-Z]/.

              xdigit Hexidecimal digits /[0-9A-Fa-f]/.

              cntrl  Control characters 0..31.

              graph  Graphical characters /[!-~]/.

              print  Printable characters /[ -~]/.

              punct  Punctuation.   Graphical   characters   that    are    not    alpha-numerics
                     /[!-/:-@\[-`{-~]/.

              space  Whitespace /[\t\v\f\n\r ]/.

              null   Zero length string. Equivalent to '', "" and [].

              empty  Empty set. Matches nothing.

BRIEF OPERATOR REFERENCE

       Operators are grouped by precedence, group 1 being the lowest and group 6 the highest.

       GROUP 1:

       expr , expr
              Join machines together without drawing any transitions, setting up a start state or
              any final states. Start state must be explicitly specified with the "start"  label.
              Final  states  may  be  specified with the an epsilon transitions to the implicitly
              created "final" state.

       GROUP 2:

       expr | expr
              Produces a machine that matches any string in machine one or machine two.

       expr & expr
              Produces a machine that matches any string that is in both machine one and  machine
              two.

       expr - expr
              Produces  a  machine  that  matches  any  string  that is in machine one but not in
              machine two.

       expr -- expr
              Strong Subtraction. Matches any string in machine one that does not have any string
              in machine two as a substring.

       GROUP 3:

       expr . expr
              Produces  a machine that matches all the strings in machine one followed by all the
              strings in machine two.

       expr :> expr
              Entry-Guarded Concatenation: terminates machine one upon entry to machine two.

       expr :>> expr
              Finish-Guarded Concatenation: terminates machine one when machine two finishes.

       expr <: expr
              Left-Guarded Concatenation: gives a higher priority to machine one.

       NOTE: Concatenation is the default operator. Two machines  next  to  each  other  with  no
       operator between them results in the concatenation operation.

       GROUP 4:

       label: expr
              Attaches  a  label  to an expression. Labels can be used by epsilon transitions and
              fgoto and fcall statements in actions. Also note that the referencing of a  machine
              definition causes the implicit creation of label by the same name.

       GROUP 5:

       expr -> label
              Draws  an epsilon transition to the state defined by label. Label must be a name in
              the current scope. Epsilon  transitions  are  resolved  when  comma  operators  are
              evaluated    and    at    the    root   of   the   expression   tree   of   machine
              assignment/instantiation.

       GROUP 6: Actions

       An action may be a name predefined with an action statement or may be  specified  directly
       with '{' and '}' in the expression.

       expr > action
              Embeds action into starting transitions.

       expr @ action
              Embeds action into transitions that go into a final state.

       expr $ action
              Embeds action into all transitions. Does not include pending out transitions.

       expr % action
              Embeds action into pending out transitions from final states.

       GROUP 6: EOF Actions

       When a machine's finish routine is called the current state's EOF actions are executed.

       expr >/ action
              Embed an EOF action into the start state.

       expr </ action
              Embed an EOF action into all states except the start state.

       expr $/ action
              Embed an EOF action into all states.

       expr %/ action
              Embed an EOF action into final states.

       expr @/ action
              Embed an EOF action into all states that are not final.

       expr <>/ action
              Embed  an  EOF action into all states that are not the start state and that are not
              final (middle states).

       GROUP 6: Global Error Actions

       Global error actions are stored in states until the final state  machine  has  been  fully
       constructed.  They  are  then  transferred  to  error  transitions, giving the effect of a
       default action.

       expr >! action
              Embed a global error action into the start state.

       expr <! action
              Embed a global error action into all states except the start state.

       expr $! action
              Embed a global error action into all states.

       expr %! action
              Embed a global error action into the final states.

       expr @! action
              Embed a global error action into all states which are not final.

       expr <>! action
              Embed a global error action into all states which are not the start state  and  are
              not final (middle states).

       GROUP 6: Local Error Actions

       Local  error  actions  are  stored in states until the named machine is fully constructed.
       They are then transferred to error transitions, giving the effect of a default action  for
       a  section  of  the  total  machine.  Note that the name may be omitted, in which case the
       action will be transferred to error actions upon construction of the current machine.

       expr >^ action
              Embed a local error action into the start state.

       expr <^ action
              Embed a local error action into all states except the start state.

       expr $^ action
              Embed a local error action into all states.

       expr %^ action
              Embed a local error action into the final states.

       expr @^ action
              Embed a local error action into all states which are not final.

       expr <>^ action
              Embed a local error action into all states which are not the start  state  and  are
              not final (middle states).

       GROUP 6: To-State Actions

       To  state  actions  are  stored  in  states and executed any time the machine moves into a
       state. This includes regular transitions, and transfers of control  such  as  fgoto.  Note
       that   setting   the   current   state  from  outside  the  machine  (for  example  during
       initialization) does not count as a transition into a state.

       expr >~ action
              Embed a to-state action action into the start state.

       expr <~ action
              Embed a to-state action into all states except the start state.

       expr $~ action
              Embed a to-state action into all states.

       expr %~ action
              Embed a to-state action into the final states.

       expr @~ action
              Embed a to-state action into all states which are not final.

       expr <>~ action
              Embed a to-state action into all states which are not the start state and  are  not
              final (middle states).

       GROUP 6: From-State Actions

       From  state actions are executed whenever a state takes a transition on a character.  This
       includes the error transition and a transition to self.

       expr >* action
              Embed a from-state action into the start state.

       expr <* action
              Embed a from-state action into every state except the start state.

       expr $* action
              Embed a from-state action into all states.

       expr %* action
              Embed a from-state action into the final states.

       expr @* action
              Embed a from-state action into all states which are not final.

       expr <>* action
              Embed a from-state action into all states which are not the start state and are not
              final (middle states).

       GROUP 6: Priority Assignment

       Priorities  are assigned to names within transitions. Only priorities on the same name are
       allowed to interact. In the first form of priorities the name defaults to the name of  the
       machine  definition  the  priority  is  assigned  in.   Transitions  do  not  have default
       priorities.

       expr > int
              Assigns the priority int in all transitions leaving the start state.

       expr @ int
              Assigns the priority int in all transitions that go into a final state.

       expr $ int
              Assigns the priority int in all existing transitions.

       expr % int
              Assigns the priority int in all pending out transitions.

       A second form of priority assignment allows the programmer to specify the  name  to  which
       the priority is assigned, allowing interactions to cross machine definition boundaries.

       expr > (name,int)
              Assigns the priority int to name in all transitions leaving the start state.

       expr @ (name, int)
              Assigns the priority int to name in all transitions that go into a final state.

       expr $ (name, int)
              Assigns the priority int to name in all existing transitions.

       expr % (name, int)
              Assigns the priority int to name in all pending out transitions.

       GROUP 7:

       expr * Produces  the  kleene  star  of  a machine. Matches zero or more repetitions of the
              machine.

       expr **
              Longest-Match Kleene Star. This version of kleene star puts a  higher  priority  on
              staying  in  the  machine  over wrapping around and starting over. This operator is
              equivalent to ( ( expr ) $0 %1 )*.

       expr ? Produces a machine that accepts the machine given or the null string. This operator
              is equivalent to  ( expr | '' ).

       expr + Produces  the  machine  concatenated  with the kleen star of itself. Matches one or
              more repetitions of the machine.  This operator is equivalent to ( expr . expr* ).

       expr {n}
              Produces a machine that matches exactly n repetitions of expr.

       expr {,n}
              Produces a machine that matches anywhere from zero to n repetitions of expr.

       expr {n,}
              Produces a machine that matches n or more repetitions of expr.

       expr {n,m}
              Produces a machine that matches n to m repetitions of expr.

       GROUP 8:

       ! expr Produces a machine that matches any string not matched by the given machine.   This
              operator is equivalent to ( *extend - expr ).

       ^ expr Character-Level  Negation.  Matches  any single character not matched by the single
              character machine expr.

       GROUP 9:

       ( expr )
              Forces precedence on operators.

VALUES AVAILABLE IN CODE BLOCKS

       fc     The current character. Equivalent to *p.

       fpc    A pointer to the current character. Equivalent to p.

       fcurs  An integer value representing the current state.

       ftargs An integer value representing the target state.

       fentry(<label>)
              An integer value representing the entry point <label>.

STATEMENTS AVAILABLE IN CODE BLOCKS

       fhold; Do not advance over the current character. Equivalent to --p;.

       fexec <expr>;
              Sets the current character to something else. Equivalent to p = (<expr>)-1;

       fgoto <label>;
              Jump to the machine defined by <label>.

       fgoto *<expr>;
              Jump to the entry point given by <expr>. The expression must evaluate to an integer
              value representing a state.

       fnext <label>;
              Set  the  next state to be the entry point defined by <label>.  The fnext statement
              does not immediately jump to the specified state. Any  action  code  following  the
              statement is executed.

       fnext *<expr>;
              Set  the  next  state  to  be  the entry point given by <expr>. The expression must
              evaluate to an integer value representing a state.

       fcall <label>;
              Call the machine defined by <label>. The next fret will jump to the target  of  the
              transition on which the action is invoked.

       fcall *<expr>;
              Call  the entry point given by <expr>. The next fret will jump to the target of the
              transition on which the action is invoked.

       fret;  Return to the target state of the transition on which the last fcall was made.

       fbreak;
              Save the current state and immediately break out of the machine.

CREDITS

       Ragel  was  written  by  Adrian  Thurston  <thurston@complang.org>.   Objective-C   output
       contributed  by Erich Ocean. D output contributed by Alan West. Ruby output contributed by
       Victor Hugo Borja. C Sharp code generation contributed by Daniel  Tang.  Contributions  to
       Java code generation by Colin Fleming.  Go code generation contributed by Justine Tunney.

SEE ALSO

       re2c(1), flex(1)

       Homepage: http://www.complang.org/ragel/