Ubuntu Manpage: Embperl::Syntax - base class for defining custom syntaxes

Provided by: libembperl-perl_2.5.0-17_amd64

NAME

       Embperl::Syntax - base class for defining custom syntaxes

SYNOPSIS

DESCRIPTION

       Embperl::Syntax provides a base class from which all custom syntaxes should be derived.
       Currently Embperl comes with the following derived syntaxes:

       EmbperlHTML
           all the HTML tag that Embperl recognizes by default

       EmbperlBlocks
           all the [ ] blocks that Embperl supports

       Embperl
           The default syntax; is derived from "EmbperlHtml" and "EmbperlBlocks"

       ASP <%  %> and <%=  %>, see perldoc Embperl::Syntax::ASP

       SSI Server Side Includes, see perldoc Embperl::Syntax::SSI

       Perl
           File contains pure Perl (similar to Apache::Registry), but can be used inside
           EmbperlObject

       Text
           File contains only Text, no actions is taken on the Text

       Mail
           Defines the <mail:send> tag, for sending mail. This is an example for a taglib, which
           could be a base for writing your own taglib to extent the number of available tags

       POD Parses POD out of any file and creates a XML tree similar to pod2xml, which can be
           formatted by XSLT afterwards.

       You can choose which syntax is used inside your page, either by the "EMBPERL_SYNTAX"
       configuration directive, the "syntax", parameter to "Execute" or the "[$ syntax $]"
       metacommand.

       You can also specify multiple syntaxes e.g.

           PerlSetEnv EMBPERL_SYNTAX "Embperl SSI"

           Execute ({inputfile => '*', syntax => 'Embperl ASP'}) ;

       The syntax metacommand allows you to switch the syntax or to add or subtract syntaxes e.g.

           [$ syntax + Mail $]

       will add the Mail taglib so the <mail:send> tag is available after this line.

           [$ syntax - Mail $]

       now the <mail:send> tag is unknown again

           [$ syntax SSI $]

       now you can only use SSI commands inside your page.

Defining your own Syntax

       If you want to define your own syntax, you have to derive a new class from one of the
       existing ones and extent it with new tags/functionality. The best thing is to take a look
       at the syntax classes that comes with Embperl.  (inside the directory Embperl/Syntax/).

       For example if you want to add new html tags, derive from Embperl::Syntax::HTML, if you
       want to add new metacommands derive from Embperl::Syntax::EmbperlBlocks.

       Some of the classes define additional methods to easily add new tags. See the respective
       pod file, which methods are available for a certain class.

       Embperl::Syntax defines the basic methods to create a syntax:

Methods

   Embperl::Syntax -> new  /  $self -> new
       Create a new syntax class. This method should only be called inside a constructor of a
       derived class.

   $self -> AddToRoot ($elements)
       This adds a new element to the root of the parser tree. $elements must be a hashref. See
       Embperl::Syntax::ASP for an example.

   $self -> AddInitCode ($compiletimecode, $initcode, $termcode, $procinfo)
       This gives you the possibility to add some Perl code, that is always executed at the
       beginning of a document ($initcode), at the end of the document ($termcode) or at compile
       time ($compiletimecode). The three strings must be valid Perl code. See
       Embperl::Syntax::SSI for an example. $procinfo is a hashref that can consits of additional
       processor infos (see below) for the document.

   $self -> GetRoot
       Returns the root of the parser tree.

   Embperl::Syntax::GetSyntax ($name, $oldname)
       Returns a syntax object which is build form the syntaxes named in $name. If $oldname is
       given, $name can start with a "+" or "-" to add or subtract a syntax. This is normally
       only needed by Embperl itself or to implement a syntax switch statement (see
       Embperl::Syntax::SSI for an example.)

   $self -> CloneHash ($old, $replace)
       Clones a hash which is given as hashref in $old, optional replace the tags given in the
       hashref $replace and return a hashref to the new hash.

Syntax Structure and Parameter

       Internaly the syntax object builds a data structure which serve as base for the parser.
       This structure consists of a list of tokens and options, which starts with a dash:

   Tokens
       '-lsearch' => 1
           Do an linear search instead of a binary search. This is necessary if the tokens can't
           clearly separated.

       '-defnodetype' => ntypText,
           Defines the default type for text nodes. Without any specification the type is CDATA,
           which mean no escaping takes places. With "ntypText" all special characters are
           escaped.

       '-rootnode'
           Name for a root node to insert always.

       <name> => \%tokendescription
           All items which does not start with a slash are treated as names. The name of a token
           is only descriptive and is used in error messages. The item must contain a hashref
           which describes the token.

   Tokendescription
       Each token can have the following members:

       'text' => '<'
           Start text

       'end'  => '>'
           End text

       'matchall'
           when set to 1 new token starts at next character, when set to -1 new token starts at
           next character, but only if it is the first token inside another one.

       'nodename'
           Text that should be outputed when node is stringifyed. Defaults to text.  If the first
           character is a ':' you can specify the sourounding delimiters for this tag with
           :<start>:<end>:<text>:<endtag>. Example:  ':{:}:NAME' .  If the nodename starts with a
           '!' a unique internal id is generated, so two or more nodename of the same text, can
           have different meaning in different contexts.

       'contains'   => 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789'
           Token consists of the following characters. Either "start" and "end" or "contains"
           could be specified.

           NOTE: If a item that only specfifies contains but no text should be compiled, you must
           specfify a nodname.

       'unescape' => 1
           If "optRawInput" isn't set unescape the data of the inside the node

       'nodetype'   => ntypEndTag
           Type of the node

       'cdatatype'  => ntypAttrValue
           Type of nodes for data (which is not matched by 'inside' definitions) inside this
           node. Set to zero to not generate any nodes for text inside of this node, other then
           these that are matched by a 'inside' definition.

       'endtag'
           Name of the tag that marks the end of a block. This is used by the parser to track
           correct nesting.

       'follow' => \%tokenlist
           Hashref that specifices one or more tokens that must follow this token.

       'inside' => \%tokenlist
           Hashref that specifices one or more tokens that could occur inside a node that is
           started with this token.

       exitinside
           when the token found, the parser stop searching in the current level and continues
           with the tokens that are defined in the hash from there the current one was "called"
           via inside

       donteat
           set to 1 to don't eat the start text, so it will be matched again, by any tokens set
           under "inside". Set 2 to don't the end text. Set to 3 for both.

       'procinfo' =>
           Processor info. Hashref with information how to process this token.

   Processor info
       The processor info gives information how to compile this token to valid code that can be
       executed later on by the processor. There could be information for multiple processors. At
       the moment only the embperl processor is defined. Normally you must not worry about
       different processor, because the syntax object knows inside that all procinfo is for the
       embperl processor. procinfo is a parameter to many methods, it is a hashref and can take
       the following items:

       perlcode => <string> or <arrayref>
           Code to generate. You can also specify a arrayref of strings.  The first string which
           contains matching attributes are used.  The following special strings are replaced:

           %#<N>%
               Text of childnode number <N> (starting with zero)

           %><N>%
               Text of sibling node number <N> .  0 gives the current node, > 0 gives the Nth
               next node, < 0 gives the Nth previous node.

           %&<attr>%
               Value of attribute <attr>.

           %^<stackname>%
               Stringvalue of given stack

           %?<stackname>%
               Set if stackvalue was used

           %$n%
               Source Dom Tree, Index of current node.

           %$t%
               Source Dom Tree

           %$x%
               Index of current node

           %$l%
               Index of last node

           %$c%
               Sets the current node Index, if not already done

           %$q%
               Index of source Dom Tree

           %$p%
               Number of current checkpoint

           %%  Gives a single %

           All of the above special values (expect those start with $) allows the following
           modifiers:

           %<X>*<N>%
               Attribute/Child etc. must exist.

           %<X>!<N>%
               Attribute/Child etc. must not exist.

           %<X>=<N>:<value1>|<value2>|<value3>%
               Attribute/Child etc. must have the value = <value1> or <value2> etc.

           %<X>~<N>:<value1>|<value2>|<value3>%
               Attribute/Child etc. must contain the substring <value1> or <value2> etc.  and a
               non alphanum character must follow the substring.

           writing a minus sign (-) after * ! = or ~ will cause the child/attribute not to be
           included, but the condition is evaluated. Writing an ' will cause the value to be
           quoted.

       perlcodeend => <string>
           Code to generate at the end of the block.

       compiletimeperlcode => <string> or <arrayref>
           Code that is executed at compile time. You can also specify a arrayref of string.  The
           first string which contains matching attributes are used.  The same special strings
           are replaced as in "perlcode".

           $_[0] contains the Embperl request object. The method "Code" can be used to get or set
           the perl code that should be generated by this node.

           If the code begins with #!- all newlines are removed in the code. This is basically
           useful to keep all code on the same line, so the line number in error reporting
           matches the line in the source.

       compiletimeperlcodeend => <string>
           Code that is executed at compile time, but at the end of the tag.  The same special
           strings are replaced as in "perlcode".

           $_[0] contains the Embperl request object. The method "Code" can be used to get or set
           the perl code that should be generated by this node.

           If the code begins with #!- all newlines are removed in the code. This is basically
           useful to keep all code on the same line, so the linenumber in error reporting matches
           the line in the source.

       perlcoderemove => 0/1
           Remove perlcode if perlcodeend condition is not met.

       removenode => <removelevel>
           Remove node after compiling. <removelevel> could be one of the following, values could
           be added:

           1.  Remove this node only

           2.  Remove next node if it consists of only white spaces and optKeepSpaces isn't set.

           3.  Replace next node with one space if next node consists only of white spaces and
               optKeepSpaces isn't set.

           4.  Set this node to ignore for output.

           5.  Remove all child nodes

           6.  Set all child nodes to ignore for output.

           7.  Calculate Attributes values of this node also for nodes that are set to ignore for
               output (makes only sense if 8 is also set).

       removespaces => <removeflags>
           Remove spaces before or after tag.

           1.  Remove all white spaces before tag

           2.  Remove all white spaces after tag

           3.  Remove spaces and tabs before tag

           4.  Remove spaces and tabs after tag

           5.  Remove all spaces and tabs but one before tag

           6.  Remove all whihe space after text inside of tag

           7.  Remove spaces and tabs  after text inside of tag

       mayjump => 0/1
           If set, tells the compiler that this code may jump to another program location.  (e.g.
           if, while, goto etc.).  Could also be a condition as described under perlcode.

       compilechilds => 0/1
           Compile child nodes. Default: 1

       stackname => <name>
           Name of stack for "push", "stackmatch"

       stackname2 => <name>
           Name of stack for "push2"

       push => <value>
           Push value on stack which name is given with "stackname". Value could include the same
           specical values as "perlcode"

       push2 => <value>
           Push value on stack which name is given with "stackname2". Value could include the
           same specical values as "perlcode"

       stackmatch => <value>
           Check if value on stack which name is given with "stackname" is the same as the given
           value. If not give a error message about tag mismatch. Value could include the same
           specical values as "perlcode"

       switchcodetype => <1/2>
           1 means put the following code into normal code which is executed every time the page
           is requested

           2 means put the following code put into code which is executed direct after
           compilation.  This is mainly for defining subs, or using modules etc.

       addflags
       cdatatype
       forcetype
       insidemustexist
       matchall
       exitinside
       addfirstchild
       starttag
       endtag
       parsetimeperlcode
       contains