Provided by: libtacacs+1-dev_4.0.4.19-11build1_all bug


       regcomp, regexec, regsub, regerror - regular expression handler


       #include <regexp.h>

       regexp *regcomp(exp)
       char *exp;

       int regexec(prog, string)
       regexp *prog;
       char *string;

       regsub(prog, source, dest)
       regexp *prog;
       char *source;
       char *dest;

       char *msg;


       These functions implement egrep(1)-style regular expressions and supporting facilities.

       Regcomp  compiles  a  regular  expression  into  a structure of type regexp, and returns a
       pointer to it.  The space has been allocated using malloc(3) and may be released by free.

       Regexec matches a NUL-terminated string against the compiled regular expression  in  prog.
       It  returns 1 for success and 0 for failure, and adjusts the contents of prog's startp and
       endp (see below) accordingly.

       The members of a regexp structure include at  least  the  following  (not  necessarily  in

              char *startp[NSUBEXP];
              char *endp[NSUBEXP];

       where  NSUBEXP  is defined (as 10) in the header file.  Once a successful regexec has been
       done using the regexp, each startp-endp pair describes one substring  within  the  string,
       with  the startp pointing to the first character of the substring and the endp pointing to
       the first character following the substring.  The 0th substring is the substring of string
       that  matched  the whole regular expression.  The others are those substrings that matched
       parenthesized expressions within the regular expression,  with  parenthesized  expressions
       numbered in left-to-right order of their opening parentheses.

       Regsub  copies  source  to dest, making substitutions according to the most recent regexec
       performed using prog.  Each instance of  `&'  in  source  is  replaced  by  the  substring
       indicated  by  startp[0]  and  endp[0].   Each  instance  of  `\n', where n is a digit, is
       replaced by the substring indicated by startp[n] and endp[n].  To get  a  literal  `&'  or
       `\n'  into dest, prefix it with `\'; to get a literal `\' preceding `&' or `\n', prefix it
       with another `\'.

       Regerror is called whenever an error is detected in  regcomp,  regexec,  or  regsub.   The
       default  regerror  writes  the  string  msg,  with  a suitable indicator of origin, on the
       standard error output and invokes exit(2).  Regerror can be replaced by the user if  other
       actions are desirable.


       A regular expression is zero or more branches, separated by `|'.  It matches anything that
       matches one of the branches.

       A branch is zero or more pieces, concatenated.  It matches a match for the first, followed
       by a match for the second, etc.

       A piece is an atom possibly followed by `*', `+', or `?'.  An atom followed by `*' matches
       a sequence of 0 or more matches of the atom.  An atom followed by `+' matches  a  sequence
       of 1 or more matches of the atom.  An atom followed by `?' matches a match of the atom, or
       the null string.

       An atom is a  regular  expression  in  parentheses  (matching  a  match  for  the  regular
       expression),  a range (see below), `.'  (matching any single character), `^' (matching the
       null string at the beginning of the input string), `$' (matching the null  string  at  the
       end  of the input string), a `\' followed by a single character (matching that character),
       or a single character with no other significance (matching that character).

       A range is a sequence of characters enclosed in `[]'.   It  normally  matches  any  single
       character  from  the  sequence.   If  the  sequence begins with `^', it matches any single
       character not from the rest of the sequence.   If  two  characters  in  the  sequence  are
       separated  by  `-',  this  is shorthand for the full list of ASCII characters between them
       (e.g. `[0-9]' matches any decimal digit).  To include a literal `]' in the sequence,  make
       it  the first character (following a possible `^').  To include a literal `-', make it the
       first or last character.


       If a regular expression could match two different parts of the input string, it will match
       the  one  which  begins  earliest.  If both begin in the same place    but match different
       lengths, or match the same length in different ways, life gets messier, as follows.

       In general, the possibilities in a list of branches are considered in left-to-right order,
       the  possibilities  for  `*', `+', and `?' are considered longest-first, nested constructs
       are considered from the outermost in, and concatenated constructs are considered leftmost-
       first.  The match that will be chosen is the one that uses the earliest possibility in the
       first choice that has to be made.  If there is more than one choice, the next will be made
       in  the  same  manner  (earliest possibility) subject to the decision on the first choice.
       And so forth.

       For example, `(ab|a)b*c' could match `abc' in one  of  two  ways.   The  first  choice  is
       between  `ab' and `a'; since `ab' is earlier, and does lead to a successful overall match,
       it is chosen.  Since the `b'  is  already  spoken  for,  the  `b*'  must  match  its  last
       possibility—the empty string—since it must respect the earlier choice.

       In  the  particular case where no `|'s are present and there is only one `*', `+', or `?',
       the net effect is that the longest possible match will be  chosen.   So  `ab*',  presented
       with  `xabbbby',  will match `abbbb'.  Note that if `ab*' is tried against `xabyabbbz', it
       will match `ab' just after `x', due to the begins-earliest rule.  (In effect, the decision
       on  where to start the match is the first choice to be made, hence subsequent choices must
       respect it even if this leads them to less-preferred alternatives.)


       egrep(1), expr(1)


       Regcomp returns NULL for a  failure  (regerror  permitting),  where  failures  are  syntax
       errors,  exceeding  implementation  limits,  or  applying  `+'  or  `*' to a possibly-null


       Both code and manual page were written at U of T.  They are intended to be compatible with
       the Bell V8 regexp(3), but are not derived from Bell code.


       Empty branches and empty regular expressions are not portable to V8.

       The  restriction  against applying `*' or `+' to a possibly-null operand is an artifact of
       the simplistic implementation.

       Does not support egrep's  newline-separated  branches;  neither  does  the  V8  regexp(3),

       Due  to  emphasis  on  compactness and simplicity, it's not strikingly fast.  It does give
       special attention to handling simple cases quickly.

                                           2 April 1986                                 REGEXP(3)