Provided by: libregexp-common-perl_2017060201-2_all bug

NAME

       Regexp::Common::comment -- provide regexes for comments.

SYNOPSIS

           use Regexp::Common qw /comment/;

           while (<>) {
               /$RE{comment}{C}/       and  print "Contains a C comment\n";
               /$RE{comment}{C++}/     and  print "Contains a C++ comment\n";
               /$RE{comment}{PHP}/     and  print "Contains a PHP comment\n";
               /$RE{comment}{Java}/    and  print "Contains a Java comment\n";
               /$RE{comment}{Perl}/    and  print "Contains a Perl comment\n";
               /$RE{comment}{awk}/     and  print "Contains an awk comment\n";
               /$RE{comment}{HTML}/    and  print "Contains an HTML comment\n";
           }

           use Regexp::Common qw /comment RE_comment_HTML/;

           while (<>) {
               $_ =~ RE_comment_HTML() and  print "Contains an HTML comment\n";
           }

DESCRIPTION

       Please consult the manual of Regexp::Common for a general description of the works of this
       interface.

       Do not use this module directly, but load it via Regexp::Common.

       This modules gives you regular expressions for comments in various languages.

   THE LANGUAGES
       Below, the comments of each of the languages are described.  The patterns are available as
       $RE{comment}{LANG}, foreach language LANG. Some languages have variants; it's described at
       the individual languages how to get the patterns for the variants.  Unless mentioned
       otherwise, "{-keep}" sets $1, $2, $3 and $4 to the entire comment, the opening marker, the
       content of the comment, and the closing marker (for many languages, the latter is a
       newline) respectively.

       ABC Comments in ABC start with a backslash ("\"), and last till the end of the line.  See
           <http://homepages.cwi.nl/%7Esteven/abc/>.

       Ada Comments in Ada start with "--", and last till the end of the line.

       Advisor
           Advisor is a language used by the HP product glance. Comments for this language start
           with either "#" or "//", and last till the end of the line.

       Advsys
           Comments for the Advsys language start with ";" and last till the end of the line. See
           also <http://www.wurb.com/if/devsys/12>.

       Alan
           Alan comments start with "--", and last till the end of the line.  See also
           <http://w1.132.telia.com/~u13207378/alan/manual/alanTOC.html>.

       Algol 60
           Comments in the Algol 60 language start with the keyword "comment", and end with a
           ";". See <http://www.masswerk.at/algol60/report.htm>.

       Algol 68
           In Algol 68, comments are either delimited by "#", or by one of the keywords "co" or
           "comment". The keywords should not be part of another word. See
           <http://westein.arb-phys.uni-dortmund.de/~wb/a68s.txt>.  With "{-keep}", only $1 will
           be set, returning the entire comment.

       ALPACA
           The ALPACA language has comments starting with "/*" and ending with "*/".

       awk The awk programming language uses comments that start with "#" and end at the end of
           the line.

       B   The B language has comments starting with "/*" and ending with "*/".

       BASIC
           There are various forms of BASIC around. Currently, we only support the variant
           supported by mvEnterprise, whose pattern is available as
           $RE{comment}{BASIC}{mvEnterprise}. Comments in this language start with a "!", a "*"
           or the keyword "REM", and end till the end of the line. See
           <http://www.rainingdata.com/products/beta/docs/mve/50/ReferenceManual/Basic.pdf>.

       Beatnik
           The esotoric language Beatnik only uses words consisting of letters.  Words are scored
           according to the rules of Scrabble. Words scoring less than 5 points, or 18 points or
           more are considered comments (although the compiler might mock at you if you score
           less than 5 points).  Regardless whether "{-keep}", $1 will be set, and set to the
           entire comment. This pattern requires perl 5.8.0 or newer.

       beta-Juliet
           The beta-Juliet programming language has comments that start with "//" and that
           continue till the end of the line. See also
           <http://www.catseye.mb.ca/esoteric/b-juliet/index.html>.

       Befunge-98
           The esotoric language Befunge-98 uses comments that start and end with a ";". See
           <http://www.catseye.mb.ca/esoteric/befunge/98/spec98.html>.

       BML BML, or Better Markup Language is an HTML templating language that uses comments
           starting with "<?c_", and ending with "c_?>".  See
           <http://www.livejournal.com/doc/server/bml.index.html>.

       Brainfuck
           The minimal language Brainfuck uses only eight characters, "<", ">", "[", "]", "+",
           "-", "." and ",".  Any other characters are considered comments. With "{-keep}", $1 is
           set to the entire comment.

       C   The C language has comments starting with "/*" and ending with "*/".

       C-- The C-- language has comments starting with "/*" and ending with "*/".  See
           <http://cs.uas.arizona.edu/classes/453/programs/C--Spec.html>.

       C++ The C++ language has two forms of comments. Comments that start with "//" and last
           till the end of the line, and comments that start with "/*", and end with "*/". If
           "{-keep}" is used, only $1 will be set, and set to the entire comment.

       C#  The C# language has two forms of comments. Comments that start with "//" and last till
           the end of the line, and comments that start with "/*", and end with "*/". If
           "{-keep}" is used, only $1 will be set, and set to the entire comment.  See
           <http://msdn.microsoft.com/library/default.asp?url=/library/en-us/csspec/html/vclrfcsharpspec_C.asp>.

       Caml
           Comments in Caml start with "(*", end with "*)", and can be nested.  See
           <http://www.cs.caltech.edu/courses/cs134/cs134b/book.pdf> and
           <http://pauillac.inria.fr/caml/index-eng.html>.

       Cg  The Cg language has two forms of comments. Comments that start with "//" and last till
           the end of the line, and comments that start with "/*", and end with "*/". If
           "{-keep}" is used, only $1 will be set, and set to the entire comment.  See
           <http://developer.nvidia.com/attach/3722>.

       CLU In "CLU", a comment starts with a procent sign ("%"), and ends with the next newline.
           See <ftp://ftp.lcs.mit.edu:/pub/pclu/CLU-syntax.ps> and
           <http://www.pmg.lcs.mit.edu/CLU.html>.

       COBOL
           Traditionally, comments in COBOL are indicated by an asteriks in the seventh column.
           This is what the pattern matches. Modern compiler may more lenient though. See
           <http://www.csis.ul.ie/cobol/Course/COBOLIntro.htm>, and
           <http://www.csis.ul.ie/cobol/default.htm>.

       CQL Comments in the chess query language (CQL) start with a semi colon (";") and last till
           the end of the line. See <http://www.rbnn.com/cql/>.

       Crystal Report
           The formula editor in Crystal Reports uses comments that start with "//", and end with
           the end of the line.

       Dylan
           There are two types of comments in Dylan. They either start with "//", or are nested
           comments, delimited with "/*" and "*/".  Under "{-keep}", only $1 will be set,
           returning the entire comment.  This pattern requires perl 5.6.0 or newer.

       ECMAScript
           The ECMAScript language has two forms of comments. Comments that start with "//" and
           last till the end of the line, and comments that start with "/*", and end with "*/".
           If "{-keep}" is used, only $1 will be set, and set to the entire comment. JavaScript
           is Netscapes implementation of ECMAScript. See
           <http://www.ecma-international.org/publications/files/ecma-st/Ecma-262.pdf>, and
           <http://www.ecma-international.org/publications/standards/Ecma-262.htm>.

       Eiffel
           Eiffel comments start with "--", and last till the end of the line.

       False
           In False, comments start with "{" and end with "}".  See
           <http://wouter.fov120.com/false/false.txt>

       FPL The FPL language has two forms of comments. Comments that start with "//" and last
           till the end of the line, and comments that start with "/*", and end with "*/". If
           "{-keep}" is used, only $1 will be set, and set to the entire comment.

       Forth
           Comments in Forth start with "\", and end with the end of the line.  See also
           <http://docs.sun.com/sb/doc/806-1377-10>.

       Fortran
           There are two forms of Fortran. There's free form Fortran, which has comments that
           start with "!", and end at the end of the line.  The pattern for this is given by
           $RE{Fortran}. Fixed form Fortran, which has been obsoleted, has comments that start
           with "C", "c" or "*" in the first column, or with "!" anywhere, but the sixth column.
           The pattern for this are given by $RE{Fortran}{fixed}.

           See also <http://www.cray.com/craydoc/manuals/007-3692-005/html-007-3692-005/>.

       Funge-98
           The esotoric language Funge-98 uses comments that start and end with a ";".

       fvwm2
           Configuration files for fvwm2 have comments starting with a "#" and lasting the rest
           of the line.

       Haifu
           Haifu, an esotoric language using haikus, has comments starting and ending with a ",".
           See <http://www.dangermouse.net/esoteric/haifu.html>.

       Haskell
           There are two types of comments in Haskell. They either start with at least two
           dashes, or are nested comments, delimited with "{-" and "-}".  Under "{-keep}", only
           $1 will be set, returning the entire comment.  This pattern requires perl 5.6.0 or
           newer.

       HTML
           In HTML, comments only appear inside a comment declaration.  A comment declaration
           starts with a "<!", and ends with a ">". Inside this declaration, we have zero or more
           comments.  Comments starts with "--" and end with "--", and are optionally followed by
           whitespace. The pattern $RE{comment}{HTML} recognizes those comment declarations (and
           hence more than a comment).  Note that this is not the same as something that starts
           with "<!--" and ends with "-->", because the following will be matched completely:

               <!--  First  Comment   --
                 --> Second Comment <!--
                 --  Third  Comment   -->

           Do not be fooled by what your favourite browser thinks is an HTML comment.

           If "{-keep}" is used, the following are returned:

           $1  captures the entire comment declaration.

           $2  captures the MDO (markup declaration open), "<!".

           $3  captures the content between the MDO and the MDC.

           $4  captures the (last) comment, without the surrounding dashes.

           $5  captures the MDC (markup declaration close), ">".

       Hugo
           There are two types of comments in Hugo. They either start with "!" (which cannot be
           followed by a "\"), or are nested comments, delimited with "!\" and "\!".  Under
           "{-keep}", only $1 will be set, returning the entire comment.  This pattern requires
           perl 5.6.0 or newer.

       Icon
           Icon has comments that start with "#" and end at the next new line.  See
           <http://www.toolsofcomputing.com/IconHandbook/IconHandbook.pdf>,
           <http://www.cs.arizona.edu/icon/index.htm>, and
           <http://burks.bton.ac.uk/burks/language/icon/index.htm>.

       ILLGOL
           The esotoric language ILLGOL uses comments starting with NB and lasting till the end
           of the line.  See <http://www.catseye.mb.ca/esoteric/illgol/index.html>.

       INTERCAL
           Comments in INTERCAL are single line comments. They start with one of the keywords
           "NOT" or "N'T", and can optionally be preceded by the keywords "DO" and "PLEASE". If
           both keywords are used, "PLEASE" precedes "DO". Keywords are separated by whitespace.

       J   The language J uses comments that start with "NB.", and that last till the end of the
           line. See <http://www.jsoftware.com/books/help/primer/contents.htm>, and
           <http://www.jsoftware.com/>.

       Java
           The Java language has two forms of comments. Comments that start with "//" and last
           till the end of the line, and comments that start with "/*", and end with "*/". If
           "{-keep}" is used, only $1 will be set, and set to the entire comment.

       JavaDoc
           The Javadoc documentation syntax is demarked with a subset of ordinary Java comments
           to separate it from code.  Comments start with "/**" end with "*/".  If "{-keep}" is
           used, only $1 will be set, and set to the entire comment. See
           <http://www.oracle.com/technetwork/java/javase/documentation/index-137868.html#format>.

       JavaScript
           The JavaScript language has two forms of comments. Comments that start with "//" and
           last till the end of the line, and comments that start with "/*", and end with "*/".
           If "{-keep}" is used, only $1 will be set, and set to the entire comment. JavaScript
           is Netscapes implementation of ECMAScript.  See
           <http://www.mozilla.org/js/language/E262-3.pdf>, and
           <http://www.mozilla.org/js/language/>.

       LaTeX
           The documentation language LaTeX uses comments starting with "%" and ending at the end
           of the line.

       Lisp
           Comments in Lisp start with a semi-colon (";") and last till the end of the line.

       LPC The LPC language has comments starting with "/*" and ending with "*/".

       LOGO
           Comments for the language LOGO start with ";", and last till the end of the line.

       lua Comments for the lua language start with "--", and last till the end of the line. See
           also <http://www.lua.org/manual/manual.html>.

       M, MUMPS
           In "M" (aka "MUMPS"), comments start with a semi-colon, and last till the end of a
           line. The language specification requires the semi-colon to be preceded by one or more
           linestart characters.  Those characters default to a space, but that's configurable.
           This requirement, of preceding the comment with linestart characters is not tested
           for. See <ftp://ftp.intersys.com/pub/openm/ism/ism64docs.zip>,
           <http://mtechnology.intersys.com/mproducts/openm/index.html>, and
           <http://mcenter.com/mtrc/index.html>.

       m4  By default, the preprocessor language m4 uses single line comments, that start with a
           "#" and continue to the end of the line, including the newline. The pattern "$RE
           {comment} {m4}" matches such comments.  In m4, it is possible to change the starting
           token though.  See <http://wolfram.schneider.org/bsd/7thEdManVol2/m4/m4.pdf>,
           <http://www.cs.stir.ac.uk/~kjt/research/pdf/expl-m4.pdf>, and
           <http://www.gnu.org/software/m4/manual/>.

       Modula-2
           In "Modula-2", comments start with "(*", and end with "*)". Comments may be nested.
           See <http://www.modula2.org/>.

       Modula-3
           In "Modula-3", comments start with "(*", and end with "*)". Comments may be nested.
           See <http://www.m3.org/>.

       mutt
           Configuration files for mutt have comments starting with a "#" and lasting the rest of
           the line.

       Nickle
           The Nickle language has one line comments starting with "#" (like Perl), or multiline
           comments delimited by "/*" and "*/" (like C). Under "-keep", only $1 will be set. See
           also <http://www.nickle.org>.

       Oberon
           Comments in Oberon start with "(*" and end with "*)".  See
           <http://www.oberon.ethz.ch/oreport.html>.

       Pascal
           There are many implementations of Pascal. This modules provides pattern for comments
           of several implementations.

           $RE{comment}{Pascal}
               This is the pattern that recognizes comments according to the Pascal ISO standard.
               This standard says that comments start with either "{", or "(*", and end with "}"
               or "*)". This means that "{*)" and "(*}" are considered to be comments. Many
               Pascal applications don't allow this.  See
               <http://www.pascal-central.com/docs/iso10206.txt>

           $RE{comment}{Pascal}{Alice}
               The Alice Pascal compiler accepts comments that start with "{" and end with "}".
               Comments are not allowed to contain newlines.  See
               <http://www.templetons.com/brad/alice/language/>.

           $RE{comment}{Pascal}{Delphi}, $RE{comment}{Pascal}{Free} and $RE{comment}{Pascal}{GPC}
               The Delphi Pascal, Free Pascal and the Gnu Pascal Compiler implementations of
               Pascal all have comments that either start with "//" and last till the end of the
               line, are delimited with "{" and "}" or are delimited with "(*" and "*)". Patterns
               for those comments are given by $RE{comment}{Pascal}{Delphi},
               $RE{comment}{Pascal}{Free} and $RE{comment}{Pascal}{GPC} respectively. These
               patterns only set $1 when "{-keep}" is used, which will then include the entire
               comment.

               See <http://info.borland.com/techpubs/delphi5/oplg/>,
               <http://www.freepascal.org/docs-html/ref/ref.html> and
               <http://www.gnu-pascal.de/gpc/>.

           $RE{comment}{Pascal}{Workshop}
               The Workshop Pascal compiler, from SUN Microsystems, allows comments that are
               delimited with either "{" and "}", delimited with "(*)" and "*"), delimited with
               "/*", and "*/", or starting and ending with a double quote ("""). When "{-keep}"
               is used, only $1 is set, and returns the entire comment.

               See <http://docs.sun.com/db/doc/802-5762>.

       PEARL
           Comments in PEARL start with a "!" and last till the end of the line, or start with
           "/*" and end with "*/". With "{-keep}", $1 will be set to the entire comment.

       PHP Comments in PHP start with either "#" or "//" and last till the end of the line, or
           are delimited by "/*" and "*/". With "{-keep}", $1 will be set to the entire comment.

       PL/B
           In PL/B, comments start with either "." or ";", and end with the next newline. See
           <http://www.mmcctech.com/pl-b/plb-0010.htm>.

       PL/I
           The PL/I language has comments starting with "/*" and ending with "*/".

       PL/SQL
           In PL/SQL, comments either start with "--" and run till the end of the line, or start
           with "/*" and end with "*/".

       Perl
           Perl uses comments that start with a "#", and continue till the end of the line.

       Portia
           The Portia programming language has comments that start with "//", and last till the
           end of the line.

       Python
           Python uses comments that start with a "#", and continue till the end of the line.

       Q-BAL
           Comments in the Q-BAL language start with "`" (a backtick), and contine till the end
           of the line.

       QML In "QML", comments start with "#" and last till the end of the line.  See
           <http://www.questionmark.com/uk/qml/overview.doc>.

       R   The statistical language R uses comments that start with a "#" and end with the
           following new line. See <http://www.r-project.org/>.

       REBOL
           Comments for the REBOL language start with ";" and last till the end of the line.

       Ruby
           Comments in Ruby start with "#" and last till the end of the time.

       Scheme
           Scheme comments start with ";", and last till the end of the line.  See
           <http://schemers.org/>.

       shell
           Comments in various shells start with a "#" and end at the end of the line.

       Shelta
           The esotoric language Shelta uses comments that start and end with a ";". See
           <http://www.catseye.mb.ca/esoteric/shelta/index.html>.

       SLIDE
           The SLIDE language has two froms of comments. First there is the line comment, which
           starts with a "#" and includes the rest of the line (just like Perl). Second, there is
           the multiline, nested comment, which are delimited by "(*" and "*)". Under C{-keep}>,
           only $1 is set, and is set to the entire comment. See
           <http://www.cs.berkeley.edu/~ug/slide/docs/slide/spec/spec_frame_intro.shtml>.

       slrn
           Configuration files for slrn have comments starting with a "%" and lasting the rest of
           the line.

       Smalltalk
           Smalltalk uses comments that start and end with a double quote, """.

       SMITH
           Comments in the SMITH language start with ";", and last till the end of the line.

       Squeak
           In the Smalltalk variant Squeak, comments start and end with """. Double quotes can
           appear inside comments by doubling them.

       SQL Standard SQL uses comments starting with two or more dashes, and ending at the end of
           the line.

           MySQL does not follow the standard. Instead, it allows comments that start with a "#"
           or "-- " (that's two dashes and a space) ending with the following newline, and
           comments starting with "/*", and ending with the next ";" or "*/" that isn't inside
           single or double quotes. A pattern for this is returned by $RE{comment}{SQL}{MySQL}.
           With "{-keep}", only $1 will be set, and it returns the entire comment.

       Tcl In Tcl, comments start with "#" and continue till the end of the line.

       TeX The documentation language TeX uses comments starting with "%" and ending at the end
           of the line.

       troff
           The document formatting language troff uses comments starting with "\"", and
           continuing till the end of the line.

       Ubercode
           The Windows programming language Ubercode uses comments that start with "//" and
           continue to the end of the line. See <http://www.ubercode.com>.

       vi  In configuration files for the editor vi, one can use comments starting with """, and
           ending at the end of the line.

       *W  In the language *W, comments start with "||", and end with "!!".

       zonefile
           Comments in DNS zonefiles start with ";", and continue till the end of the line.

       ZZT-OOP
           The in-game language ZZT-OOP uses comments that start with a "'" character, and end at
           the following newline. See <http://dave2.rocketjump.org/rad/zzthelp/lang.html>.

REFERENCES

       [Go 90]
           Charles F. Goldfarb: The SGML Handbook. Oxford: Oxford University Press. 1990. ISBN
           0-19-853737-9. Ch. 10.3, pp 390-391.

SEE ALSO

       Regexp::Common for a general description of how to use this interface.

AUTHOR

       Damian Conway (damian@conway.org)

MAINTENANCE

       This package is maintained by Abigail (regexp-common@abigail.be).

BUGS AND IRRITATIONS

       Bound to be plenty.

       For a start, there are many common regexes missing.  Send them in to
       regexp-common@abigail.be.

LICENSE and COPYRIGHT

       This software is Copyright (c) 2001 - 2017, Damian Conway and Abigail.

       This module is free software, and maybe used under any of the following licenses:

        1) The Perl Artistic License.     See the file COPYRIGHT.AL.
        2) The Perl Artistic License 2.0. See the file COPYRIGHT.AL2.
        3) The BSD License.               See the file COPYRIGHT.BSD.
        4) The MIT License.               See the file COPYRIGHT.MIT.