plucky (3) docstrip.3tcl.gz

Provided by: tcllib_2.0+dfsg-2_all bug

NAME

       docstrip - Docstrip style source code extraction

SYNOPSIS

       package require Tcl 8.5 9

       package require docstrip ?1.3?

       docstrip::extract text terminals ?option value ...?

       docstrip::sourcefrom filename terminals ?option value ...?

________________________________________________________________________________________________________________

DESCRIPTION

       Docstrip  is  a tool created to support a brand of Literate Programming. It is most common in the (La)TeX
       community, where it is being used for pretty much everything from the LaTeX core and  up,  but  there  is
       nothing about docstrip which prevents using it for other types of software.

       In  short, the basic principle of literate programming is that program source should primarily be written
       and structured to suit the developers (and advanced users who want to peek "under the hood"), not to suit
       the  whims  of  a compiler or corresponding source code consumer.  This means literate sources often need
       some kind of "translation" to an illiterate form that dumb software can  understand.   The  docstrip  Tcl
       package handles this translation.

       Even  for  those  who  do  not  whole-hartedly  subscribe  to the philosophy behind literate programming,
       docstrip can bring greater clarity to in particular:

       •      programs employing non-obvious mathematics

       •      projects where separate pieces of code,  perhaps  in  different  languages,  need  to  be  closely
              coordinated.

       The  first  is  by providing access to much more powerful typographical features for source code comments
       than are possible in plain text.  The second is because all the separate pieces of code can be kept  next
       to each other in the same source file.

       The  way  it  works is that the programmer edits directly only one or several "master" source code files,
       from which docstrip generates the more traditional "source" files compilers or the like would expect. The
       master  sources  typically  contain a large amount of documentation of the code, sometimes even in places
       where the code consumers would not  allow  any  comments.  The  etymology  of  "docstrip"  is  that  this
       documentation  was  stripped  away  (although  "code extraction" might be a better description, as it has
       always been a matter of copying selected pieces of the master source rather than deleting text from  it).
       The  docstrip  Tcl  package  contains  a  reimplementation of the basic extraction functionality from the
       docstrip program, and thus makes it possible for a Tcl interpreter  to  read  and  interpret  the  master
       source files directly.

       Readers  who  are  not  previously  familiar with docstrip but want to know more about it may consult the
       following sources.

       [1]    The tclldoc package and class, http://ctan.org/tex-archive/macros/latex/contrib/tclldoc/.

       [2]    The DocStrip utility, http://ctan.org/tex-archive/macros/latex/base/docstrip.dtx.

       [3]    The doc and shortvrb Packages, http://ctan.org/tex-archive/macros/latex/base/doc.dtx.

       [4]    Chapter 14 of The LaTeX Companion (second edition), Addison-Wesley, 2004; ISBN 0-201-36299-6.

FILE FORMAT

       The basic unit docstrip operates on are the lines  of  a  master  source  file.  Extraction  consists  of
       selecting  some of these lines to be copied from input text to output text. The basic distinction is that
       between code lines (which are copied and do not begin with a percent character) and comment lines  (which
       begin with a percent character and are not copied).

                 docstrip::extract [join {
                   {% comment}
                   {% more comment !"#$%&/(}
                   {some command}
                   { % blah $blah "Not a comment."}
                   {% abc; this is comment}
                   {# def; this is code}
                   {ghi}
                   {% jkl}
                 } \n] {}

       returns the same sequence of lines as

                 join {
                   {some command}
                   { % blah $blah "Not a comment."}
                   {# def; this is code}
                   {ghi} ""
                 } \n

       It  does  not  matter  to docstrip what format is used for the documentation in the comment lines, but in
       order to do better than plain text comments, one typically uses some markup language. Most commonly LaTeX
       is  used,  as  that  is  a  very established standard and also provides the best support for mathematical
       formulae, but the docstrip::util package also gives some support for doctools-like markup.

       Besides the basic code and comment lines, there are also guard lines, which begin with the two characters
       '%<',  and  meta-comment  lines,  which  begin  with the two characters ´%%'. Within guard lines there is
       furthermore the distinction between verbatim guard lines, which begin  with  '%<<',  and  ordinary  guard
       lines, where the '%<' is not followed by another '<'. The last category is by far the most common.

       Ordinary  guard  lines  conditions  extraction  of  the code line(s) they guard by the value of a boolean
       expression; the guarded block of code lines will only be included if the expression  evaluates  to  true.
       The syntax of an ordinary guard line is one of

                  '%' '<' STARSLASH EXPRESSION '>'
                  '%' '<' PLUSMINUS EXPRESSION '>' CODE

       where

                  STARSLASH  ::=  '*' | '/'
                  PLUSMINUS  ::=  | '+' | '-'
                  EXPRESSION ::= SECONDARY | SECONDARY ',' EXPRESSION
                               | SECONDARY '|' EXPRESSION
                  SECONDARY  ::= PRIMARY | PRIMARY '&' SECONDARY
                  PRIMARY    ::= TERMINAL | '!' PRIMARY | '(' EXPRESSION ')'
                  CODE       ::= { any character except end-of-line }

       Comma  and  vertical  bar  both  denote 'or'. Ampersand denotes 'and'.  Exclamation mark denotes 'not'. A
       TERMINAL can be any nonempty string of characters not containing '>',  '&',  '|',  comma,  '(',  or  ')',
       although  the  docstrip  manual  is a bit restrictive and only guarantees proper operation for strings of
       letters (although even the LaTeX core sources make heavy use also of digits in  TERMINALs).   The  second
       argument  of  docstrip::extract  is  the  list  of  those TERMINALs that should count as having the value
       'true'; all other TERMINALs count as being 'false' when guard expressions are evaluated.

       In the case of a '%<*EXPRESSION>' guard, the lines guarded are all lines up to the next  '%</EXPRESSION>'
       guard  with  the  same EXPRESSION (compared as strings). The blocks of code delimited by such '*' and '/'
       guard lines must be properly nested.

                 set text [join {
                    {begin}
                    {%<*foo>}
                    {1}
                    {%<*bar>}
                    {2}
                    {%</bar>}
                    {%<*!bar>}
                    {3}
                    {%</!bar>}
                    {4}
                    {%</foo>}
                    {5}
                    {%<*bar>}
                    {6}
                    {%</bar>}
                    {end}
                 } \n]
                 set res [docstrip::extract $text foo]
                 append res [docstrip::extract $text {foo bar}]
                 append res [docstrip::extract $text bar]

       sets $res to the result of

                 join {
                    {begin}
                    {1}
                    {3}
                    {4}
                    {5}
                    {end}
                    {begin}
                    {1}
                    {2}
                    {4}
                    {5}
                    {6}
                    {end}
                    {begin}
                    {5}
                    {6}
                    {end} ""
                 } \n

       In guard lines without a '*', '/', '+', or '-' modifier after the ´%<', the guard  applies  only  to  the
       CODE  following  the '>' on that single line. A '+' modifier is equivalent to no modifier. A '-' modifier
       is like the case with no modifier, but the expression is implicitly negated, i.e., the CODE  of  a  '%<-'
       guard line is only included if the expression evaluates to false.

       Metacomment  lines  are  "comment  lines  which  should not be stripped away", but be extracted like code
       lines; these are sometimes used for copyright notices and similar material. The '%%'  prefix  is  however
       not kept, but substituted by the current -metaprefix, which is customarily set to some "comment until end
       of line" character (or character sequence) of the language of the code being extracted.

                 set text [join {
                    {begin}
                    {%<foo> foo}
                    {%<+foo>plusfoo}
                    {%<-foo>minusfoo}
                    {middle}
                    {%% some metacomment}
                    {%<*foo>}
                    {%%another metacomment}
                    {%</foo>}
                    {end}
                 } \n]
                 set res [docstrip::extract $text foo -metaprefix {# }]
                 append res [docstrip::extract $text bar -metaprefix {#}]

       sets $res to the result of

                 join {
                    {begin}
                    { foo}
                    {plusfoo}
                    {middle}
                    {#  some metacomment}
                    {# another metacomment}
                    {end}
                    {begin}
                    {minusfoo}
                    {middle}
                    {# some metacomment}
                    {end} ""
                 } \n

       Verbatim guards can be used to force code line interpretation of a block of lines even if  some  of  them
       happen  to  look like any other type of lines to docstrip. A verbatim guard has the form '%<<END-TAG' and
       the verbatim block is terminated by the first line that is exactly '%END-TAG'.

                 set text [join {
                    {begin}
                    {%<*myblock>}
                    {some stupid()}
                    {   #computer<program>}
                    {%<<QQQ-98765}
                    {% These three lines are copied verbatim (including percents}
                    {%% even if -metaprefix is something different than %%).}
                    {%</myblock>}
                    {%QQQ-98765}
                    {   using*strange@programming<language>}
                    {%</myblock>}
                    {end}
                 } \n]
                 set res [docstrip::extract $text myblock -metaprefix {# }]
                 append res [docstrip::extract $text {}]

       sets $res to the result of

                 join {
                    {begin}
                    {some stupid()}
                    {   #computer<program>}
                    {% These three lines are copied verbatim (including percents}
                    {%% even if -metaprefix is something different than %%).}
                    {%</myblock>}
                    {   using*strange@programming<language>}
                    {end}
                    {begin}
                    {end} ""
                 } \n

       The processing of verbatim guards takes place also inside blocks of lines which due to some  outer  block
       guard will not be copied.

       The  final  piece of docstrip syntax is that extraction stops at a line that is exactly "\endinput"; this
       is often used to avoid copying random whitespace at the end of a file. In  the  unlikely  case  that  one
       wants such a code line, one can protect it with a verbatim guard.

COMMANDS

       The package defines two commands.

       docstrip::extract text terminals ?option value ...?
              The  extract  command docstrips the text and returns the extracted lines of code, as a string with
              each line terminated with a newline. The terminals is the list of those guard expression terminals
              which should evaluate to true.  The available options are:

              -annotate lines
                     Requests  the  specified number of lines of annotation to follow each extracted line in the
                     result. Defaults to 0. Annotation lines are mostly useful when the extracted lines  are  to
                     undergo  some  further transformation. A first annotation line is a list of three elements:
                     line type, prefix removed in extraction, and prefix inserted in extraction. The  line  type
                     is  one  of:  'V'  (verbatim), ´M' (metacomment), '+' (+ or no modifier guard line), '-' (-
                     modifier guard line), '.' (normal line). A  second  annotation  line  is  the  source  line
                     number.  A third annotation line is the current stack of block guards. Requesting more than
                     three lines of annotation is currently not supported.

              -metaprefix string
                     The string by which the '%%' prefix of a metacomment line will  be  replaced.  Defaults  to
                     '%%'. For Tcl code this would typically be '#'.

              -onerror keyword
                     Controls what will be done when a format error in the text being processed is detected. The
                     settings are:

                     ignore Just ignore the error; continue as if nothing happened.

                     puts   Write an error message to stderr, then continue processing.

                     throw  Throw an error. The -errorcode is set to a list whose  first  element  is  DOCSTRIP,
                            second  element is the type of error, and third element is the line number where the
                            error is detected. This is the default.

              -trimlines boolean
                     Controls whether spaces at the end of a line should be trimmed  away  before  the  line  is
                     processed. Defaults to true.

              It should be remarked that the terminals are often called "options" in the context of the docstrip
              program, since these specify which optional code fragments should be included.

       docstrip::sourcefrom filename terminals ?option value ...?
              The sourcefrom command is a docstripping emulation of source. It opens the  file  filename,  reads
              it,  closes  it, docstrips the contents as specified by the terminals, and evaluates the result in
              the local context of the caller, during which time the info script value will be the filename. The
              options  are  passed  on  to  fconfigure  to  configure the file before its contents are read. The
              -metaprefix is set to '#', all other extract options have their default values.

DOCUMENT STRUCTURE

       The file format (as described above) determines whether a  master  source  code  file  can  be  processed
       correctly  by  docstrip, but the usefulness of the format is to no little part also dependent on that the
       code and comment lines together constitute a well-formed document.

       For a document format that does not require  any  non-Tcl  software,  see  the  ddt2man  command  in  the
       docstrip::util  package.  It  is suggested that files employing that document format are given the suffix
       ".ddt", to distinguish them from the more traditional LaTeX-based ".dtx" files.

       Master source files with ".dtx" extension are usually set up so that they  can  be  typeset  directly  by
       latex without any support from other files. This is achieved by beginning the file with the lines

                 % \iffalse
                 %<*driver>
                 \documentclass{tclldoc}
                 \begin{document}
                 \DocInput{filename.dtx}
                 \end{document}
                 %</driver>
                 % \fi

       or  some variation thereof. The trick is that the file gets read twice.  With normal LaTeX reading rules,
       the first two lines are comments and therefore ignored. The third line  is  the  document  preamble,  the
       fourth  line  begins the document body, and the sixth line ends the document, so LaTeX stops there — non-
       comments below that point in the file are never subjected to the normal LaTeX reading rules. Before that,
       however,  the  \DocInput command on the fifth line is processed, and that does two things: it changes the
       interpretation of '%' from "comment" to "ignored", and it inputs  the  file  specified  in  the  argument
       (which  is  normally  the  name  of the file the command is in).  It is this second time that the file is
       being read that the comments and code in it are typeset.

       The function of the \iffalse ... \fi is to skip lines two to seven on this second time through;  this  is
       similar  to  the  "if  0  {  ...  }"  idiom for block comments in Tcl code, and it is needed here because
       (amongst other things) the \documentclass command may only be executed once. The function of the <driver>
       guards  is  to prevent this short piece of LaTeX code from being extracted by docstrip.  The total effect
       is that the file can function both as a LaTeX document and as a docstrip master source code file.

       It is not necessary to use the tclldoc document class, but that does provide a number  of  features  that
       are  convenient for ".dtx" files containing Tcl code. More information on this matter can be found in the
       references above.

SEE ALSO

       docstrip_util

KEYWORDS

       \.dtx, LaTeX, docstrip, documentation, literate programming, source

CATEGORY

       Documentation tools

       Copyright (c) 2003–2010 Lars Hellström <Lars dot Hellstrom at residenset dot net>