#### NAME

       bibclean - prettyprint and syntax check BibTeX and Scribe bibliography data base files



#### SYNOPSIS

       bibclean [ -author ] [ -error-log filename ] [ -help ] [ -? ] [ -init-file filename ]
[ -long-field fieldname ] [ -max-width nnn ] [ -[no-]align-equals ]
[ -[no-]check-values ] [ -[no-]delete-empty-values ] [ -[no-]file-position ]
[ -[no-]fix-font-changes ] [ -[no-]fix-initials ] [ -[no-]fix-names ]
[ -[no-]German-style ] [ -[no-]keep-linebreaks ] [ -[no-]keep-parbreaks ]
[ -[no-]keep-preamble-spaces ] [ -[no-]keep-spaces ] [ -[no-]keep-string-spaces ]
[ -[no-]parbreaks ] [ -[no-]prettyprint ] [ -[no-]print-patterns ]
[ -[no-]read-init-files ] [ -[no-]remove-OPT-prefixes ] [ -[no-]scribe ]
[ -[no-]trace-file-opening ] [ -[no-]warnings ] [ -version ]
( <infile | bibfile1 bibfile2 bibfile3 ... ) >outfile

All options can be abbreviated to a unique leading prefix.

An explicit file name of -'' represents standard input; it is assumed if no input  files
are specified.



#### DESCRIPTION

       bibclean  prettyprints  input  BibTeX  files  to  stdout, and checks the brace balance and
bibliography entry syntax as well.  It can be used to detect problems in BibTeX files that
sometimes  confuse  even  BibTeX  itself,  and  importantly,  can be used to normalize the
appearance of collections of BibTeX files.

Here is a summary of the formatting actions:

·  BibTeX items are formatted into a consistent structure with one field  =  "value"  pair
per line, and the initial @ and trailing right brace in column 1.

·  Tabs  are  expanded  into  blank strings; their use is discouraged because they inhibit
portability, and can suffer corruption in electronic mail.

·  Long string values are split at a blank and continued onto the next line  with  leading
indentation.

·  A single blank line separates adjacent bibliography entries.

·  Text outside BibTeX entries is passed through verbatim.

·  Outer parentheses around entries are converted to braces.

·  Personal  names in author and editor field values are normalized to the form P. D. Q.
Bach'', from P.D.Q. Bach'' and Bach, P.D.Q.''.

·  Hyphen sequences in page numbers are converted to en-dashes.

·  Month values are converted to standard BibTeX string abbreviations.

·  In titles, sequences of upper-case characters at brace level zero are braced to protect
them from being converted to lower-case letters by some bibliography styles.

·  CODEN,  ISBN  (International  Standard  Book  Number)  and ISSN (International Standard
Serial Number) entry values are examined to verify the checksums of each listed number,
and correct ISBN hyphenation is automatically supplied.

The  standardized  format  of  the output of bibclean facilitates the later application of
simple filters, such as bibcheck(1), bibdup(1),  bibextract(1),  bibindex(1),  bibjoin(1),
biblabel(1), biblook(1), biborder(1), bibsort(1), citefind(1), and citetags(1), to process
the text, and also is the one expected by the GNU Emacs BibTeX support functions.



#### OPTIONS

       Command-line switches may be abbreviated to a unique leading prefix, and  letter  case  is
not  significant.  All options are parsed before any input bibliography files are read, no
matter what their order on the command line.  Options that correspond to a yes/no  setting
of  a  flag  have a form with a prefix "no-" to set the flag to no.  For such options, the
last setting determines the flag value used.  This is significant when  options  are  also
specified in initialization files (see the INITIALIZATION FILES manual section).

The  leading  hyphen  that  distinguishes  an  option  from a filename may be doubled, for
compatibility with GNU and POSIX conventions.  Thus, -author and --author are equivalent.

To avoid confusion with options, if a filename begins with a hyphen, it must be  disguised
by a leading absolute or relative directory path, e.g., /tmp/-foo.bib or ./-foo.bib.

-author   Display  an author credit on the standard error unit, stderr, and then exit with
a success return code.  Sometimes an executable program is  separated  from  its
documentation and source code; this option provides a way to recover from that.

-error-log filename
Redirect  stderr to the indicated file, which will then contain all of the error
and warning messages.  This option is  provided  for  those  systems  that  have
difficulty redirecting stderr.

-help or -?
Display  a  help  message on stderr, giving a usage description, similar to this
section of the manual pages, and then exit with a success return code.

-init-file filename
Provide an explicit value pattern initialization file.   It  will  be  processed
after  any system-wide and job-wide initialization files, and may override them.
It in turn may be overridden by a subsequent file-specific initialization  file.
For further details, see the INITIALIZATION FILES manual section.

-long-field fieldname
Suppress warnings that field named fieldname have lenghts exceeding the standard
BibTeX limits.  NB! This is a Debian-specific extension!

-max-width nnn
bibclean normally limits output  line  widths  to  72  characters,  and  in  the
interests  of  consistency,  that  value  should  not be changed.  Occasionally,
special-purpose applications may require different maximum line widths, so  this
option  provides  that  capability.  The number following the option name can be
specified in decimal, octal (starting with 0),  or  hexadecimal  (starting  with
0x).  A zero or negative value is interpreted to mean unlimited, so -max-width 0
can be used to ensure that each field/value pair appears on a single line.

When -no-prettyprint requests bibclean to act as a lexical analyzer, the default
line width is unlimited, unless overridden by this option.

When  bibclean  is  prettyprinting,  line wrapping will be done only at a space.
Consequently, a long non-blank character  sequence  may  result  in  the  output
exceeding the requested line width.

When  bibclean is lexing, line wrapping is done by inserting a backslash-newline
pair when the specified maximum is reached, so no line length will  ever  exceed
the maximum.

-[no-]align-equals
With  the  positive  form, align the equals sign in key/value assignments at the
same column, separated by a single space from the value string.  Otherwise,  the
equals sign follows the key, separated by a single space.  Default: no.

-[no-]check-values
With  the  positive  form,  apply  heuristic pattern matching to field values in
order to detect possible errors (e.g., year =  "192"''  instead  of  year  =
"1992"''), and issue warnings when unexpected patterns are found.

This  checking is usually beneficial, but if it produces too many bogus warnings
for a particular bibliography file, you can disable it with the negative form of
this option.  Default: yes.

-[no-]delete-empty-values
With  the  positive form, remove all field/value pairs for which the value is an
empty string.  This is helpful in cleaning up bibliographies generated from text
editor  templates.  Compare this option with -[no-]remove-OPT-prefixes described
below.  Default: no.

-[no-]file-position
With the positive form, give detailed file position information in  warning  and
error messages.  Default: no.

-[no-]fix-font-changes
With  the positive form, supply an additional brace level around font changes in
titles to protect against downcasing by some BibTeX styles.  Font  changes  that
already have more than one level of braces are not modified.

For example, if a title contains the Latin phrase {\em Dictyostelium Discoideum}
or {\em {D}ictyostelium {D}iscoideum}, then downcasing will incorrectly  convert
the  phrase to lower-case letters.  Most BibTeX users are surprised that bracing
the initial letters does not prevent the downcase action.  The correct coding is
{{\em  Dictyostelium  Discoideum}}.   However,  there  are also legitimate cases
where an extra level of bracing wrongly protects from downcasing.  Consequently,
bibclean  will  normally  not supply an extra level of braces, but if you have a
bibliography where the extra braces are routinely  missing,  you  can  use  this
option to supply them.

If  you  think  that  you  need this option, it is strongly recommended that you
apply bibclean to your bibliography file  with  and  without  -fix-font-changes,
then  compare  the  two  output  files to ensure that extra braces are not being
supplied in titles where they should not be present.  You will  have  to  decide
which  of  the  two output files is the better choice, then repair the incorrect
title bracing by hand.

Since font changes in titles are uncommon, except for cases of  the  type  which
this  option is designed to correct, it should do more good than harm.  Default:
no.

-[no-]fix-initials
With the positive form, insert a space after a period following author initials.
Default: yes.

-[no-]fix-names
With the positive form, reorder author and editor name lists to remove commas at
brace level zero, placing first names or initials before last  names.   Default:
yes.

-[no-]German-style
With  the  positive  form,  interpret  quote  characters ["] inside braced value
strings at brace level 1 according to the conventions  of  the  TeX  style  file
german.sty, which overloads quote to simplify input and representation of German
umlaut  accents,  sharp-s  (es-zet),  ligature  separators,  invisible  hyphens,
raised/lowered quotes, French guillemets, and discretionary hyphens.  Recognized
character combinations will be braced to prevent BibTeX  from  interpreting  the
quote as a string delimiter.

Quoted  strings  receive  no special handling from this option, and since German
nouns in titles must anyway be protected from the downcasing operation  of  most
BibTeX  bibliography  styles, German value strings that use the overloaded quote
character can always be entered in the form "{...}", without the need to specify
this option at all.

Default: no.

-[no-]keep-linebreaks
Normally, line breaks inside value strings are collapsed into a single space, so
that long value strings can later be  broken  to  provide  lines  of  reasonable
length.

With  the  positive  form,  linebreaks are preserved in value strings.  If -max-
width is set to zero, this preserves the original line breaks.  Spacing  outside
value  strings  remains  under  bibclean's  control, and is not affected by this
option.

Default: no.

-[no-]keep-parbreaks
With the positive form, preserve paragraph breaks (either  formfeeds,  or  lines
containing  only  spaces)  in  value  strings.   Normally,  paragraph breaks are
collapsed into a single space.  Spacing  outside  value  strings  remains  under
bibclean's control, and is not affected by this option.  Default: no.

-[no-]keep-preamble-spaces
With  the  positive  form,  preserve  all  whitespace in @Preamble{...} entries.
Default: no.

-[no-]keep-spaces
With the positive  form,  preserve  all  spaces  in  value  strings.   Normally,
multiple  spaces  are  collapsed  into  a single space.  This option can be used
together with -keep-linebreaks, -keep-parbreaks, and -max-width  0  to  preserve
the  form  of  value  strings  while  still providing syntax and value checking.
Spacing outside value strings remains  under  bibclean's  control,  and  is  not
affected by this option.  Default: no.

-[no-]keep-string-spaces
With  the  positive  form,  preserve  all  whitespace  in  @String{...} entries.
Default: no.

-[no-]parbreaks
With the negative form,  a  paragraph  break  (either  a  formfeed,  or  a  line
containing   only  spaces)  is  not  permitted  in  value  strings,  or  between
field/value pairs.  This may be useful to quickly trap runaway  strings  arising
from mismatched delimiters.  Default: yes.

-[no-]prettyprint
Normally,  bibclean  functions  as  a prettyprinter.  However, with the negative
form of this option, it acts as a lexical analyzer instead, producing  a  stream
of lexical tokens.  See the LEXICAL ANALYSIS manual section for further details.
Default: yes.

-[no-]print-patterns
With the positive form, print the value patterns read from initialization  files
as  they  are  added  to  internal tables.  Use this option to check newly-added
patterns, or to see what patterns are being used.

These patterns are the ones that will be used  in  checking  value  strings  for
valid syntax, and all of them are specified in initialization files, rather than
hard-coded into the program.  For further details, see the INITIALIZATION  FILES
manual section.  Default: no.

initialization  files.   Initializations  will  come  only  from   those   files
explicitly given by -init-file filename options.  Default: yes.

-[no-]remove-OPT-prefixes
With the positive form, remove the OPT'' prefix from each field name where the
corresponding value is not an empty string.  The prefix OPT'' must be entirely
in upper-case to be recognized.

This  option  is  for  bibliographies  generated  with the help of the GNU Emacs
BibTeX  editing  support,  which  generates  templates  with   optional   fields
identified  by  the OPT'' prefix.  Although the function M-x bibtex-remove-OPT
normally bound to the keystrokes C-c C-o does the job, users often forget,  with
the  result that BibTeX does not recognize the field name, and ignores the value
string.  Compare this option  with  -[no-]delete-empty-values  described  above.
Default: no.

-[no-]scribe
With  the  positive  form, accept input syntax conforming to the Scribe document
system.  The output will be converted to conform  to  BibTeX  syntax.   See  the
SCRIBE BIBLIOGRAPHY FORMAT manual section for further details.  Default: no.

-[no-]trace-file-opening
With  the  positive  form,  record  in the error log file the names of all files
which  bibclean  attempts  to  open.   Use  this  option   to   identify   where
initialization files are located.  Default: no.

-[no-]warnings
With  the  positive  form, allow all warning messages.  The negative form is not
recommended since it may mask problems that should be repaired.  Default: yes.

-version  Display the program version number on stderr,  and  then  exit  with  a  success
return  code.  This will also include an indication of who compiled the program,
the host name on which it was compiled, the time of compilation, and the type of
string-value  matching  code selected, when that information is available to the
compiler.



#### ERRORRECOVERYANDWARNINGS

       When bibclean detects an error, it issues an error message  to  both  stderr  and  stdout.
That  way,  the  user  is  clearly notified, and the output bibliography also contains the
message at the point of error.

Error messages begin with a distinctive pair  of  queries,  ??,  beginning  in  column  1,
followed  by  the  input  file  name  and  line  number.  If the -file-position option was
specified, they also contain the input and output positions of the  current  file,  entry,
and  value.   Each position includes the file byte number, the line number, and the column
number.  In the event of a runaway string argument, the entry and value  positions  should
precisely  pinpoint the erroneous bibliography entry, and the file positions will indicate
where it was detected, which may be rather later in the files.

Warning messages identify possible problems, and are therefore sent only  to  stderr,  and
not  to  stdout,  so  they  never  appear  in  the  output file.  They are identified by a
distinctive pair of percents, %%, beginning in column 1, and as with error  messages,  may
be followed by file position messages if the -file-position option was specified.

For  convenience,  the  first  line  of  each  error and warning message sent to stderr is
formatted according to the expectations of the GNU  Emacs  next-error  command.   You  can
invoke  bibclean  with  the  Emacs  M-x  compile<RET>bibclean  filename.bib  >filename.new
command, then use the next-error command, normally bound to C-x    (that's  a  grave,  or
back, accent), to move to the location of the error in the input file.

If  error  messages  are  ignored,  and  left  in  the output bibliography file, they will
precipitate an error when the bibliography is next processed with BibTeX.

After issuing an error message, bibclean then  resynchronizes  its  input  by  copying  it
verbatim  to  stdout  until  a new bibliography entry is recognized on a line in which the
first non-blank character is an at-sign (@).  This ensures that nothing is lost  from  the
input  file(s),  allowing  corrections to be made in either the input or the output files.
However, if bibclean detects an internal error in its data structures, it  will  terminate
abruptly  without  further  input  or  output  processing; this kind of error should never
happen, and if it does, it should be reported immediately to the author  of  the  program.
Errors  in  initialization files, and running out of dynamic memory, will also immediately
terminate bibclean.



#### INITIALIZATIONFILES

       bibclean can be compiled with one of three different types of pattern matching; the choice
is made by the installer at compile time:

·  The original version uses explicit hand-coded tests of value-string syntax.

·  The   second  version  uses  regular-expression  pattern-matching  host  library
routines together with  regular-expression  patterns  that  come  entirely  from
initialization files.

·  The  third  version uses special patterns that come entirely from initialization
files.

This Debianized version of bibclean uses the third version.  However, command-line options
can also be specified in initialization files, no matter which pattern matching choice was
selected.

When bibclean starts, it searches  for  initialization  files,  using  the  first  one  of
$(HOME)/.bibcleanrc, /usr/share/bibcleanrc, and /etc/bibcleanrc that exists. Afterwards, it reads the first .bibcleanrc found in the BIBINPUTS search path. The name .bibcleanrc can be changed at run time through a setting of the environment variable BIBCLEANINI. If the name starts with a dot, it will be stripped when looking in /usr/share and /etc. Then, when command-line arguments are processed, any additional files specified by -init- filefilename options are also processed. Finally, immediately before each named bibliography file is processed, an attempt is made to process an initialization file with the same name, but with the extension changed to .ini. The default extension can be changed by a setting of the environment variable BIBCLEANEXT. This scheme permits system- wide, user-wide, session-wide, and file-specific initialization files to be supported. When input is taken from stdin, there is no file-specific initialization. For precise control, the -no-read-init-files option suppresses all initialization files except those explicitly named by -init-filefilename options, either on the command line, or in requested initialization files. Recursive execution of initialization files with nested -init-file options is permitted; if the recursion is circular, bibclean will finally get a non-fatal initialization file open failure after opening too many files. This terminates further initialization file processing. As the recursion unwinds, the files are all closed, then execution proceeds normally. An initialization file may contain empty lines, comments from percent to end of line (just like TeX), option switches, and field/pattern or field/pattern/message assignments. Leading and trailing spaces are ignored. This is best illustrated by a short example: % This is a small bibclean initialization file -init-file /u/math/bib/.bibcleanrc %% departmental patterns chapter = "\"D\"" %% 23 pages = "\"D--D\"" %% 23--27 volume = "\"D \\an\\d D\"" %% 11 and 12 year = \ "\"dddd, dddd, dddd\"" \ "Multiple years specified." %% 1989, 1990, 1991 -no-fix-names %% do not modify author/editor lists Long logical lines can be split into multiple physical lines by breaking at a backslash- newline pair; the backslash-newline pair is discarded. This processing happens while characters are being read, before any further interpretation of the input stream. Each logical line must contain a complete option (and its value, if any), or a complete field/pattern pair, or a field/pattern/message triple. Comments are stripped during the parsing of the field, pattern, and message values. The comment start symbol is not recognized inside quoted strings, so it can be freely used in such strings. Comments on logical lines that were input as multiple physical lines via the backslash- newline convention must appear on the last physical line; otherwise, the remaining physical lines will become part of the comment. Pattern strings must be enclosed in quotation marks; within such strings, a backslash starts an escape mechanism that is commonly used in UNIX software. The recognized escape sequences are: \a alarm bell (octal 007) \b backspace (octal 010) \f formfeed (octal 014) \n newline (octal 012) \r carriage return (octal 015) \t horizontal tab (octal 011) \v vertical tab (octal 013) \ooo character number octal ooo (e.g \012 is linefeed). Up to 3 octal digits may be used. \0xhh character number hexadecimal hh (e.g., \0x0a is linefeed). xhh may be in either letter case. Any number of hexadecimal digits may be used. Backslash followed by any other character produces just that character. Thus, \% gets a literal percent into a string (preventing its interpretation as a comment), \" produces a quotation mark, and \\ produces a single backslash. An ASCII NUL (\0) in a string will terminate it; this is a feature of the C programming language in which bibclean is implemented. Field/pattern pairs can be separated by arbitrary space, and optionally, either an equals sign or colon functioning as an assignment operator. Thus, the following are equivalent: pages="\"D--D\"" pages:"\"D--D\"" pages "\"D--D\"" pages = "\"D--D\"" pages : "\"D--D\"" pages "\"D--D\"" Each field name can have an arbitrary number of patterns associated with it; however, they must be specified in separate field/pattern assignments. An empty pattern string causes previously-loaded patterns for that field name to be forgotten. This feature permits an initialization file to completely discard patterns from earlier initialization files. Patterns for value strings are represented in a tiny special-purpose language that is both convenient and suitable for bibliography value-string syntax checking. While not as powerful as the language of regular-expression patterns, its parsing can be portably implemented in less than 3% of the code in a widely-used regular-expression parser (the GNU regexp package). The patterns are represented by the following special characters: <space> one or more spaces a exactly one letter A one or more letters d exactly one digit D one or more digits r exactly one Roman numeral R one or more Roman numerals (i.e. a Roman number) w exactly one word (one or more letters and digits) W one or more space-separated words, beginning and ending with a word . one special' character, one of the characters <space>!#()*+,-./:;?[]~, a subset of punctuation characters that are typically used in string values : one or more special' characters X one or more special'-separated words, beginning and ending with a word \x exactly one x (x is any character), possibly with an escape sequence interpretation given earlier x exactly the character x (x is anything but one of these pattern characters: aAdDrRwW.:<space>\) The X pattern character is very powerful, but generally inadvisable, since it will match almost anything likely to be found in a BibTeX value string. The reason for providing pattern matching on the value strings is to uncover possible errors, not mask them. There is no provision for specifying ranges or repetitions of characters, but this can usually be done with separate patterns. It is a good idea to accompany the pattern with a comment showing the kind of thing it is expected to match. Here is a portion of an initialization file giving a few of the patterns used to match number value strings: number = "\"D\"" %% 23 number = "\"A AD\"" %% PN LPS5001 number = "\"A D(D)\"" %% RJ 34(49) number = "\"A D\"" %% XNSS 288811 number = "\"A D\\.D\"" %% Version 3.20 number = "\"A-A-D-D\"" %% UMIAC-TR-89-11 number = "\"A-A-D\"" %% CS-TR-2189 number = "\"A-A-D\\.D\"" %% CS-TR-21.7 For a bibliography that contains only article entries, this list should probably be reduced to just the first pattern, so that anything other than a digit string fails the pattern-match test. This is easily done by keeping bibliography-specific patterns in a corresponding file with extension .ini, since that file is read automatically. You should be sure to use empty pattern strings in this pattern file to discard patterns from earlier initialization files. The value strings passed to the pattern matcher contain surrounding quotes, so the patterns should also. However, you could use a pattern specification like "\"D" to match an initial digit string followed by anything else; the omission of the final quotation mark \" in the pattern allows the match to succeed without checking that the next character in the value string is a quotation mark. Because the value strings are intended to be processed by TeX, the pattern matching ignores braces, and TeX control sequences, together with any space following those control sequences. Spaces around braces are preserved. This convention allows the pattern fragment A-AD-D to match the value string TN-K\slash 27-70, because the value is implicitly collapsed to TN-K27-70 during the matching operation. bibclean's normal action when a string value fails to match any of the corresponding patterns is to issue a warning message something like this: "Unexpected value in year = "192"''. In most cases, that is sufficient to alert the user to a problem. In some cases, however, it may be desirable to associate a different message with a particular pattern. This can be done by supplying a message string following the pattern string. Format items %% (single percent), %e (entry name), %f (field name), %k (citation key), and %v (string value) are available to get current values expanded in the messages. Here is an example: chapter = "\"D:D\"" "Colon found in %f = %v''" %% 23:2 To be consistent with other messages output by bibclean, the message string should not end with punctuation. If you wish to make the message an error, rather than just a warning, begin it with a query (?), like this: chapter = "\"D:D\"" "?Colon found in %f = %v''" %% 23:2 The query will not be included in the output message. Escape sequences are supported in message strings, just as they are in pattern strings. You can use this to advantage for fancy things, such as terminal display mode control. If you rewrite the previous example as chapter = "\"D:D\"" \ "?\033[7mColon found in %f = %v''\033[0m" %% 23:2 the error message will appear in inverse video on display screens that support ANSI terminal control sequences. Such practice is not normally recommended, since it may have undesirable effects on some output devices. Nevertheless, you may find it useful for restricted applications. For some types of bibliography fields, bibclean contains special-purpose code to supplement or replace the pattern matching: · CODEN, ISBN and ISSN field values are handled this way because their validation requires evaluation of checksums that cannot be expressed by simple patterns; no patterns are even used in these three cases. · chapter, number, pages, and volume values are checked only by pattern matching. · month values are first checked against the standard BibTeX month abbreviations, and only if no match is found are patterns then used. · year values are first checked against patterns, then if no match is found, the year numbers are found and converted to integer values for testing against reasonable bounds. Values for other fields are checked only against patterns. You can provide patterns for any field you like, even ones bibclean does not already know about. New ones are simply added to an internal table that is searched for each string to be validated. The special field, key, represents the bibliographic citation key. It can be given patterns, like any other field. Here is an initialization file pattern assignment that will match an author name, a colon, an alphabetic string, and a two-digit year: key = "A:Add" %% Knuth:TB86 Notice that no quotation marks are included in the pattern, because the citation keys are not quoted. You can use such patterns to help enforce uniform naming conventions for citation keys, which is increasingly important as your bibliography data base grows.  #### LEXICALANALYSIS  When -no-prettyprint is specified, bibclean acts as a lexical analyzer instead of a prettyprinter, producing output in lines of the form <token-number><tab><token-name><tab>"<token-value>" Each output line contains a single complete token, identified by a small integer number for use by a computer program, a token type name for human readers, and a string value in quotes. Special characters in the token value string are represented with ANSI/ISO Standard C escape sequences, so all characters other than NUL are representable, and multi-line values can be represented in a single line. Here are the token numbers and token type names that can appear in the output when -prettyprint is specified: 0 UNKNOWN 1 ABBREV 2 AT 3 COMMA 4 COMMENT 5 ENTRY 6 EQUALS 7 FIELD 8 INCLUDE 9 INLINE 10 KEY 11 LBRACE 12 LITERAL 13 NEWLINE 14 PREAMBLE 15 RBRACE 16 SHARP 17 SPACE 18 STRING 19 VALUE Programs that parse such output should also be prepared for lines beginning with the warning prefix, %%, or the error prefix, ??, and for ANSI/ISO Standard C line number directives of the form # line 273 "texbook1.bib" which record the line number and file name of the current input file. If a -max-width nnn command-line option was specified, long output lines will be wrapped at a backslash-newline pair, and consequently, software that processes the lexical token stream should be prepared to collapse such wrapped lines back into single lines. As an example of the use of -no-prettyprint, the UNIX command pipeline bibclean -no-prettyprint mylib.bib | \ awk '$2 == "KEY" {print $3}' | \ sed -e 's/"//g' | \ sort will extract a sorted list of all citation keys in the file mylib.bib. A certain amount of processing will have been done on the tokens. In particular, delimiters equivalent to braces will have been replaced by braces, and braced strings will have become quoted strings. The LITERAL token type is used for arbitrary text that bibclean does not examine further, such as the contents of a @Preamble{...} or a @Comment{...}. The UNKNOWN token type should never appear in the output stream. It is used internally to initialize token type variables.  #### SCRIBEBIBLIOGRAPHYFORMAT  bibclean's support for the Scribe bibliography format is based on the syntax description in the Scribe Introductory User's Manual, 3rd Edition, May 1980. Scribe was originally developed by Brian Reid at Carnegie-Mellon University, and is now marketed by Unilogic, Ltd. The BibTeX bibliography format was strongly influenced by Scribe, and indeed, with care, it is possible to share bibliography files between the two systems. Nevertheless, there are some differences, so here is a summary of features of the Scribe bibliography file format: (1) Letter case is not significant in field names and entry names, but case is preserved in value strings. (2) In field/value pairs, the field and value may be separated by one of three characters: =, /, or space. Space may optionally surround these separators. (3) Value delimiters are any of these seven pairs: { } [ ] ( ) < > ' ' " "   (4) Value delimiters may not be nested, even though with the first four delimiter pairs, nested balanced delimiters would be unambiguous. (5) Delimiters can be omitted around values that contain only letters, digits, sharp (#), ampersand (&), period (.), and percent (%). (6) Outside of delimited values, a literal at-sign (@) is represented by doubled at- signs (@@). (7) Bibliography entries begin with @name, as for BibTeX, but any of the seven Scribe value delimiter pairs may be used to surround the values in field/value pairs. As in (4), nested delimiters are forbidden. (8) Arbitrary space may separate entry names from the following delimiters. (9) @Comment is a special command whose delimited value is discarded. As in (4), nested delimiters are forbidden. (10) The special form @Begin{comment} ... @End{comment} permits encapsulating arbitrary text containing any characters or delimiters, other than @End{comment}''. Any of the seven delimiter pairs may be used around the word comment'' following the @Begin'' or @End''; the delimiters in the two cases need not be the same, and consequently, @Begin{comment}''/@End{comment}'' pairs may not be nested. (11) The key field is required in each bibliography entry. (12) A backslashed quote in a string will be assumed to be a TeX accent, and braced appropriately. While such accents do not conform to Scribe syntax, Scribe-format bibliographies have been found that appear to be intended for TeX processing. Because of this loose syntax, bibclean's normal error detection heuristics are less effective, and consequently, Scribe mode input is not the default; it must be explicitly requested.  #### ENVIRONMENTVARIABLES  BIBCLEANEXT File extension of bibliography-specific initialization files. Default: .ini. BIBCLEANINI Name of bibclean initialization files. Default: .bibcleanrc. BIBINPUTS Search path for bibclean and BibTeX input files. This is a colon-separated list of directories that are searched in order from first to last. It is not an error for a specified directory to not exist.  #### FILES  *.bib BibTeX and Scribe bibliography data base files. *.ini File-specific initialization files. /usr/share/bibcleanrc, /etc/bibcleanrc System-wide initialization files. .bibcleanrc User-specific initialization files.  #### SEEALSO  bibcheck(1), bibdup(1), bibextract(1), bibindex(1), bibjoin(1), biblabel(1), biblex(1), biblook(1), biborder(1), bibparse(1), bibsort(1), bibtex(1), bibunlex(1), citefind(1), citesub(1), citetags(1), latex(1), scribe(1), tex(1).  #### AUTHOR  Nelson H. F. Beebe Center for Scientific Computing University of Utah Department of Mathematics, 322 INSCC 155 S 1400 E RM 233 Salt Lake City, UT 84112-0090 USA Tel: +1 801 581 5254 FAX: +1 801 585 1640, +1 801 581 4148 Email: beebe@math.utah.edu, beebe@acm.org, beebe@ieee.org (Internet) URL: http://www.math.utah.edu/~beebe This Debianization of bibclean was done by Henning Makholm <henning@makholm.net>, and differs from the upstream source in where it looks for the system-wide initialization file (vanilla bibclean expects to find it in$PATH), and has also been patched  to  ignore  the
built-in BibTeX field-length limit for abstract fields.