Provided by: groff_1.23.0-5_amd64 bug

Name

       roff - concepts and history of roff typesetting

Description

       The  term  roff denotes a family of document formatting systems known by names like troff,
       nroff, and ditroff.  A roff system consists of  an  interpreter  for  an  extensible  text
       formatting  language  and  a  set of programs for preparing output for various devices and
       file formats.  Unix-like operating systems often distribute a  roff  system.   The  manual
       pages  on  Unix  systems  (“man  pages”)  and  bestselling  books on software engineering,
       including Brian Kernighan and Dennis Ritchie's The C Programming Language and  W.  Richard
       Stevens's  Advanced  Programming  in  the  Unix  Environment  have been written using roff
       systems.  GNU roffgroff—is arguably the most widespread roff implementation.

       Below  we  present  typographical  concepts  that  form  the  background   of   all   roff
       implementations,  narrate the development history of some roff systems, detail the command
       pipeline managed by groff(1), survey the formatting language,  suggest  tips  for  editing
       roff input, and recommend further reading materials.

Concepts

       roff  input  files  contain  text interspersed with instructions to control the formatter.
       Even in the absence of such instructions, a roff formatter still processes  its  input  in
       several  ways,  by  filling, hyphenating, breaking, and adjusting it, and supplementing it
       with inter-sentence  space.   These  processes  are  basic  to  typesetting,  and  can  be
       controlled at the input document's discretion.

       When  a  device-independent  roff  formatter  starts  up, it obtains information about the
       device for  which  it  is  preparing  output  from  the  latter's  description  file  (see
       groff_font(5)).   An  essential  property  is  the length of the output line, such as “6.5
       inches”.

       The formatter interprets plain text files employing the Unix line-ending  convention.   It
       reads  input  a  character  at a time, collecting words as it goes, and fits as many words
       together on an output line as it can—this is known as filling.  To a roff system,  a  word
       is  any sequence of one or more characters that aren't spaces or newlines.  The exceptions
       separate words.

       A roff formatter attempts to detect boundaries between sentences, and supplies  additional
       inter-sentence  space  between  them.  It flags certain characters (normally “!”, “?”, and
       “.”) as potentially ending a sentence.  When the formatter encounters one of these end-of-
       sentence  characters  at  the  end  of  an  input  line, or one of them is followed by two
       (unescaped) spaces on the same input line, it appends an inter-word space followed  by  an
       inter-sentence  space  in  the output.  The dummy character escape sequence \& can be used
       after an end-of-sentence character to defeat end-of-sentence detection on  a  per-instance
       basis.  Normally, the occurrence of a visible non-end-of-sentence character (as opposed to
       a space or tab) immediately after an end-of-sentence character cancels  detection  of  the
       end  of  a  sentence.   However,  several  characters  are treated transparently after the
       occurrence of an end-of-sentence character.  That is,  a  roff  does  not  cancel  end-of-
       sentence detection when it processes them.  This is because such characters are often used
       as footnote markers or to close quotations and parentheticals.  The default set is  ",  ',
       ),  ],  *,  \[dg],  \[dd],  \[rq],  and  \[cq].   The  last  four  are examples of special
       characters, escape sequences whose purpose is to obtain glyphs that are not  easily  typed
       at the keyboard, or which have special meaning to the formatter (like \).

       When  an  output  line is nearly full, it is uncommon for the next word collected from the
       input to exactly fill it—typically, there is room left over only  for  part  of  the  next
       word.   The  process  of splitting a word so that it appears partially on one line (with a
       hyphen to indicate to the reader that the word has been broken) with its remainder on  the
       next  is  hyphenation.   Hyphenation  points  can be manually specified; groff also uses a
       hyphenation algorithm and language-specific pattern files to decide  which  words  can  be
       hyphenated  and  where.  Hyphenation does not always occur even when the hyphenation rules
       for a word allow it; it  can  be  disabled,  and  when  not  disabled  there  are  several
       parameters that can prevent it in certain circumstances.

       Once an output line is full, the next word (or remainder of a hyphenated one) is placed on
       a different output line; this is called a break.  In this document and in roff discussions
       generally,  a  “break”  if  not  further  qualified always refers to the termination of an
       output line.  When the formatter is filling text, it introduces  breaks  automatically  to
       keep  output lines from exceeding the configured line length.  After an automatic break, a
       roff formatter adjusts the line if applicable (see below), and then resumes collecting and
       filling text on the next output line.

       Sometimes,  a  line  cannot  be  broken  automatically.  This usually does not happen with
       natural language text unless the output line length has been manipulated to  be  extremely
       short,  but it can with specialized text like program source code.  groff provides a means
       of telling the formatter where the line may be broken without hyphens.  This is done  with
       the non-printing break point escape sequence \:.

       There are several ways to cause a break at a predictable location.  A blank input line not
       only causes a break, but by default it also outputs a one-line vertical space (effectively
       a  blank  output line).  Macro packages may discourage or disable this “blank line method”
       of paragraphing in favor of their own macros.  A line that begins with one or more  spaces
       causes  a  break.   The  spaces are output at the beginning of the next line without being
       adjusted (see below).  Again, macro  packages  may  provide  other  methods  of  producing
       indented paragraphs.  Trailing spaces on text lines (see below) are discarded.  The end of
       input causes a break.

       After the formatter performs an automatic break, it may then  adjust  the  line,  widening
       inter-word spaces until the text reaches the right margin.  Extra spaces between words are
       preserved.  Leading and trailing spaces are handled as noted above.  Text can  be  aligned
       to the left or right margin only, or centered, using requests.

       A  roff  formatter translates horizontal tab characters, also called simply “tabs”, in the
       input into movements to the next tab stop.  These tab stops are by default  located  every
       half  inch measured from the current position on the input line.  With them, simple tables
       can be made.  However, this method can be deceptive, as the appearance (and width) of  the
       text  in  an editor and the results from the formatter can vary greatly, particularly when
       proportional typefaces are used.  A tab character does not cause  a  break  and  therefore
       does  not  interrupt  filling.   The formatter provides facilities for sophisticated table
       composition; there are many details to track when using the “tab”  and  “field”  low-level
       features, so most users turn to the tbl(1) preprocessor to lay out tables.

   Requests and macros
       A  request is an instruction to the formatter that occurs after a control character, which
       is recognized at the beginning of an input line.  The regular control character is  a  dot
       “.”.   Its  counterpart,  the  no-break  control  character,  a  neutral  apostrophe  “'”,
       suppresses the break implied by some requests.  These characters were chosen because it is
       uncommon  for  lines  of  text  in natural languages to begin with them.  If you require a
       formatted period or apostrophe (closing single quotation  mark)  where  the  formatter  is
       expecting  a  control  character,  prefix  the  dot  or  neutral apostrophe with the dummy
       character escape sequence, “\&”.

       An input line beginning with a control character is called a control line.  Every line  of
       input that is not a control line is a text line.

       Requests  often  take  arguments, words (separated from the request name and each other by
       spaces) that specify details of the action the formatter is expected  to  perform.   If  a
       request  is meaningless without arguments, it is typically ignored.  Of key importance are
       the requests that define macros.  Macros are invoked like requests, enabling  the  request
       repertoire to be extended or overridden.

       A  macro  can  be thought of as an abbreviation you can define for a collection of control
       and text lines.  When the macro is called by giving its name after a control character, it
       is  replaced  with  what  it  stands  for.  The process of textual replacement is known as
       interpolation.  Interpolations are handled as  soon  as  they  are  recognized,  and  once
       performed,  a  roff formatter scans the replacement for further requests, macro calls, and
       escape sequences.

       In roff systems, the “de” request defines a macro.

   Page geometry
       roff systems format text under certain assumptions about the size of the output medium, or
       page.   For  the  formatter to correctly break a line it is filling, it must know the line
       length, which it derives from the page width.  For it to decide whether to write an output
       line  to  the  current  page  or wait until the next one, it must know the page length.  A
       device's resolution converts practical units like inches or centimeters to basic units,  a
       convenient  length measure for the output device or file format.  The formatter and output
       driver use basic units to reckon page measurements.  The device description  file  defines
       its resolution and page dimensions (see groff_font(5)).

       A  page  is  a  two-dimensional  structure  upon which a roff system imposes a rectangular
       coordinate system with its upper left corner as the  origin.   Coordinate  values  are  in
       basic units and increase down and to the right.  Useful ones are therefore always positive
       and within numeric ranges corresponding to the page boundaries.

       While the formatter (and, later, output driver) is processing a page, it  keeps  track  of
       its  drawing position, which is the location at which the next glyph will be written, from
       which the next motion will  be  measured,  or  where  a  geometric  object  will  commence
       rendering.   Notionally,  glyphs are drawn from the text baseline upward and to the right.
       (groff does not yet support right-to-left scripts.)   The  text  baseline  is  a  (usually
       invisible)  line  upon  which  the  glyphs  of  a typeface are aligned.  A glyph therefore
       “starts” at its bottom-left corner.  If drawn at the origin, a typical letter glyph  would
       lie  partially  or  wholly  off  the  page,  depending on whether, like “g”, it features a
       descender below the baseline.

       Such a situation is nearly always undesirable.  It  is  furthermore  conventional  not  to
       write or draw at the extreme edges of the page.  Therefore the initial drawing position of
       a roff formatter is not at the origin, but below and to the right of it.   This  rightward
       shift  from  the  left edge is known as the page offset.  (groff's terminal output devices
       have page offsets of zero.)  The downward shift leaves room for a text output line.

       Text is arranged on a one-dimensional lattice of text baselines from the top to the bottom
       of  the  page.   Vertical  spacing  is  the  distance  between  adjacent  text  baselines.
       Typographic tradition sets this quantity to 120% of the type size.  The  initial  vertical
       drawing  position  is  one unit of vertical spacing below the page top.  Typographers term
       this unit a vee.

       Vertical spacing has an impact  on  page-breaking  decisions.   Generally,  when  a  break
       occurs,  the formatter moves the drawing position to the next text baseline automatically.
       If the formatter were already writing to the  last  line  that  would  fit  on  the  page,
       advancing  by  one  vee  would place the next text baseline off the page.  Rather than let
       that happen, roff formatters instruct the output driver to eject the  page,  start  a  new
       one,  and  again  set  the  drawing position to one vee below the page top; this is a page
       break.

       When the last line of input text corresponds to the last output  line  that  fits  on  the
       page,  the  break caused by the end of input will also break the page, producing a useless
       blank one.  Macro packages keep users from having to confront this difficulty  by  setting
       “traps”;  moreover, all but the simplest page layouts tend to have headers and footers, or
       at least bear vertical margins larger than one vee.

   Other language elements
       Escape sequences start with the escape character, a backslash \, and are  followed  by  at
       least one additional character.  They can appear anywhere in the input.

       With  requests, the escape and control characters can be changed; further, escape sequence
       recognition can be turned off and back on.

       Strings store character sequences.  In groff, they can be parameterized as macros can.

       Registers store numerical values, including measurements.  The  latter  are  generally  in
       basic units; scaling units can be appended to numeric expressions to clarify their meaning
       when stored or interpolated.  Some read-only predefined registers interpolate text.

       Fonts are identified either by a name or by a mounting position (a  non-negative  number).
       Four styles are available on all devices.  R is “roman”: normal, upright text.  B is bold,
       an upright typeface with a heavier weight.  I  is  italic,  a  face  that  is  oblique  on
       typesetter output devices and usually underlined instead on terminal devices.  BI is bold-
       italic, combining both of the foregoing style variations.  Typesetting devices group these
       four  styles  into  families  of text fonts; they also typically offer one or more special
       fonts that provide unstyled glyphs; see groff_char(7).

       groff supports named colors for glyph rendering and drawing of geometric objects.   Stroke
       and fill colors are distinct; the stroke color is used for glyphs.

       Glyphs  are  visual representation forms of characters.  In groff, the distinction between
       those two elements is not always obvious (and a full discussion is beyond our scope).   In
       brief, “A” is a character when we consider it in the abstract: to make it a glyph, we must
       select a typeface with which to render it, and determine its type  size  and  color.   The
       formatting  process  turns input characters into output glyphs.  A few characters commonly
       seen on keyboards are treated specially by the roff language and may not look  correct  in
       output  if  used  unthinkingly;  they  are  the  (double)  quotation mark ("), the neutral
       apostrophe ('), the minus sign (-), the backslash (\), the caret or circumflex accent (^),
       the  grave  accent  (`),  and  the  tilde (~).  All of these and more can be produced with
       special character escape sequences; see groff_char(7).

       groff offers streams, identifiers for  writable  files,  but  for  security  reasons  this
       feature is disabled by default.

       A  further  few  language  elements  arise  as  page layouts become more sophisticated and
       demanding.  Environments collect formatting parameters like line length and  typeface.   A
       diversion  stores  formatted  output for later use.  A trap is a condition on the input or
       output, tested automatically by the formatter, that is associated with a macro, calling it
       when that condition is fulfilled.

       Footnote  support  often  exercises  all  three  of  the  foregoing  features.   A  simple
       implementation might work as follows.  A pair of macros is defined: one starts a  footnote
       and  the  other  ends  it.   The  author  calls the first macro where a footnote marker is
       desired.  The macro establishes a diversion so that the footnote text is collected at  the
       place  in the body text where its corresponding marker appears.  An environment is created
       for the footnote so that it is set at a smaller typeface.  The footnote text is  formatted
       in  the  diversion  using that environment, but it does not yet appear in the output.  The
       document author calls the footnote end macro, which returns to  the  previous  environment
       and  ends  the diversion.  Later, after much more body text in the document, a trap, set a
       small distance above the page bottom, is sprung.  The macro called by  the  trap  draws  a
       line across the page and emits the stored diversion.  Thus, the footnote is rendered.

History

       Computer-driven  document  formatting  dates  back  to  the  1960s.   The  roff  system is
       intimately connected with Unix, but its origins lie with  the  earlier  operating  systems
       CTSS, GECOS, and Multics.

   The predecessor—RUNOFF
       roff's  ancestor  RUNOFF  was  written in the MAD language by Jerry Saltzer to prepare his
       Ph.D. thesis on the Compatible Time Sharing System (CTSS), a project of the  Massachusetts
       Institute  of  Technology  (MIT).   This  program is referred to in full capitals, both to
       distinguish it from its many descendants, and because bits were expensive in  those  days;
       five-  and  six-bit  character  encodings  were  still in widespread usage, and mixed-case
       alphabetics in file names seen as a  luxury.   RUNOFF  introduced  a  syntax  of  inlining
       formatting  directives  amid document text, by beginning a line with a period (an unlikely
       occurrence in human-readable material) followed by a “control word”.  Control  words  with
       obvious  meaning  like  “.line length n” were supported as well as an abbreviation system;
       the latter came to overwhelm the former in popular usage  and  later  derivatives  of  the
       program.   A sample of control words from a RUNOFF manual of December 1966 ⟨http://web.mit
       .edu/Saltzer/www/publications/ctss/AH.9.01.html⟩  was  documented  as  follows  (with  the
       parameter  notation  slightly  altered).   The  abbreviations  will  be  familiar  to roff
       veterans.

                                     Abbreviation   Control word
                                              .ad   .adjust
                                              .bp   .begin page
                                              .br   .break
                                              .ce   .center
                                              .in   .indent n
                                              .ll   .line length n
                                              .nf   .nofill
                                              .pl   .paper length n
                                              .sp   .space [n]

       In 1965, MIT's Project MAC teamed with Bell Telephone Laboratories  and  General  Electric
       (GE)  to  inaugurate  the Multics ⟨http://www.multicians.org⟩ project.  After a few years,
       Bell Labs discontinued its participation in Multics, famously prompting the development of
       Unix.   Meanwhile,  Saltzer's RUNOFF proved influential, seeing many ports and derivations
       elsewhere.

       In 1969, Doug McIlroy wrote one such reimplementation,  adding  extensions,  in  the  BCPL
       language  for a GE 645 running GECOS at the Bell Labs location in Murray Hill, New Jersey.
       In its manual, the control commands were termed “requests”, their  two-letter  names  were
       canonical,  and the control character was configurable with a .cc request.  Other familiar
       requests emerged at this time;  no-adjust  (.na),  need  (.ne),  page  offset  (.po),  tab
       configuration  (.ta,  though  it  worked  differently),  temporary indent (.ti), character
       translation (.tr), and automatic underlining (.ul; on RUNOFF  you  had  to  backspace  and
       underscore  in the input yourself).  .fi to enable filling of output lines got the name it
       retains to this day.  McIlroy's program also featured a heuristic system for automatically
       placing  hyphenation  points,  designed  and  implemented by Molly Wagner.  It furthermore
       introduced numeric variables, termed registers.  By 1971, this program had been ported  to
       Multics  and was known as roff, a name McIlroy attributes to Bob Morris, to distinguish it
       from CTSS RUNOFF.

   Unix and roff
       McIlroy's  roff  was  one  of  the  first  Unix  programs.   In  Ritchie's  term,  it  was
       “transliterated” from BCPL to DEC PDP-7 assembly language for the fledgling Unix operating
       system.  Automatic hyphenation was managed with .hc and .hy requests, line spacing control
       was  generalized  with  the  .ls  request, and what later roffs would call diversions were
       available via “footnote” requests.  This roff indirectly funded operating systems research
       at  Murray  Hill;  AT&T prepared patent applications to the U.S. government with it.  This
       arrangement enabled the group to acquire a PDP-11; roff promptly proved equal to the  task
       of  formatting  the  manual  for  what  would  become known as “First Edition Unix”, dated
       November 1971.

       Output from all of the foregoing programs was limited to line printers and paper terminals
       such  as  the  IBM  2471  (based  on  the  Selectric line of typewriters) and the Teletype
       Corporation Model 37.  Proportionally spaced type was unavailable.

   New roff and Typesetter roff
       The first years of Unix were spent in rapid evolution.  The  practicalities  of  preparing
       standardized  documents  like  patent  applications (and Unix manual pages), combined with
       McIlroy's enthusiasm for macro languages, perhaps created an irresistible pressure to make
       roff  extensible.   Joe  Ossanna's  nroff, literally a “new roff”, was the outlet for this
       pressure.  By the time of Unix Version 3 (February  1973)—and  still  in  PDP-11  assembly
       language—it  sported  a  swath  of  features  now  considered  essential  to roff systems:
       definition of macros (.de), diversion of text thither (.di), and  removal  thereof  (.rm);
       trap  planting (.wh; “when”) and relocation (.ch; “change”); conditional processing (.if);
       and environments (.ev).  Incremental improvements included assignment  of  the  next  page
       number  (.pn);  no-space  mode (.ns) and restoration of vertical spacing (.rs); the saving
       (.sv) and output (.os) of vertical space; specification of replacement characters for tabs
       (.tc)  and leaders (.lc); configuration of the no-break control character (.c2); shorthand
       to disable automatic hyphenation (.nh); a condensation of what were formerly six different
       requests  for  configuration  of page “titles” (headers and footers) into one (.tl) with a
       length controlled separately from the line length (.lt); automatic line  numbering  (.nm);
       interactive input (.rd), which necessitated buffer-flushing (.fl), and was made convenient
       with early program cessation (.ex); source file inclusion in its modern form (.so;  though
       RUNOFF  had an “.append” control word for a similar purpose) and early advance to the next
       file argument (.nx); ignorable content (.ig); and programmable abort (.ab).

       Third Edition Unix also brought the  pipe(2)  system  call,  the  explosive  growth  of  a
       componentized system based around it, and a “filter model” that remains perceptible today.
       Equally importantly, the Bell Labs site in Murray Hill acquired a  Graphic  Systems  C/A/T
       phototypesetter,  and  with  it came the necessity of expanding the capabilities of a roff
       system to cope with a variety  of  proportionally  spaced  typefaces  at  multiple  sizes.
       Ossanna  wrote  a  parallel  implementation  of nroff for the C/A/T, dubbing it troff (for
       “typesetter roff”).  Unfortunately,  surviving  documentation  does  not  illustrate  what
       requests  were implemented at this time for C/A/T support; the troff(1) man page in Fourth
       Edition Unix (November 1973) does not feature a request list, unlike nroff(1).  Apart from
       typesetter-driven  features, Unix Version 4 roffs added string definitions (.ds); made the
       escape character configurable (.ec); and enabled the user  to  write  diagnostics  to  the
       standard  error  stream  (.tm).  Around 1974, empowered with multiple type sizes, italics,
       and a symbol font specially commissioned by Bell Labs from Graphic Systems, Kernighan  and
       Lorinda  Cherry  implemented eqn for typesetting mathematics.  In the same year, for Fifth
       Edition Unix, Ossanna combined and reimplemented the two roffs in C, using that language's
       preprocessor to generate both from a single source tree.

       Ossanna documented the syntax of the input language to the nroff and troff programs in the
       “Troff User's Manual”, first published in 1976, with further revisions as late as 1992  by
       Kernighan.   (The  original  version  was  entitled “Nroff/Troff User's Manual”, which may
       partially explain why roff practitioners have tended to refer to it by its  AT&T  document
       identifier,  “CSTR #54”.)  Its final revision serves as the de facto specification of AT&T
       troff, and all subsequent implementors of roff systems have done so in its shadow.

       A small and simple set of roff macros  was  first  used  for  the  manual  pages  of  Unix
       Version  4  and  persisted  for  two  further  releases, but the first macro package to be
       formally described and installed was ms by Michael Lesk in Version 6.   He  also  wrote  a
       manual,  “Typing Documents on the Unix System”, describing ms and basic nroff/troff usage,
       updating it as the package accrued features.  Sixth Edition additionally saw the debut  of
       the tbl preprocessor for formatting tables, also by Lesk.

       For  Unix  Version 7 (January 1979), McIlroy designed, implemented, and documented the man
       macro package, introducing most of the macros described in groff_man(7) today, and  edited
       volume 1 of the Version 7 manual using it.  Documents composed using ms featured in volume
       2, edited by Kernighan.

       Meanwhile, troff proved popular even at Unix sites that lacked a C/A/T device.  Tom Ferrin
       of  the University of California at San Francisco combined it with Allen Hershey's popular
       vector fonts to produce vtroff, which translated troff's output to  the  command  language
       used by Versatec and Benson-Varian plotters.

       Ossanna had passed away unexpectedly in 1977, and after the release of Version 7, with the
       C/A/T typesetter becoming supplanted by  alternative  devices  such  as  the  Mergenthaler
       Linotron  202,  Kernighan  undertook  a  revision  and  rewrite of troff to generalize its
       design.  To implement  this  revised  architecture,  he  developed  the  font  and  device
       description  file  formats and the page description language that remain in use today.  He
       described these novelties in the article “A Typesetter-independent TROFF”, last revised in
       1982, and like the troff manual itself, it is widely known by a shorthand, “CSTR #97”.

       Kernighan's  innovations  prepared troff well for the introduction of the Adobe PostScript
       language in 1982 and a vibrant market in laser printers with built-in interpreters for it.
       An  output  driver for PostScript, dpost, was swiftly developed.  However, AT&T's software
       licensing practices  kept  Ossanna's  troff,  with  its  tight  coupling  to  the  C/A/T's
       capabilities, in parallel distribution with device-independent troff throughout the 1980s.
       Today, however, all  actively  maintained  troffs  follow  Kernighan's  device-independent
       design.

   groff—a free roff from GNU
       The  most  important free roff project historically has been groff, the GNU implementation
       of troff, developed by James  Clark  starting  in  1989  and  distributed  under  copyleft
       ⟨http://www.gnu.org/copyleft⟩  licenses,  ensuring  to all the availability of source code
       and the freedom to modify and redistribute it, properties unprecedented in roff systems to
       that  point.   groff  rapidly  attracted contributors, and has served as a replacement for
       almost all applications of  AT&T  troff  (exceptions  include  mv,  a  macro  package  for
       preparation  of viewgraphs and slides, and the ideal preprocessor, which produces diagrams
       from mathematical  constraints).   Beyond  that,  it  has  added  numerous  features;  see
       groff_diff(7).   Since  its inception and for at least the following three decades, it has
       been used by practically all GNU/Linux and BSD operating systems.

       groff continues to be developed, is available for almost all operating systems  in  common
       use (along with several obscure ones), and is free.  These factors make groff the de facto
       roff standard today.

   Other free roffs
       In 2007, Caldera/SCO and Sun Microsystems, having acquired  rights  to  AT&T  Documenter's
       Workbench  (DWB)  troff (a descendant of the Bell Labs code), released it under a free but
       GPL-incompatible license.  This  implementation  ⟨https://github.com/n-t-roff/DWB3.3⟩  was
       made portable to modern POSIX systems, and adopted and enhanced first by Gunnar Ritter and
       then  Carsten  Kunze  to  produce   Heirloom   Doctools   troff   ⟨https://github.com/n-t-
       roff/heirloom-doctools⟩.

       In  July 2013, Ali Gholami Rudi announced neatroff ⟨https://github.com/aligrudi/neatroff⟩,
       a permissively licensed new implementation.

       Another descendant of DWB troff is part of Plan 9 from  User  Space  ⟨https://9fans.github
       .io/plan9port/⟩.  Since 2021, this troff has been available under permissive terms.

Using roff
       When  you  read  a  man  page,  often  a  roff  is  the  program  rendering it.  Some roff
       implementations provide wrapper programs that make it easy to use the roff system from the
       shell's  command  line.  These can be specific to a macro package, like mmroff(1), or more
       general.  groff(1) provides command-line options sparing the user  from  constructing  the
       long,  order-dependent  pipelines  familiar  to  AT&T  troff  users.  Further, a heuristic
       program, grog(1), is available to infer from a document's contents which  groff  arguments
       should be used to process it.

   The roff pipeline
       A  typical roff document is prepared by running one or more processors in series, followed
       by a a formatter program and then an output driver (or “device postprocessor”).  Commonly,
       these  programs are structured into a pipeline; that is, each is run in sequence such that
       the output of one is taken as the input to the next,  without  passing  through  secondary
       storage.  (On non-Unix systems, pipelines may have to be simulated with temporary files.)

              $ preproc1 < input-file | preproc2 | ... | troff [option] ... \
                  | output-driver

       Once  all  preprocessors have run, they deliver pure roff language input to the formatter,
       which in turn generates a document in a page description language that is then interpreted
       by a postprocessor for viewing, printing, or further processing.

       Each  program  interprets  input in a language that is independent of the others; some are
       purely descriptive, as with tbl(1) and roff output, and  some  permit  the  definition  of
       macros,  as  with  eqn(1)  and  roff  input.  Most roff input files employ the macros of a
       document formatting package, intermixed with instructions for one or  more  preprocessors,
       and  seasoned  with  escape sequences and requests from the roff language.  Some documents
       are simpler still, since their formatting packages discourage direct use of roff requests;
       man  pages  are a prominent example.  Many features of the roff language are seldom needed
       by users; only authors of macro packages require a substantial command of them.

   Preprocessors
       A roff preprocessor is a program that, directly or ultimately,  generates  output  in  the
       roff language.  Typically, each preprocessor defines a language of its own that transforms
       its input into that for roff or another preprocessor.  As an example of the  latter,  chem
       produces  pic  input.   Preprocessors  must  consequently  be run in an appropriate order;
       groff(1) handles this automatically for all preprocessors supplied by the GNU roff system.

       Portions of the document written in preprocessor languages are usually bracketed by tokens
       that look like roff macro calls.  roff preprocessor programs transform only the regions of
       the document intended for them.  When a preprocessor language is used by a  document,  its
       corresponding  program  must  process  it  before  the  input is seen by the formatter, or
       incorrect rendering is almost guaranteed.

       GNU roff provides several preprocessors, including eqn, grn, pic, tbl, refer, and  soelim.
       See groff(1) for a complete list.  Other preprocessors for roff systems are known.

              dformat   depicts data structures;
              grap      constructs statistical charts; and
              ideal     draws diagrams using a constraint-based language.

   Formatter programs
       A  roff  formatter transforms roff language input into a single file in a page description
       language, described in groff_out(5), intended for processing by a selected  device.   This
       page  description  language  is specialized in its parameters, but not its syntax, for the
       selected  device;  the  format  is  device-independent,  but  not  device-agnostic.    The
       parameters  the  formatter  uses  to  arrange  the  document are stored in device and font
       description files; see groff_font(5).

       AT&T Unix had two formatters—nroff for terminals, and troff for typesetters.   Often,  the
       name  troff is used loosely to refer to both.  When generalizing thus, groff documentation
       prefers the term “roff”.  In GNU roff, the formatter program is always troff(1).

   Devices and output drivers
       To a roff system, a device is a hardware interface like a printer,  a  text  or  graphical
       terminal,  or a standardized file format that unrelated software can interpret.  An output
       driver is a program that parses the output of troff and produces instructions specific  to
       the  device  or file format it supports.  An output driver might support multiple devices,
       particularly if they are similar.

       The names of the devices and their driver programs are  not  standardized.   Technological
       fashions  evolve;  the  devices  used  for  document preparation when AT&T troff was first
       written in the 1970s are no longer used in production environments.   Device  capabilities
       have tended to increase, improving resolution and font repertoire, and adding color output
       and hyperlinking.  Further, to reduce file size and processing  time,  AT&T  troff's  page
       description  language  placed  low  limits  on  the magnitudes of some quantities it could
       represent.  Its PostScript output driver, dpost(1), had a  resolution  of  720  units  per
       inch; groff's grops(1) uses 72,000.

roff programming
       Documents using roff are normal text files interleaved with roff formatting elements.  The
       roff language is  powerful  enough  to  support  arbitrary  computation  and  it  supplies
       facilities  that encourage extension.  The primary such facility is macro definition; with
       this feature, macro  packages  have  been  developed  that  are  tailored  for  particular
       applications.

   Macro packages
       Macro  packages  can  have a much smaller vocabulary than roff itself; this trait combined
       with their domain-specific nature can make them easy to acquire  and  master.   The  macro
       definitions  of  a  package  are  typically kept in a file called name.tmac (historically,
       tmac.name).  Find details on the naming and placement of macro packages in groff_tmac(5).

       A macro package anticipated for use in a document can be declared to the formatter by  the
       command-line option -m; see troff(1).  It can alternatively be specified within a document
       using the mso request of the groff language; see groff(7).

       Well-known macro packages include man for traditional man pages  and  mdoc  for  BSD-style
       manual  pages.   Macro  packages  for  typesetting books, articles, and letters include ms
       (from “manuscript macros”), me (named by a system administrator from the first name of its
       creator,  Eric  Allman), mm (from “memorandum macros”), and mom, a punningly named package
       exercising many groff extensions.  See groff_tmac(5) for more.

   The roff formatting language
       The roff language provides requests, escape sequences, macro definition facilities, string
       variables,  registers for storage of numbers or dimensions, and control of execution flow.
       The theoretically minded will observe that a roff is  not  a  mere  markup  language,  but
       Turing-complete.   It  has  storage  (registers),  it can perform tests (as in conditional
       expressions like “(\n[i] >= 1)”), its “if” and related requests alter the flow of control,
       and macro definition permits unbounded recursion.

       Requests  and  escape  sequences  are instructions, predefined parts of the language, that
       perform formatting operations, interpolate stored material, or otherwise change the  state
       of  the parser.  The user can define their own request-like elements by composing together
       text, requests, and escape sequences ad libitum.  A document  writer  will  not  (usually)
       note  any  difference  in  usage  for requests or macros; both are found on control lines.
       However, there is a  distinction;  requests  take  either  a  fixed  number  of  arguments
       (sometimes  zero),  silently  ignoring  any excess, or consume the rest of the input line,
       whereas macros can take a variable number of arguments.  Since arguments are separated  by
       spaces,  macros  require  a  means of embedding a space in an argument; in other words, of
       quoting it.  This then demands a mechanism of embedding the quoting character  itself,  in
       case  it  is needed literally in a macro argument.  AT&T troff had complex rules involving
       the placement and repetition of the double quote to achieve both aims.   groff  cuts  this
       knot  by  supporting  a  special  character  escape sequence for the neutral double quote,
       “\[dq]”, which never performs quoting in the typesetting language, but is simply a  glyph,
       ‘"’.

       Escape  sequences  start  with a backslash, “\”.  They can appear almost anywhere, even in
       the midst of text on a line, and implement various features, including  the  insertion  of
       special  characters  with “\(xx” or “\[xxx]”, break suppression at input line endings with
       “\c”, font changes with “\f”, type size changes with “\s”, in-line comments with “\"”, and
       many others.

       Strings  store text.  They are populated with the ds request and interpolated using the \*
       escape sequence.

       Registers store numbers and measurements.  A register can be set with the request  nr  and
       its value can be retrieved by the escape sequence \n.

File naming conventions

       The  structure  or  content of a file name, beyond its location in the file system, is not
       significant to roff tools.  roff documents employing “full-service”  macro  packages  (see
       groff_tmac(5))  tend  to  be named with a suffix identifying the package; we thus see file
       names ending in .man, .ms, .me, .mm, and .mom, for instance.  When  installed,  man  pages
       tend  to  be  named with the manual's section number as the suffix.  For example, the file
       name for this document is roff.7.  Practice for “raw” roff documents is  less  consistent;
       they are sometimes seen with a .t suffix.

Input conventions

       Since  troff fills text automatically, it is common practice in the roff language to avoid
       visual composition of text in input files: the esthetic appeal of the formatted output  is
       what  matters.   Therefore, roff input should be arranged such that it is easy for authors
       and maintainers to compose and  develop  the  document,  understand  the  syntax  of  roff
       requests,  macro  calls,  and preprocessor languages used, and predict the behavior of the
       formatter.  Several traditions have accrued in service of these goals.

       • Follow sentence endings in the input with newlines to ease  their  recognition.   It  is
         frequently  convenient  to  end text lines after colons and semicolons as well, as these
         typically precede independent clauses.  Consider doing so after commas; they often occur
         in  lists  that  become easy to scan when itemized by line, or constitute supplements to
         the sentence that are added, deleted, or  updated  to  clarify  it.   Parenthetical  and
         quoted phrases are also good candidates for placement on text lines by themselves.

       • Set your text editor's line length to 72 characters or fewer; see the subsections below.
         This limit, combined with the previous item of advice, makes  it  less  common  that  an
         input  line  will  wrap in your text editor, and thus will help you perceive excessively
         long constructions in your text.  Recall that natural languages originate in speech, not
         writing,  and  that  punctuation  is correlated with pauses for breathing and changes in
         prosody.

       • Use \& after “!”, “?”, and “.” if they are followed by space, tab, or newline characters
         and don't end a sentence.

       • In  filled  text lines, use \& before “.” and “'” if they are preceded by space, so that
         reflowing the input doesn't turn them into control lines.

       • Do not use spaces to perform indentation or align columns of a  table.   Leading  spaces
         are reliable when text is not being filled.

       • Comment  your document.  It is never too soon to apply comments to record information of
         use to future document maintainers (including your future self).  The \" escape sequence
         causes troff to ignore the remainder of the input line.

       • Use  the empty request—a control character followed immediately by a newline—to visually
         manage separation of material in input files.  Many of the groff project's own documents
         use  an  empty  request between sentences, after macro definitions, and where a break is
         expected, and two empty requests between paragraphs or other  requests  or  macro  calls
         that will introduce vertical space into the document.  You can combine the empty request
         with the comment escape sequence to include whole-line comments in  your  document,  and
         even “comment out” sections of it.

       An  example  sufficiently  long  to  illustrate  most of the above suggestions in practice
       follows.  An arrow → indicates a tab character.

              .\"   nroff this_file.roff | less
              .\"   groff -T ps this_file.roff > this_file.ps
              →The theory of relativity is intimately connected with
              the theory of space and time.
              .
              I shall therefore begin with a brief investigation of
              the origin of our ideas of space and time,
              although in doing so I know that I introduce a
              controversial subject.  \" remainder of paragraph elided
              .
              .

              →The experiences of an individual appear to us arranged
              in a series of events;
              in this series the single events which we remember
              appear to be ordered according to the criterion of
              \[lq]earlier\[rq] and \[lq]later\[rq], \" punct swapped
              which cannot be analysed further.
              .
              There exists,
              therefore,
              for the individual,
              an I-time,
              or subjective time.
              .
              This itself is not measurable.
              .
              I can,
              indeed,
              associate numbers with the events,
              in such a way that the greater number is associated with
              the later event than with an earlier one;
              but the nature of this association may be quite
              arbitrary.
              .
              This association I can define by means of a clock by
              comparing the order of events furnished by the clock
              with the order of a given series of events.
              .
              We understand by a clock something which provides a
              series of events which can be counted,
              and which has other properties of which we shall speak
              later.
              .\" Albert Einstein, _The Meaning of Relativity_, 1922

   Editing with Emacs
       Official GNU doctrine holds that the best program for editing a roff  document  is  Emacs;
       see  emacs(1).   It  provides  an  nroff major mode that is suitable for all kinds of roff
       dialects.  This mode can be activated by the following methods.

       When editing a file within Emacs the mode can be changed by typing “M-x nroff-mode”, where
       M-x  means  to  hold down the meta key (often labelled “Alt”) while pressing and releasing
       the “x” key.

       It is also possible to have the mode automatically selected when a  roff  file  is  loaded
       into the editor.

       • The  most  general  method is to include file-local variables at the end of the file; we
         can also configure the fill column this way.

                .\" Local Variables:
                .\" fill-column: 72
                .\" mode: nroff
                .\" End:

       • Certain file name extensions, such as those commonly used  by  man  pages,  trigger  the
         automatic activation of the nroff mode.

       • Technically, having the sequence

                .\" -*- nroff -*-

         in  the  first  line of a file will cause Emacs to enter the nroff major mode when it is
         loaded into the buffer.  Unfortunately, some implementations of the man(1)  program  are
         confused by this practice, so we discourage it.

   Editing with Vim
       Other  editors  provide  support for roff-style files too, such as vim(1), an extension of
       the vi(1) program.  Vim's highlighting can be made to recognize roff files by setting  the
       filetype  option  in  a  Vim modeline.  For this feature to work, your copy of vim must be
       built with support for, and configured to enable, several features; consult  the  editor's
       online  help  topics  “auto-setting”, “filetype”, and “syntax”.  Then put the following at
       the end of your roff files, after any Emacs configuration:

                     .\" vim: set filetype=groff textwidth=72:

       Replace “groff” in the above with “nroff” if you want highlighting that does not recognize
       many  of  the  GNU  extensions to roff, such as request, register, and string names longer
       than two characters.

Authors

       This document was written by Bernd Warken ⟨groff-bernd.warken-72@web.de⟩  and  G.  Branden
       Robinson ⟨g.branden.robinson@gmail.com⟩.

See also

       Much  roff  documentation is available.  The Bell Labs papers describing AT&T troff remain
       available, and groff is documented comprehensively.

   Internet sites
       Unix  Text  Processing  ⟨https://github.com/larrykollar/Unix-Text-Processing⟩,   by   Dale
       Dougherty and Tim O'Reilly, 1987, Hayden Books.  This well-regarded text brings the reader
       from a state of no knowledge of Unix or  text  editing  (if  necessary)  to  sophisticated
       computer-aided  typesetting.   It  has  been  placed  under a free software license by its
       authors and updated by a team of groff contributors and enthusiasts.

       “History  of  Unix  Manpages”  ⟨http://manpages.bsd.lv/history.html⟩,  an  online  article
       maintained  by the mdocml project, provides an overview of roff development from Saltzer's
       RUNOFF to 2008, with links to original documentation and recollections of the authors  and
       their contemporaries.

       troff.org  ⟨http://www.troff.org/⟩,  Ralph Corderoy's troff site, provides an overview and
       pointers to much historical roff information.

       Multicians ⟨http://www.multicians.org/⟩, a site by Multics enthusiasts, contains a lot  of
       information  on  the  MIT  projects  CTSS  and Multics, including RUNOFF; it is especially
       useful for its glossary and the many links to historical documents.

       The Unix Archive ⟨http://www.tuhs.org/Archive/⟩, curated by  the  Unix  Heritage  Society,
       provides the source code and some binaries of historical Unices (including the source code
       of some versions of troff and its documentation) contributed by their copyright holders.

       Jerry Saltzer's home page  ⟨http://web.mit.edu/Saltzer/www/publications/pubs.html⟩  stores
       some documents using the original RUNOFF formatting language.

       groffhttp://www.gnu.org/software/groff⟩, GNU roff's web site, provides convenient access
       to groff's source code repository, bug tracker, and mailing lists (including archives  and
       the subscription interface).

   Historical roff documentation
       Many  AT&T troff documents are available online, and can be found at Ralph Corderoy's site
       (see above) or via Internet search.

       Of foremost significance are two mentioned in  section  “History”  above,  describing  the
       language and its device-independent implementation, respectively.

       “Troff  User's  Manual”  by Joseph F. Ossanna, 1976 (revised by Brian W. Kernighan, 1992),
       AT&T Bell Laboratories Computing Science Technical Report No. 54.

       “A Typesetter-independent TROFF” by Brian  W.  Kernighan,  1982,  AT&T  Bell  Laboratories
       Computing Science Technical Report No. 97.

       You  can obtain many relevant Bell Labs papers in PDF from Bernd Warken's “roff classical”
       GitHub repository ⟨https://github.com/bwarken/roff_classical.git⟩.

   Manual pages
       As a system of multiple components, a roff system potentially has  many  man  pages,  each
       describing an aspect of it.  Unfortunately, there is no consistent naming scheme for these
       pages among the different roff implementations.

       For GNU roff, the groff(1) man page enumerates all man pages distributed with the  system,
       and individual pages frequently refer to external resources as well as manuals distributed
       with groff on a variety of topics.

       With other roffs, you are on your own, but troff(1) might be a good starting point.