jammy (3) lowdown.3.gz

Provided by: liblowdown-dev_0.10.0-1_amd64 bug

NAME

     lowdown — simple markdown translator library

LIBRARY

     library “liblowdown”

SYNOPSIS

     #include <sys/queue.h>
     #include <stdio.h>
     #include <lowdown.h>

     struct lowdown_metadata
     struct lowdown_node
     struct lowdown_opts

DESCRIPTION

     This library parses lowdown(5) into various output formats.

     The library consists first of a high-level interface consisting of lowdown_buf(3), lowdown_buf_diff(3),
     lowdown_file(3), and lowdown_file_diff(3).

     The high-level functions interface with low-level functions that perform parsing and formatting.  These
     consist of lowdown_doc_new(3), lowdown_doc_parse(3), and lowdown_doc_free(3) for parsing lowdown(5)
     documents into an abstract syntax tree.

     The front-end functions for freeing, allocation, and rendering are as follows.

        HTML5:
         lowdown_html_free(3)
         lowdown_html_new(3)
         lowdown_html_rndr(3)

        gemini:
         lowdown_gemini_free(3)
         lowdown_gemini_new(3)
         lowdown_gemini_rndr(3)

        LaTeX:
         lowdown_latex_free(3)
         lowdown_latex_new(3)
         lowdown_latex_rndr(3)

        OpenDocument:
         lowdown_odt_free(3)
         lowdown_odt_new(3)
         lowdown_odt_rndr(3)

        roff:
         lowdown_nroff_free(3)
         lowdown_nroff_new(3)
         lowdown_nroff_rndr(3)

        UTF-8 ANSI terminal:
         lowdown_term_free(3)
         lowdown_term_new(3)
         lowdown_term_rndr(3)

        debugging:
         lowdown_tree_rndr(3)

     To compile and link, use pkg-config(1):

     % cc `pkg-config --cflags lowdown` -c -o sample.o sample.c
     % cc -o sample sample.o `pkg-config --libs lowdown`

   Pledge Promises
     The lowdown library is built to operate in security-sensitive environments, such as those using pledge(2)
     on OpenBSD.  The only promise required is stdio for lowdown_file_diff(3) and lowdown_file(3): both require
     access to the stream for reading input.

   Types
     All lowdown functions use one or more of the following structures.

     The struct lowdown_opts structure manage features.  It has the following fields:

           unsigned int feat
                   Features used during the parse.  This bit-field may have the following bits OR'd:

                   LOWDOWN_ATTRS
                           Parse PHP extra link and image attributes.
                   LOWDOWN_AUTOLINK
                           Parse http, https, ftp, mailto, and relative links or link fragments.
                   LOWDOWN_COMMONMARK
                           Tighten input parsing to the CommonMark specification.  This also uses the first
                           ordered list value instead of starting all lists at one.  This feature is
                           experimental and incomplete.
                   LOWDOWN_DEFLIST
                           Parse PHP extra definition lists.  This is currently constrained to single-key lists.
                   LOWDOWN_FENCED
                           Parse GFM fenced (language-specific) code blocks.
                   LOWDOWN_FOOTNOTES
                           Parse MMD style footnotes.  This only supports the referenced footnote style, not the
                           "inline" style.
                   LOWDOWN_HILITE
                           Parse highlit sequences.  This are disabled by default because it may be erroneously
                           interpreted as section headers.
                   LOWDOWN_IMG_EXT
                           Deprecated.  Use LOWDOWN_ATTRS instead.
                   LOWDOWN_MATH
                           Parse mathematics equations.
                   LOWDOWN_METADATA
                           Parse in-document MMD metadata.  For the first paragraph to count as meta-data, the
                           first line must have a colon in it.
                   LOWDOWN_NOCODEIND
                           Do not parse indented content as code blocks.
                   LOWDOWN_NOINTEM
                           Do not parse emphasis within words.
                   LOWDOWN_STRIKE
                           Parse strikethrough sequences.
                   LOWDOWN_SUPER
                           Parse super-scripts.  This accepts foo^bar, which puts the parts following the caret
                           until whitespace in superscripts; or foo^(bar), which puts only the parts in
                           parenthesis.
                   LOWDOWN_TABLES
                           Parse GFM tables.
                   LOWDOWN_TASKLIST
                           Parse GFM task list items.

                   The default value is zero (none).

           unsigned int oflags
                   Features used by the output generators.  This bit-field may have the following enabled.  Note
                   that bits are by definition specific to an output type.

                   For LOWDOWN_HTML:

                   LOWDOWN_HTML_ESCAPE
                           If LOWDOWN_HTML_SKIP_HTML has not been set, escapes in-document HTML so that it is
                           rendered as opaque text.
                   LOWDOWN_HTML_HARD_WRAP
                           Retain line-breaks within paragraphs.
                   LOWDOWN_HTML_HEAD_IDS
                           Have an identifier written with each header element consisting of an HTML-escaped
                           version of the header contents.
                   LOWDOWN_HTML_OWASP
                           When escaping text, be extra paranoid in following the OWASP suggestions for which
                           characters to escape.
                   LOWDOWN_HTML_NUM_ENT
                           Convert, when possible, HTML entities to their numeric form.  If not set, the
                           entities are used as given in the input.
                   LOWDOWN_HTML_SKIP_HTML
                           Do not render in-document HTML at all.

                   For LOWDOWN_GEMINI, there are several flags for controlling link placement.  By default,
                   links (images, autolinks, and links) are queued when specified in-line then emitted in a
                   block sequence after the nearest block element.

                   LOWDOWN_GEMINI_LINK_END
                           Emit the queue of links at the end of the document instead of after the nearest block
                           element.
                   LOWDOWN_GEMINI_LINK_IN
                           Render all links within the flow of text.  This will cause breakage when nested
                           links, such as images within links, links in blockquotes, etc.  It should not be used
                           unless in carefully crafted documents.
                   LOWDOWN_GEMINI_LINK_NOREF
                           Do not format link labels.  Takes precedence over LOWDOWN_GEMINI_LINK_ROMAN.
                   LOWDOWN_GEMINI_LINK_ROMAN
                           When formatting link labels, use lower-case Roman numerals instead of the default
                           lowercase hexavigesimal (i.e., “a”, “b”, ..., “aa”, “ab”, ...).
                   LOWDOWN_GEMINI_METADATA
                           Print metadata as the canonicalised key followed by a colon then the value, each on
                           one line (newlines replaced by spaces).  The metadata block is terminated by a double
                           newline.  If there is no metadata, this does nothing.

                   There may only be one of LOWDOWN_GEMINI_LINK_END or LOWDOWN_GEMINI_LINK_IN.  If both are
                   specified, the latter is unset.

                   For LOWDOWN_FODT:

                   LOWDOWN_ODT_SKIP_HTML
                           Do not render in-document HTML at all.  Text within HTML elements remains.

                   For LOWDOWN_LATEX:

                   LOWDOWN_LATEX_NUMBERED
                           Use the default numbering scheme for sections, subsections, etc.  If not specified,
                           these are inhibited.
                   LOWDOWN_LATEX_SKIP_HTML
                           Do not render in-document HTML at all.  Text within HTML elements remains.

                   And for LOWDOWN_MAN and LOWDOWN_NROFF:

                   LOWDOWN_NROFF_GROFF
                           Use GNU extensions (i.e., for groff(1)) when rendering output.  The groff arguments
                           must include -mpdfmark for formatting links with LOWDOWN_MAN or -mspdf instead of -ms
                           for LOWDOWN_NROFF.  Applies to the LOWDOWN_MAN and LOWDOWN_NROFF output types.
                   LOWDOWN_NROFF_NUMBERED
                           Use numbered sections if LOWDOWON_NROFF_GROFF is not specified.  Only applies to the
                           LOWDOWN_NROFF output type.
                   LOWDOWN_NROFF_SKIP_HTML
                           Do not render in-document HTML at all.  Text within HTML elements remains.
                   LOWDOWN_NROFF_SHORTLINK
                           Render link URLs in short form.  Applies to images, autolinks, and regular links.
                           Only in LOWDOWN_MAN or when LOWDOWN_NROFF_GROFF is not specified.
                   LOWDOWN_NROFF_NOLINK
                           Don't show links at all if they have embedded text.  Applies to images and regular
                           links.  Only in LOWDOWN_MAN or when LOWDOWN_NROFF_GROFF is not specified.

                   For LOWDOWN_TERM:

                   LOWDOWN_TERM_NOANSI
                           Don't apply ANSI style codes at all.  This implies LOWDOWN_TERM_NOCOLOUR.
                   LOWDOWN_TERM_NOCOLOUR
                           Don't apply ANSI colour codes.  This will still show underline, bold, etc.  This
                           should not be used in difference mode, as the output will make no sense.
                   LOWDOWN_TERM_NOLINK
                           Don't show links at all.  Applies to images and regular links: autolinks are still
                           shown.  This may be combined with LOWDOWN_TERM_SHORTLINK to also shorten autolinks.
                   LOWDOWN_TERM_SHORTLINK
                           Render link URLs in short form.  Applies to images, autolinks, and regular links.
                           This may be combined with LOWDOWN_TERM_NOLINK to only show shortened autolinks.

                   For any mode, you may specify:

                   LOWDOWN_SMARTY
                           Don't use smart typography formatting.
                   LOWDOWN_STANDALONE
                           Emit a full document instead of a document fragment.  This envelope is largely
                           populated from metadata if LOWDOWN_METADATA was provided as an option or as given in
                           meta or metaovr.

           size_t maxdepth
                   The maximum parse depth before the parser exits.  Most documents will have a parse depth in
                   the single digits.

           size_t cols
                   For LOWDOWN_TERM, the "soft limit" for width of terminal output not including margins.  If
                   zero, 80 shall be used.

           size_t hmargin
                   For LOWDOWN_TERM, the left margin (space characters).

           size_t vmargin
                   For LOWDOWN_TERM, the top/bottom margin (newlines).

           enum lowdown_type type
                   May be set to LOWDOWN_HTML for HTML5 output, LOWDOWN_LATEX for LaTeX, LOWDOWN_MAN for -man
                   macros, LOWDOWN_FODT for “flat” OpenDocument, LOWDOWN_TERM for ANSI-compatible UTF-8 terminal
                   output, LOWDOWN_GEMINI for the Gemini format, or LOWDOWN_NROFF for -ms macros.  The
                   LOWDOWN_TREE type causes a debug tree to be written.

                   Both LOWDOWN_MAN and LOWDOWN_NROFF will use troff tables, which usually require the tbl(1)
                   preprocessor.

           char **meta
                   An array of metadata key-value pairs or NULL.  Each pair must appear as if provided on one
                   line (or multiple lines) of the input, including the terminating newline character.  If not
                   consisting of a valid pair (e.g., no newline, no colon), then it is ignored.  When processed,
                   these values are overridden by those in the document (if LOWDOWN_METADATA is specified) or by
                   those in metaovr.

           size_t metasz
                   Number of pairs in metaovr.

           char **metaovr
                   See meta.  The difference is that metaovr is applied after meta and in-document metadata, so
                   it overrides prior values.

           size_t metaovrsz
                   Number of pairs in metaovr.

     Another common structure is struct lowdown_metadata, which is used to hold parsed (and output-formatted)
     metadata keys and values if LOWDOWN_METADATA was provided as an input bit.  This structure consists of the
     following fields:

           char *key
                   The metadata key in its lowercase, canonical form.

           char *value
                   The metadata value as rendered in the current output format.  This may be an empty string.

     The abstract syntax tree is encoded in struct lowdown_node, which consists of the following.

           enum lowdown_rndrt type
                   The node type.  (Described below.)

           size_t id
                   An identifier unique within the document.  This can be used as a table index since the number
                   is assigned from a monotonically increasing point during the parse.

           struct lowdown_node *parent
                   The parent of the node, or NULL at the root.

           enum lowdown_chng chng
                   Change tracking: whether this node was inserted (LOWDOWN_CHNG_INSERT), deleted
                   (LOWDOWN_CHNG_DELETE), or neither (LOWDOWN_CHNG_NONE).

           struct lowdown_nodeq children
                   A possibly-empty list of child nodes.

           <anon union>
                   An anonymous union of type-specific structures.  See below for a description of each one.

     The nodes may be one of the following types, with default rendering in HTML5 to illustrate functionality.

           LOWDOWN_BLOCKCODE
                   A block-level (and possibly language-specific) snippet of code.  Described by the <pre><code>
                   elements.

           LOWDOWN_BLOCKHTML
                   A block-level snippet of HTML.  This is simply opaque HTML content.  (Only if configured
                   during parse.)

           LOWDOWN_BLOCKQUOTE
                   A block-level quotation.  Described by the <blockquote> element.

           LOWDOWN_CODESPAN
                   A snippet of code.  Described by the <code> element.

           LOWDOWN_DOC_FOOTER
                   Closes out the document opened in LOWDOWN_DOC_HEADER.

           LOWDOWN_DOC_HEADER
                   A header with data gathered from document metadata (if configured).  Described by the <head>
                   element.  (Only if configured during parse.)

           LOWDOWN_DOUBLE_EMPHASIS
                   Bold (or otherwise notable) content.  Described by the <strong> element.

           LOWDOWN_EMPHASIS
                   Italic (or otherwise notable) content.  Described by the <em> element.

           LOWDOWN_ENTITY
                   An HTML entity, which may either be named or numeric.

           LOWDOWN_FOOTNOTE_DEF
                   A footnote within a LOWDOWN_FOOTNOTES_BLOCK node.  Described by the <li id="fnXX"> element.
                   (Only if configured during parse.)

           LOWDOWN_FOOTNOTE_REF
                   A reference to a LOWDOWN_FOOTNOTE_DEF.  Described by the <sup><a> elements.  (Only if
                   configured during parse.)

           LOWDOWN_FOOTNOTES_BLOCK
                   A block of footnotes.  Described by the <div class="footnotes"><hr /><ol> elements.  (Only if
                   configured during parse.)

           LOWDOWN_HEADER
                   A block-level header.  Described (in the HTML case) by one of <h1> through <h6>.

           LOWDOWN_HIGHLIGHT
                   Marked test.  Described by the <mark> element.  (Only if configured during parse.)

           LOWDOWN_HRULE
                   A horizontal line.  Described by <hr>.

           LOWDOWN_IMAGE
                   An image.  Described by the <img> element.

           LOWDOWN_LINEBREAK
                   A hard line-break within a block context.  Described by the <br> element.

           LOWDOWN_LINK
                   A link to external media.  Described by the <a> element.

           LOWDOWN_LINK_AUTO
                   Like LOWDOWN_LINK, except inferred from text content.  Described by the <a> element.  (Only
                   if configured during parse.)

           LOWDOWN_LIST
                   A block-level list enclosure.  Described by <ul> or <ol>.

           LOWDOWN_LISTITEM
                   A block-level list item, always appearing within a LOWDOWN_LIST.  Described by <li>.

           LOWDOWN_MATH_BLOCK
                   A block (or inline) of mathematical text in LaTeX format.  Described within \[xx\] or \(xx\).
                   This is usually (in HTML) externally handled by a JavaScript renderer.  (Only if configured
                   during parse.)

           LOWDOWN_META
                   Meta-data keys and values.  (Only if configured during parse.)  These are described by
                   elements in the <head> element.

           LOWDOWN_NORMAL_TEXT
                   Normal text content.

           LOWDOWN_PARAGRAPH
                   A block-level paragraph.  Described by the <p> element.

           LOWDOWN_RAW_HTML
                   An inline of raw HTML.  (Only if configured during parse.)

           LOWDOWN_ROOT
                   The root of the document.  This is always the topmost node, and the only node where the
                   parent field is NULL.

           LOWDOWN_STRIKETHROUGH
                   Content struck through.  Described by the <del> element.  (Only if configured during parse.)

           LOWDOWN_SUPERSCRIPT
                   A superscript.  Described by the <sup> element.  (Only if configured during parse.)

           LOWDOWN_TABLE_BLOCK
                   A table block.  Described by <table>.  (Only if configured during parse.)

           LOWDOWN_TABLE_BODY
                   A table body section.  Described by <tbody>.  Parent is always LOWDOWN_TABLE_BLOCK.  (Only if
                   configured during parse.)

           LOWDOWN_TABLE_CELL
                   A table cell.  Described by <td> or <th> if in the header.  Parent is always
                   LOWDOWN_TABLE_ROW.  (Only if configured during parse.)

           LOWDOWN_TABLE_HEADER
                   A table header section.  Described by <thead>.  Parent is always LOWDOWN_TABLE_BLOCK.  (Only
                   if configured during parse.)

           LOWDOWN_TABLE_ROW
                   A table row.  Described by <tr>.  Parent is always LOWDOWN_TABLE_HEADER or
                   LOWDOWN_TABLE_BODY.  (Only if configured during parse.)

           LOWDOWN_TRIPLE_EMPHASIS
                   Combination of LOWDOWN_EMPHASIS and LOWDOWN_DOUBLE_EMPHASIS.

     The following anonymous union structures correspond to certain nodes.  Note that all buffers may be zero-
     length.

           rndr_autolink
                   For LOWDOWN_LINK_AUTO, the link address as link and the link type type, which may be one of
                   HALINK_EMAIL for e-mail links and HALINK_NORMAL otherwise.  Any buffer may be empty-sized.

           rndr_blockcode
                   For LOWDOWN_BLOCKCODE, the opaque text of the block and the optional lang of the code
                   language.

           rndr_blockhtml
                   For LOWDOWN_BLOCKHTML, the opaque HTML text.

           rndr_codespan
                   The opaque text of the contents.

           rndr_definition
                   For LOWDOWN_DEFINITION, containing flags that may be HLIST_FL_BLOCK if the definition list
                   should be interpreted as containing block elements.

           rndr_entity
                   For LOWDOWN_ENTITY, the entity text.

           rndr_footnote_def
                   For LOWDOWN_FOOTNOTE_DEF, the footnote number num (starting at one).  This matches a single
                   LOWDOWN_FOOTNOTE_DEF similarly numbered.  The key is its original in-document reference key.

           rndr_footnote_ref
                   For a LOWDOWN_FOOTNOTE_REF reference to a LOWDOWN_FOOTNOTE_DEF, the footnote number num
                   (starting at one).  The def is the content parsed as children to the matching
                   LOWDOWN_FOOTNOTE_DEF.  The key is its original in-document reference key.

           rndr_header
                   For LOWDOWN_HEADER, the level of the header starting at zero This value is relative to the
                   metadata base header level, defaulting to one (the top-level header).

           rndr_image
                   For LOWDOWN_IMAGE, the image address link, the image title title, dimensions NxN (width by
                   height) in dims, and alternate text alt.  CSS in-line style for width and height may be given
                   in attr_width and/or attr_height, and a space-separated list of classes may be in attr_cls
                   and a single identifier may be in attr_id.

           rndr_link
                   Like rndr_autolink, but without a type and further defining an optional link title title,
                   optional space-separated class list attr_cls, and optional single identifier attr_id.

           rndr_list
                   For LOWDOWN_LIST, consists of a bitfield flags that may be set to HLIST_FL_ORDERED for an
                   ordered list and HLIST_FL_UNORDERED for an unordered one.  If HLIST_FL_BLOCK is set, the list
                   should be output as if items were separate blocks.  The start value for HLIST_FL_ORDERED is
                   the starting list item position, which is one by default and never zero.

           rndr_listitem
                   For LOWDOWN_LISTITEM, consists of a bitfield flags that may be set to HLIST_FL_ORDERED for an
                   ordered list, HLIST_FL_UNORDERED for an unordered list, HLIST_FL_DEF for definition list
                   data, HLIST_FL_CHECKED or HLIST_FL_UNCHECKED for an unordered “task” list element, and/or
                   HLIST_FL_BLOCK for list item output as if containing block elements.  The HLIST_FL_BLOCK
                   should not be used: use the parent list (or definition list) flags for this.  The num is the
                   index in a HLIST_FL_ORDERED list.  It is monotonically increasing with each item in the list,
                   starting at the start variable given in struct rndr_list.

           rndr_math
                   For LOWDOWN_MATH, the mode of display displaymode: if 1, in-line math; if 2, multi-line.

           rndr_meta
                   Each LOWDOWN_META key-value pair is represented.  The keys are lower-case without spaces or
                   non-ASCII characters.  If provided, enclosed nodes may consist only of LOWDOWN_NORMAL_TEXT
                   and LOWDOWN_ENTITY.

           rndr_normal_text
                   The basic text content for LOWDOWN_NORMAL_TEXT.

           rndr_paragraph
                   For LOWDOWN_PARAGRAPH, species how many lines the paragraph has in the input file and beoln,
                   set to non-zero if the paragraph ends with an empty line instead of a breaking block element.

           rndr_raw_html
                   For LOWDOWN_RAW_HTML, the opaque HTML text.

           rndr_table
                   For LOWDOWN_TABLE_BLOCK, the number of columns in each row or header row.  The number of
                   columns in rndr_table, rndr_table_header, and rndr_table_cell are the same.

           rndr_table_cell
                   For LOWDOWN_TABLE_CELL, the current col column number out of columns.  See rndr_table_header
                   for a description of the bits in flags.  The number of columns in rndr_table,
                   rndr_table_header, and rndr_table_cell are the same.

           rndr_table_header
                   For LOWDOWN_TABLE_HEADER, the number of columns in each row and the per-column flags, which
                   may be bits of HTBL_FL_ALIGN_LEFT, HTBL_FL_ALIGN_RIGHT, or HTBL_FL_ALIGN_CENTER when masked
                   with HTBL_FL_ALIGNMASK; or HTBL_FL_HEADER.  The number of columns in rndr_table,
                   rndr_table_header, and rndr_table_cell are the same.

SEE ALSO

     lowdown(1), lowdown_buf(3), lowdown_buf_diff(3), lowdown_diff(3), lowdown_doc_free(3), lowdown_doc_new(3),
     lowdown_doc_parse(3), lowdown_file(3), lowdown_file_diff(3), lowdown_gemini_free(3), lowdown_gemini_new(3),
     lowdown_gemini_rndr(3), lowdown_html_free(3), lowdown_html_new(3), lowdown_html_rndr(3),
     lowdown_latex_free(3), lowdown_latex_new(3), lowdown_latex_rndr(3), lowdown_metaq_free(3),
     lowdown_nroff_free(3), lowdown_nroff_new(3), lowdown_nroff_rndr(3), lowdown_odt_free(3),
     lowdown_odt_new(3), lowdown_odt_rndr(3), lowdown_term_free(3), lowdown_term_new(3), lowdown_term_rndr(3),
     lowdown_tree_rndr(3), lowdown(5)

AUTHORS

     lowdown was forked from hoedown: https://github.com/hoedown/hoedown by Kristaps Dzonsons, kristaps@bsd.lv.
     It has been considerably modified since.