Ubuntu Manpage: markdown - markdown documents parsing

name
synopsis
description
php-markdown-like tables
renderer examples
standard markdown renderer
discount-ish
natasha's own extensions
copyright
see also

Provided by: libsoldout1-dev_1.3-6_amd64

NAME

       markdown - markdown documents parsing

SYNOPSIS

       #PACKAGE#
           #include <markdown.h>

           void markdown ( struct buf *ob,
                           struct buf *ib,
                           const struct mkd_renderer *rndr);

DESCRIPTION

       is  the  only  exported function in libsoldout and starts the parsing process of a markdown document. You
       can have more information about the markdown language from John Gruber's website:

       http://daringfireball.net/projects/markdown/

       Libsoldout only performs the parsing of markdown input, the construction of  the  output  is  left  to  a
       *renderer*,  which is a set of callback functions called when markdown elements are encountered. Pointers
       to these functions are gathered into a `struct mkd_renderer` along with  some  renderer-related  data.  I
       think the struct declaration is pretty obvious:

       struct mkd_renderer {
           /* document level callbacks */
           void (*prolog)(struct buf *ob, void *opaque);
           void (*epilog)(struct buf *ob, void *opaque);

           /* block level callbacks - NULL skips the block */
           void (*blockcode)(struct buf *ob, struct buf *text, void *opaque);
           void (*blockquote)(struct buf *ob, struct buf *text, void *opaque);
           void (*blockhtml)(struct buf *ob, struct buf *text, void *opaque);
           void (*header)(struct buf *ob, struct buf *text,
                               int level, void *opaque);
           void (*hrule)(struct buf *ob, void *opaque);
           void (*list)(struct buf *ob, struct buf *text, int flags,
                        void *opaque);
           void (*listitem)(struct buf *ob, struct buf *text,
                               int flags, void *opaque);
           void (*paragraph)(struct buf *ob, struct buf *text, void *opaque);
           void (*table)(struct buf *ob, struct buf *head_row,
                               struct buf *rows,void *opaque);
           void (*table_cell)(struct buf *ob, struct buf *text, int flags,
                                   void *opaque);
           void (*table_row)(struct buf *ob, struct buf *cells, int flags,
                                   void *opaque);

           /* span level callbacks - NULL or return 0
              prints the span verbatim */
           int (*autolink)(struct buf *ob, struct buf *link,
                           enum mkd_autolink type, void *opaque);
           int (*codespan)(struct buf *ob, struct buf *text, void *opaque);
           int (*double_emphasis)(struct buf *ob, struct buf *text,
                               char c, void *opaque);
           int (*emphasis)(struct buf *ob, struct buf *text,
                           char c, void*opaque);
           int (*image)(struct buf *ob, struct buf *link, struct buf *title,
                               struct buf *alt, void *opaque);
           int (*linebreak)(struct buf *ob, void *opaque);
           int (*link)(struct buf *ob, struct buf *link, struct buf *title,
                           struct buf *content, void *opaque);
           int (*raw_html_tag)(struct buf *ob, struct buf *tag, void *opaque);
           int (*triple_emphasis)(struct buf *ob, struct buf *text,
                               char c, void *opaque);

           /* low level callbacks - NULL copies input directly
              into the output */
           void (*entity)(struct buf *ob, struct buf *entity, void *opaque);
           void (*normal_text)(struct buf *ob, struct buf *text,
                               void *opaque);

           /* renderer data */
           int max_work_stack; /* prevent arbitrary deep recursion */
           const char *emph_chars; /* chars that trigger emphasis rendering */
           void *opaque; /* opaque data send to every rendering callback */ };

       The  first argument of a renderer function is always the output buffer, where the function is supposed to
       write its output. It's not necessarily related to the output buffer given to `markdown()` because in some
       cases render into a temporary buffer is needed.

       The  last  argument  of  a  renderer function is always an opaque pointer, which is equal to the `opaque`
       member of `struct mkd_renderer`. The name "opaque" might not be  well-chosen,  but  it  means  a  pointer
       *opaque  for  the  parser,  **not**  for  the renderer*. It means that my parser passes around blindy the
       pointer which contains data you know about, in case you need to store an internal state  or  whatever.  I
       have  not  found  anything  to  put  in this pointer in my example renderers, so it is set to NULL in the
       structure and never look at in the callbacks.

       `emph_chars` is a zero-terminated string which contains the set of characters that trigger  emphasis.  In
       regular  markdown, emphasis is only triggered by '\_' and '\*', but in some extensions it might be useful
       to add other characters to this list. For example in my extension to handle `<ins>`  and  `<del>`  spans,
       delimited  respectively  by  "++"  and "--", I have added '+' and '-' to `emph_chars`. The character that
       triggered the emphasis is then passed to `emphasis`, `double_emphasis` and `triple_emphasis` through  the
       parameter `c`.

       Function  pointers  in  `struct  mkd_renderer`  can  be  NULL, but it has a different meaning whether the
       callback is block-level or span-level. A null block-level callback  will  make  the  corresponding  block
       disappear  from  the  output,  as  if the callback was an empty function. A null span-level callback will
       cause the corresponding element to be treated as normal characters, copied verbatim to the output.

       So for example, to disable link and images (e.g. because you consider them as dangerous), just put a null
       pointer  in  `rndr.link`  and  `rndr.image`  and the bracketed stuff will be present as-is in the output.
       While a null pointer in `header` will remove all header-looking blocks. If you want an otherwise standard
       markdown-to-XHTML  conversion, you can take the example `mkd_xhtml` struct, copy it into your own `struct
       mkd_renderer` and then assign NULL to `link` and `image` members.

       Moreover, span-level callbacks return an integer, which tells whether the renderer accepts to render  the
       item (non-zero return value) or whether it should be copied verbatim (zero return value). This allows you
       to only accept some specific inputs. For example,  my  extension  for  `<ins>`  and  `<del>`  spans  asks
       *exactly* two '-' or '+' as delimiters, when `emphasis` and `triple_emphasis` are called with '-' or '+',
       they return 0.

       Special care should be taken when writing `autolink`, `link` and `image` callbacks, because the arguments
       `link`,  `title`  and  `alt`  are  unsanitized  data  taken directly from the input file. It is up to the
       renderer to escape whatever needs escaping to prevent bad things from  happening.  To  help  you  writing
       renderers,  the function `lus_attr_escape()` escapes all problematic characters in (X)HTML: `'<'`, `'>'`,
       `'&'` and `'"'`.

       The `normal_text` callback should also perform whatever escape is needed to have the output looking  like
       the input data.

PHP-MARKDOWN-LIKE TABLES

       Tables  are  one  of  the few extensions that are quite difficult and/or hacky to implement using vanilla
       Markdown parser and a renderer. Thus a support has been  introduced  into  the  parser,  using  dedicated
       callbacks:

         - `table_cell`, which is called with the span-level contents of the cell;
         - `table_row`, which is called with data returned by `table_cell`;
         - `table`, which called with data returned by `table_row`.

       The input format to describe tables is taken from PHP-Markdown, and looks like this:

           header 1    | header 2      | header 3      | header 4
           ------------|:-------------:|--------------:|:--------------
           first line  |   centered    | right-aligned | left-aligned
           second line |   centered    |:   centered  :| left-aligned
           third line  |: left-aglined | right-aligned | right-aligned :
           column-separator | don't need | to be | aligned in the source
           | extra speratators | are allowed | at both ends | of the line |
           | correct number of cell per row is not enforced |
           | pipe characters can be embedded in cell text by escaping it:  |

       Each  row  of  the  input  text  is  a  single row in the output, except the header rule, which is purely
       syntactic.

       Each cell in a row is delimited by a pipe (`|`) character. Optionally,  a  pipe  character  can  also  be
       present  at the beginning and/or at the end of the line. Column separator don't have to be aligned in the
       input, but it makes the input more readable.

       There is no check of "squareness" of the table: `table_cell` is called once for each cell provided in the
       input, which can be a number of times different from one row to the other. If the output *has* to respect
       a given number of cell per row, it's up to the renderer to enforce it, using  state  transmitted  through
       the `opaque` pointer.

       The  header  rule is a line containing only horizontal blanks (space and tab), dashes (`-`), colons (`:`)
       and separator. Moreover, it *must* be the second line of the  table.  In  case  such  a  header  rule  is
       detected, the first line of the table is considered as a header, and passed as the `head_row` argument to
       `table`  callback.  Moreover  `table_row`  and  `table_cell`  are  called  for  that  specific  row  with
       `MKD_CELL_HEAD` flag.

       Alignment  is  defined  on  a per-cell basis, and specified by a colon (`:`) at the very beginning of the
       input span (i.e. directly after the `|` separator, or as the first character on the line) and/or  at  the
       very  end  of it (i.e.  directly before the separator, or as the last character on the line). A cell with
       such a leading colon only is left-aligned (`MKD_CELL_ALIGN_LEFT`), one with  a  trailing  colon  only  is
       right-aligned (`MKD_CELL_ALIGN_RIGHT`), and one with both is centered (`MKD_CELL_ALIGN_CENTER`).

       A column-wise default alignment can be specified with the same syntax on the header rule.

RENDERER EXAMPLES

       While  libsoldout  is  designed to perform only the parsing of markdown files, and to let you provide the
       renderer callbacks, a few renderers have been included, both to illustrate how to write a set of renderer
       functions and to allow anybody who do not need special extensions to use libsoldout without hassle.

       All the examples provided here comme with two flavors, `_html` producing HTML code (self-closing tags are
       rendered like this: `<hr>`), and `_xhtml` producing XHTML code (self-closing tags like `<hr />`).

STANDARD MARKDOWN RENDERER

       `mkd_html` and `mkd_xhtml` implement standard Markdown to (X)HTML translation without any extension.

DISCOUNT-ISH

       `discount_html` and `discount_xhtml` implement on top of the standard markdown *some* of  the  extensions
       found in Discount.

       Actually,  all  Discount extensions that are not provided here cannot be easily implemented in libsoldout
       without touching to the parsing code, hence they do not belong strictly to the  renderer  realm.  However
       some  (maybe  all,  not sure about tables) extensions can be implemented fairly easily with libsoldout by
       using both a dedicated renderer and some preprocessing to make the extension look like  something  closer
       to the original markdown syntax.

       Here is a list of all extensions included in these renderers:

        - image size specitication, by appending " =(width)x(height)" to
          the link,
        - pseudo-protocols in links:
           * abbr:_description_ for `<abbr title="`_description_`">...</abbr>`
           * class:_name_ for `<span class="`_name_`">...</span>`
           * id:_name_ for `<a id="`_name_`>...</a>`
           * raw:_text_ for verbatim unprocessed _text_ inclusion
        - class blocks: blockquotes beginning with %_class_% will be
           rendered as a `div` of the given class(es).

NATASHA'S OWN EXTENSIONS

       `nat_html`  and  `nat_xhtml`  implement  on top of Discount extensions some things that I need to convert
       losslessly my existing HTML into extended markdown.

       Here is a list of these extensions :

        - id attribute for headers, using the syntax _id_#_Header text_
        - class attribute for paragraphs, by putting class name(s) between
          parenthesis at the very beginning of the paragraph
        - `<ins>` and `<del>` spans, using respectively `++` and `--` as
          delimiters (with emphasis-like restrictions, i.e. an opening
          delimiter cannot be followed by a whitespace, and a closing
          delimiter cannot be preceded by a whitespace).
        - plain `<span>` without attribute, using emphasis-like delimiter `|`

       Follows an example use of all of them:

           ###atx_id#ID was chosen to look nice in atx-style headers ###

           setext_id#Though it will also work in setext-style headers
           ----------------------------------------------------------

           Here is a paragraph with --deleted-- and ++inserted++ text.

           I use CSS rules to render poetry and other verses, using a plain
           `<span>` for each verse, and enclosing each group of verses in
           a `<p class="verse">`. Here is how it would look like:

           (verse)|And on the pedestal these words appear:|
           |"My name is Ozymandias, king of kings:|
           |Look on my works, ye Mighty, and despair!"|

COPYRIGHT

         Copyright © 2009 Natasha Porte' <natbsd@instinctive.eu>

NAME

SYNOPSIS

DESCRIPTION

PHP-MARKDOWN-LIKE TABLES

RENDERER EXAMPLES

STANDARD MARKDOWN RENDERER

DISCOUNT-ISH

NATASHA'S OWN EXTENSIONS

COPYRIGHT

SEE ALSO