lunar (1) csepdjvu.1.gz

Provided by: djvulibre-bin_3.5.28-2build3_amd64 bug

NAME

       csepdjvu - DjVu encoder for separated data files.

SYNOPSIS

       csepdjvu  [options] [sepfiles]... outputdjvufile

DESCRIPTION

       This  program  creates  a  DjVuDocument  file  outputdjvufile  from  separated  data files
       sepfiles.  It can read separated data from the standard input when  given  a  single  dash
       instead  of  the  separated  data file names.  This feature is intended for pre-processing
       programs that push separated data into csepdjvu via a pipe.

       Each separated data file represents one or more page images.  When the  program  arguments
       specify  multiple  pages,  all  the  pages  are  encoded and saved as a bundled multi-page
       document.  When the program arguments specify a single page, the page is encoded and saved
       as a single page file.

OPTIONS

       -d n   Specify  the  resolution information encoded into the output file expressed in dots
              per inch. The resolution information  encoded  in  DjVu  files  determine  how  the
              decoder  scales  the  image  on a particular display.  Meaningful resolutions range
              from 25 to 6000.  The default value is 300 dpi.

       -q n,...,n

       -q n+...+n
              Specify the encoding quality of the IW44  encoded  background  layer.   The  option
              argument  contain  several  integers  (one per chunk) separated by either commas or
              pluses.  This option is similar to option -slice of program c44.  Please  refer  to
              the  c44(1)  man page for additional details.  The default quality specification is
              -q 72,83,93,103.

              This option does not apply to uniformly white background that were not specified by
              the  separated  data but are called for by the DjVu specification.  Such background
              images always come at the lowest possible resolution and with  a  standard  quality
              setting that ensures the color uniformity.

       -t     Program  csepdjvu  interprets certain comments in the separated file to construct a
              hidden text layer in the DjVu file. This layer records the location  of  each  word
              for  hiliting  purposes.  This option reduces the file size by simply recording the
              location of each line.

       -v     Display a brief message describing each page.

       -vv    Display extensive informational messages during encoding.

SEPARATED DATA FILE FORMAT

       Each separated data file contains a concatenation of one or more  separated  page  images.
       Each page is logically represented by a foreground image with a transparent color and by a
       background image visible through the transparent pixels.  The data for each separated page
       image is the concatenation of the following data blocks:

       *  A  foreground  image  encoded  using  either the "Color RLE format" or the "Bitonal RLE
          format".  These formats are described later in this section.

       *  An optional background image encoded as a "Portable Pixmap" ( PPM ).  This  well  known
          format  is  summarized later in this section.  The absence of a background image simply
          indicates that a uniformly white background should be assumed.

       *  An arbitrary number of comment lines starting with character "#" and  terminated  by  a
          linefeed  character.  Comment  lines whose first word starts with a capital letter have
          special meanings documented later in this document.

       The dimensions (width and height) of the background image must be obtained by rounding  up
       the  quotient  of  the  foreground image dimensions by an integer reduction factor ranging
       from 1 to 12.  Assume, for instance, that the width of the  foreground  is  2507  and  the
       reduction  factor  is  3.   The  width  of  the background image will be the integer ratio
       (2507+2)/3.

   Color RLE format
       The Color RLE format is a simple run-length  encoding  scheme  for  color  images  with  a
       limited  number  of distinct colors.  The data always begin with a text header composed of
       the two characters "R6", the number of columns, the number of  rows,  and  the  number  of
       color  palette entries.  All numbers are expressed in decimal ASCII.  These four items are
       separated by blank characters (space, tab, carriage return, or  linefeed)  or  by  comment
       lines  introduced  by character "#".  The last number is followed by exactly one character
       which usually is a linefeed character.

       The header is followed by the color palette containing three bytes per color  entry.   The
       bytes represent the red, green, and blue components of the color.

       The  palette  is  followed  by  a  collection of four bytes integers (most significant bit
       first) representing runs of pixels with an identical color.  The twelve upper bits of this
       integer  indicate  the index of the run color in the palette entry.  The twenty lower bits
       of the integer indicate the run length.  Color indices greater than  0xff0  are  reserved.
       Color  index 0xfff is used for transparent runs.  Each row is represented by a sequence of
       runs whose lengths add up to the image width.  Rows are encoded starting with the top  row
       and progressing toward the bottom row.

   Bitonal RLE format
       The  Bitonal  RLE  format  is a simple run-length encoding scheme for bitonal images.  The
       data always begin with a text header composed of the two characters "R4",  the  number  of
       columns, and the number of rows.  All numbers are expressed in decimal ASCII.  These three
       items are separated by blank characters (space, tab, carriage return, or linefeed)  or  by
       comment  lines  introduced  by  character "#".  The last number is followed by exactly one
       character which usually is a linefeed character.

       The rest of the file encodes a sequence of numbers representing the lengths of alternating
       runs  of  transparent  and black pixels.  Lines are encoded starting with the top line and
       progressing toward the bottom line.  Each line starts with a white run. The decoder  knows
       that  a  line  is  finished  when the sum of the run lengths for that line is equal to the
       number of columns in the image.  Numbers in range 0 to 191 are  represented  by  a  single
       byte  in  range 0x00 to 0xbf.  Numbers in range 192 to 16383 are represented by a two byte
       sequence: the first byte, in range 0xc0 to 0xff, encodes the six most significant bits  of
       the  number,  the  second byte encodes the remaining eight bits of the number. This scheme
       allows for runs of length zero, which are useful when a line starts with  a  black  pixel,
       and when a very long run (whose length exceeds 16383) must be split into smaller runs.

   Portable Pixmap (PPM) format
       The  Portable  Pixmap  format is a well known format for representing color images.  Check
       the ppm(1) man page for complete information.

       The data always begin with a text header composed of the two characters "P6",  the  number
       of  columns, the number of rows, and the maximal value of a color component (usually 255).
       All numbers are expressed in decimal ASCII.  These three  items  are  separated  by  blank
       characters  (space,  tab,  carriage return, or linefeed) or by comment lines introduced by
       character "#".  The last number is followed by exactly one character which  usually  is  a
       linefeed character.

       The  rest  of  the  file encodes all the pixels.  Each pixel is represented by three bytes
       representing the red, green and blue component of the pixel.  Pixels are ordered  in  left
       to right, top to bottom.

   Comments in separated files
       Each  page is followed by an arbitrary number of comment lines starting with character "#"
       and terminated by a linefeed character.  Certain comment lines have special  meanings.  In
       the  following constructs, all the strings are UTF-8 encoded and represent in the style of
       Postscript strings,  that  is,  surrounded  with  parenthesis  and  using  C-style  escape
       sequences introduced by a backslash.

       *  # T px:py dx:dy wxh+x+y (string)
          Such  a comment line indicates that the piece of text string must be associated with an
          area of size wxh at position x,y relative  to  the  lower  left  corner  of  the  page.
          Integers  px,  and  py represent the position of the current point on the text baseline
          before the text was drawn. The drawing operation then moves the current  point  by  dx,
          and  dy  pixels.  When such comments are present, csepdjvu produces a hidden text layer
          for the corresponding pages.

       *  # L wxh+x+y (url)
          Such a comment line indicates that an hyperlink to url url should  be  associated  with
          area  of  size  wxh at position x,y.  When such comments are present, csepdjvu produces
          pages with an annotation chunk containing the specified hyperlinks.

       *  # B count (string) (#pageno)
          Such a comment line provides outline information for the document.   An  outline  entry
          entitled  string  is  associated with page pageno.  Integer count indicates how many of
          the following outline entries must be attached to  the  current  entry  as  subentries.
          When  such comments are present in the first page csepdjvu produces an navigation chunk
          with the specified outline.

       *  # P (string)
          Such a comment line provides a title string for the current page.

CREDITS

       This program was initially written by Léon Bottou  <leonb@users.sourceforge.net>  and  was
       improved by Bill Riemers <docbill@sourceforge.net> and many others.

SEE ALSO

       djvu(1), ppm(5), c44(1)