xenial (1) csepdjvu.1.gz

Provided by: djvulibre-bin_3.5.27.1-5ubuntu0.1_amd64 bug

NAME

       csepdjvu - DjVu encoder for separated data files.

SYNOPSIS

       csepdjvu  [options] [sepfiles]... outputdjvufile

DESCRIPTION

       This  program creates a DjVuDocument file outputdjvufile from separated data files sepfiles.  It can read
       separated data from the standard input when given a single dash instead of the separated data file names.
       This feature is intended for pre-processing programs that push separated data into csepdjvu via a pipe.

       Each separated data file represents one or more page images.  When the program arguments specify multiple
       pages, all the pages are encoded and saved as a bundled multi-page document.  When the program  arguments
       specify a single page, the page is encoded and saved as a single page file.

OPTIONS

       -d n   Specify  the  resolution  information encoded into the output file expressed in dots per inch. The
              resolution information encoded in DjVu files determine how the  decoder  scales  the  image  on  a
              particular display.  Meaningful resolutions range from 25 to 6000.  The default value is 300 dpi.

       -q n,...,n

       -q n+...+n
              Specify  the  encoding  quality of the IW44 encoded background layer.  The option argument contain
              several integers (one per chunk) separated by either commas or pluses.  This option is similar  to
              option  -slice  of  program c44.  Please refer to the c44(1) man page for additional details.  The
              default quality specification is -q 72,83,93,103.

              This option does not apply to uniformly white background that were not specified by the  separated
              data  but  are  called  for  by the DjVu specification.  Such background images always come at the
              lowest possible resolution and with a standard quality setting that ensures the color uniformity.

       -t     Program csepdjvu interprets certain comments in the separated file  to  construct  a  hidden  text
              layer  in the DjVu file. This layer records the location of each word for hiliting purposes.  This
              option reduces the file size by simply recording the location of each line.

       -v     Display a brief message describing each page.

       -vv    Display extensive informational messages during encoding.

SEPARATED DATA FILE FORMAT

       Each separated data file contains a concatenation of one or more separated page  images.   Each  page  is
       logically  represented  by  a foreground image with a transparent color and by a background image visible
       through the transparent pixels.  The data for each separated page  image  is  the  concatenation  of  the
       following data blocks:

       *  A  foreground  image  encoded  using either the "Color RLE format" or the "Bitonal RLE format".  These
          formats are described later in this section.

       *  An optional background image encoded as a "Portable Pixmap" (  PPM  ).   This  well  known  format  is
          summarized later in this section.  The absence of a background image simply indicates that a uniformly
          white background should be assumed.

       *  An arbitrary number of comment lines  starting  with  character  "#"  and  terminated  by  a  linefeed
          character.  Comment  lines  whose  first  word  starts  with  a  capital  letter have special meanings
          documented later in this document.

       The dimensions (width and height) of the background image must be obtained by rounding up the quotient of
       the  foreground  image  dimensions  by  an  integer  reduction  factor ranging from 1 to 12.  Assume, for
       instance, that the width of the foreground is 2507 and the reduction factor  is  3.   The  width  of  the
       background image will be the integer ratio (2507+2)/3.

   Color RLE format
       The  Color  RLE  format  is a simple run-length encoding scheme for color images with a limited number of
       distinct colors.  The data always begin with a text header composed  of  the  two  characters  "R6",  the
       number  of  columns,  the  number  of  rows,  and  the  number of color palette entries.  All numbers are
       expressed in decimal ASCII.  These four items are separated by blank  characters  (space,  tab,  carriage
       return,  or  linefeed)  or  by comment lines introduced by character "#".  The last number is followed by
       exactly one character which usually is a linefeed character.

       The header is followed by the color palette containing three bytes per color entry.  The bytes  represent
       the red, green, and blue components of the color.

       The  palette is followed by a collection of four bytes integers (most significant bit first) representing
       runs of pixels with an identical color.  The twelve upper bits of this integer indicate the index of  the
       run  color  in  the  palette entry.  The twenty lower bits of the integer indicate the run length.  Color
       indices greater than 0xff0 are reserved.  Color index 0xfff is used for transparent runs.   Each  row  is
       represented  by  a  sequence  of runs whose lengths add up to the image width.  Rows are encoded starting
       with the top row and progressing toward the bottom row.

   Bitonal RLE format
       The Bitonal RLE format is a simple run-length encoding scheme for bitonal images.  The data always  begin
       with  a  text  header composed of the two characters "R4", the number of columns, and the number of rows.
       All numbers are expressed in decimal ASCII.  These three items are separated by blank characters  (space,
       tab,  carriage  return, or linefeed) or by comment lines introduced by character "#".  The last number is
       followed by exactly one character which usually is a linefeed character.

       The rest of the file encodes a sequence of numbers  representing  the  lengths  of  alternating  runs  of
       transparent  and  black  pixels.  Lines are encoded starting with the top line and progressing toward the
       bottom line.  Each line starts with a white run. The decoder knows that a line is finished when  the  sum
       of  the  run lengths for that line is equal to the number of columns in the image.  Numbers in range 0 to
       191 are represented by a single byte in  range  0x00  to  0xbf.   Numbers  in  range  192  to  16383  are
       represented  by  a  two  byte  sequence:  the  first  byte,  in  range 0xc0 to 0xff, encodes the six most
       significant bits of the number, the second byte encodes the remaining eight  bits  of  the  number.  This
       scheme allows for runs of length zero, which are useful when a line starts with a black pixel, and when a
       very long run (whose length exceeds 16383) must be split into smaller runs.

   Portable Pixmap (PPM) format
       The Portable Pixmap format is a well known format for representing color images.  Check  the  ppm(1)  man
       page for complete information.

       The  data always begin with a text header composed of the two characters "P6", the number of columns, the
       number of rows, and the maximal value of a color component (usually 255).  All numbers are  expressed  in
       decimal  ASCII.   These  three  items  are separated by blank characters (space, tab, carriage return, or
       linefeed) or by comment lines introduced by character "#".  The last number is followed  by  exactly  one
       character which usually is a linefeed character.

       The  rest  of the file encodes all the pixels.  Each pixel is represented by three bytes representing the
       red, green and blue component of the pixel.  Pixels are ordered in left to right, top to bottom.

   Comments in separated files
       Each page is followed by an arbitrary number of comment lines starting with character "#" and  terminated
       by  a  linefeed  character.   Comment  lines  whose  first word starts with a capital letter have special
       meanings. The following constructs are currently defined:

       *  # T px:py dx:dy wxh+x+y (string)
          This constructs indicates that the piece of text string must be associated with an area of size wxh at
          position  x,y  relative  to  the  lower left corner of the page.  The string is UTF-8 encoded. Special
          characters can be escaped as in PostScript  using  the  backslash  character.   Integers  px,  and  py
          represent  the  position  of  the  current  point  on the text baseline before the text was drawn. The
          drawing operation then moves the current point by dx, and dy pixels.  When such comments are  present,
          csepdjvu produces a hidden text layer for the corresponding pages.

       *  # L wxh+x+y (url)
          This  construct  indicates  that an hyperlink to url url should be associated with area of size wxh at
          position x,y.  When such comments are present,  csepdjvu  produces  pages  with  an  annotation  chunk
          containing the specified hyperlinks.

       *  # B count (string) (#pageno)
          This  constructs  provides  outline information for the document.  An outline entry entitled string is
          associated with page pageno.  Integer count indicates how many of the following outline  entries  must
          be  attached  to  the  current  entry as subentries.  When such comments are present in the first page
          csepdjvu produces an navigation chunk with the specified outline.

       *  # P (string)
          Provides title string for the current page.

CREDITS

       This program was initially written by Léon Bottou <leonb@users.sourceforge.net> and was improved by  Bill
       Riemers <docbill@sourceforge.net> and many others.

SEE ALSO

       djvu(1), ppm(5), c44(1)