Provided by: djvulibre-bin_3.5.27.1-14ubuntu0.1_amd64 bug

NAME

       DjVu - DjVu and DjVuLibre.

INTRODUCTION

       Although  the  Internet  has given us a worldwide infrastructure on which to build the universal library,
       much of the world knowledge, history, and literature is still trapped on paper in the  basements  of  the
       world's  traditional  libraries. Many libraries and content owners are in the process of digitizing their
       collections.  While many such efforts involve the painstaking process of converting  paper  documents  to
       computer-friendly  form,  such  as  SGML  based  formats,  the high cost of such conversions limits their
       extent. Scanning documents, and distributing the resulting images electronically is not only considerably
       cheaper, but also more faithful to the original document because it preserves its visual aspect.

       Despite  the quickly improving speed of network connections and computers, the number of scanned document
       images accessible on the Web today is relatively small. There are several reasons for this.

       The first reason is the relatively high cost of scanning anything else but unbound sheets  in  black  and
       white.  This  problem  is  slowly going away with the appearance of fast and low-cost color scanners with
       sheet feeders.

       The second reason is that long-established image compression  standards  and  file  formats  have  proved
       inadequate for distributing scanned documents at high resolution, particularly color documents.  Not only
       are the file sizes and download times impractical, the decoding and rendering times are also prohibitive.
       A typical magazine page scanned in color at 100 dpi in JPEG would typically occupy 100 KB to 200 KB , but
       the text would be hardly readable: insufficient for screen viewing and totally unacceptable for printing.
       The  same page at 300 dpi would have sufficient quality for viewing and printing, but the file size would
       be 300 KB to 1000 KB at best, which is impractical for remote access. Another major  problem  is  that  a
       fully  decoded 300 dpi color images of a letter-size page occupies 24 MB of memory and easily causes disk
       swapping.

       The third reason is that digital documents are more than just a collection  of  individual  page  images.
       Pages  in  a scanned documents have a natural serial order. Special provision must be made to ensure that
       flipping pages be instantaneous and effortless so as to  maintain  a  good  user  experience.  Even  more
       important,  most  existing  document  formats  force  users  to download the entire document first before
       displaying a chosen page.  However, users often want to jump to individual pages of the document  without
       waiting  for  the entire document to download.  Efficient browsing requires efficient random page access,
       fast sequential page flipping, and quick rendering. This can be achieved with a combination  of  advanced
       compression,  pre-fetching,  pre-decoding,  caching, and progressive rendering. DjVu decomposes each page
       into multiple components (text, backgrounds, images, libraries of common shapes...)  that may  be  shared
       by  several  pages  and  downloaded  on demand.  All these requirements call for a very sophisticated but
       parsimonious control mechanism to handle on-demand  downloading,  pre-fetching,  decoding,  caching,  and
       progressive  rendering  of  the  page images.  What is being considered here is not just a document image
       compression technique, but a whole platform for document delivery.

       DjVu is an image compression technique, a  document  format,  and  a  software  platform  for  delivering
       documents images over the Internet that fulfills the above requirements.

DJVU IMAGE COMPRESSION

       The DjVu image compression is based on three technologies:

   DjVuPhoto
       DjVuPhoto,  also  known  as  IW44,  is  a  wavelet-based continuous-tone image compression technique with
       progressive decoding/rendering.  It is best used for encoding photographic images in colors or in  shades
       of gray.  Images are typically half the size as JPEG for the same distortion.

   DjVuBitonal
       DjVuBitonal,  also  known  as  JB2, is a bitonal image compression that takes advantage of repetitions of
       nearly identical shapes on the page (such as characters) to efficiently compress text images.  It is best
       used to compress black and white images representing text and simple drawings.  A typical 300 dpi page in
       DjVuBitonal occupies 5 to 25 KB (3 to 8 times better than TIFF-G4 or PDF ).

   DjVuDocument
       DjVuDocument is a  compression  technique  specifically  designed  for  color  digital  documents  images
       containing  both  pictures  and  text, such as a page of a magazine.  DjVuDocument represents images into
       separately compressed layers.  The foreground layer is usually compressed with DjVu Bitonal and  contains
       the  text  and  drawings.   The  background  layer  is usually compressed with DjVuPhoto and contains the
       background texture and the pictures at lower resolution.

DJVU DOCUMENT DELIVERY PLATFORM

       The DjVu technology is designed from the ground up to support the efficient delivery of digital documents
       over  the  Internet.   It  provides  various  ways to deal with multi-page documents, and various ways to
       enrich the content with hyper-links, meta-data, searchable text, etc.

   MIME types
       The DjVu format has an official MIME type of image/vnd.djvu, which is the preferred  content-type  to  be
       given  by  http  servers  for  DjVu  files.  Unofficial mime types used historically are image/x.djvu and
       image/x-djvu, which may still be encountered.  Ideally, clients should be configured to handle all three.
       (For    web    server    configuration    help,   see   http://www.djvuzone.org/support/tutorial/chapter-
       authoring1.html.)

   Bundled multi-page documents
       Bundled multi-page DjVu document uses a single file to represent the entire document.  This  single  file
       contains  all the pages as well as ancillary information (e.g. the page directory, data shared by several
       pages, thumbnails, etc.).  Using a single file format is very convenient for  storing  documents  or  for
       sending email attachments.

       When  you  type  the  URL  of a multi-page document, the DjVu browser plugin starts downloading the whole
       file, but displays the first page as soon as it is available.  You  can  immediately  navigate  to  other
       pages  using  the DjVu toolbar.  Suppose however that the document is stored on a remote web server.  You
       can easily access the first page and see that this is not the document you  wanted.   Although  you  will
       never  display  the  other  pages  the  browser  is  transferring data for these pages and is wasting the
       bandwidth of your server (and the bandwidth of the Internet too).  You could also see the summary of  the
       document on the first page and jump to page 100.  But page 100 cannot be displayed until data for pages 1
       to 99 has been received.  You may have to wait for the  transmission  of  unnecessary  page  data.   This
       second  problem  (the  unnecessary wait) can be solved using the ``byte serving'' options of the HTTP/1.1
       protocol.  This option has to be supported by the web server, the proxies, the caches  and  the  browser.
       Byte serving however does not solve the first problem (the waste of bandwidth).

   Indirect multi-page documents
       Indirect multi-page DjVu documents solve both problems.  An indirect multi-page DjVu document is composed
       of several files.  The main file is named the index file.  You can browse a document using the URL of the
       index  file,  just like you do with a bundled multi-page document.  The index file however is very small.
       It simply contains the document directory and the URLs of secondary files containing the page data.  When
       you browse an indirect multi-page document, the browser only accesses data for the pages you are viewing.
       This can be done at a reasonable speed because the browser maintains a cache of pages and sometimes  pre-
       fetches  a  few  pages  ahead  of  the current page.  This model uses the web serving bandwidth much more
       effectively.  It also eliminates unnecessary delays when jumping ahead to pages  located  anywhere  in  a
       long document.

   Annotations
       Every  DjVu image optionally includes so-called annotation chunks.  The annotation chunk is often used to
       define hyper-links to other document pages or to arbitrary web pages.  Annotation chunks can also be used
       for  other  purposes  such  as setting the initial viewing mode of a page, defining highlighted zones, or
       storing arbitrary meta-data about the page or the document.

   Hidden text
       Every DjVu image optionally includes a hidden text layer that  associated  graphical  features  with  the
       corresponding  text.   The  hidden  text  layer  is  usually  generated  by  running an Optical Character
       Recognition software.  This textual information provides for indexing DjVu documents and  copying/pasting
       text from DjVu page images.

   Thumbnails
       DjVu documents sometimes contain pre-computed page thumbnails.

   Outline
       DjVu  documents sometimes contain a navigation chunk containing an outline, that is, a hierarchical table
       of contents with pointers to the corresponding document pages.

DJVUZONE AND DJVULIBRE

       The DjVu technology was initially created by a few researchers  in  AT&T  Labs  between  1995  and  1999.
       Lizardtech, Inc. ( http://www.lizardtech.com ) then obtained a commercial license from AT&T and continued
       the development.  They have now a variety of solutions for producing and distributing documents using the
       DjVu technology.

       The DjVuZone web site ( http://www.djvuzone.org ) is managed by the few AT&T Labs researchers who created
       the DjVu technology in the first place.  We promote the  DjVu  technology  by  providing  an  independent
       source of information about DjVu.

       Understanding  how  little  room there is for a proprietary document format, Lizardtech released the DjVu
       Reference Library under the GNU Public License in December  2000.   This  library  entirely  defines  the
       compression  format  and  the  elementary  codecs.   Six month later, Lizardtech released an updated DjVu
       Reference Library as well as the source code of the Unix viewer.

       These two releases form the basis of our initial DjVuLibre software.  We modified  the  build  system  to
       comply with the expectations of the open source community.  Various bugs and portability issues have been
       fixed.  We also tried to make it simpler to use and install, while preserving the essential structure  of
       the Lizardtech releases.

       The DjVuLibre software contains the following components:

       bzz(1) A  general  purpose  compression  command  line  program.   Many internal DjVu data structures are
              compressed using this technique.

       c44(1) A DjVuPhoto command line encoder. This  state-of-the-art  wavelet  compressor  produces  DjVuPhoto
              images from PPM or JPEG images.

       cjb2(1)
              A  DjVuBitonal  command  line  encoder. This soft-pattern-matching compressor produces DjVuBitonal
              images from PBM images.  It can encode images without loss, or introduce small changes in order to
              improve  the  compression  ratio.   The  lossless  encoding  mode  is competitive with that of the
              Lizardtech commercial encoders.

       cpaldjvu(1)
              A DjVuDocument command line encoder for images with few colors.  This encoder is  well  suited  to
              compressing images with a small number of distinct colors (e.g. screen-shots).  The dominant color
              is encoded by the background layer.  The other colors are encoded by the foreground layer.

       csepdjvu(1)
              A DjVuDocument command line encoder for separated images.  This encoder takes  a  file  containing
              pre-segmented foreground and background images and produces a DjVuDocument image.

       ddjvu(1)
              A  command  line  decoder  for  DjVu  images.   This program produces a PNM image representing any
              segment of any page of a DjVu document at any resolution.

       djview(1)
              A stand-alone viewer for DjVu images.  This sophisticated  viewer  displays  DjVu  documents.   It
              implements document navigation as well as fast zooming and panning.

       nsdejavu(1)
              A web browser plugin for viewing DjVu images.  This small plugin allows for viewing DjVu documents
              from web browsers.  It internally uses djview to perform the actual work.

       djvups(1)
              A command line tool for converting DjVu documents into PostScript .

       djvm(1)
              A command line tool for manipulating bundled multi-page DjVu documents.   This  program  is  often
              used to collect individual pages and produce a bundled document.

       djvmcvt(1)
              A command line tool for converting bundled documents to indirect documents and conversely.

       djvused(1)
              A powerful command line tool for manipulating multi-page documents, creating or editing annotation
              chunks, creating or editing hidden text layers, pre-computing thumbnail images, and more...

       djvutxt(1)
              A command line tool to extract the hidden text from DjVu documents.

       djvudump(1)
              A command line tool for inspecting DjVu files and displaying their internal structure.

       djvuextract(1)
              A command line tool for dis-assembling DjVu image files.

       djvumake(1)
              A command line tool for assembling DjVu image files.

       djvuserve(1)
              A CGI program for generating indirect multi-page DjVu documents on the fly.

       djvutoxml(1), djvuxmlparser(1)
              Command line tools to edit DjVu metadata as XML files.

DJVU ENCODERS AND ANY2DJVU

       DjVuLibre comes with a variety of specialized encoders,  c44(1)  for  photographic  images,  cjb2(1)  for
       bitonal  images,  and  cpaldjvu(1)  for images with few distinct colors.  Although these encoders perform
       well in their specialized domain, they cannot handle complex tasks involving segmentation  and  multipage
       encoding.

       The  Lizardtech  commercial products (see http://www.lizardtech.com/solutions/document) can perform these
       complex encoding tasks

       Another solution is provided by the compression server at (http://any2djvu.djvuzone.org).   This  machine
       uses  pre-lizardtech  prototype  encoders  from  AT&T  Labs and performs almost as well as the commercial
       Lizardtech encoders.  Please note that the Any2DjVu compression server  comes  with  no  guarantee,  that
       nothing  is  done  to  ensure  that  your  documents will remain confidential, and that there is only one
       computer working for the whole planet.

CREDITS

       Numerous people have contributed to the DjVu source code during the last five  years.   Please  submit  a
       sourceforge bug report to update the following list.

          Yoshua  Bengio, Léon Bottou, Chakradhar Chandaluri, Regis M. Chaplin, Ming Chen, Parag Deshmukh, Royce
          Edwards, Andrew Erofeev, Praveen Guduru, Patrick Haffner, Paul G. Howard, Orlando Keise, Yann Le  Cun,
          Artem  Mikheev,  Florin  Nicsa,  Joseph M. Orost, Steven Pigeon, Bill Riemers, Patrice Simard, Jeffery
          Triggs, Luc Vincent, Pascal Vincent.