Provided by: libqdbm-dev_1.8.78-6.1ubuntu2_amd64 bug

NAME

       QDBM - quick database manager

OVERVIEW

       QDBM is a library of routines for managing a database.  The database is a simple data file
       containing records, each is a pair of a key and a value.  Every key and  value  is  serial
       bytes  with  variable  length.  Both binary data and character string can be used as a key
       and a value.  There is neither concept  of  data  tables  nor  data  types.   Records  are
       organized in hash table or B+ tree.

       As  for  database  of  hash  table,  each  key  must be unique within a database, so it is
       impossible to store two or more records with a key overlaps.  The following access methods
       are  provided  to the database: storing a record with a key and a value, deleting a record
       by a key, retrieving a record by a key.  Moreover,  traversal  access  to  every  key  are
       provided,  although  the  order is arbitrary.  These access methods are similar to ones of
       DBM (or its followers: NDBM and GDBM) library defined in the UNIX standard.   QDBM  is  an
       alternative for DBM because of its higher performance.

       As  for  database  of  B+  tree,  records whose keys are duplicated can be stored.  Access
       methods of storing, deleting, and retrieving are provided as with  the  database  of  hash
       table.   Records  are  stored  in order by a comparing function assigned by a user.  It is
       possible to access  each  record  with  the  cursor  in  ascending  or  descending  order.
       According  to  this  mechanism,  forward  matching search for strings and range search for
       integers are realized.  Moreover, transaction is available in database of B+ tree.

EFFECTIVE IMPLEMENTATION OF HASH DATABASE

       QDBM is developed referring to GDBM for the purpose of the following three points:  higher
       processing  speed,  smaller  size  of  a  database  file, and simpler API.  They have been
       achieved.  Moreover, the following three restrictions of traditional DBM:  a  process  can
       handle  only  one  database,  the size of a key and a value is bounded, a database file is
       sparse, are cleared.

       QDBM uses hash algorithm to retrieve records.  If a bucket array has sufficient number  of
       elements,  the  time  complexity  of  retrieval  is  `O(1)'.   That  is, time required for
       retrieving a record is constant, regardless of the scale of a database.  It  is  also  the
       same  about  storing  and  deleting.   Collision  of  hash  values  is managed by separate
       chaining.  Data structure of the chains is binary search tree.  Even if a bucket array has
       unusually scarce elements, the time complexity of retrieval is `O(log n)'.

       QDBM attains improvement in retrieval by loading RAM with the whole of a bucket array.  If
       a bucket array is on RAM, it is possible to access a region of a target  record  by  about
       one path of file operations.  A bucket array saved in a file is not read into RAM with the
       `read' call but directly mapped to RAM with the `mmap' call.  Therefore, preparation  time
       on  connecting  to  a database is very short, and two or more processes can share the same
       memory map.

       If the number of elements of a bucket array is about  half  of  records  stored  within  a
       database, although it depends on characteristic of the input, the probability of collision
       of hash values is about 56.7% (36.8% if the same, 21.3% if twice,  11.5%  if  four  times,
       6.0%  if  eight  times).  In such case, it is possible to retrieve a record by two or less
       paths of file operations.  If it is made into a performance index, in order  to  handle  a
       database containing one million of records, a bucket array with half a million of elements
       is needed.  The size of each element is  4  bytes.   That  is,  if  2M  bytes  of  RAM  is
       available, a database containing one million records can be handled.

       QDBM  provides  two  modes  to connect to a database: `reader' and `writer'.  A reader can
       perform retrieving but neither storing nor deleting.  A  writer  can  perform  all  access
       methods.   Exclusion  control between processes is performed when connecting to a database
       by file locking.  While a writer is connected to a database, neither readers  nor  writers
       can  be  connected.   While  a  reader  is  connected  to a database, other readers can be
       connect, but writers can not.  According to this mechanism, data consistency is guaranteed
       with simultaneous connections in multitasking environment.

       Traditional  DBM provides two modes of the storing operations: `insert' and `replace'.  In
       the case a key overlaps an existing record, the insert  mode  keeps  the  existing  value,
       while  the  replace  mode  transposes  it  to the specified value.  In addition to the two
       modes, QDBM provides `concatenate' mode.  In the mode, the specified value is concatenated
       at the end of the existing value and stored.  This feature is useful when adding a element
       to a value as an array.  Moreover, although DBM has a method to fetch out a value  from  a
       database only by reading the whole of a region of a record, QDBM has a method to fetch out
       a part of a region of a value.  When a value is treated as an array, this feature is  also
       useful.

       Generally  speaking,  while  succession  of  updating,  fragmentation of available regions
       occurs, and the size of a  database  grows  rapidly.   QDBM  deal  with  this  problem  by
       coalescence  of  dispensable regions and reuse of them, and featuring of optimization of a
       database.  When overwriting a record with a value whose size is greater than the  existing
       one,  it  is  necessary to remove the region to another position of the file.  Because the
       time complexity of the operation depends on the size of the region of a record,  extending
       values  successively  is  inefficient.  However, QDBM deal with this problem by alignment.
       If increment can be put in padding, it is not necessary to remove the region.

       As for many file systems, it is impossible to handle a file whose size is more  than  2GB.
       To deal with this problem, QDBM provides a directory database containing multiple database
       files.  Due to this feature, it is possible to handle a database whose total size is up to
       1TB  in  theory.   Moreover, because database files can be deployed on multiple disks, the
       speed of updating operations can be improved  as  with  RAID-0  (striping).   It  is  also
       possible for the database files to deploy on multiple file servers using NFS and so on.

USEFUL IMPLEMENTATION OF B+ TREE DATABASE

       Although  B+  tree  database  is slower than hash database, it features ordering access to
       each record.  The order can be assigned by users.  Records  of  B+  tree  are  sorted  and
       arranged  in  logical  pages.   Sparse index organized in B tree that is multiway balanced
       tree are maintained for each page.  Thus, the time complexity of retrieval and  so  on  is
       `O(log  n)'.  Cursor is provided to access each record in order.  The cursor can jump to a
       position specified by a key and can step forward or backward from  the  current  position.
       Because  each  page  is  arranged  as  double linked list, the time complexity of stepping
       cursor is `O(1)'.

       B+ tree database is implemented, based on above hash database.  Because each  page  of  B+
       tree  is  stored  as each record of hash database, B+ tree database inherits efficiency of
       storage management of hash database.  Because the header of each  record  is  smaller  and
       alignment  of  each  page is calculated statistically, in most cases, the size of database
       file is cut by half compared to one of hash database.  Although operation  of  many  pages
       are  required  to update B+ tree, QDBM expedites the process by caching pages and reducing
       file operations.  In most cases, because whole of the sparse index is cached on memory, it
       is possible to retrieve a record by one or less path of file operations.

       B+  tree  database  features  transaction mechanism.  It is possible to commit a series of
       operations between the beginning and the end of the transaction in a lump, or to abort the
       transaction and perform rollback to the state before the transaction.  Even if the process
       of an application is crushed while the transaction, the database file is not broken.

       In case that QDBM is built with ZLIB, LZO, or BZIP2 enabled, a  lossless  data-compression
       library,  the content of each page of B+ tree is compressed and stored in a file.  Because
       each record in a page has similar patterns, high efficiency of compression is expected due
       to  the  Lempel-Ziv  algorithm  and  the  like.  In case handling text data, the size of a
       database is reduced to about 25%.  If the scale of a database is large and disk I/O is the
       bottleneck, featuring compression makes the processing speed improved to a large extent.

SIMPLE BUT VARIOUS INTERFACES

       QDBM  provides  very  simple  APIs.   You  can perform database I/O as usual file I/O with
       `FILE' pointer defined in ANSI C.  In the basic API of  QDBM,  entity  of  a  database  is
       recorded  as  one  file.  In the extended API, entity of a database is recorded as several
       files in one directory.  Because the two APIs are very similar with each other, porting an
       application from one to the other is easy.

       APIs  which  are  compatible  with NDBM and GDBM are also provided.  As there are a lot of
       applications using NDBM or GDBM, it is easy to port them onto QDBM.  In most cases, it  is
       completed  only  by replacement of header including (#include) and re-compiling.  However,
       QDBM can not handle database files made by the original NDBM or GDBM.

       In order to handle records on memory easily, the utility API is provided.   It  implements
       memory  allocating  functions,  sorting functions, extensible datum, array list, hash map,
       and so on.  Using them, you can handle records in C language cheaply  as  in  such  script
       languages as Perl or Ruby.

       B+ tree database is used with the advanced API.  The advanced API is implemented using the
       basic API and the utility API.  Because the advanced API is also similar to the basic  API
       and the extended API, it is easy to learn how to use it.

       In  order  to  handle  an  inverted  index  which is used by full-text search systems, the
       inverted API is provided.  If it is easy to handle an  inverted  index  of  documents,  an
       application  can  focus  on text processing and natural language processing.  Because this
       API does not depend on character codes nor  languages,  it  is  possible  to  implement  a
       full-text search system which can respond to various requests from users.

       Along  with  APIs for C, QDBM provides APIs for C++, Java, Perl, and Ruby.  APIs for C are
       composed of seven kinds: the basic API, the extended API,  the  NDBM-compatible  API,  the
       GDBM-compatible  API,  the  utility  API, the advanced API, and the inverted API.  Command
       line interfaces corresponding to  each  API  are  also  provided.   They  are  useful  for
       prototyping,  testing,  debugging,  and so on.  The C++ API encapsulates database handling
       functions of the basic API, the extended API, and the advanced API with class mechanism of
       C++.   The  Java  API  has native methods calling the basic API, the extended API, and the
       advanced API with Java Native Interface.  The Perl API has methods calling the basic  API,
       the  extended API, and the advanced API with XS language.  The Ruby API has method calling
       the basic API, the extended API, and the advanced API as modules of Ruby.   Moreover,  CGI
       scripts for administration of databases and full-text search are provided.

WIDE PORTABILITY

       QDBM  is  implemented being based on syntax of ANSI C (C89) and using only APIs defined in
       ANSI C or POSIX.  Thus, QDBM works on most UNIX and its compatible OSs.   As  for  C  API,
       checking  operations have been done at least on Linux 2.2, Linux 2.4, FreeBSD 4.8, FreeBSD
       5.0, SunOS 5.7, SunOS 5.8, SunOS 5.9, HP-UX 11.00, Cygwin 1.3.10, Mac OS X 10.2, and  RISC
       OS 5.03.  Although a database file created by QDBM depends on byte order of the processor,
       to do with it, utilities to dump data in format which is independent to  byte  orders  are
       provided.

BUILDING

       For  building  a  program  using  QDBM,  the  program should be linked with a library file
       `libqdbm.a' or `libqdbm.so'.  For example, the following  command  is  executed  to  build
       `sample' from `sample.c'.

              gcc -I/usr/local/include -o sample sample.c -L/usr/local/lib -lqdbm

AUTHOR

       QDBM  is  written  by  Mikio  Hirabayashi.   You  can  contact  the  author  by  e-mail to
       <mikio@fallabs.com>.  Any suggestion or bug report is welcome to the author.

COPYRIGHT

       Copyright(c) 2000-2003 Mikio Hirabayashi

       QDBM is free software; you can redistribute it and/or modify it under the terms of the GNU
       Lesser General Public License as published by the Free Software Foundation; either version
       2 of the License, or any later version.

       QDBM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY;  without
       even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
       GNU Lesser General Public License for more details.

       You should have received a copy of the GNU Lesser General Public License along with  QDBM;
       if  not,  write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston,
       MA 02111-1307 USA.

SEE ALSO

       depot(3), curia(3), relic(3), hovel(3), cabin(3), villa(3), odeum(3), ndbm(3), gdbm(3)