Provided by: tinycdb_0.78_amd64 bug

NAME

       cdb - Constant DataBase file format

DESCRIPTION

       A  cdb  database  is  a  single  file  used  to  map `keys' to `values', having records of
       (key,value) pairs.  File consists of 3 parts: toc (table  of  contents),  data  and  index
       (hash tables).

       Toc  has  fixed  length of 2048 bytes, containing 256 pointers to hash tables inside index
       sections.  Every pointer consists of position of a hash table in bytes from the  beginning
       of  a  file,  and  a  size of a hash table in entries, both are 4-bytes (32 bits) unsigned
       integers in little-endian form.  Hash table length may  have  zero  length,  meaning  that
       corresponding hash table is empty.

       Right  after  toc  section,  data  section  follows without any alingment.  It consists of
       series of records, each is a key length, value (data) length, key and value.   Again,  key
       and  value length are 4-byte unsigned integers.  Each next record follows previous without
       any special alignment.

       After data section, index (hash tables) section  follows.   It  should  be  looked  to  in
       conjunction  with  toc  section,  where  each  of  max 256 hash tables are defined.  Index
       section consists of series of hash tables, with starting position and  length  defined  in
       toc section.  Every hash table is a sequence of records each holds two numbers: key's hash
       value and record position inside data section (bytes from the beginning of a file to first
       byte  of  key  length  starting data record).  If record position is zero, then this is an
       empty hash table slot, pointed to nowhere.

       CDB hash function is
         hv = ((hv << 5) + hv) ^ c
       for every single c byte of a key, starting with hv = 5381.

       Toc section indexed by (hv % 256), i.e. hash value modulo 256 (number of  entries  in  toc
       section).

       In  order  to  find  a  record,  one  should: first, compute the hash value (hv) of a key.
       Second, look to hash table number hv modulo 256.  If it is empty, then there  is  no  such
       key  exists.   If  it  is  not  empty,  then  third, loop by slots inside that hash table,
       starting from slot with number hv divided by 256 modulo length of that table,  or  ((hv  /
       256)  % htlen), searching for this hv in hash table.  Stop search on empty slot (if record
       position is zero) or when all slots was probed (note cyclic search, jumping  from  end  to
       beginning of a table).  When hash value in question is found in hash table, look to key of
       corresponding record, comparing it with key in question.  If them of the same  length  and
       equals  to  each  other, then record is found, overwise, repeat with next hash table slot.
       Note that there may be several records with the same key.

SEE ALSO

       cdb(1), cdb(3).

AUTHOR

       The tinycdb package written by Michael Tokarev <mjt@corpit.ru>, based on ideas and  shares
       file format with original cdb library by Dan Bernstein.

LICENSE

       Public domain.

                                            Apr, 2005                                      cdb(5)