Provided by: inn2-dev_2.5.2+20110413-1build1_amd64 bug

NAME

       dbzinit,  dbzfresh,  dbzagain,  dbzclose, dbzexists, dbzfetch, dbzstore, dbzsync, dbzsize,
       dbzgetoptions, dbzsetoptions, dbzdebug - database routines

SYNOPSIS

       #include <inn/dbz.h>

       bool dbzinit(const char *base)

       bool dbzclose(void)

       bool dbzfresh(const char *base, long size)

       bool dbzagain(const char *base, const char *oldbase)

       bool dbzexists(const HASH key)

       off_t dbzfetch(const HASH key)
       bool dbzfetch(const HASH key, void *ivalue)

       DBZSTORE_RESULT dbzstore(const HASH key, off_t offset)
       DBZSTORE_RESULT dbzstore(const HASH key, void *ivalue)

       bool dbzsync(void)

       long dbzsize(long nentries)

       void dbzgetoptions(dbzoptions *opt)

       void dbzsetoptions(const dbzoptions opt)

DESCRIPTION

       These functions provide an indexing system for rapid random access to  a  text  file  (the
       base file).

       Dbz  stores offsets into the base text file for rapid retrieval.  All retrievals are keyed
       on a hash value that is generated by the HashMessageID() function.

       Dbzinit opens a database, an index into the base file base, consisting of files base.dir ,
       base.index , and base.hash which must already exist.  (If the database is new, they should
       be zero-length files.)  Subsequent accesses go to that database until dbzclose  is  called
       to close the database.

       Dbzfetch searches the database for the specified key, returning the corresponding value if
       any, if <--enable-tagged-hash at configure> is  specified.   If  <--enable-tagged-hash  at
       configure>  is  not  specified,  it  returns  true and content of ivalue is set.  Dbzstore
       stores the key - value pair in the database, if  <--enable-tagged-hash  at  configure>  is
       specified.  If <--enable-tagged-hash at configure> is not specified, it stores the content
       of ivalue.  Dbzstore will fail unless the database files  are  writable.   Dbzexists  will
       verify  whether  or not the given hash exists or not.  Dbz is optimized for this operation
       and it may be significantly faster than dbzfetch().

       Dbzfresh is a variant of dbzinit for creating  a  new  database  with  more  control  over
       details.

       Dbzfresh's  size parameter specifies the size of the first hash table within the database,
       in key-value pairs.  Performance will be best if the number of key-value pairs  stored  in
       the database does not exceed about 2/3 of size.  (The dbzsize function, given the expected
       number of key-value pairs, will suggest  a  database  size  that  meets  these  criteria.)
       Assuming  that  an  fseek  offset is 4 bytes, the .index file will be 4 * size bytes.  The
       .hash file will be DBZ_INTERNAL_HASH_SIZE * size bytes (the .dir file is tiny and  roughly
       constant in size) until the number of key-value pairs exceeds about 80% of size.  (Nothing
       awful will happen if the database grows beyond 100% of size, but accesses will  slow  down
       quite a bit and the .index and .hash files will grow somewhat.)

       Dbz  stores  up to DBZ_INTERNAL_HASH_SIZE bytes of the message-id's hash in the .hash file
       to confirm a hit.  This eliminates the need to read the base file  to  handle  collisions.
       This replaces the tagmask feature in previous dbz releases.

       A size of ``0'' given to dbzfresh is synonymous with the local default; the normal default
       is suitable for tables of 5,000,000 key-value pairs.  Calling dbzinit(name) with the empty
       name is equivalent to calling dbzfresh(name, 0).

       When  databases  are  regenerated  periodically,  as  in  news, it is simplest to pick the
       parameters for a new database based on the old one.  This also permits some memory of past
       sizes  of  the  old  database, so that a new database size can be chosen to cover expected
       fluctuations.  Dbzagain is a variant of dbzinit for creating  a  new  database  as  a  new
       generation  of  an  old database.  The database files for oldbase must exist.  Dbzagain is
       equivalent to calling dbzfresh with a size equal to the result of applying dbzsize to  the
       largest number of entries in the oldbase database and its previous 10 generations.

       When  many  accesses  are  being  done by the same program, dbz is massively faster if its
       first hash table is in memory.  If the  ``pag_incore''  flag  is  set  to  INCORE_MEM,  an
       attempt  is  made to read the table in when the database is opened, and dbzclose writes it
       out to disk again (if it was read successfully and has been modified).  Dbzsetoptions  can
       be  used  to  set  the  pag_incore  and  exists_incore  flag  to new value which should be
       ``INCORE_NO'',  ``INCORE_MEM'',  or  ``INCORE_MMAP''  for  the  .hash  and  .index   files
       separately;  this  does  not affect the status of a database that has already been opened.
       The default is ``INCORE_NO'' for the .index file and ``INCORE_MMAP'' for the  .hash  file.
       The  attempt  to read the table in may fail due to memory shortage; in this case dbz fails
       with an error.  Stores to an in-memory database are not (in general) written  out  to  the
       file  until dbzclose or dbzsync, so if robustness in the presence of crashes or concurrent
       accesses is crucial, in-memory databases should probably be avoided  or  the  writethrough
       option should be set to ``true'';

       If the nonblock option is ``true'', then writes to the .hash and .index files will be done
       using non-blocking I/O.  This can be significantly faster if your platform  supports  non-
       blocking I/O with files.

       Dbzsync causes all buffers etc. to be flushed out to the files.  It is typically used as a
       precaution against crashes or concurrent accesses when a dbz-using process will be running
       for  a  long  time.   It  is  a  somewhat expensive operation, especially for an in-memory
       database.

       Concurrent reading of databases is  fairly  safe,  but  there  is  no  (inter)locking,  so
       concurrent updating is not.

       An open database occupies three stdio streams and two file descriptors; Memory consumption
       is negligible (except for stdio buffers) except for in-memory databases.

SEE ALSO

       dbm(3), history(5), libinn(3)

DIAGNOSTICS

       Functions returning bool values  return  ``true''  for  success,  ``false''  for  failure.
       Functions  returning off_t values return a value with -1 for failure.  Dbzinit attempts to
       have errno set plausibly on return, but otherwise this is not  guaranteed.   An  errno  of
       EDOM from dbzinit indicates that the database did not appear to be in dbz format.

       If  DBZTEST is defined at compile-time then a main() function will be included.  This will
       do performance tests and integrity test.

HISTORY

       The  original  dbz  was  written  by  Jon  Zeeff  (zeeff@b-tech.ann-arbor.mi.us).    Later
       contributions  by  David  Butler  and  Mark  Moraes.   Extensive reworking, including this
       documentation, by Henry Spencer (henry@zoo.toronto.edu) as part of  the  C  News  project.
       MD5  code borrowed from RSA.  Extensive reworking to remove backwards compatibility and to
       add hashes into dbz files by Clayton O'Neill (coneill@oneill.net)

BUGS

       Unlike dbm, dbz will refuse to dbzstore with a key already in the database.  The  user  is
       responsible for avoiding this.

       The  RFC5322  case  mapper  implements only a first approximation to the hideously-complex
       RFC5322 case rules.

       Dbz no longer tries to be call-compatible with dbm in any way.

                                            6 Sep 1997                                     DBZ(3)