plucky (3) odopen.3.gz

Provided by: libqdbm-dev_1.8.78-13build2_amd64 bug

NAME

       Odeum - the inverted API of QDBM

SYNOPSIS

       #include <depot.h>
       #include <cabin.h>
       #include <odeum.h>
       #include <stdlib.h>

       typedef struct { int id; int score; } ODPAIR;

       ODEUM *odopen(const char *name, int omode);

       int odclose(ODEUM *odeum);

       int odput(ODEUM *odeum, const ODDOC *doc, int wmax, int over);

       int odout(ODEUM *odeum, const char *uri);

       int odoutbyid(ODEUM *odeum, int id);

       ODDOC *odget(ODEUM *odeum, const char *uri);

       ODDOC *odgetbyid(ODEUM *odeum, int id);

       int odgetidbyuri(ODEUM *odeum, const char *uri);

       int odcheck(ODEUM *odeum, int id);

       ODPAIR *odsearch(ODEUM *odeum, const char *word, int max, int *np);

       int odsearchdnum(ODEUM *odeum, const char *word);

       int oditerinit(ODEUM *odeum);

       ODDOC *oditernext(ODEUM *odeum);

       int odsync(ODEUM *odeum);

       int odoptimize(ODEUM *odeum);

       char *odname(ODEUM *odeum);

       double odfsiz(ODEUM *odeum);

       int odbnum(ODEUM *odeum);

       int odbusenum(ODEUM *odeum);

       int oddnum(ODEUM *odeum);

       int odwnum(ODEUM *odeum);

       int odwritable(ODEUM *odeum);

       int odfatalerror(ODEUM *odeum);

       int odinode(ODEUM *odeum);

       time_t odmtime(ODEUM *odeum);

       int odmerge(const char *name, const CBLIST *elemnames);

       int odremove(const char *name);

       ODDOC *oddocopen(const char *uri);

       void oddocclose(ODDOC *doc);

       void oddocaddattr(ODDOC *doc, const char *name, const char *value);

       void oddocaddword(ODDOC *doc, const char *normal, const char *asis);

       int oddocid(const ODDOC *doc);

       const char *oddocuri(const ODDOC *doc);

       const char *oddocgetattr(const ODDOC *doc, const char *name);

       const CBLIST *oddocnwords(const ODDOC *doc);

       const CBLIST *oddocawords(const ODDOC *doc);

       CBMAP *oddocscores(const ODDOC *doc, int max, ODEUM *odeum);

       CBLIST *odbreaktext(const char *text);

       char *odnormalizeword(const char *asis);

       ODPAIR *odpairsand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);

       ODPAIR *odpairsor(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);

       ODPAIR *odpairsnotand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);

       void odpairssort(ODPAIR *pairs, int pnum);

       double odlogarithm(double x);

       double odvectorcosine(const int *avec, const int *bvec, int vnum);

       void odsettuning(int ibnum, int idnum, int cbnum, int csiz);

       void odanalyzetext(ODEUM *odeum, const char *text, CBLIST *awords, CBLIST *nwords);

       void  odsetcharclass(ODEUM  *odeum,  const char *spacechars, const char *delimchars, const
       char *gluechars);

       ODPAIR *odquery(ODEUM *odeum, const char *query, int *np, CBLIST *errors);

DESCRIPTION

       Odeum is the API which handles an inverted index.  An inverted index is a  data  structure
       to retrieve a list of some documents that include one of words which were extracted from a
       population of documents.  It is easy to realize a full-text search system with an inverted
       index.   Odeum  provides an abstract data structure which consists of words and attributes
       of a document.  It is used when an application stores a document into a database and  when
       an application retrieves some documents from a database.

       Odeum  does  not provide methods to extract the text from the original data of a document.
       It should be implemented by applications.  Although Odeum provides  utilities  to  extract
       words  from  a text, it is oriented to such languages whose words are separated with space
       characters as English.  If an application handles such languages which need  morphological
       analysis  or  N-gram analysis as Japanese, or if an application perform more such rarefied
       analysis of natural languages as stemming,  its  own  analyzing  method  can  be  adopted.
       Result  of search is expressed as an array contains elements which are structures composed
       of the ID number of documents and its score.  In order to search with two or  more  words,
       Odeum provides utilities of set operations.

       Odeum  is  implemented, based on Curia, Cabin, and Villa.  Odeum creates a database with a
       directory name.  Some databases of Curia and Villa are placed in the specified  directory.
       For  example,  `casket/docs',  `casket/index',  and `casket/rdocs' are created in the case
       that a database directory named as `casket'.  `docs' is a  database  directory  of  Curia.
       The key of each record is the ID number of a document, and the value is such attributes as
       URI.  `index' is a database directory of Curia.  The key of each record is the  normalized
       form  of  a  word, and the value is an array whose element is a pair of the ID number of a
       document including the word and its score.  `rdocs' is a database file of Villa.  The  key
       of each record is the URI of a document, and the value is its ID number.

       In  order  to use Odeum, you should include `depot.h', `cabin.h', `odeum.h' and `stdlib.h'
       in the source files.  Usually, the following description will be near the beginning  of  a
       source file.

              #include <depot.h>
              #include <cabin.h>
              #include <odeum.h>
              #include <stdlib.h>

       A  pointer  to `ODEUM' is used as a database handle.  A database handle is opened with the
       function `odopen' and closed with `odclose'.  You should not refer directly to any  member
       of  the  handle.   If a fatal error occurs in a database, any access method via the handle
       except `odclose' will not work and return error status.  Although a process is allowed  to
       use  multiple  database handles at the same time, handles of the same database file should
       not be used.

       A pointer to `ODDOC' is used as a document handle.  A document handle is opened  with  the
       function  `oddocopen'  and closed with `oddocclose'.  You should not refer directly to any
       member of the handle.  A  document  consists  of  attributes  and  words.   Each  word  is
       expressed as a pair of a normalized form and a appearance form.

       Odeum  also  assign  the  external  variable  `dpecode'  with the error code. The function
       `dperrmsg' is used in order to get the message of the error code.

       Structures of `ODPAIR' type is used in order to handle results of search.

       typedef struct { int id; int score; } ODPAIR;
              `id' specifies the ID number of a document.  `score' specifies the score calculated
              from the number of searching words in the document.

       The function `odopen' is used in order to get a database handle.

       ODEUM *odopen(const char *name, int omode);
              `name'  specifies  the  name  of  a  database  directory.   `omode'  specifies  the
              connection mode: `OD_OWRITER' as a writer, `OD_OREADER' as a reader.  If  the  mode
              is `OD_OWRITER', the following may be added by bitwise or: `OD_OCREAT', which means
              it creates a new database if not exist, `OD_OTRUNC', which means it creates  a  new
              database  regardless  if  one exists.  Both of `OD_OREADER' and `OD_OWRITER' can be
              added to by bitwise or: `OD_ONOLCK', which means  it  opens  a  database  directory
              without  file  locking,  or  `OD_OLCKNB',  which means locking is performed without
              blocking.  The return value  is  the  database  handle  or  `NULL'  if  it  is  not
              successful.   While  connecting  as  a  writer, an exclusive lock is invoked to the
              database directory.  While connecting as a reader, a shared lock is invoked to  the
              database  directory.  The thread blocks until the lock is achieved.  If `OD_ONOLCK'
              is used, the application is responsible for exclusion control.

       The function `odclose' is used in order to close a database handle.

       int odclose(ODEUM *odeum);
              `odeum' specifies a database handle.  If successful,  the  return  value  is  true,
              else,  it  is false.  Because the region of a closed handle is released, it becomes
              impossible to use the handle.  Updating a database is assured to  be  written  when
              the  handle  is  closed.   If  a  writer  opens  a  database  but does not close it
              appropriately, the database will be broken.

       The function `odput' is used in order to store a document.

       int odput(ODEUM *odeum, const ODDOC *doc, int wmax, int over);
              `odeum' specifies a database handle connected  as  a  writer.   `doc'  specifies  a
              document  handle.   `wmax'  specifies  the  max number of words to be stored in the
              document database.  If it is negative, the number is unlimited.   `over'  specifies
              whether  the data of the duplicated document is overwritten or not.  If it is false
              and the URI of the document is duplicated, the function returns as  an  error.   If
              successful, the return value is true, else, it is false.

       The function `odout' is used in order to delete a document specified by a URI.

       int odout(ODEUM *odeum, const char *uri);
              `odeum'  specifies  a  database  handle connected as a writer.  `uri' specifies the
              string of the URI of a document.  If successful, the return value is true, else, it
              is false.  False is returned when no document corresponds to the specified URI.

       The function `odoutbyid' is used in order to delete a document specified by an ID number.

       int odoutbyid(ODEUM *odeum, int id);
              `odeum'  specifies  a database handle connected as a writer.  `id' specifies the ID
              number of a document.  If successful, the return value is true, else, it is  false.
              False is returned when no document corresponds to the specified ID number.

       The function `odget' is used in order to retrieve a document specified by a URI.

       ODDOC *odget(ODEUM *odeum, const char *uri);
              `odeum'  specifies  a  database handle.  `uri' specifies the string of the URI of a
              document.  If successful, the return value  is  the  handle  of  the  corresponding
              document,  else,  it is `NULL'.  `NULL' is returned when no document corresponds to
              the specified URI.  Because the handle of the  return  value  is  opened  with  the
              function `oddocopen', it should be closed with the function `oddocclose'.

       The function `odgetbyid' is used in order to retrieve a document by an ID number.

       ODDOC *odgetbyid(ODEUM *odeum, int id);
              `odeum'  specifies  a database handle.  `id' specifies the ID number of a document.
              If successful, the return value is the handle of the corresponding document,  else,
              it  is `NULL'.  `NULL' is returned when no document corresponds to the specified ID
              number.  Because the handle of  the  return  value  is  opened  with  the  function
              `oddocopen', it should be closed with the function `oddocclose'.

       The  function `odgetidbyuri' is used in order to retrieve the ID of the document specified
       by a URI.

       int odgetidbyuri(ODEUM *odeum, const char *uri);
              `odeum' specifies a database handle.  `uri' specifies  the  string  the  URI  of  a
              document.   If successful, the return value is the ID number of the document, else,
              it is -1.  -1 is returned when no document corresponds to the specified URI.

       The function `odcheck' is used in order to check whether the document specified by  an  ID
       number exists.

       int odcheck(ODEUM *odeum, int id);
              `odeum'  specifies  a database handle.  `id' specifies the ID number of a document.
              The return value is true if the document exists, else, it is false.

       The function `odsearch' is used in order  to  search  the  inverted  index  for  documents
       including a particular word.

       ODPAIR *odsearch(ODEUM *odeum, const char *word, int max, int *np);
              `odeum'  specifies  a  database  handle.  `word' specifies a searching word.  `max'
              specifies the max number of documents to be retrieve.  `np' specifies  the  pointer
              to  a variable to which the number of the elements of the return value is assigned.
              If successful, the return value is the pointer to an array,  else,  it  is  `NULL'.
              Each  element  of the array is a pair of the ID number and the score of a document,
              and sorted in descending order of their scores.  Even if no document corresponds to
              the specified word, it is not error but returns an dummy array.  Because the region
              of the return value is allocated with the `malloc' call, it should be released with
              the  `free' call if it is no longer in use.  Note that each element of the array of
              the return value can be data of a deleted document.

       The function `odsearchnum' is used in order to get the number  of  documents  including  a
       word.

       int odsearchdnum(ODEUM *odeum, const char *word);
              `odeum'  specifies  a  database  handle.   `word'  specifies  a searching word.  If
              successful, the return value is the number of documents including the  word,  else,
              it is -1.  Because this function does not read the entity of the inverted index, it
              is faster than `odsearch'.

       The function `oditerinit' is used in order  to  initialize  the  iterator  of  a  database
       handle.

       int oditerinit(ODEUM *odeum);
              `odeum'  specifies  a  database  handle.   If successful, the return value is true,
              else, it is false.  The iterator is used in order to access every  document  stored
              in a database.

       The function `oditernext' is used in order to get the next key of the iterator.

       ODDOC *oditernext(ODEUM *odeum);
              `odeum' specifies a database handle.  If successful, the return value is the handle
              of the next document, else, it is `NULL'.  `NULL' is returned when no  document  is
              to  be  get  out  of  the  iterator.   It  is  possible to access every document by
              iteration of calling this function.  However, it is not  assured  if  updating  the
              database  is  occurred  while  the iteration.  Besides, the order of this traversal
              access method is arbitrary, so it is not assured that the order of  string  matches
              the  one of the traversal access.  Because the handle of the return value is opened
              with the function `oddocopen', it should be closed with the function `oddocclose'.

       The function `odsync' is used in order to synchronize updating contents with the files and
       the devices.

       int odsync(ODEUM *odeum);
              `odeum'  specifies  a  database  handle  connected as a writer.  If successful, the
              return value is true, else, it is false.  This  function  is  useful  when  another
              process uses the connected database directory.

       The function `odoptimize' is used in order to optimize a database.

       int odoptimize(ODEUM *odeum);
              `odeum'  specifies  a  database  handle  connected as a writer.  If successful, the
              return value is true, else, it is false.  Elements of the deleted documents in  the
              inverted index are purged.

       The function `odname' is used in order to get the name of a database.

       char *odname(ODEUM *odeum);
              `odeum'  specifies  a  database  handle.   If  successful,  the return value is the
              pointer to the region of the name of the database, else, it is `NULL'.  Because the
              region  of  the  return  value  is  allocated  with the `malloc' call, it should be
              released with the `free' call if it is no longer in use.

       The function `odfsiz' is used in order to get the total size of database files.

       double odfsiz(ODEUM *odeum);
              `odeum' specifies a database handle.  If successful, the return value is the  total
              size of the database files, else, it is -1.0.

       The  function  `odbnum'  is  used  in order to get the total number of the elements of the
       bucket arrays in the inverted index.

       int odbnum(ODEUM *odeum);
              `odeum' specifies a database handle.  If successful, the return value is the  total
              number of the elements of the bucket arrays, else, it is -1.

       The  function `odbusenum' is used in order to get the total number of the used elements of
       the bucket arrays in the inverted index.

       int odbusenum(ODEUM *odeum);
              `odeum' specifies a database handle.  If successful, the return value is the  total
              number of the used elements of the bucket arrays, else, it is -1.

       The  function  `oddnum'  is  used  in order to get the number of the documents stored in a
       database.

       int oddnum(ODEUM *odeum);
              `odeum' specifies a database handle.  If successful, the return value is the number
              of the documents stored in the database, else, it is -1.

       The  function  `odwnum'  is  used  in  order  to  get  the number of the words stored in a
       database.

       int odwnum(ODEUM *odeum);
              `odeum' specifies a database handle.  If successful, the return value is the number
              of  the  words  stored in the database, else, it is -1.  Because of the I/O buffer,
              the return value may be less than the hard number.

       The function `odwritable' is used in order to check whether a database handle is a  writer
       or not.

       int odwritable(ODEUM *odeum);
              `odeum'  specifies  a database handle.  The return value is true if the handle is a
              writer, false if not.

       The function `odfatalerror' is used in order to check whether a database has a fatal error
       or not.

       int odfatalerror(ODEUM *odeum);
              `odeum'  specifies a database handle.  The return value is true if the database has
              a fatal error, false if not.

       The function `odinode' is used in order to get the inode number of a database directory.

       int odinode(ODEUM *odeum);
              `odeum' specifies a database handle.  The return value is the inode number  of  the
              database directory.

       The function `odmtime' is used in order to get the last modified time of a database.

       time_t odmtime(ODEUM *odeum);
              `odeum' specifies a database handle.  The return value is the last modified time of
              the database.

       The function `odmerge' is used in order to merge plural database directories.

       int odmerge(const char *name, const CBLIST *elemnames);
              `name' specifies the name of a database directory to create.  `elemnames' specifies
              a  list  of  names  of element databases.  If successful, the return value is true,
              else, it is false.  If two or more documents which have the same URL come  in,  the
              first one is adopted and the others are ignored.

       The function `odremove' is used in order to remove a database directory.

       int odremove(const char *name);
              `name' specifies the name of a database directory.  If successful, the return value
              is true, else, it is false.  A database directory can contain  databases  of  other
              APIs of QDBM, they are also removed by this function.

       The function `oddocopen' is used in order to get a document handle.

       ODDOC *oddocopen(const char *uri);
              `uri' specifies the URI of a document.  The return value is a document handle.  The
              ID number of a new document is not defined.  It is defined  when  the  document  is
              stored in a database.

       The function `oddocclose' is used in order to close a document handle.

       void oddocclose(ODDOC *doc);
              `doc'  specifies  a  document  handle.   Because  the  region of a closed handle is
              released, it becomes impossible to use the handle.

       The function `oddocaddattr' is used in order to add an attribute to a document.

       void oddocaddattr(ODDOC *doc, const char *name, const char *value);
              `doc' specifies a document handle.  `name' specifies the string of the name  of  an
              attribute.  `value' specifies the string of the value of the attribute.

       The function `oddocaddword' is used in order to add a word to a document.

       void oddocaddword(ODDOC *doc, const char *normal, const char *asis);
              `doc' specifies a document handle.  `normal' specifies the string of the normalized
              form of a word.  Normalized forms are treated as keys of the  inverted  index.   If
              the  normalized form of a word is an empty string, the word is not reflected in the
              inverted index.  `asis' specifies the string of the appearance form  of  the  word.
              Appearance forms are used after the document is retrieved by an application.

       The function `oddocid' is used in order to get the ID number of a document.

       int oddocid(const ODDOC *doc);
              `doc'  specifies  a  document  handle.   The  return  value  is  the ID number of a
              document.

       The function `oddocuri' is used in order to get the URI of a document.

       const char *oddocuri(const ODDOC *doc);
              `doc' specifies a document handle.  The return value is the string of the URI of  a
              document.

       The  function  `oddocgetattr'  is  used  in  order  to  get the value of an attribute of a
       document.

       const char *oddocgetattr(const ODDOC *doc, const char *name);
              `doc' specifies a document handle.  `name' specifies the string of the name  of  an
              attribute.  The return value is the string of the value of the attribute, or `NULL'
              if no attribute corresponds.

       The function `oddocnwords' is used in order to get  the  list  handle  contains  words  in
       normalized form of a document.

       const CBLIST *oddocnwords(const ODDOC *doc);
              `doc'  specifies  a  document handle.  The return value is the list handle contains
              words in normalized form.

       The function `oddocawords' is used in order to get  the  list  handle  contains  words  in
       appearance form of a document.

       const CBLIST *oddocawords(const ODDOC *doc);
              `doc'  specifies  a  document handle.  The return value is the list handle contains
              words in appearance form.

       The function `oddocscores' is used in order to get the map  handle  contains  keywords  in
       normalized form and their scores.

       CBMAP *oddocscores(const ODDOC *doc, int max, ODEUM *odeum);
              `doc'  specifies  a document handle.  `max' specifies the max number of keywords to
              get.  `odeum' specifies a database handle with  which  the  IDF  for  weighting  is
              calculate.   If  it  is `NULL', it is not used.  The return value is the map handle
              contains keywords and their scores.   Scores  are  expressed  as  decimal  strings.
              Because  the handle of the return value is opened with the function `cbmapopen', it
              should be closed with the function `cbmapclose' if it is no longer in use.

       The function `odbreaktext' is used in order to break a text into words in appearance form.

       CBLIST *odbreaktext(const char *text);
              `text' specifies the string of a  text.   The  return  value  is  the  list  handle
              contains  words  in appearance form.  Words are separated with space characters and
              such delimiters as period, comma and so on.  Because the handle of the return value
              is  opened  with  the  function `cblistopen', it should be closed with the function
              `cblistclose' if it is no longer in use.

       The function `odnormalizeword' is used in order to make the normalized form of a word.

       char *odnormalizeword(const char *asis);
              `asis' specifies the string of the appearance form of a word.  The return value  is
              is  the string of the normalized form of the word.  Alphabets of the ASCII code are
              unified into lower cases.  Words composed of only delimiters are treated  as  empty
              strings.   Because  the  region  of the return value is allocated with the `malloc'
              call, it should be released with the `free' call if it is no longer in use.

       The function `odpairsand' is used in order to get the  common  elements  of  two  sets  of
       documents.

       ODPAIR *odpairsand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);
              `apairs'  specifies the pointer to the former document array.  `anum' specifies the
              number of the elements of  the  former  document  array.   `bpairs'  specifies  the
              pointer  to the latter document array.  `bnum' specifies the number of the elements
              of the latter document array.  `np' specifies the pointer to a  variable  to  which
              the  number  of  the elements of the return value is assigned.  The return value is
              the pointer to a new document array whose elements commonly belong to the specified
              two  sets.   Elements  of the array are sorted in descending order of their scores.
              Because the region of the return value is allocated  with  the  `malloc'  call,  it
              should be released with the `free' call if it is no longer in use.

       The  function  `odpairsor'  is  used  in  order  to get the sum of elements of two sets of
       documents.

       ODPAIR *odpairsor(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);
              `apairs' specifies the pointer to the former document array.  `anum' specifies  the
              number  of  the  elements  of  the  former  document array.  `bpairs' specifies the
              pointer to the latter document array.  `bnum' specifies the number of the  elements
              of  the  latter  document array.  `np' specifies the pointer to a variable to which
              the number of the elements of the return value is assigned.  The  return  value  is
              the  pointer to a new document array whose elements belong to both or either of the
              specified two sets.  Elements of the array are sorted in descending order of  their
              scores.   Because  the  region  of  the return value is allocated with the `malloc'
              call, it should be released with the `free' call if it is no longer in use.

       The function `odpairsnotand' is used in order to get the difference set of documents.

       ODPAIR *odpairsnotand(ODPAIR *apairs, int anum, ODPAIR *bpairs, int bnum, int *np);
              `apairs' specifies the pointer to the former document array.  `anum' specifies  the
              number  of  the  elements  of  the  former  document array.  `bpairs' specifies the
              pointer to the latter document array of the sum of elements.  `bnum' specifies  the
              number of the elements of the latter document array.  `np' specifies the pointer to
              a variable to which the number of the elements of the  return  value  is  assigned.
              The  return  value  is the pointer to a new document array whose elements belong to
              the former set but not to the latter set.  Elements of  the  array  are  sorted  in
              descending  order  of  their  scores.   Because  the  region of the return value is
              allocated with the `malloc' call, it should be released with the `free' call if  it
              is no longer in use.

       The function `odpairssort' is used in order to sort a set of documents in descending order
       of scores.

       void odpairssort(ODPAIR *pairs, int pnum);
              `pairs' specifies the pointer to a document array.  `pnum' specifies the number  of
              the elements of the document array.

       The function `odlogarithm' is used in order to get the natural logarithm of a number.

       double odlogarithm(double x);
              `x'  specifies  a number.  The return value is the natural logarithm of the number.
              If the number is equal to or less than 1.0, the return value is 0.0.  This function
              is useful when an application calculates the IDF of search results.

       The  function  `odvectorcosine'  is  used  in  order to get the cosine of the angle of two
       vectors.

       double odvectorcosine(const int *avec, const int *bvec, int vnum);
              `avec' specifies the pointer to one array of numbers.  `bvec' specifies the pointer
              to  the  other  array  of numbers.  `vnum' specifies the number of elements of each
              array.  The return value is the cosine of the angle of two vectors.  This  function
              is useful when an application calculates similarity of documents.

       The function `odsettuning' is used in order to set the global tuning parameters.

       void odsettuning(int ibnum, int idnum, int cbnum, int csiz);
              `ibnum'  specifies  the  number of buckets for inverted indexes.  `idnum' specifies
              the division number of inverted index.  `cbnum' specifies the number of buckets for
              dirty buffers.  `csiz' specifies the maximum bytes to use memory for dirty buffers.
              The default setting is equivalent  to  `odsettuning(32749,  7,  262139,  8388608)'.
              This function should be called before opening a handle.

       The  function  `odanalyzetext'  is  used  in  order  to  break a text into words and store
       appearance forms and normalized form into lists.

       void odanalyzetext(ODEUM *odeum, const char *text, CBLIST *awords, CBLIST *nwords);
              `odeum' specifies a database handle.   `text'  specifies  the  string  of  a  text.
              `awords'  specifies  a  list  handle into which appearance form is store.  `nwords'
              specifies a list handle into which normalized form is store.  If it is  `NULL',  it
              is  ignored.   Words  are  separated  with  space characters and such delimiters as
              period, comma and so on.

       The function `odsetcharclass' is used in order to set the classes of  characters  used  by
       `odanalyzetext'.

       void  odsetcharclass(ODEUM  *odeum,  const char *spacechars, const char *delimchars, const
       char *gluechars);
              `odeum' specifies a database handle.   `spacechars'  spacifies  a  string  contains
              space  characters.   `delimchars' spacifies a string contains delimiter characters.
              `gluechars' spacifies a string contains glue characters.

       The function `odquery' is used in order to query a database using a  small  boolean  query
       language.

       ODPAIR *odquery(ODEUM *odeum, const char *query, int *np, CBLIST *errors);
              `odeum'  specifies  a  database  handle.   'query' specifies the text of the query.
              `np' specifies the pointer to a variable to which the number of the elements of the
              return  value  is  assigned.   `errors'  specifies  a  list handle into which error
              messages are stored.  If it is `NULL', it is ignored.  If  successful,  the  return
              value is the pointer to an array, else, it is `NULL'.  Each element of the array is
              a pair of the ID number and the score of a document, and sorted in descending order
              of their scores.  Even if no document corresponds to the specified condition, it is
              not error but returns an dummy array.  Because the region of the  return  value  is
              allocated  with the `malloc' call, it should be released with the `free' call if it
              is no longer in use.  Note that each element of the array of the return  value  can
              be data of a deleted document.

       If  QDBM  was built with POSIX thread enabled, the global variable `dpecode' is treated as
       thread specific data, and functions of Odeum  are  reentrant.   In  that  case,  they  are
       thread-safe  as  long  as  a  handle  is  not accessed by threads at the same time, on the
       assumption that `errno', `malloc', and so on are thread-safe.

       If QDBM was built with ZLIB enabled, records in the database for document  attributes  are
       compressed.   In that case, the size of the database is reduced to 30% or less.  Thus, you
       should enable ZLIB if you use Odeum.  A database of Odeum created without ZLIB enabled  is
       not  available  on environment with ZLIB enabled, and vice versa.  If ZLIB was not enabled
       but LZO, LZO is used instead.

       The query language of the function `odquery' is a basic language following this grammar:

              expr ::= subexpr ( op subexpr )*
              subexpr ::= WORD
              subexpr ::= LPAREN expr RPAREN

       Operators are "&" (AND), "|" (OR), and "!" (NOTAND).  You can  use  parenthesis  to  group
       sub-expressions  together  in  order  to  change  order of operations.  The given query is
       broken up using the function `odanalyzetext', so if you want  to  specify  different  text
       breaking  rules,  then  make  sure that you at least set "&", "|", "!", "(", and ")" to be
       delimiter characters.  Consecutive words are treated as having an  implicit  "&"  operator
       between them, so "zed shaw" is actually "zed & shaw".

       The  encoding  of the query text should be the same with the encoding of target documents.
       Moreover, each of space characters, delimiter characters, and glue  characters  should  be
       single byte.

SEE ALSO

       qdbm(3), depot(3), curia(3), relic(3), hovel(3), cabin(3), villa(3), ndbm(3), gdbm(3)