bionic (3) wnsearch.3WN.gz

Provided by: wordnet-dev_3.0-35_amd64 bug

NAME

       findtheinfo,   findtheinfo_ds,  is_defined,  in_wn,  index_lookup,  parse_index,  getindex,  read_synset,
       parse_synset, free_syns, free_synset, free_index, traceptrs_ds, do_trace - functions  for  searching  the
       WordNet database

SYNOPSIS

       #include "wn.h"

       char *findtheinfo(char *searchstr, int pos, int ptr_type, int sense_num);

       SynsetPtr findtheinfo_ds(char *searchstr, int pos, int ptr_type, int sense_num );

       unsigned int is_defined(char *searchstr, int pos);

       unsigned int in_wn(char *searchstr, int pos);

       IndexPtr index_lookup(char *searchstr, int pos);

       IndexPtr parse_index(long offset, int dabase, char *line);

       IndexPtr getindex(char *searchstr, int pos);

       SynsetPtr read_synset(int pos, long synset_offset, char *searchstr);

       SynsetPtr parse_synset(FILE *fp, int pos, char *searchstr);

       void free_syns(SynsetPtr synptr);

       void free_synset(SynsetPtr synptr);

       void free_index(IndexPtr idx);

       SynsetPtr traceptrs_ds(SynsetPtr synptr, int ptr_type, int pos, int depth);

       char *do_trace(SynsetPtr synptr, int ptr_type, int pos, int depth);

DESCRIPTION

       These  functions  are  used  for  searching  the  WordNet  database.   They  generally  fall into several
       categories: functions for reading and parsing index file  entries;  functions  for  reading  and  parsing
       synsets  in  data  files;  functions  for  tracing  pointers and hierarchies; functions for freeing space
       occupied by data structures allocated with malloc(3).

       In the following function descriptions, pos is one of the following:

              1    NOUN
              2    VERB
              3    ADJECTIVE
              4    ADVERB

       findtheinfo() is the primary search algorithm for  use  with  database  interface  applications.   Search
       results  are  automatically formatted, and a pointer to the text buffer is returned.  All searches listed
       in WNHOME/include/wn.h can be done by findtheinfo().  findtheinfo_ds() can be used to perform most of the
       searches,  with results returned in a linked list data structure.  This is for use with applications that
       need to analyze the search results rather than just display them.

       Both functions are passed the same arguments: searchstr is the word or collocation  to  search  for;  pos
       indicates the syntactic category to search in; ptr_type is one of the valid search types for searchstr in
       pos.  (Available searches can be obtained by calling is_defined() described below.)  sense_num should  be
       ALLSENSES if the search is to be done on all senses of searchstr in pos, or a positive integer indicating
       which sense to search.

       findtheinfo_ds() returns a linked list data structures representing synsets.  Senses are  linked  through
       the  nextss  field  of  a Synset data structure.  For each sense, synsets that match the search specified
       with ptr_type are linked through the  ptrlist  field.   See  Synset  Navigation  ,  below,  for  detailed
       information on the linked lists returned.

       is_defined()  sets  a  bit  for  each  search  type  that  is valid for searchstr in pos, and returns the
       resulting unsigned integer.   Each  bit  number  corresponds  to  a  pointer  type  constant  defined  in
       WNHOME/include/wn.h.   For  example,  if bit 2 is set, the HYPERPTR search is valid for searchstr.  There
       are 29 possible searches.

       in_wn() is used to find the syntactic categories in the WordNet database that contain one or more  senses
       of  searchstr.   If  pos  is  ALL_POS, all syntactic categories are checked.  Otherwise, only the part of
       speech passed is checked.  An unsigned integer is returned with a bit set corresponding to each syntactic
       category  containing searchstr.  The bit number matches the number for the part of speech.  0 is returned
       if searchstr is not present in pos.

       index_lookup() finds searchstr in the index file for pos and returns a pointer to the parsed entry in  an
       Index  data  structure.   searchstr must exactly match the form of the word (lower case only, hyphens and
       underscores in the same places) in the index file.  NULL is returned if a match is not found.

       parse_index() parses an entry from an index file and returns a pointer to the parsed entry  in  an  Index
       data  structure.   Passed the byte offset and syntactic category, it reads the index entry at the desired
       location in the corresponding file.  If passed line, line contains an index file entry and  the  database
       index  file is not consulted.  However, offset and dbase should still be passed so the information can be
       stored in the Index structure.

       getindex() is a "smart" search for searchstr in the index file  corresponding  to  pos.   It  applies  to
       searchstr  an algorithm that replaces underscores with hyphens, hyphens with underscores, removes hyphens
       and underscores, and removes periods in an attempt to find a form of the string that is  an  exact  match
       for an entry in the index file corresponding to pos.  index_lookup() is called on each transformed string
       until a match is found or all the different strings have been tried.  It returns a pointer to the  parsed
       Index data structure for searchstr, or NULL if a match is not found.

       read_synset()  is  used  to  read a synset from a byte offset in a data file.  It performs an fseek(3) to
       synset_offset in the data file corresponding to pos, and calls  parse_synset()  to  read  and  parse  the
       synset.  A pointer to the Synset data structure containing the parsed synset is returned.

       parse_synset()  reads the synset at the current offset in the file indicated by fp.  pos is the syntactic
       category, and searchstr, if not NULL, indicates the word in the synset that the caller is interested  in.
       An attempt is made to match searchstr to one of the words in the synset.  If an exact match is found, the
       whichword field in the Synset structure is set to that word's number in the synset  (beginning  to  count
       from 1).

       free_syns()  is used to free a linked list of Synset structures allocated by findtheinfo_ds().  synptr is
       a pointer to the list to free.

       free_synset() frees the Synset structure pointed to by synptr.

       free_index() frees the Index structure pointed to by idx.

       traceptrs_ds() is a recursive search algorithm that traces pointers matching ptr_type starting  with  the
       synset  pointed  to  by  synptr.   Setting depth to 1 when traceptrs_ds() is called indicates a recursive
       search; 0 indicates a non-recursive call.  synptr points to the data structure representing the synset to
       search for a pointer of type ptr_type.  When a pointer type match is found, the synset pointed to is read
       is linked onto the nextss chain.  Levels of the tree generated by a recursive search are linked  via  the
       ptrlist  field  structure until NULL is found, indicating the top (or bottom) of the tree.  This function
       is usually called from findtheinfo_ds() for each sense of the word.  See Synset Navigation ,  below,  for
       detailed information on the linked lists returned.

       do_trace()  performs  the search indicated by ptr_type on synset synptr in syntactic category pos.  depth
       is defined as above.  do_trace() returns the search results formatted in a text buffer.

   Synset Navigation
       Since the Synset structure is used to represent the synsets  for  both  word  senses  and  pointers,  the
       ptrlist  and  nextss fields have different meanings depending on whether the structure is a word sense or
       pointer.  This can make navigation through the lists returned by findtheinfo_ds() confusing.

       Navigation through the returned list involves the following:

       Following the nextss chain from the synset returned moves through the various senses of searchstr.   NULL
       indicates that end of the chain of senses.

       Following  the  ptrlist  chain  from  a Synset structure representing a sense traces the hierarchy of the
       search results for that sense.  Subsequent links in the ptrlist chain indicate  the  next  level  (up  or
       down,  depending  on  the search) in the hierarchy.  NULL indicates the end of the chain of search result
       synsets.

       If a synset pointed to by ptrlist has a value in the nextss field, it represents another pointer  of  the
       same  type at that level in the hierarchy.  For example, some noun synsets have two hypernyms.  Following
       this nextss pointer, and then the ptrlist chain from the Synset structure  pointed  to,  traces  another,
       parallel, hierarchy, until the end is indicated by NULL on that ptrlist chain.  So, a synset representing
       a pointer (versus a sense of searchstr) having a non-NULL value in nextss has  another  chain  of  search
       results linked through the ptrlist chain of the synset pointed to by nextss.

       If  searchstr  contains more than one base form in WordNet (as in the noun axes, which has base forms axe
       and axis), synsets representing the search results for each base form are  linked  through  the  nextform
       pointer of the Synset structure.

   WordNet Searches
       There is no extensive description of what each search type is or the results returned.  Using the WordNet
       interface, examining the source code, and reading wndb(5WN) are the  best  ways  to  see  what  types  of
       searches are available and the data returned for each.

       Listed  below are the valid searches that can be passed as ptr_type to findtheinfo().  Passing a negative
       value (when applicable) causes a recursive, hierarchical search by setting depth to 1 when traceptrs() is
       called.

                   ┌─────────────────┬───────┬─────────┬────────────────────────────────────────────┐
                   │ptr_typeValuePointerSearch                                     │
                   │                 │       │ Symbol  │                                            │
                   ├─────────────────┼───────┼─────────┼────────────────────────────────────────────┤
                   │ANTPTR           │   1   │    !    │ Antonyms                                   │
                   │HYPERPTR         │   2   │    @    │ Hypernyms                                  │
                   │HYPOPTR          │   3   │    ∼    │ Hyponyms                                   │
                   │ENTAILPTR        │   4   │    *    │ Entailment                                 │
                   │SIMPTR           │   5   │    &    │ Similar                                    │
                   │ISMEMBERPTR      │   6   │   #m    │ Member meronym                             │
                   │ISSTUFFPTR       │   7   │   #s    │ Substance meronym                          │
                   │ISPARTPTR        │   8   │   #p    │ Part meronym                               │
                   │HASMEMBERPTR     │   9   │   %m    │ Member holonym                             │
                   │HASSTUFFPTR      │  10   │   %s    │ Substance holonym                          │
                   │HASPARTPTR       │  11   │   %p    │ Part holonym                               │
                   │MERONYM          │  12   │    %    │ All meronyms                               │
                   │HOLONYM          │  13   │    #    │ All holonyms                               │
                   │CAUSETO          │  14   │    >    │ Cause                                      │
                   │PPLPTR           │  15   │    <    │ Participle of verb                         │
                   │SEEALSOPTR       │  16   │    ^    │ Also see                                   │
                   │PERTPTR          │  17   │    \    │ Pertains to noun or derived from adjective │
                   │ATTRIBUTE        │  18   │   \=    │ Attribute                                  │
                   │VERBGROUP        │  19   │    $    │ Verb group                                 │
                   │DERIVATION       │  20   │    +    │ Derivationally related form                │
                   │CLASSIFICATION   │  21   │    ;    │ Domain of synset                           │
                   │CLASS            │  22   │    -    │ Member of this domain                      │
                   │SYNS             │  23   │   n/a   │ Find synonyms                              │
                   │FREQ             │  24   │   n/a   │ Polysemy                                   │
                   │FRAMES           │  25   │   n/a   │ Verb example sentences and generic frames  │
                   │COORDS           │  26   │   n/a   │ Noun coordinates                           │
                   │RELATIVES        │  27   │   n/a   │ Group related senses                       │
                   │HMERONYM         │  28   │   n/a   │ Hierarchical meronym search                │
                   │HHOLONYM         │  29   │   n/a   │ Hierarchical holonym search                │
                   │WNGREP           │  30   │   n/a   │ Find keywords by substring                 │
                   │OVERVIEW         │  31   │   n/a   │ Show all synsets for word                  │
                   │CLASSIF_CATEGORY │  32   │   ;c    │ Show domain topic                          │
                   │CLASSIF_USAGE    │  33   │   ;u    │ Show domain usage                          │
                   │CLASSIF_REGIONAL │  34   │   ;r    │ Show domain region                         │
                   │CLASS_CATEGORY   │  35   │   -c    │ Show domain terms for topic                │
                   │CLASS_USAGE      │  36   │   -u    │ Show domain terms for usage                │
                   │CLASS_REGIONAL   │  37   │   -r    │ Show domain terms for region               │
                   │INSTANCE         │  38   │   @i    │ Instance of                                │
                   │INSTANCES        │  39   │   ∼i    │ Show instances                             │
                   └─────────────────┴───────┴─────────┴────────────────────────────────────────────┘
       findtheinfo_ds() cannot perform the following searches:

              SEEALSOPTR
              PERTPTR
              VERBGROUP
              FREQ
              FRAMES
              RELATIVES
              WNGREP
              OVERVIEW

NOTES

       Applications  that  use WordNet and/or the morphological functions must call wninit() at the start of the
       program.  See wnutil(3WN) for more information.

       In all function calls, searchstr may be either a word or a collocation formed by joining individual words
       with underscore characters (_).

       The  SearchResults  structure defines fields in the wnresults global variable that are set by the various
       search functions.  This is a way to get additional information, such as the number  of  senses  the  word
       has, from the search functions.  The searchds field is set by findtheinfo_ds().

       The pos passed to traceptrs_ds() is not used.

SEE ALSO

       wn(1WN), wnb(1WN), wnintro(3WN), binsrch(3WN), malloc(3), morph(3WN), wnutil(3WN), wnintro(5WN).

WARNINGS

       parse_synset()  must  find  an  exact  match between the searchstr passed and a word in the synset to set
       whichword.  No attempt is made to translate hyphens and underscores, as is done in getindex().

       The WordNet database and exception list files must be opened with  wninit  prior  to  using  any  of  the
       searching functions.

       A large search may cause findtheinfo() to run out of buffer space.  The maximum buffer size is determined
       by computer platform.  If the buffer size is exceeded the following message  is  printed  in  the  output
       buffer: "Search too large.  Narrow search and try again...".

       Passing an invalid pos will probably result in a core dump.