lunar (1) link-parser.1.gz

Provided by: link-grammar_5.12.0~dfsg-2_amd64 bug

NAME

       link-parser - parse natural language sentences using Link Grammar

SYNOPSIS

       link-parser --help
       link-parser --version
       link-parser [language|dict_location] [--quiet] [-<special_"!"_command>...]

DESCRIPTION

       link-parser  is  the  command-line  wrapper  to  the link-grammar natural language parsing
       library.  This library  will  parse  sentences  written  in  English,  Russian  and  other
       languages,  generating  linkage trees showing relationships between the subject, the verb,
       and various adjectives, adverbs, etc. in the sentence.

EXAMPLE

       link-parser

       Starts the parser interactive shell.  Enter any sentence to parse:

       linkparser> Reading a man page is informative.

       Found 18 linkages (18 had no P.P. violations)
               Linkage 1, cost vector = (UNUSED=0 DIS= 0.00 LEN=16)

           +------------------------Xp------------------------+
           +--------------->WV-------------->+                |
           |         +----------Ss*g---------+                |
           |         +--------Os-------+     |                |
           |         |     +---Ds**x---+     |                |
           +--->Wd---+     +-PHc+---A--+     +---Pa---+       |
           |         |     |    |      |     |        |       |
       LEFT-WALL reading.g a man.ij page.n is.v informative.a .

BACKGROUND

       The link-parser command-line tool is useful for general exploration and use,  although  it
       is  presumed  that,  for  the  parsing  of large quantities of text, a custom application,
       making use of the link-grammar library, will be written.  Several  such  applications  are
       described  on  the  Link  Grammar web page (see SEE ALSO below); these include the AbiWord
       grammar checker, and the RelEx semantic relation extractor.

       The theory of Link Grammar is explained in many academic papers.  In the first  of  these,
       Daniel  Sleator  and  Davy  Temperley,  "Parsing  English with a Link Grammar" (1991), the
       authors defined a new formal grammatical system called a "link  grammar".  A  sequence  of
       words is in the language of a link grammar if there is a way to draw "links" between words
       in such a way that the local requirements of each word are satisfied,  the  links  do  not
       cross,  and  the  words  form  a  consistent  connected graph. The authors encoded English
       grammar into such a system, and wrote link-parser to parse English using this grammar.

       The engine that performs the parsing  is  separate  from  the  dictionaries  describing  a
       language.   Currently, the most fully developed, complete dictionaries are for the English
       and Russian languages, although experimental, incomplete dictionaries exist for German and
       eight other languages.

OVERVIEW

       link-parser,  when  invoked  manually,  starts an interactive shell, taking control of the
       terminal.  Any lines beginning with an exclamation mark  are  assumed  to  be  a  "special
       command";  these  are  described  below.   The  command  !help will provide more info; the
       command !variables will print  all  of  the  special  commands.   These  are  also  called
       "variables",  as  almost  all  commands  have  a  value  associated with them: the command
       typically enable or disable some function, or they alter some multi-valued setting.

       All other input is treated as a single, English-language sentence; it is parsed,  and  the
       result  of  the  parse is printed.  The variables control what is printed:  By default, an
       ASCII-graphics linkage is printed, although post-script  output  is  also  possible.   The
       printing  of  the  constituent tree can also be enabled. Other output controls include the
       printing of disjuncts and complete link data.

       In order to analyze sentences, link-parser depends on  a  link-grammar  dictionary.   This
       contains  lists  of  words and associated metadata about their grammatical properties.  An
       English language dictionary is provided by default.  If other  language  dictionaries  are
       installed in the default search locations, these may be explicitly specified by means of a
       2-letter ISO language code: so, for example:

           link-parser de

       will start the parser shell with the German dictionary.

       Alternately, the dictionary location can be specified explicitly with either  an  absolute
       or a relative file path; so, for example:

           link-parser /usr/share/link-grammar/en

       will  run  link-parser  using  the  English  dictionary  located  in  the  typical install
       directory.

       link-parser can also be used in a non-interactive  mode  ("batch  mode")  via  the  -batch
       option (a special command, see below): So, for example:

           cat thesis.txt | link-parser -batch

       will  read lines from the file thesis.txt, processing each one as a complete sentence. For
       sentences that don't have a full parse, it will print
       +++++ error N
       (N is a number) to the standard output.

       Alternately, an input file may be specified  with  the  !file  filename  special  command,
       described below.

       Note that using "batch mode" disables the usual ASCII-graphics linkage printing. The input
       sentences also don't appear by default on stdout.  These features may  be  re-enabled  via
       special commands; special commands may be interspersed with the input.

       Instead  of  specifying  -batch  in  the command-line, !batch can be specified in the said
       input file.

       For more details, use !help batch in link-parser's interactive shell.

OPTIONS

       --help Print usage and exit.

       --version
              Print program version and configuration details, and exit.

       --quiet
              Suppress the version messages on startup.

   Special ! commands
       The special "!" commands can be specified as command-line options in the command-line,  or
       within  the  interactive  shell itself by prefixing them with "!" on line start.  The full
       option name does not need to be used; only enough letters to make the option  unique  must
       be specified.

       When specifying as a command-line option, a special command is proceeded by "-" instead of
       "!".

       Boolean variables may be toggled simply by  giving  the  !varname,  for  example,  !batch.
       Setting  other  variables  require  using  an  equals  sign:  !varname=value, for example,
       !width=100.

       The !help command prints general help. When issued from the interactive shell, it can  get
       an  argument, usually a special command.  The !variables command prints all of the current
       variable settings.  The !file command reads input  from  its  argument  file.   The  !file
       command is not a variable; it cannot be set.  It can be used repeatedly.

       The !exit command instructs link-parser to exit.

       The  exclamation  mark  "!"  is  also  a  special  command  by itself, used to inspect the
       dictionary entry for any given word (optionally terminated  by  a  subscript).   Thus  two
       exclamation  marks are needed before such a word when doing so from the interactive shell.
       The wildcard character "*" can be specified as the last character of the word in order  to
       find multiple matches.

       Default  values  of  the special commands below are shown in parenthesis. Most of them are
       the default ones of the link-grammar library.
       Boolean default values are shown as on (1) or off (0).

       !bad (off)
              Enable display of bad linkages.

       !batch (off)
              Enable batch mode.

       !constituents (0)
              Generate constituent output. Its value may be:

              0      Disabled

              1      Treebank-style constituent tree

              2      Flat, bracketed tree [A like [B this B] A]

              3      Flat, treebank-style tree (A like (B this))

       !cost-max (2.7)
              Largest cost to be considered.

       !dialect (no value)
              Use the specified (comma-separated) names.
              They modify the  disjunct  cost  of  dictionary  words  whose  expressions  contain
              symbolic cost specifications.

       !disjuncts (off)
              Display of disjuncts used.

       !echo (off)
              Echo input sentence.

       !graphics (on)
              Enable  graphical  display  of  linkage.  For each linkage, the sentence is printed
              along with a graphical representation of its linkage above it.

              The following notations are used for words in the sentence:

              [word] A word with no linkage.

              word[?].x
                     An unknown word whose POS category x has been found by the parser.

              word[!]
                     An unknown word whose link-grammar dictionary entry has been assigned  by  a
                     RegEx.  (Use !morphology=1 to see the said dictionary entry.)

              word[~]
                     There was an unknown word in this position, and it has got replaced, using a
                     spell guess with this word, that is found in the link-grammar dictionary.

              word[&]
                     This word is a part of an unknown word which has been found  to  consist  of
                     two or more words that are in the link-grammar dictionary.

              word.POS
                     This word found in the dictionary as word.POS.

              word.#CORRECTION
                     This word is probably a typo - got linked as an alternative word CORRECTION.

              For dictionaries that support morphology (enable with !morphology=1):

              word=  A prefix morpheme

              =word  A suffix morpheme

              word.= A stem

       !islands-ok (on)
              Use null-linked islands.

       !limit (1000)
              Limit the maximum linkages processed.

       !links (off)
              Enable display of complete link data.

       !null (on)
              Allow null links.

       !morphology (off)
              Display word morphology.  When a word matches a RegEx, show the matching dictionary
              entry.

       !panic (on)
              Use "panic mode" if a parse cannot be quickly found.
              The command !panic_variables prints the special variables that  are  used  only  in
              "panic mode".

       !postscript (off)
              Generate postscript output.

       !short (16)
              Maximum length of short links.

       !spell (7)
              If zero, no spell and run-on corrections of unknown words are performed.
              Else,  use up to this many spell-guesses per unknown word. In that case, the number
              of run-on corrections (word split) of unknown words is not limited.

       !timeout (30)
              Abort parsing after this many seconds.

       !use-sat (off)
              Use Boolean SAT-based parser.

       !verbosity (1)
              Level of detail in output. Some useful values:

              0      No prompt, minimal library messages

              1      Normal verbosity

              2      Show times of the parsing steps

              3      Display some more information messages

              4      Display data file search and locale setup

              5-9    Tokenizer and parser debugging

              10-19  Dictionary debugging

              101    Print all the dictionary connectors, along with their length limit

       !walls (off)
              Display wall words.

       !width (16381)(*)
              The width of the display.
              * When writing to a terminal, this value is set from its width.

       !wordgraph (0)
              Display the wordgraph (word-split graph).

              0      Disabled

              1      Default display

              2      Display parent tokens as subgraphs

              3      Use esoteric display flags as set by !test=wg:FLAGS

FILES

       The following files are per-language, when LL is the 2-letter ISO language code.

       LL/4.0.dict
              The Link Grammar dictionary.

       LL/4.0.affix
              Values of entities used in tokenization.

       LL/4.0.regex
              Regular expressions (see regex(7)) that are used to match tokens not found  in  the
              dictionary.

       LL/4.0.dialect
              Dialect component definitions.

       LL/4.0.knowledge
              Post-processing definitions.

       LL/4.0.constituent-knowledge
              Definitions for producing a constituent tree.

       command-help-LL.txt or command-help-LL-CC.txt
              Help  text  for  the  !help  topic  special "!" command.  If several such files are
              provided, the desired one can be selected by e.g. the LANGUAGE environment variable
              if  it is set to LL or LL-CC (default is en). Currently only command-help-en.txt is
              provided.

       The directory search order for these files is:
              • ./data/../../data/
              •  A custom data directory, as set by the API call dictionary_set_data_dir().
              •  Installation-depended system data directory (*)

              * This location is displayed as  DICTIONARY_DIR  when  the  --version  argument  is
              provided  to link-parser on the command line.  On windows it may be relative to the
              location of the link-grammar library DLL; in  that  case  the  actual  location  is
              displayed as "System data directory" when link-parser is invoked with -verbosity=4.

SEE ALSO

       Information on the link-grammar shared-library API and the link types used in the parse is
       available at the AbiWord website ⟨http://www.abisource.com/projects/link-grammar/⟩.

       Peer-reviewed  papers  explaining  Link  Grammar  can  be  found  at  original  CMU   site
       ⟨http://www.link.cs.cmu.edu/link/papers⟩.

       The  source  code  of  link-parser  and  the  link-grammar  library  is  located at GitHub
       ⟨https://github.com/opencog/link-grammar⟩.

       The  mailing  list  for  Link  Grammar  discussion  is  at   link-grammar   Google   group
       ⟨http://groups.google.com/group/link-grammar?hl=en⟩.

AUTHOR

       link-parser    and   the   link-grammar   library   were   written   by   Daniel   Sleator
       <sleator@cs.cmu.edu>, Davy Temperley <dtemp@theory.esm.rochester.edu>, and  John  Lafferty
       <lafferty@cs.cmu.edu>

       This  manual page was written by Ken Bloom <kbloom@gmail.com>, for the Debian project, and
       updated by Linas Vepstas <linasvepstas@gmail.com>.