Ubuntu Manpage: hspell - Hebrew spellchecker

NAME

       hspell - Hebrew spellchecker

SYNOPSIS

       hspell [ -acDhHilnsvV ] [file...]

DESCRIPTION

       hspell tries to find incorrectly spelled Hebrew words in its input files.

       Like the traditional Unix spell(1), hspell outputs the sorted list of incorrect words, and
       does not have a more friendly interface for making corrections for  you.  However,  unlike
       spell(1),  hspell  can  suggest  possible  corrections  for  some  spelling  errors.  Such
       suggestions can be enabled with the -c (correct) and -n (notes) options.

       Hspell currently expects ISO-8859-8-encoded input  files.  Non-Hebrew  characters  in  the
       input  files are ignored, allowing the easy spellchecking of Hebrew-English texts, as well
       as HTML or TeX files.  If files using  a  different  encoding  (e.g.,  UTF-8)  are  to  be
       checked, they must be converted first to ISO-8859-8 (e.g., see iconv(1), recode(1)).

       The  output  will  also  be in ISO-8859-8 encoding, in so-called "logical order", so it is
       normally useful to pipe it to bidiv(1) before viewing, as in:

              hspell -c filename | bidiv | less

       If no input file is given, hspell reads from its standard input.

OPTIONS

-v If the -v option is given, hspell prints emacs-oriented version information and
exits.

-vv Repetition of the -v option causes hspell to also show some information on which
optional features were enabled at compile time.

-V With the -V option, hspell prints true and human-oriented version information and
exits.

-c If the -c option is given, hspell will suggest corrections for misspelled words,
whenever it can find such corrections. The correction mechanism in this release is
especially good at finding corrections for incorrect niqqud-less spellings, with
missing or extra 'immot-qri'a.

-n The -n option will give some longer "notes" about certain spelling errors,
explaining why these are indeed errors (or in what cases using this word is in fact
correct). It is recommend to combine the two options, -cn for maximal correction
help from hspell.

-l The -l (linguistic information) option will explain for each correct word why it
was recognized (show the basic noun, verb, etc., that this inflection relates to,
and its tense, gender, associated Kinnuy, or other relevant information)

If Hspell was built without morphological analysis support, this option will only
show the correct splits of the given word into prefix + word, as the full
information incurs a 4-fold increase in the installation size.

Giving the -c option in addition to -l results in special behavior. In that case
hspell suggests "corrections" to every word (regardless if they are in the
dictionary or not), and shows the linguistic information on all those words. This
can be useful for a reader application, which may also want to be able to
understand misspellings and their possible meanings.

-s Normally, the words deemed spelling mistakes are shown in alphabetical order. The
-s option orders them by severity, i.e., the errors that most frequently appear in
the document are shown first. This option is most useful for people helping to
build hspell's word list, and are looking for common correct words that hspell does
not know yet.

-a With the -a option, hspell tries to emulate (as little as possible of) ispell's
pipe interface. This allows Lyx, Emacs, Geresh and KDE to use hspell as an external
spell-checker.

-i This option only has any effect when used together with the -a option. Normally,
hspell -a only checks the spelling of Hebrew words. If the given file also contains
non-Hebrew words (such as English words), these are simply ignored. Adding the -i
option tells hspell to pass the non-Hebrew words to ispell(1), and return its
answer as an answer from hspell. This allows conveniently spell-checking mixed
Hebrew-English documents.

Running hspell with the program name hspell-i also enables the -i option. This is a
useful trick when an application expects just the name of a spell-checking program,
and adds only the "-a" option (without giving the user an option to also add "-i").
The multispell script supplied with hspell serves a similar purpose, with more
control over encodings and which spell-checker to run for non-Hebrew words.

-H By default, Hspell does not allow the He Ha-sh'ela prefix. This is because this
prefix is not normally used in modern Hebrew, and generates many false-negatives
(errors, like He followed by a possessed noun, are thought to be correct). The -H
option nevertheless tells Hspell to allow this prefix.

-D base
Load the word lists from the given base pathname, rather than from the compiled-in
default path. This is mostly used for testing Hspell, when the dictionaries have
been compiled in the current directory and hspell is run as "hspell -Dhebrew.wgz".

-d, -B, -m, -T, -C, -S, -P, -p, -w, and -W
These options are passed to hspell by lyx or other applications, thinking they are
talking to ispell. These options are cordially ignored.

SPELLING STANDARD

       Hspell was designed to be 100%  and  strictly  compliant  with  the  official  niqqud-less
       spelling  rules  ("Ha-ktiv Khasar Ha-niqqud", colloquially known as "Ktiv Male") published
       by the Academy of the Hebrew Language.

       This is both an advantage and a  disadvantage,  depending  on  your  viewpoint.   It's  an
       advantage  because  it  encourages a correct and consistent spelling style throughout your
       writing. It is a disadvantage, because a few of the Academia's official spelling decisions
       are relatively unknown to the general public.

       Users  of  Hspell  (and  all  Hebrew  writers, for that matter) are encouraged to read the
       Academia's official niqqud-less spelling rules (which are  printed  at  the  end  of  most
       modern  Hebrew  dictionaries,  and  an  abridged  version  is  available in http://hebrew-
       academy.huji.ac.il/decision4.html).  Users  are  also  encouraged  to  refer   to   Hebrew
       dictionaries  which  use  the niqqud-less spelling (such as Millon Ha-hove, Rav Milim, and
       the new Even Shoshan).

       Hspell's distribution (and Web  site)  also  include  a  document,  niqqudless.odt,  which
       explains  Hspell's  spelling  standard in detail (in Hebrew). It explains both the overall
       principles, and why specific words are spelled the way they are.

       A future release may include an option for alternative spelling standards.

BEHIND THE SCENES

The hspell program itself is mostly a simple (but efficient) program that checks input
words against a long list of valid words. The real "brains" behind it are the word lists
(dictionary) provided by the Hspell project.

In order for this dictionary to be completely free of other people's copyright
restrictions, the Hspell project is a clean-room implementation, not based on pre-existing
word lists or spell checkers, or on copying of printed dictionaries.

The word list is also not based on automatic scanning of available Hebrew documents (such
as online newspapers), because there is no way to guarantee that such a list will be
correct, complete, or consistent in its spelling standard.

Instead, our idea was to write programs which know how to correctly inflect Hebrew nouns
and conjugate Hebrew verbs. The input to these programs is a list of noun stems and verb
roots, plus hints needed for the correct inflection when these cannot be figured out
automatically. Most of the effort that went into the Hspell project went into building
these input files. Then, "word list generators" (written in Perl, and are also part of
the Hspell project) create the complete inflected word list that will be used by the
spellchecking program, hspell. This generation process is only done once, when building
hspell from source.

These lists, before and after inflection, may be useful for much more than spellchecking.
Morphological analysis (which hspell provides with the -l option) is one example. For more
ideas, see Hspell project's Web site, at http://ivrix.org.il/projects/spell-checker.

FILES

       ~/.hspell_words, ./hspell_words
              These  files, if they exist, should contain a list of Hebrew words that hspell will
              also accept as correct words.

              Note that only these words exactly will be added -  they  are  not  inflected,  and
              prefixes are not automatically allowed.

       /usr/local/share/hspell/*
              The standard Hebrew word lists used by hspell.

EXIT STATUS

       Currently always 0.

VERSION

       The version of hspell described by this manual page is 1.2.

COPYRIGHT

       Copyright  (C)  2000-2012,  Nadav  Har'El  <nyh@math.technion.ac.il>  and  Dan  Kenigsberg
       <danken@cs.technion.ac.il>.

       Hspell is free software, released under the  GNU  Affero  General  Public  License  (AGPL)
       version  3.   Note that not only the programs in the distribution, but also the dictionary
       files and the generated word lists, are licensed under the AGPL.  There is no warranty  of
       any kind.

       See the LICENSE file for more information and the exact license terms.

       The latest version of this software can be found in http://hspell.ivrix.org.il/

ACKNOWLEDGMENTS

       The hspell utility and the linguistic databases behind it (collectively called "the Hspell
       project") were created by Nadav Har'El <nyh@math.technion.ac.il>  and  by  Dan  Kenigsberg
       <danken@cs.technion.ac.il>.

       Although  we  wrote all of Hspell's code ourselves, we are truly indebted to the old-style
       "open source" pioneers - people who wrote books  instead  of  hiding  their  knowledge  in
       proprietary  software. For the correct noun inflections, Dr. Shaul Barkali's "The Complete
       Noun Book" has been a great help. Prof. Uzzi Ornan's booklet  "Verb  Conjugation  in  Flow
       Charts"  has  been  instrumental  in the implementation of verb conjugation, and Barkali's
       "The Complete Verb Book" was used too.

       During our work we have extensively used a number of Hebrew dictionaries,  including  Even
       Shoshan, Millon Ha-hove and Rav-Milim, to ensure the correctness of certain words. Various
       Hebrew newspapers and books, both printed and online, were used for  inspiration  and  for
       finding words we still do not recognize.

       We  wish  to  thank  Cilla  Tuviana  and  Dr.  Zvi  Har'El  for their assistance with some
       grammatical questions.

       Several other people helped us in various releases, with suggestions, fixes or  patches  -
       they are listed in the WHATSNEW file in the distribution.

BUGS

       This manual page is in English.

       For  GUI-lovers,  hspell's  user  interface  is  an abomination. However, as more and more
       applications learn to interface with hspell, and as Hspell's  data  becomes  available  in
       multi-lingual  spellcheckers  (such  as  aspell  and  hunspell), this will no longer be an
       issue. See http://hspell.ivrix.org.il/ for instructions on how to use Hspell in a  variety
       of applications.

       hspell's  being  limited  to  the  ISO-8859-8  encoding, and not recognizing UTF-8 or even
       CP1255 (including niqqud), is an anachronism today.