Ubuntu Manpage: hspell - Hebrew spellchecker

NAME

       hspell - Hebrew spellchecker

SYNOPSIS

       hspell [ -acDhHilnsvV ] [file...]

DESCRIPTION

       hspell tries to find incorrectly spelled Hebrew words in its input files.

       Like  the traditional Unix spell(1), hspell outputs the sorted list of incorrect words, and does not have
       a more friendly interface for making corrections for you. However, unlike spell(1),  hspell  can  suggest
       possible  corrections for some spelling errors. Such suggestions can be enabled with the -c (correct) and
       -n (notes) options.

       Hspell currently expects ISO-8859-8-encoded input files. Non-Hebrew characters in  the  input  files  are
       ignored, allowing the easy spellchecking of Hebrew-English texts, as well as HTML or TeX files.  If files
       using  a  different  encoding (e.g., UTF-8) are to be checked, they must be converted first to ISO-8859-8
       (e.g., see iconv(1), recode(1)).

       The output will also be in ISO-8859-8 encoding, in so-called "logical order", so it is normally useful to
       pipe it to bidiv(1) before viewing, as in:

              hspell -c filename | bidiv | less

       If no input file is given, hspell reads from its standard input.

OPTIONS

-v If the -v option is given, hspell prints emacs-oriented version information and exits.

-vv Repetition of the -v option causes hspell to also show some information on which optional features
were enabled at compile time.

-V With the -V option, hspell prints true and human-oriented version information and exits.

-c If the -c option is given, hspell will suggest corrections for misspelled words, whenever it can
find such corrections. The correction mechanism in this release is especially good at finding
corrections for incorrect niqqud-less spellings, with missing or extra 'immot-qri'a.

-n The -n option will give some longer "notes" about certain spelling errors, explaining why these
are indeed errors (or in what cases using this word is in fact correct). It is recommend to
combine the two options, -cn for maximal correction help from hspell.

-l The -l (linguistic information) option will explain for each correct word why it was recognized
(show the basic noun, verb, etc., that this inflection relates to, and its tense, gender,
associated Kinnuy, or other relevant information)

If Hspell was built without morphological analysis support, this option will only show the correct
splits of the given word into prefix + word, as the full information incurs a 4-fold increase in
the installation size.

Giving the -c option in addition to -l results in special behavior. In that case hspell suggests
"corrections" to every word (regardless if they are in the dictionary or not), and shows the
linguistic information on all those words. This can be useful for a reader application, which may
also want to be able to understand misspellings and their possible meanings.

-s Normally, the words deemed spelling mistakes are shown in alphabetical order. The -s option
orders them by severity, i.e., the errors that most frequently appear in the document are shown
first. This option is most useful for people helping to build hspell's word list, and are looking
for common correct words that hspell does not know yet.

-a With the -a option, hspell tries to emulate (as little as possible of) ispell's pipe interface.
This allows Lyx, Emacs, Geresh and KDE to use hspell as an external spell-checker.

-i This option only has any effect when used together with the -a option. Normally, hspell -a only
checks the spelling of Hebrew words. If the given file also contains non-Hebrew words (such as
English words), these are simply ignored. Adding the -i option tells hspell to pass the non-Hebrew
words to ispell(1), and return its answer as an answer from hspell. This allows conveniently
spell-checking mixed Hebrew-English documents.

Running hspell with the program name hspell-i also enables the -i option. This is a useful trick
when an application expects just the name of a spell-checking program, and adds only the "-a"
option (without giving the user an option to also add "-i"). The multispell script supplied with
hspell serves a similar purpose, with more control over encodings and which spell-checker to run
for non-Hebrew words.

-H By default, Hspell does not allow the He Ha-sh'ela prefix. This is because this prefix is not
normally used in modern Hebrew, and generates many false-negatives (errors, like He followed by a
possessed noun, are thought to be correct). The -H option nevertheless tells Hspell to allow this
prefix.

-D base
Load the word lists from the given base pathname, rather than from the compiled-in default path.
This is mostly used for testing Hspell, when the dictionaries have been compiled in the current
directory and hspell is run as "hspell -Dhebrew.wgz".

-d, -B, -m, -T, -C, -S, -P, -p, -w, and -W
These options are passed to hspell by lyx or other applications, thinking they are talking to
ispell. These options are cordially ignored.

SPELLING STANDARD

       Hspell was designed to be 100% and strictly compliant with the official niqqud-less spelling rules  ("Ha-
       ktiv  Khasar  Ha-niqqud",  colloquially  known  as  "Ktiv  Male")  published by the Academy of the Hebrew
       Language.

       This is both an advantage and a disadvantage, depending on your viewpoint.  It's an advantage because  it
       encourages a correct and consistent spelling style throughout your writing. It is a disadvantage, because
       a few of the Academia's official spelling decisions are relatively unknown to the general public.

       Users  of Hspell (and all Hebrew writers, for that matter) are encouraged to read the Academia's official
       niqqud-less spelling rules (which are printed at the end of  most  modern  Hebrew  dictionaries,  and  an
       abridged  version  is  available  in  http://hebrew-academy.huji.ac.il/decision4.html).  Users  are  also
       encouraged to refer to Hebrew dictionaries which use the niqqud-less spelling (such  as  Millon  Ha-hove,
       Rav Milim, and the new Even Shoshan).

       Hspell's  distribution  (and  Web  site) also include a document, niqqudless.odt, which explains Hspell's
       spelling standard in detail (in Hebrew). It explains both the overall principles, and why specific  words
       are spelled the way they are.

       A future release may include an option for alternative spelling standards.

BEHIND THE SCENES

The hspell program itself is mostly a simple (but efficient) program that checks input words against a
long list of valid words. The real "brains" behind it are the word lists (dictionary) provided by the
Hspell project.

In order for this dictionary to be completely free of other people's copyright restrictions, the Hspell
project is a clean-room implementation, not based on pre-existing word lists or spell checkers, or on
copying of printed dictionaries.

The word list is also not based on automatic scanning of available Hebrew documents (such as online
newspapers), because there is no way to guarantee that such a list will be correct, complete, or
consistent in its spelling standard.

Instead, our idea was to write programs which know how to correctly inflect Hebrew nouns and conjugate
Hebrew verbs. The input to these programs is a list of noun stems and verb roots, plus hints needed for
the correct inflection when these cannot be figured out automatically. Most of the effort that went into
the Hspell project went into building these input files. Then, "word list generators" (written in Perl,
and are also part of the Hspell project) create the complete inflected word list that will be used by the
spellchecking program, hspell. This generation process is only done once, when building hspell from
source.

These lists, before and after inflection, may be useful for much more than spellchecking. Morphological
analysis (which hspell provides with the -l option) is one example. For more ideas, see Hspell project's
Web site, at http://ivrix.org.il/projects/spell-checker.

FILES

       ~/.hspell_words, ./hspell_words
              These files, if they exist, should contain a list of Hebrew words that hspell will also accept  as
              correct words.

              Note  that  only  these words exactly will be added - they are not inflected, and prefixes are not
              automatically allowed.

       /usr/local/share/hspell/*
              The standard Hebrew word lists used by hspell.

EXIT STATUS

       Currently always 0.

VERSION

       The version of hspell described by this manual page is 1.2.

COPYRIGHT

       Copyright   (C)    2000-2012,    Nadav    Har'El    <nyh@math.technion.ac.il>    and    Dan    Kenigsberg
       <danken@cs.technion.ac.il>.

       Hspell  is  free  software,  released under the GNU Affero General Public License (AGPL) version 3.  Note
       that not only the programs in the distribution, but also the dictionary  files  and  the  generated  word
       lists, are licensed under the AGPL.  There is no warranty of any kind.

       See the LICENSE file for more information and the exact license terms.

       The latest version of this software can be found in http://hspell.ivrix.org.il/

ACKNOWLEDGMENTS

       The hspell utility and the linguistic databases behind it (collectively called "the Hspell project") were
       created by Nadav Har'El <nyh@math.technion.ac.il> and by Dan Kenigsberg <danken@cs.technion.ac.il>.

       Although  we  wrote  all of Hspell's code ourselves, we are truly indebted to the old-style "open source"
       pioneers - people who wrote books instead of hiding their knowledge  in  proprietary  software.  For  the
       correct  noun inflections, Dr. Shaul Barkali's "The Complete Noun Book" has been a great help. Prof. Uzzi
       Ornan's booklet "Verb Conjugation in Flow Charts" has been instrumental in  the  implementation  of  verb
       conjugation, and Barkali's "The Complete Verb Book" was used too.

       During  our work we have extensively used a number of Hebrew dictionaries, including Even Shoshan, Millon
       Ha-hove and Rav-Milim, to ensure the correctness of certain words. Various Hebrew newspapers  and  books,
       both printed and online, were used for inspiration and for finding words we still do not recognize.

       We wish to thank Cilla Tuviana and Dr. Zvi Har'El for their assistance with some grammatical questions.

       Several  other people helped us in various releases, with suggestions, fixes or patches - they are listed
       in the WHATSNEW file in the distribution.

BUGS

       This manual page is in English.

       For GUI-lovers, hspell's user interface is an abomination. However, as more and more  applications  learn
       to  interface with hspell, and as Hspell's data becomes available in multi-lingual spellcheckers (such as
       aspell and hunspell), this will no longer be an issue. See http://hspell.ivrix.org.il/  for  instructions
       on how to use Hspell in a variety of applications.

       hspell's  being  limited  to the ISO-8859-8 encoding, and not recognizing UTF-8 or even CP1255 (including
       niqqud), is an anachronism today.

Hspell 1.2                                      28 February 2012                                       hspell(1)