Ubuntu Manpages

Plucene::Analysis::LetterTokenizer

Letter tokenizer

        # isa Plucene::Analysis::CharTokenizer

This is the letter tokenizer class, which divides text at non-letters.

Note: this does a decent job for most European languages, but does a terrible job for some Asian languages, where words are not separated by spaces