Provided by: liblingua-stem-perl_0.84-1_all bug


       Lingua::Stem::EnBroken - Porter's stemming algorithm for 'generic' English


           use Lingua::Stem::EnBroken;
           my $stems   = Lingua::Stem::EnBroken::stem({ -words => $word_list_reference,
                                               -locale => 'en',
                                           -exceptions => $exceptions_hash,


       This routine MIS-applies the Porter Stemming Algorithm to its parameters, returning the
       stemmed words. It is an intentionally broken version of Lingua::Stem::En for people
       needing backwards compatibility with Lingua::Stem 0.30 and Lingua::Stem 0.40. Do not use
       it if you aren't one of those people.

       It is derived from the C program "stemmer.c" as found in freewais and elsewhere, which
       contains these notes:

          Purpose:    Implementation of the Porter stemming algorithm documented
                      in: Porter, M.F., "An Algorithm For Suffix Stripping,"
                      Program 14 (3), July 1980, pp. 130-137.
          Provenance: Written by B. Frakes and C. Cox, 1986.

       I have re-interpreted areas that use Frakes and Cox's "WordSize" function. My version may
       misbehave on short words starting with "y", but I can't think of any examples.

       The step numbers correspond to Frakes and Cox, and are probably in Porter's article (which
       I've not seen).  Porter's algorithm still has rough spots (e.g current/currency, -ings
       words), which I've not attempted to cure, although I have added support for the British
       -ise suffix.


        2003.09.28 -  Documentation fix

        2000.09.14 -  Forked from the module to provide
                      a backward compatibly broken version for people needing
                      consistent behavior with 0.30 and 0.40 more than accurate


       stem({ -words => \@words, -locale => 'en', -exceptions => \%exceptions });
           Stems a list of passed words using the rules of US English. Returns an anonymous array
           reference to the stemmed words.


             my $stemmed_words = Lingua::Stem::EnBroken::stem({ -words => \@words,
                                                         -locale => 'en',
                                                     -exceptions => \%exceptions,

       stem_caching({ -level => 0|1|2 });
           Sets the level of stem caching.

           '0' means 'no caching'. This is the default level.

           '1' means 'cache per run'. This caches stemming results during a single
               call to 'stem'.

           '2' means 'cache indefinitely'. This caches stemming results until
               either the process exits or the 'clear_stem_cache' method is called.

           Clears the cache of stemmed words


       This code is almost entirely derived from the Porter 2.1 module written by Jim Richardson.




         Jim Richardson, University of Sydney or

         Integration in Lingua::Stem by
         Benjamin Franz, FreeRun Technologies, or


       Jim Richardson, University of Sydney Benjamin Franz, FreeRun Technologies

       This code is freely available under the same terms as Perl.