Ubuntu Manpage: Plucene::Index::Writer

NAME

       Plucene::Index::Writer - write an index.

SYNOPSIS

               my $writer = Plucene::Index::Writer->new($path, $analyser, $create);

               $writer->add_document($doc);
               $writer->add_indexes(@dirs);

               $writer->optimize; # called before close

               my $doc_count = $writer->doc_count;

               my $mergefactor = $writer->mergefactor;

               $writer->set_mergefactor($value);

DESCRIPTION

       This is the writer class.

       If an index will not have more documents added for a while and optimal search performance
       is desired, then the "optimize" method should be called before the index is closed.

METHODS

   new
               my $writer = Plucene::Index::Writer->new($path, $analyser, $create);

       This will create a new Plucene::Index::Writer object.

       The third argument to the constructor determines whether a new index is created, or
       whether an existing index is opened for the addition of new documents.

   mergefactor / set_mergefactor
               my $mergefactor = $writer->mergefactor;

               $writer->set_mergefactor($value);

       Get / set the mergefactor. It defaults to 5.

   doc_count
               my $doc_count = $writer->doc_count;

   add_document
               $writer->add_document($doc);

       Adds a document to the index. After the document has been added, a merge takes place if
       there are more than $Plucene::Index::Writer::mergefactor segments in the index. This
       defaults to 10, but can be set to whatever value is optimal for your application.

   optimize
               $writer->optimize;

       Merges all segments together into a single segment, optimizing an index for search. This
       should be the last method called on an indexer, as it invalidates the writer object.

   add_indexes
               $writer->add_indexes(@dirs);

       Merges all segments from an array of indexes into this index.

       This may be used to parallelize batch indexing.  A large document collection can be broken
       into sub-collections.  Each sub-collection can be indexed in parallel, on a different
       thread, process or machine.  The complete index can then be created by merging sub-
       collection indexes with this method.

       After this completes, the index is optimized.