Provided by: libkinosearch1-perl_1.00-1build3_amd64
NAME
KinoSearch1::InvIndexer - build inverted indexes
SYNOPSIS
use KinoSearch1::InvIndexer; use KinoSearch1::Analysis::PolyAnalyzer; my $analyzer = KinoSearch1::Analysis::PolyAnalyzer->new( language => 'en' ); my $invindexer = KinoSearch1::InvIndexer->new( invindex => '/path/to/invindex', create => 1, analyzer => $analyzer, ); $invindexer->spec_field( name => 'title' boost => 3, ); $invindexer->spec_field( name => 'bodytext' ); while ( my ( $title, $bodytext ) = each %source_documents ) { my $doc = $invindexer->new_doc($title); $doc->set_value( title => $title ); $doc->set_value( bodytext => $bodytext ); $invindexer->add_doc($doc); } $invindexer->finish;
DESCRIPTION
The InvIndexer class is KinoSearch1's primary tool for creating and modifying inverted indexes, which may be searched using KinoSearch1::Searcher.
METHODS
new my $invindexer = KinoSearch1::InvIndexer->new( invindex => '/path/to/invindex', # required create => 1, # default: 0 analyzer => $analyzer, # default: no-op Analyzer ); Create an InvIndexer object. • invindex - can be either a filepath, or an InvIndex subclass such as KinoSearch1::Store::FSInvIndex or KinoSearch1::Store::RAMInvIndex. • create - create a new invindex, clobbering an existing one if necessary. • analyzer - an object which subclasses KinoSearch1::Analysis::Analyzer, such as a PolyAnalyzer. spec_field $invindexer->spec_field( name => 'url', # required boost => 1, # default: 1, analyzer => undef, # default: analyzer spec'd in new() indexed => 0, # default: 1 analyzed => 0, # default: 1 stored => 1, # default: 1 compressed => 0, # default: 0 vectorized => 0, # default: 1 ); Define a field. • name - the field's name. • boost - A multiplier which determines how much a field contributes to a document's score. • analyzer - By default, all indexed fields are analyzed using the analyzer that was supplied to new(). Supplying an alternate for a given field overrides the primary analyzer. • indexed - index the field, so that it can be searched later. • analyzed - analyze the field, using the relevant Analyzer. Fields such as "category" or "product_number" might be indexed but not analyzed. • stored - store the field, so that it can be retrieved when the document turns up in a search. • compressed - compress the stored field, using the zlib compression algorithm. • vectorized - store the field's "term vectors", which are required by KinoSearch1::Highlight::Highlighter for excerpt selection and search term highlighting. new_doc my $doc = $invindexer->new_doc; Spawn an empty KinoSearch1::Document::Doc object, primed to accept values for the fields spec'd by spec_field. add_doc $invindexer->add_doc($doc); Add a document to the invindex. add_invindexes my $invindexer = KinoSearch1::InvIndexer->new( invindex => $invindex, analyzer => $analyzer, ); $invindexer->add_invindexes( $another_invindex, $yet_another_invindex ); $invindexer->finish; Absorb existing invindexes into this one. May only be called once per InvIndexer. add_invindexes() and add_doc() cannot be called on the same InvIndexer. delete_docs_by_term my $term = KinoSearch1::Index::Term->new( 'id', $unique_id ); $invindexer->delete_docs_by_term($term); Mark any document which contains the supplied term as deleted, so that it will be excluded from search results. For more info, see Deletions in KinoSearch1::Docs::FileFormat. finish $invindexer->finish( optimize => 1, # default: 0 ); Finish the invindex. Invalidates the InvIndexer. Takes one hash-style parameter. • optimize - If optimize is set to 1, the invindex will be collapsed to its most compact form, which will yield the fastest queries.
COPYRIGHT
Copyright 2005-2010 Marvin Humphrey
LICENSE, DISCLAIMER, BUGS, etc.
See KinoSearch1 version 1.00.