Provided by: libbloom-filter-perl_1.2-5_all bug

NAME

       Bloom::Filter - Sample Perl Bloom filter implementation

DESCRIPTION

       A Bloom filter is a probabilistic algorithm for doing existence tests in less memory than a full list of
       keys would require.  The tradeoff to using Bloom filters is a certain configurable risk of false
       positives.  This module implements a simple Bloom filter with configurable capacity and false positive
       rate. Bloom filters were first described in a 1970 paper by Burton Bloom, see
       <http://portal.acm.org/citation.cfm?id=362692&dl=ACM&coll=portal>.

SYNOPSIS

               use Bloom::Filter

               my $bf = Bloom::Filter->new( capacity => 10, error_rate => .001 );

               $bf->add( @keys );

               while ( <> ) {
                       chomp;
                       print "Found $_\n" if $bf->check( $_ );
               }

CONSTRUCTORS

       new %PARAMS
           Create a brand new instance.  Allowable params are "error_rate", "capacity".

       init
           Calculates the best number of hash functions and optimum filter length, creates some random salts,
           and generates a blank bit vector.  Called automatically by constructor.

ACCESSORS

       capacity
           Returns the total capacity of the Bloom filter

       error_rate
           Returns the configured maximum error rate

       length
           Returns the length of the Bloom filter in bits

       key_count
           Returns the number of items currently stored in the filter

       on_bits
           Returns the number of 'on' bits in the filter

       salts
           Returns the list of salts used to create the hash functions

PUBLIC METHODS

       add @KEYS
           Adds the list of keys to the filter.   Will fail, return "undef" and complain if the number of keys
           in the filter exceeds the configured capacity.

       check @KEYS
           Checks the provided key list against the Bloom filter, and returns a list of equivalent length, with
           true or false values depending on whether there was a match.

INTERNAL METHODS

       _calculate_shortest_filter_length CAPACITY ERR_RATE
           Given a desired error rate and maximum capacity, returns the optimum combination of vector length (in
           bits) and number of hash functions to use in building the filter, where "optimum" means shortest
           vector length.

       _get_cells KEY
           Given a key, hashes it using the list of salts and returns an array of cell indexes corresponding to
           the key.

AUTHOR

       Originally written by Maciej Ceglowski <maciej@ceglowski.com>.  Currently maintained by Grzegorz
       Rożniecki <xaerxess@gmail.com>.

CONTRIBUTORS

       Dmitriy Ryaboy <dmitriy.ryaboy@ask.com> (big speedup in February 2007, thanks!)

COPYRIGHT AND LICENSE

       (c) 2004 Maciej Ceglowski

       This is free software, distributed under version 2 of the GNU Public License (GPL).