Ubuntu Manpage: DBM::Deep::Cookbook - Cookbook for DBM::Deep

name
description
recipes
custom digest algorithm
performance
see also

Provided by: libdbm-deep-perl_2.0011-1_all

NAME

       DBM::Deep::Cookbook - Cookbook for DBM::Deep

DESCRIPTION

       This is the Cookbook for DBM::Deep. It contains useful tips and tricks, plus some examples of how to do
       common tasks.

RECIPES

   Unicode data
       If possible, it is highly recommended that you upgrade your database to version 2 (using the
       utils/upgrade_db.pl script in the CPAN distribution), in order to use Unicode.

       If your databases are still shared by perl installations with older DBM::Deep versions, you can use
       filters to encode strings on the fly:

         my $db = DBM::Deep->new( ... );
         my $encode_sub = sub { my $s = shift; utf8::encode($s); $s };
         my $decode_sub = sub { my $s = shift; utf8::decode($s); $s };
         $db->set_filter( 'store_value' => $encode_sub );
         $db->set_filter( 'fetch_value' => $decode_sub );
         $db->set_filter( 'store_key' => $encode_sub );
         $db->set_filter( 'fetch_key' => $decode_sub );

       A previous version of this cookbook recommended using "binmode $db->_fh, ":utf8"", but that is not a good
       idea, as it could easily corrupt the database.

   Real-time Encryption Example
       NOTE: This is just an example of how to write a filter. This most definitely should NOT be taken as a
       proper way to write a filter that does encryption. (Furthermore, it fails to take Unicode into account.)

       Here is a working example that uses the Crypt::Blowfish module to do real-time encryption / decryption of
       keys & values with DBM::Deep Filters.  Please visit
       <http://search.cpan.org/search?module=Crypt::Blowfish> for more on Crypt::Blowfish. You'll also need the
       Crypt::CBC module.

         use DBM::Deep;
         use Crypt::Blowfish;
         use Crypt::CBC;

         my $cipher = Crypt::CBC->new({
             'key'             => 'my secret key',
             'cipher'          => 'Blowfish',
             'iv'              => '$KJh#(}q',
             'regenerate_key'  => 0,
             'padding'         => 'space',
             'prepend_iv'      => 0
         });

         my $db = DBM::Deep->new(
             file => "foo-encrypt.db",
             filter_store_key => \&my_encrypt,
             filter_store_value => \&my_encrypt,
             filter_fetch_key => \&my_decrypt,
             filter_fetch_value => \&my_decrypt,
         );

         $db->{key1} = "value1";
         $db->{key2} = "value2";
         print "key1: " . $db->{key1} . "\n";
         print "key2: " . $db->{key2} . "\n";

         undef $db;
         exit;

         sub my_encrypt {
             return $cipher->encrypt( $_[0] );
         }
         sub my_decrypt {
             return $cipher->decrypt( $_[0] );
         }

   Real-time Compression Example
       Here is a working example that uses the Compress::Zlib module to do real-time compression / decompression
       of keys & values with DBM::Deep Filters.  Please visit
       <http://search.cpan.org/search?module=Compress::Zlib> for more on Compress::Zlib.

         use DBM::Deep;
         use Compress::Zlib;

         my $db = DBM::Deep->new(
             file => "foo-compress.db",
             filter_store_key => \&my_compress,
             filter_store_value => \&my_compress,
             filter_fetch_key => \&my_decompress,
             filter_fetch_value => \&my_decompress,
         );

         $db->{key1} = "value1";
         $db->{key2} = "value2";
         print "key1: " . $db->{key1} . "\n";
         print "key2: " . $db->{key2} . "\n";

         undef $db;
         exit;

         sub my_compress {
             my $s = shift;
             utf8::encode($s);
             return Compress::Zlib::memGzip( $s ) ;
         }
         sub my_decompress {
             my $s = Compress::Zlib::memGunzip( shift ) ;
             utf8::decode($s);
             return $s;
         }

       Note: Filtering of keys only applies to hashes. Array "keys" are actually numerical index numbers, and
       are not filtered.

Custom Digest Algorithm

       DBM::Deep by default uses the Message Digest 5 (MD5) algorithm for hashing keys. However you can override
       this, and use another algorithm (such as SHA-256) or even write your own. But please note that DBM::Deep
       currently expects zero collisions, so your algorithm has to be perfect, so to speak. Collision detection
       may be introduced in a later version.

       You can specify a custom digest algorithm by passing it into the parameter list for new(), passing a
       reference to a subroutine as the 'digest' parameter, and the length of the algorithm's hashes (in bytes)
       as the 'hash_size' parameter. Here is a working example that uses a 256-bit hash from the Digest::SHA256
       module. Please see <http://search.cpan.org/search?module=Digest::SHA256> for more information.

       The value passed to your digest function will be encoded as UTF-8 if the database is in version 2 format
       or higher.

         use DBM::Deep;
         use Digest::SHA256;

         my $context = Digest::SHA256::new(256);

         my $db = DBM::Deep->new(
             filename => "foo-sha.db",
             digest => \&my_digest,
             hash_size => 32,
         );

         $db->{key1} = "value1";
         $db->{key2} = "value2";
         print "key1: " . $db->{key1} . "\n";
         print "key2: " . $db->{key2} . "\n";

         undef $db;
         exit;

         sub my_digest {
             return substr( $context->hash($_[0]), 0, 32 );
         }

       Note: Your returned digest strings must be EXACTLY the number of bytes you specify in the hash_size
       parameter (in this case 32). Undefined behavior will occur otherwise.

       Note: If you do choose to use a custom digest algorithm, you must set it every time you access this file.
       Otherwise, the default (MD5) will be used.

PERFORMANCE

       Because DBM::Deep is a conncurrent datastore, every change is flushed to disk immediately and every read
       goes to disk. This means that DBM::Deep functions at the speed of disk (generally 10-20ms) vs. the speed
       of RAM (generally 50-70ns), or at least 150-200x slower than the comparable in-memory datastructure in
       Perl.

       There are several techniques you can use to speed up how DBM::Deep functions.

       •   Put it on a ramdisk

           The easiest and quickest mechanism to making DBM::Deep run faster is to create a ramdisk and locate
           the DBM::Deep file there. Doing this as an option may become a feature of DBM::Deep, assuming there
           is a good ramdisk wrapper on CPAN.

       •   Work at the tightest level possible

           It is much faster to assign the level of your db that you are working with to an intermediate
           variable than to re-look it up every time. Thus

             # BAD
             while ( my ($k, $v) = each %{$db->{foo}{bar}{baz}} ) {
               ...
             }

             # GOOD
             my $x = $db->{foo}{bar}{baz};
             while ( my ($k, $v) = each %$x ) {
               ...
             }

       •   Make your file as tight as possible

           If you know that you are not going to use more than 65K in your database, consider using the
           "pack_size => 'small'" option. This will instruct DBM::Deep to use 16bit addresses, meaning that the
           seek times will be less.

NAME

DESCRIPTION

RECIPES

Custom Digest Algorithm

PERFORMANCE

SEE ALSO