Ubuntu Manpage: DBD::SQLite::Cookbook - The DBD::SQLite Cookbook

name
description
aggregate functions
fts fulltext indexing
support
to do
author
copyright

Provided by: libdbd-sqlite3-perl_1.40-3_amd64

NAME

       DBD::SQLite::Cookbook - The DBD::SQLite Cookbook

DESCRIPTION

       This is the DBD::SQLite cookbook.

       It is intended to provide a place to keep a variety of functions and formals for use in callback APIs in
       DBD::SQLite.

AGGREGATE FUNCTIONS

   Variance
       This is a simple aggregate function which returns a variance. It is adapted from an example
       implementation in pysqlite.

         package variance;

         sub new { bless [], shift; }

         sub step {
             my ( $self, $value ) = @_;

             push @$self, $value;
         }

         sub finalize {
             my $self = $_[0];

             my $n = @$self;

             # Variance is NULL unless there is more than one row
             return undef unless $n || $n == 1;

             my $mu = 0;
             foreach my $v ( @$self ) {
                 $mu += $v;
             }
             $mu /= $n;

             my $sigma = 0;
             foreach my $v ( @$self ) {
                 $sigma += ($v - $mu)**2;
             }
             $sigma = $sigma / ($n - 1);

             return $sigma;
         }

         # NOTE: If you use an older DBI (< 1.608),
         # use $dbh->func(..., "create_aggregate") instead.
         $dbh->sqlite_create_aggregate( "variance", 1, 'variance' );

       The function can then be used as:

         SELECT group_name, variance(score)
         FROM results
         GROUP BY group_name;

   Variance (Memory Efficient)
       A more efficient variance function, optimized for memory usage at the expense of precision:

         package variance2;

         sub new { bless {sum => 0, count=>0, hash=> {} }, shift; }

         sub step {
             my ( $self, $value ) = @_;
             my $hash = $self->{hash};

             # by truncating and hashing, we can comsume many more data points
             $value = int($value); # change depending on need for precision
                                   # use sprintf for arbitrary fp precision
             if (exists $hash->{$value}) {
                 $hash->{$value}++;
             } else {
                 $hash->{$value} = 1;
             }
             $self->{sum} += $value;
             $self->{count}++;
         }

         sub finalize {
             my $self = $_[0];

             # Variance is NULL unless there is more than one row
             return undef unless $self->{count} > 1;

             # calculate avg
             my $mu = $self->{sum} / $self->{count};

             my $sigma = 0;
             while (my ($h, $v) = each %{$self->{hash}}) {
                 $sigma += (($h - $mu)**2) * $v;
             }
             $sigma = $sigma / ($self->{count} - 1);

             return $sigma;
         }

       The function can then be used as:

         SELECT group_name, variance2(score)
         FROM results
         GROUP BY group_name;

   Variance (Highly Scalable)
       A third variable implementation, designed for arbitrarily large data sets:

         package variance3;

         sub new { bless {mu=>0, count=>0, S=>0}, shift; }

         sub step {
             my ( $self, $value ) = @_;
             $self->{count}++;
             my $delta = $value - $self->{mu};
             $self->{mu} += $delta/$self->{count};
             $self->{S} += $delta*($value - $self->{mu});
         }

         sub finalize {
             my $self = $_[0];
             return $self->{S} / ($self->{count} - 1);
         }

       The function can then be used as:

         SELECT group_name, variance3(score)
         FROM results
         GROUP BY group_name;

FTS fulltext indexing

   Sparing database disk space
       As explained in <http://www.sqlite.org/fts3.html#fts4_options>, several options are available to specify
       how SQLite should store indexed documents.

       One strategy is to use SQLite only for the fulltext index and metadata, and keep the full documents
       outside of SQLite; to do so, use the "content=""" option. For example, the following SQL creates an FTS4
       table with three columns - "a", "b", and "c":

          CREATE VIRTUAL TABLE t1 USING fts4(content="", a, b, c);

       Data can be inserted into such an FTS4 table using an INSERT statements. However, unlike ordinary FTS4
       tables, the user must supply an explicit integer docid value. For example:

         -- This statement is Ok:
         INSERT INTO t1(docid, a, b, c) VALUES(1, 'a b c', 'd e f', 'g h i');

         -- This statement causes an error, as no docid value has been provided:
         INSERT INTO t1(a, b, c) VALUES('j k l', 'm n o', 'p q r');

       Of course your application will need an algorithm for finding the external resource corresponding to any
       docid stored within SQLite. Furthermore, SQLite "offsets()" and "snippet()" functions cannot be used, so
       if such functionality is needed, it has to be directly programmed within the Perl application.

SUPPORT

       Bugs should be reported via the CPAN bug tracker at

       <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DBD-SQLite>

TO DO

       •   Add more and varied cookbook recipes, until we have enough to turn them into a separate CPAN
           distribution.

       •   Create a series of tests scripts that validate the cookbook recipes.

AUTHOR

       Adam Kennedy <adamk@cpan.org>

       Laurent Dami <dami@cpan.org>

COPYRIGHT

       Copyright 2009 - 2012 Adam Kennedy.

       This program is free software; you can redistribute it and/or modify it under the same terms as Perl
       itself.

       The full text of the license can be found in the LICENSE file included with this module.