oracular (3) Bio::DB::SeqFeature::Store::berkeleydb.3pm.gz

Provided by: libbio-db-seqfeature-perl_1.7.4-2_all bug

NAME

       Bio::DB::SeqFeature::Store::berkeleydb -- Storage and retrieval of sequence annotation
       data in Berkeleydb files

SYNOPSIS

         use Bio::DB::SeqFeature::Store;

         # Create a database from the feature files located in /home/fly4.3 and store
         # the database index in the same directory:
         my $db = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                   -dir     => '/home/fly4.3');

         # Create a database that will monitor the files in /home/fly4.3, but store
         # the indexes in /var/databases/fly4.3
         $db    = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                   -dir     => '/home/fly4.3',
                                                   -dsn     => '/var/databases/fly4.3');

         # Create a feature database from scratch
         $db    = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                   -dsn     => '/var/databases/fly4.3',
                                                   -create  => 1);

         # get a feature from somewhere
         my $feature = Bio::SeqFeature::Generic->new(...);

         # store it
         $db->store($feature) or die "Couldn't store!";

         # primary ID of the feature is changed to indicate its primary ID
         # in the database...
         my $id = $feature->primary_id;

         # get the feature back out
         my $f  = $db->fetch($id);

         # change the feature and update it
         $f->start(100);
         $db->update($f) or $self->throw("Couldn't update!");

         # use the GFF3 loader to do a bulk write:
         my $loader = Bio::DB::SeqFeature::Store::GFF3Loader->new(-store   => $db,
                                                                  -verbose => 0,
                                                                  -fast    => 1);
         $loader->load('/home/fly4.3/dmel-all.gff');

         # searching...
         # ...by id
         my @features = $db->fetch_many(@list_of_ids);

         # ...by name
         @features = $db->get_features_by_name('ZK909');

         # ...by alias
         @features = $db->get_features_by_alias('sma-3');

         # ...by type
         @features = $db->get_features_by_type('gene');

         # ...by location
         @features = $db->get_features_by_location(-seq_id=>'Chr1',-start=>4000,-end=>600000);

         # ...by attribute
         @features = $db->get_features_by_attribute({description => 'protein kinase'})

         # ...by the GFF "Note" field
         @result_list = $db->search_notes('kinase');

         # ...by arbitrary combinations of selectors
         @features = $db->features(-name => $name,
                                   -type => $types,
                                   -seq_id => $seqid,
                                   -start  => $start,
                                   -end    => $end,
                                   -attributes => $attributes);

         # ...using an iterator
         my $iterator = $db->get_seq_stream(-name => $name,
                                            -type => $types,
                                            -seq_id => $seqid,
                                            -start  => $start,
                                            -end    => $end,
                                            -attributes => $attributes);

         while (my $feature = $iterator->next_seq) {
           # do something with the feature
         }

         # ...limiting the search to a particular region
         my $segment  = $db->segment('Chr1',5000=>6000);
         my @features = $segment->features(-type=>['mRNA','match']);

         # what feature types are defined in the database?
         my @types    = $db->types;

         # getting & storing sequence information
         # Warning: this returns a string, and not a PrimarySeq object
         $db->insert_sequence('Chr1','GATCCCCCGGGATTCCAAAA...');
         my $sequence = $db->fetch_sequence('Chr1',5000=>6000);

         # create a new feature in the database
         my $feature = $db->new_feature(-primary_tag => 'mRNA',
                                        -seq_id      => 'chr3',
                                        -start      => 10000,
                                        -end        => 11000);

DESCRIPTION

       Bio::DB::SeqFeature::Store::berkeleydb is the Berkeleydb adaptor for
       Bio::DB::SeqFeature::Store. You will not create it directly, but instead use
       Bio::DB::SeqFeature::Store->new() to do so.

       See Bio::DB::SeqFeature::Store for complete usage instructions.

   Using the berkeleydb adaptor
       The Berkeley database consists of a series of Berkeleydb index files, and a couple of
       special purpose indexes. You can create the index files from scratch by creating a new
       database and calling new_feature() repeatedly, you can create the database and then bulk
       populate it using the GFF3 loader, or you can monitor a directory of preexisting GFF3 and
       FASTA files and rebuild the indexes whenever one or more of the fields changes. The last
       mode is probably the most convenient. Note that the indexer will only pay attention to
       files that end with .gff3, .wig and .fa.

       The new() constructor
           The new() constructor method all the arguments recognized by
           Bio::DB::SeqFeature::Store, and a few additional ones.

           Standard arguments:

            Name               Value
            ----               -----

            -adaptor           The name of the Adaptor class (default DBI::mysql)

            -serializer        The name of the serializer class (default Storable)

            -index_subfeatures Whether or not to make subfeatures searchable
                               (default true)

            -cache             Activate LRU caching feature -- size of cache

            -compress          Compresses features before storing them in database
                               using Compress::Zlib

           Adaptor-specific arguments

            Name               Value
            ----               -----

            -dsn               Where the index files are stored

            -dir               Where the source (GFF3, FASTA) files are stored

            -autoindex         An alias for -dir.

            -write             Pass true to open the index files for writing.

            -create            Pass true to create the index files if they don't exist
                               (implies -write=>1)

            -locking           Use advisory locking to avoid one process trying to read
                               from the database while another is updating it (may not
                               work properly over NFS).

            -temp              Pass true to create temporary index files that will
                               be deleted once the script exits.

            -verbose           Pass true to report autoindexing operations on STDERR.
                               (default is true).

           Examples:

           To create an empty database which will be populated using calls to store() or
           new_feature(), or which will be bulk-loaded using the GFF3 loader:

             $db     = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                        -dsn     => '/var/databases/fly4.3',
                                                        -create  => 1);

           To open a preexisting database in read-only mode:

             $db     = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                        -dsn     => '/var/databases/fly4.3');

           To open a preexisting database in read/write (update) mode:

             $db     = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                        -dsn     => '/var/databases/fly4.3',
                                                        -write   => 1);

           To monitor a set of GFF3 and FASTA files located in a directory and create/update the
           database indexes as needed. The indexes will be stored in a new subdirectory named
           "indexes":

             $db     = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                        -dir     => '/var/databases/fly4.3');

           As above, but store the source files and index files in separate directories:

             $db     = Bio::DB::SeqFeature::Store->new( -adaptor => 'berkeleydb',
                                                        -dsn     => '/var/databases/fly4.3',
                                                        -dir     => '/home/gff3_files/fly4.3');

           To be indexed, files must end with one of .gff3 (GFF3 format), .fa (FASTA format) or
           .wig (WIG format).

           -autoindex is an alias for -dir.

           You should specify -locking in a multiuser environment, including the case in which
           the database is being used by a web server at the same time another user might be
           updating it.

       See Bio::DB::SeqFeature::Store for all the access methods supported by this adaptor. The
       various methods for storing and updating features and sequences into the database are
       supported, but there is no locking. If two processes try to update the same database
       simultaneously, the database will likely become corrupted.

BUGS

       This is an early version, so there are certainly some bugs. Please use the BioPerl bug
       tracking system to report bugs.

SEE ALSO

       bioperl, Bio::DB::SeqFeature, Bio::DB::SeqFeature::Store, Bio::DB::SeqFeature::GFF3Loader,
       Bio::DB::SeqFeature::Segment, Bio::DB::SeqFeature::Store::memory,
       Bio::DB::SeqFeature::Store::DBI::mysql,

AUTHOR

       Lincoln Stein <lstein@cshl.org>.

       Copyright (c) 2006 Cold Spring Harbor Laboratory.

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.