oracular (3) Bio::ASN1::EntrezGene::Indexer.3pm.gz

Provided by: libbio-asn1-entrezgene-perl_1.730-3_all bug

NAME

       Bio::ASN1::EntrezGene::Indexer - Indexes NCBI Sequence files.

VERSION

       version 1.73

SYNOPSIS

         use Bio::ASN1::EntrezGene::Indexer;

         # creating & using the index is just a few lines
         my $inx = Bio::ASN1::EntrezGene::Indexer->new(
           -filename => 'entrezgene.idx',
           -write_flag => 'WRITE'); # needed for make_index call, but if opening
                                    # existing index file, don't set write flag!
         $inx->make_index('Homo_sapiens', 'Mus_musculus', 'Rattus_norvegicus');
         my $seq = $inx->fetch(10); # Bio::Seq obj for Entrez Gene #10
         # alternatively, if one prefers just a data structure instead of objects
         $seq = $inx->fetch_hash(10); # a hash produced by Bio::ASN1::EntrezGene
                                   # that contains all data in the Entrez Gene record

         # note that in case you wonder, you can get the files 'Homo_sapiens'
         # from NCBI Entrez Gene ftp download, DATA/ASN/Mammalia directory

DESCRIPTION

       Bio::ASN1::EntrezGene::Indexer is a Perl Indexer for NCBI Entrez Gene genome databases. It
       processes an ASN.1-formatted Entrez Gene record and stores the file position for each
       record in a way compliant with Bioperl standard (in fact its a subclass of Bioperl's index
       objects).

       Note that this module does not parse record, because it needs to run fast and grab only
       the gene ids.  For parsing record, use Bio::ASN1::EntrezGene, or better yet, use
       Bio::SeqIO, format 'entrezgene'.

       It takes this module (version 1.07) 21 seconds to index the human genome Entrez Gene file
       (Apr. 5/2005 download) on one 2.4 GHz Intel Xeon processor.

METHODS

   fetch
         Parameters: $geneid - id for the Entrez Gene record to be retrieved
         Example:    my $hash = $indexer->fetch(10); # get Entrez Gene #10
         Function:   fetch the data for the given Entrez Gene id.
         Returns:    A Bio::Seq object produced by Bio::SeqIO::entrezgene
         Notes:      One needs to have Bio::SeqIO::entrezgene installed before
                       calling this function!

   fetch_hash
         Parameters: $geneid - id for the Entrez Gene record to be retrieved
         Example:    my $hash = $indexer->fetch_hash(10); # get Entrez Gene #10
         Function:   fetch a hash produced by Bio::ASN1::EntrezGene for given Entrez
                       Gene id.
         Returns:    A data structure containing all data items from the Entrez
                       Gene record.
         Notes:      Alternative to fetch()

INTERNAL METHODS

   _version
   _type_stamp
   _index_file
   _file_format
   _file_handle
         Title   : _file_handle
         Usage   : $fh = $index->_file_handle( INT )
         Function: Returns an open filehandle for the file
                   index INT.  On opening a new filehandle it
                   caches it in the @{$index->_filehandle} array.
                   If the requested filehandle is already open,
                   it simply returns it from the array.
         Example : $fist_file_indexed = $index->_file_handle( 0 );
         Returns : ref to a filehandle
         Args    : INT
         Notes   : This function is copied from Bio::Index::Abstract. Once that module
                     changes file handle code like I do below to fit perl 5.005_03, this
                     sub would be removed from this module

PREREQUISITE

       Bio::ASN1::EntrezGene, Bioperl version that contains Stefan Kirov's entrezgene.pm and all
       dependencies therein.

INSTALLATION

       Same as Bio::ASN1::EntrezGene

SEE ALSO

       For details on various parsers I generated for Entrez Gene, example scripts that
       uses/benchmarks the modules, please see <http://sourceforge.net/projects/egparser/>.
       Those other parsers etc. are included in V1.05 download.

CITATION

       Liu, Mingyi, and Andrei Grigoriev. "Fast parsers for Entrez Gene."  Bioinformatics 21, no.
       14 (2005): 3189-3190.

OPERATION SYSTEMS SUPPORTED

       Any OS that Perl & Bioperl run on.

FEEDBACK

   Mailing lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send
       your comments and suggestions preferably to the Bioperl mailing list.  Your participation
       is much appreciated.

         bioperl-l@bioperl.org              - General discussion
         http://bioperl.org/Support.html    - About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will
       be able look at the problem and quickly address it. Please include a thorough description
       of the problem with code and data examples if at all possible.

   Reporting bugs
       Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their
       resolution. Bug reports can be submitted via the web:

         https://github.com/bioperl/bio-asn1-entrezgene/issues

AUTHOR

       Dr. Mingyi Liu <mingyiliu@gmail.com>

       This software is copyright (c) 2005 by Mingyi Liu, 2005 by GPC Biotech AG, and 2005 by
       Altana Research Institute.

       This software is available under the same terms as the perl 5 programming language system
       itself.