bionic (3) Bio::ASN1::EntrezGene::Indexer.3pm.gz

Provided by: libbio-asn1-entrezgene-perl_1.720-1_all bug

NAME

       Bio::ASN1::EntrezGene::Indexer - Indexes NCBI Sequence files.

VERSION

       version 1.72

SYNOPSIS

         use Bio::ASN1::EntrezGene::Indexer;

         # creating & using the index is just a few lines
         my $inx = Bio::ASN1::EntrezGene::Indexer->new(
           -filename => 'entrezgene.idx',
           -write_flag => 'WRITE'); # needed for make_index call, but if opening
                                    # existing index file, don't set write flag!
         $inx->make_index('Homo_sapiens', 'Mus_musculus', 'Rattus_norvegicus');
         my $seq = $inx->fetch(10); # Bio::Seq obj for Entrez Gene #10
         # alternatively, if one prefers just a data structure instead of objects
         $seq = $inx->fetch_hash(10); # a hash produced by Bio::ASN1::EntrezGene
                                   # that contains all data in the Entrez Gene record

         # note that in case you wonder, you can get the files 'Homo_sapiens'
         # from NCBI Entrez Gene ftp download, DATA/ASN/Mammalia directory

DESCRIPTION

       Bio::ASN1::EntrezGene::Indexer is a Perl Indexer for NCBI Entrez Gene genome databases. It processes an
       ASN.1-formatted Entrez Gene record and stores the file position for each record in a way compliant with
       Bioperl standard (in fact its a subclass of Bioperl's index objects).

       Note that this module does not parse record, because it needs to run fast and grab only the gene ids.
       For parsing record, use Bio::ASN1::EntrezGene, or better yet, use Bio::SeqIO, format 'entrezgene'.

       It takes this module (version 1.07) 21 seconds to index the human genome Entrez Gene file (Apr. 5/2005
       download) on one 2.4 GHz Intel Xeon processor.

METHODS

   fetch
         Parameters: $geneid - id for the Entrez Gene record to be retrieved
         Example:    my $hash = $indexer->fetch(10); # get Entrez Gene #10
         Function:   fetch the data for the given Entrez Gene id.
         Returns:    A Bio::Seq object produced by Bio::SeqIO::entrezgene
         Notes:      One needs to have Bio::SeqIO::entrezgene installed before
                       calling this function!

   fetch_hash
         Parameters: $geneid - id for the Entrez Gene record to be retrieved
         Example:    my $hash = $indexer->fetch_hash(10); # get Entrez Gene #10
         Function:   fetch a hash produced by Bio::ASN1::EntrezGene for given Entrez
                       Gene id.
         Returns:    A data structure containing all data items from the Entrez
                       Gene record.
         Notes:      Alternative to fetch()

INTERNAL METHODS

   _version
   _type_stamp
   _index_file
   _file_format
   _file_handle
         Title   : _file_handle
         Usage   : $fh = $index->_file_handle( INT )
         Function: Returns an open filehandle for the file
                   index INT.  On opening a new filehandle it
                   caches it in the @{$index->_filehandle} array.
                   If the requested filehandle is already open,
                   it simply returns it from the array.
         Example : $fist_file_indexed = $index->_file_handle( 0 );
         Returns : ref to a filehandle
         Args    : INT
         Notes   : This function is copied from Bio::Index::Abstract. Once that module
                     changes file handle code like I do below to fit perl 5.005_03, this
                     sub would be removed from this module

PREREQUISITE

       Bio::ASN1::EntrezGene, Bioperl version that contains Stefan Kirov's entrezgene.pm and all dependencies
       therein.

INSTALLATION

       Same as Bio::ASN1::EntrezGene

SEE ALSO

       For details on various parsers I generated for Entrez Gene, example scripts that uses/benchmarks the
       modules, please see <http://sourceforge.net/projects/egparser/>.  Those other parsers etc. are included
       in V1.05 download.

CITATION

       Liu, Mingyi, and Andrei Grigoriev. "Fast parsers for Entrez Gene."  Bioinformatics 21, no. 14 (2005):
       3189-3190.

OPERATION SYSTEMS SUPPORTED

       Any OS that Perl & Bioperl run on.

FEEDBACK

   Mailing lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments
       and suggestions preferably to the Bioperl mailing list.  Your participation is much appreciated.

         bioperl-l@bioperl.org                  - General discussion
         http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will be able look
       at the problem and quickly address it. Please include a thorough description of the problem with code and
       data examples if at all possible.

   Reporting bugs
       Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution.
       Bug reports can be submitted via the web:

         https://redmine.open-bio.org/projects/bioperl/

AUTHOR

       Dr. Mingyi Liu <mingyiliu@gmail.com>

       This software is copyright (c) 2005 by Mingyi Liu, 2005 by GPC Biotech AG, and 2005 by Altana Research
       Institute.

       This software is available under the same terms as the perl 5 programming language system itself.