oracular (3) Bio::DB::BioFetch.3pm.gz

Provided by: libbio-db-biofetch-perl_1.7.3-4_all bug

NAME

       Bio::DB::BioFetch - Database object interface to BioFetch retrieval

SYNOPSIS

        use Bio::DB::BioFetch;

        $bf = Bio::DB::BioFetch->new();

        $seq = $bf->get_Seq_by_id('HSFOS');  # EMBL or SWALL ID

        # change formats, storage procedures
        $bf = Bio::DB::BioFetch->new(-format        => 'fasta',
                                    -retrievaltype => 'tempfile',
                                    -db            => 'EMBL');

        $stream = $bf->get_Stream_by_id(['HSFOS','J00231']);
        while (my $s = $stream->next_seq) {
           print $s->seq,"\n";
        }
        # get a RefSeq entry
        $bf->db('refseq');
        eval {
            $seq = $bf->get_Seq_by_version('NM_006732.1'); # RefSeq VERSION
        };
        print "accession is ", $seq->accession_number, "\n" unless $@;

DESCRIPTION

       Bio::DB::BioFetch is a guaranteed best effort sequence entry fetching method.  It goes to
       the Web-based dbfetch server located at the EBI
       (http://www.ebi.ac.uk/Tools/dbfetch/dbfetch) to retrieve sequences in the EMBL or GenBank
       sequence repositories.

       This module implements all the Bio::DB::RandomAccessI interface, plus the
       get_Stream_by_id() and get_Stream_by_acc() methods that are found in the
       Bio::DB::SwissProt interface.

FEEDBACK

   Mailing Lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send
       your comments and suggestions preferably to one of the Bioperl mailing lists.  Your
       participation is much appreciated.

         bioperl-l@bioperl.org                  - General discussion
         http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will
       be able look at the problem and quickly address it. Please include a thorough description
       of the problem with code and data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their
       resolution.  Bug reports can be submitted via the web:

         https://github.com/bioperl/bioperl-live/issues

AUTHOR - Lincoln Stein

       Email Lincoln Stein  <lstein@cshl.org<

       Also thanks to Heikki Lehvaslaiho <heikki-at-bioperl-dot-org> for the BioFetch server and
       interface specification.

APPENDIX

       The rest of the documentation details each of the object methods. Internal methods are
       usually preceded with a _

   new
        Title   : new
        Usage   : $bf = Bio::DB::BioFetch->new(@args)
        Function: Construct a new Bio::DB::BioFetch object
        Returns : a Bio::DB::BioFetch object
        Args    : see below
        Throws  :

       @args are standard -name=>value options as listed in the following table. If you do not
       provide any options, the module assumes reasonable defaults.

         Option         Value                            Default
         ------         -----                            -------

         -baseaddress   location of dbfetch server       http://www.ebi.ac.uk/Tools/dbfetch/dbfetch
         -retrievaltype "tempfile" or "io_string"        io_string
         -format        "embl", "fasta", "swissprot",    embl
                         or "genbank"
         -db            "embl", "genbank" or "swissprot" embl

   new_from_registry
        Title   : new_from_registry
        Usage   : $biofetch = $db->new_from_registry(%config)
        Function: Creates a BioFetch object from the registry config hash
        Returns : itself
        Args    : A configuration hash (see Registry.pm)
        Throws  :

   get_Seq_by_id
        Title   : get_Seq_by_id
        Usage   : $seq = $db->get_Seq_by_id('ROA1_HUMAN')
        Function: Gets a Bio::Seq object by its name
        Returns : a Bio::Seq object
        Args    : the id (as a string) of a sequence
        Throws  : "id does not exist" exception

   get_Seq_by_acc
        Title   : get_Seq_by_acc
        Usage   : $seq = $db->get_Seq_by_acc('X77802');
        Function: Gets a Bio::Seq object by accession number
        Returns : A Bio::Seq object
        Args    : accession number (as a string)
        Throws  : "acc does not exist" exception

   get_Seq_by_gi
        Title   : get_Seq_by_gi
        Usage   : $seq = $db->get_Seq_by_gi('405830');
        Function: Gets a Bio::Seq object by gi number
        Returns : A Bio::Seq object
        Args    : gi number (as a string)
        Throws  : "gi does not exist" exception

   get_Seq_by_version
        Title   : get_Seq_by_version
        Usage   : $seq = $db->get_Seq_by_version('X77802.1');
        Function: Gets a Bio::Seq object by sequence version
        Returns : A Bio::Seq object
        Args    : accession.version (as a string)
        Throws  : "acc.version does not exist" exception

   get_Stream_by_id
         Title   : get_Stream_by_id
         Usage   : $stream = $db->get_Stream_by_id( [$uid1, $uid2] );
         Function: Gets a series of Seq objects by unique identifiers
         Returns : a Bio::SeqIO stream object
         Args    : $ref : a reference to an array of unique identifiers for
                          the desired sequence entries

   get_Stream_by_gi
         Title   : get_Stream_by_gi
         Usage   : $seq = $db->get_Seq_by_gi([$gi1, $gi2]);
         Function: Gets a series of Seq objects by gi numbers
         Returns : a Bio::SeqIO stream object
         Args    : $ref : a reference to an array of gi numbers for
                          the desired sequence entries
         Note    : For GenBank, this just calls the same code for get_Stream_by_id()

   get_Stream_by_batch
         Title   : get_Stream_by_batch
         Usage   : $seq = $db->get_Stream_by_batch($ref);
         Function: Get a series of Seq objects by their IDs
         Example :
         Returns : a Bio::SeqIO stream object
         Args    : $ref : an array reference containing a list of unique
                   ids/accession numbers.

       In some of the Bio::DB::* moduels, get_Stream_by_id() is called get_Stream_by_batch().
       Since there seems to be no consensus, this is provided as an alias.

The remainder of these methods are for internal use

   get_request
        Title   : get_request
        Usage   : my $url = $self->get_request
        Function: returns a HTTP::Request object
        Returns :
        Args    : %qualifiers = a hash of qualifiers (ids, format, etc)

   default_format
        Title   : default_format
        Usage   : $format = $self->default_format
        Function: return the default format
        Returns : a string
        Args    :

   default_db
        Title   : default_db
        Usage   : $db = $self->default_db
        Function: return the default database
        Returns : a string
        Args    :

   db
        Title   : db
        Usage   : $db = $self->db([$db])
        Function: get/set the database
        Returns : a string
        Args    : new database

   postprocess_data
        Title   : postprocess_data
        Usage   : $self->postprocess_data ( 'type' => 'string',
                                            'location' => \$datastr);
        Function: process downloaded data before loading into a Bio::SeqIO
        Returns : void
        Args    : hash with two keys - 'type' can be 'string' or 'file'
                                     - 'location' either file location or string
                                        reference containing data

   request_format
        Title   : request_format
        Usage   : my ($req_format, $ioformat) = $self->request_format;
                  $self->request_format("genbank");
                  $self->request_format("fasta");
        Function: Get/Set sequence format retrieval. The get-form will normally not
                  be used outside of this and derived modules.
        Returns : Array of two strings, the first representing the format for
                  retrieval, and the second specifying the corresponding SeqIO format.
        Args    : $format = sequence format

   Bio::DB::WebDBSeqI methods
       Overriding WebDBSeqI method to help newbies to retrieve sequences.  EMBL database is all
       too often passed RefSeq accessions. This redirects those calls. See Bio::DB::RefSeq.

   get_Stream_by_acc
         Title   : get_Stream_by_acc
         Usage   : $seq = $db->get_Seq_by_acc([$acc1, $acc2]);
         Function: Gets a series of Seq objects by accession numbers
         Returns : a Bio::SeqIO stream object
         Args    : $ref : a reference to an array of accession numbers for
                          the desired sequence entries

   _check_id
         Title   : _check_id
         Usage   :
         Function: Throw on whole chromosome NCBI sequences not in sequence databases
                   and redirect RefSeq accession requests sent to EMBL.
         Returns :
         Args    : $id(s), $string
         Throws  : if accessionn number indicates whole chromosome NCBI sequence