oracular (3) Bio::DB::WebDBSeqI.3pm.gz

Provided by: libbio-perl-perl_1.7.8-1_all bug

NAME

       Bio::DB::WebDBSeqI - Object Interface to generalize Web Databases for retrieving sequences

SYNOPSIS

          # get a WebDBSeqI object somehow
          # assuming it is a nucleotide db
          my $seq = $db->get_Seq_by_id('ROA1_HUMAN')

DESCRIPTION

       Provides core set of functionality for connecting to a web based database for retrieving
       sequences.

       Users wishing to add another Web Based Sequence Dabatase will need to extend this class
       (see Bio::DB::SwissProt or Bio::DB::NCBIHelper for examples) and implement the get_request
       method which returns a HTTP::Request for the specified uids (accessions, ids, etc
       depending on what query types the database accepts).

FEEDBACK

   Mailing Lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send
       your comments and suggestions preferably to one of the Bioperl mailing lists. Your
       participation is much appreciated.

         bioperl-l@bioperl.org                  - General discussion
         http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will
       be able look at the problem and quickly address it. Please include a thorough description
       of the problem with code and data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their
       resolution.  Bug reports can be submitted via the web.

         https://github.com/bioperl/bioperl-live/issues

AUTHOR - Jason Stajich

       Email < jason@bioperl.org >

APPENDIX

       The rest of the documentation details each of the object methods. Internal methods are
       usually preceded with a _

   get_Seq_by_id
        Title   : get_Seq_by_id
        Usage   : $seq = $db->get_Seq_by_id('ROA1_HUMAN')
        Function: Gets a Bio::Seq object by its name
        Returns : a Bio::Seq object
        Args    : the id (as a string) of a sequence
        Throws  : "id does not exist" exception

   get_Seq_by_acc
        Title   : get_Seq_by_acc
        Usage   : $seq = $db->get_Seq_by_acc('X77802');
        Function: Gets a Bio::Seq object by accession number
        Returns : A Bio::Seq object
        Args    : accession number (as a string)
        Throws  : "acc does not exist" exception

   get_Seq_by_gi
        Title   : get_Seq_by_gi
        Usage   : $seq = $db->get_Seq_by_gi('405830');
        Function: Gets a Bio::Seq object by gi number
        Returns : A Bio::Seq object
        Args    : gi number (as a string)
        Throws  : "gi does not exist" exception

   get_Seq_by_version
        Title   : get_Seq_by_version
        Usage   : $seq = $db->get_Seq_by_version('X77802.1');
        Function: Gets a Bio::Seq object by sequence version
        Returns : A Bio::Seq object
        Args    : accession.version (as a string)
        Throws  : "acc.version does not exist" exception

   get_request
        Title   : get_request
        Usage   : my $url = $self->get_request
        Function: returns a HTTP::Request object
        Returns :
        Args    : %qualifiers = a hash of qualifiers (ids, format, etc)

   get_Stream_by_id
         Title   : get_Stream_by_id
         Usage   : $stream = $db->get_Stream_by_id( [$uid1, $uid2] );
         Function: Gets a series of Seq objects by unique identifiers
         Returns : a Bio::SeqIO stream object
         Args    : $ref : a reference to an array of unique identifiers for
                          the desired sequence entries

   get_Stream_by_acc
         Title   : get_Stream_by_acc
         Usage   : $seq = $db->get_Stream_by_acc([$acc1, $acc2]);
         Function: Gets a series of Seq objects by accession numbers
         Returns : a Bio::SeqIO stream object
         Args    : $ref : a reference to an array of accession numbers for
                          the desired sequence entries
         Note    : For GenBank, this just calls the same code for get_Stream_by_id()

   get_Stream_by_gi
         Title   : get_Stream_by_gi
         Usage   : $seq = $db->get_Stream_by_gi([$gi1, $gi2]);
         Function: Gets a series of Seq objects by gi numbers
         Returns : a Bio::SeqIO stream object
         Args    : $ref : a reference to an array of gi numbers for
                          the desired sequence entries
         Note    : For GenBank, this just calls the same code for get_Stream_by_id()

   get_Stream_by_version
         Title   : get_Stream_by_version
         Usage   : $seq = $db->get_Stream_by_version([$version1, $version2]);
         Function: Gets a series of Seq objects by accession.versions
         Returns : a Bio::SeqIO stream object
         Args    : $ref : a reference to an array of accession.version strings for
                          the desired sequence entries
         Note    : For GenBank, this is implemented in NCBIHelper

   get_Stream_by_query
         Title   : get_Stream_by_query
         Usage   : $stream = $db->get_Stream_by_query($query);
         Function: Gets a series of Seq objects by way of a query string or oject
         Returns : a Bio::SeqIO stream object
         Args    : $query :   A string that uses the appropriate query language
                   for the database or a Bio::DB::QueryI object.  It is suggested
                   that you create the Bio::DB::Query object first and interrogate
                   it for the entry count before you fetch a potentially large stream.

   default_format
        Title   : default_format
        Usage   : my $format = $self->default_format
        Function: Returns default sequence format for this module
        Returns : string
        Args    : none

   request_format
        Title   : request_format
        Usage   : my ($req_format, $ioformat) = $self->request_format;
                  $self->request_format("genbank");
                  $self->request_format("fasta");
        Function: Get/Set sequence format retrieval. The get-form will normally not
                  be used outside of this and derived modules.
        Returns : Array of two strings, the first representing the format for
                  retrieval, and the second specifying the corresponding SeqIO format.
        Args    : $format = sequence format

   get_seq_stream
        Title   : get_seq_stream
        Usage   : my $seqio = $self->get_seq_stream(%qualifiers)
        Function: builds a url and queries a web db
        Returns : a Bio::SeqIO stream capable of producing sequence
        Args    : %qualifiers = a hash qualifiers that the implementing class
                  will process to make a url suitable for web querying

   url_base_address
        Title   : url_base_address
        Usage   : my $address = $self->url_base_address or
                  $self->url_base_address($address)
        Function: Get/Set the base URL for the Web Database
        Returns : Base URL for the Web Database
        Args    : $address - URL for the WebDatabase

   proxy
        Title   : proxy
        Usage   : $httpproxy = $db->proxy('http')  or
                  $db->proxy(['http','ftp'], 'http://myproxy' )
        Function: Get/Set a proxy for use of proxy
        Returns : a string indicating the proxy
        Args    : $protocol : an array ref of the protocol(s) to set/get
                  $proxyurl : url of the proxy to use for the specified protocol
                  $username : username (if proxy requires authentication)
                  $password : password (if proxy requires authentication)

   authentication
        Title   : authentication
        Usage   : $db->authentication($user,$pass)
        Function: Get/Set authentication credentials
        Returns : Array of user/pass
        Args    : Array or user/pass

   retrieval_type
        Title   : retrieval_type
        Usage   : $self->retrieval_type($type);
                  my $type = $self->retrieval_type
        Function: Get/Set a proxy for retrieval_type (pipeline, io_string or tempfile)
        Returns : string representing retrieval type
        Args    : $value - the value to store

       This setting affects how the data stream from the remote web server is processed and
       passed to the Bio::SeqIO layer. Three types of retrieval types are currently allowed:

          pipeline  Perform a fork in an attempt to begin streaming
                    while the data is still downloading from the remote
                    server.  Disk, memory and speed efficient, but will
                    not work on Windows or MacOS 9 platforms.

          io_string Store downloaded database entry(s) in memory.  Can be
                    problematic for batch downloads because entire set
                    of entries must fit in memory.  Alll entries must be
                    downloaded before processing can begin.

          tempfile  Store downloaded database entry(s) in a temporary file.
                    All entries must be downloaded before processing can
                    begin.

       The default is pipeline, with automatic fallback to io_string if pipelining is not
       available.

   url_params
        Title   : url_params
        Usage   : my $params = $self->url_params or
                  $self->url_params($params)
        Function: Get/Set the URL parameters for the Web Database
        Returns : url parameters for Web Database
        Args    : $params - parameters to be appended to the URL for the WebDatabase

   ua
        Title   : ua
        Usage   : my $ua = $self->ua or
                  $self->ua($ua)
        Function: Get/Set a LWP::UserAgent for use
        Returns : reference to LWP::UserAgent Object
        Args    : $ua - must be a LWP::UserAgent

   postprocess_data
        Title   : postprocess_data
        Usage   : $self->postprocess_data ( 'type' => 'string',
                                            'location' => \$datastr);
        Function: process downloaded data before loading into a Bio::SeqIO
        Returns : void
        Args    : hash with two keys - 'type' can be 'string' or 'file'
                                     - 'location' either file location or string
                                                  reference containing data

   delay
        Title   : delay
        Usage   : $secs = $self->delay([$secs])
        Function: get/set number of seconds to delay between fetches
        Returns : number of seconds to delay
        Args    : new value

       NOTE: the default is to use the value specified by delay_policy().  This can be overridden
       by calling this method, or by passing the -delay argument to new().

   delay_policy
        Title   : delay_policy
        Usage   : $secs = $self->delay_policy
        Function: return number of seconds to delay between calls to remote db
        Returns : number of seconds to delay
        Args    : none

       NOTE: The default delay policy is 0s.  Override in subclasses to implement delays.  The
       timer has only second resolution, so the delay will actually be +/- 1s.

   _sleep
        Title   : _sleep
        Usage   : $self->_sleep
        Function: sleep for a number of seconds indicated by the delay policy
        Returns : none
        Args    : none

       NOTE: This method keeps track of the last time it was called and only imposes a sleep if
       it was called more recently than the delay_policy() allows.

   mod_perl_api
        Title   : mod_perl_api
        Usage   : $version = self->mod_perl_api
        Function: Returns API version of mod_perl being used based on set env. variables
        Returns : mod_perl API version; if mod_perl isn't loaded, returns 0
        Args    : none