Provided by: libbio-perl-run-perl_1.7.3-9_all bug

NAME

       Bio::DB::SoapEUtilities - Interface to the NCBI Entrez web service *BETA*

SYNOPSIS

        use Bio::DB::SoapEUtilities;

        # factory construction

        my $fac = Bio::DB::SoapEUtilities->new()

        # executing a utility call

        #get an iteratable adaptor
        my $links = $fac->elink(
                      -dbfrom => 'protein',
                      -db => 'taxonomy',
                      -id => \@protein_ids )->run(-auto_adapt => 1);

        # get a Bio::DB::SoapEUtilities::Result object
        my $result = $fac->esearch(
                      -db => 'gene',
                      -term => 'sonic and human')->run;

        # get the raw XML message
        my $xml = $fac->efetch(
                    -db => 'gene',
                    -id => \@gids )->run( -raw_xml => 1 );

        # change parameters
        my $new_result = $fac->efetch(
                          -db => 'gene',
                          -id => \@more_gids)->run;
        # reset parameters
        $fac->efetch->reset_parameters( -db => 'nucleotide',
                                        -id => $nucid );
        $result = $fac->efetch->run;

        # parsing and iterating the results

        $count = $result->count;
        @ids = $result->ids;

        while ( my $linkset = $links->next_link ) {
           $submitted = $linkset->submitted_id;
        }

        ($taxid) = $links->id_map($submitted_prot_id);
        $species_io = $fac->efetch( -db => 'taxonomy',
                                    -id => $taxid )->run( -auto_adapt => 1);
        $species = $species_io->next_species;
        $linnaeus = $species->binomial;

DESCRIPTION

       This module allows the user to query the NCBI Entrez database via its SOAP (Simple Object
       Access Protocol) web service (described at
       <http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/v2.0/DOC/esoap_help.html>).  The basic
       tools ("einfo, esearch, elink, efetch, espell, epost") are available as methods off a
       "SoapEUtilities" factory object. Parameters for each tool can be queried, set and reset
       for each method through the Bio::ParameterBaseI standard calls ("available_parameters(),
       set_parameters(), get_parameters(), reset_parameters()"). Returned data can be retrieved,
       accessed and parsed in several ways, according to user preference. Adaptors and object
       iterators are available for "efetch", "egquery", "elink", and "esummary" results.

USAGE

       The "SoapEU" system has been designed to be as easy (few includes, available parameter
       facilities, reasonable defaults, intuitive aliases, built-in pipelines) or as complex
       (accessors for underlying low-level objects, all parameters accessible, custom hooks for
       builder objects, facilities for providing local copies of WSDLs) as the user requires or
       desires. (To the extent that it does not succeed in either direction, it is up to the user
       to report to the mailing list ("FEEDBACK")!)

   Factory
       To begin, make a factory:

        my $fac = Bio::DB::SoapEUtilities->new();

       From the factory, utilities are called, parameters are set, and results or adaptors are
       retrieved.

       If you have your own copy of the wsdl, use

        my $fac = Bio::Db::SoapEUtilities->new( -wsdl_file => $my_wsdl );

       otherwise, the correct one will be obtained over the network (by Bio::DB::ESoap and
       friends).

   Utilities and parameters
       To run any of the standard NCBI EUtilities ("einfo, esearch, esummary, elink, egquery,
       epost, espell"), call the desired utility from the factory.  To use a utility, you must
       set its parameters and run it to get a result.  TMTOWTDI:

        # verbose
        my $fetch = $fac->efetch();
        $fetch->set_parameters( -db => 'gene', -id => [828392, 790]);
        my $result = $fetch->run;

        # compact
        my $result = $fac->efetch(-db =>'gene',-id => [828392,790])->run;

        # change ids
        $fac->efetch->set_parameters( -id => 470338 );
        $result = $fac->run;

        # another util
        $result = $fac->esearch(-db => 'protein', -term => 'BRCA and human')->run;

        # the utilities are kept separate
        %search_params = $fac->esearch->get_parameters;
        %fetch_params = $fac->efetch->get_parameters;
        $search_param{db}; # is 'protein'
        $fetch_params{db}; # is 'gene'

       The factory is Bio::ParameterBaseI compliant: that means you can find out what you can set
       with

        @available_search = $fac->esearch->available_parameters;
        @available_egquery = $fac->egquery->available_parameters;

       For more information on parameters, see
       <http://www.ncbi.nlm.nih.gov/entrez/query/static/eutils_help.html>.

   Results
       The "intermediate" object for "SoapEU" query results is the
       Bio::DB::SoapEUtilities::Result. This is a BioPerly parsing of the SOAP message sent by
       NCBI when a query is "run()". This can be very useful on it's own, but most users will
       likely want to proceed directly to "Adaptors", which take a "Result" and turn it into more
       intuitive/familiar BioPerl objects. Go there if the following details are too gory.

       Results can be highly- or lowly-parsed, depending on the parameters passed to the factory
       "run()" method. To get the raw XML message with no parsing, do

        my $xml = $fac->$util->run(-raw_xml => 1); # $xml is a scalar string

       To retrieve a Bio::DB::SoapEUtilities::Result object with limited parsing, but with
       accessors to the SOAP::SOM message (provided by SOAP::Lite), do

        my $result = $fac->$util->run(-no_parse => 1);
        my $som = $result->som;
        my $method_hash = $som->method; # etc...

       To retrieve a "Result" object with message elements parsed into accessors, including
       "count()" and "ids()", run without arguments:

        my $result = $fac->esearch->run()
        my $count = $result->count;
        my @Count = $result->Count; # counts for each member of
                                    # the translation stack
        my @ids = $result->IdList_Id; # from automatic message parsing
        @ids = $result->ids; # a convenient alias

       See Bio::DB::SoapEUtilities::Result for more, even gorier details.

   Adaptors
       Adaptors convert EUtility "Result"s into convenient objects, via a handle that usually
       provides an iterator, in the spirit of Bio::SeqIO. These are probably more useful than the
       "Result" to the typical user, and so you can retrieve them automatically by setting the
       "run()" parameter "-auto_adapt =" 1>.

       In general, retrieve an adaptor like so:

        $adp = $fac->$util->run( -auto_adapt => 1 );
        # iterate...
        while ( my $obj = $adp->next_obj ) {
           # do stuff with $obj
        }

       The adaptor itself occasionally possesses useful methods besides the iterator. The method
       "next_obj" always works, but a natural alias is also always available:

        $seqio = $fac->esearch->run( -auto_adapt => 1 );
        while ( my $seq = $seqio->next_seq ) {
           # do stuff with $seq
        }

       In the above example, "-auto_adapt =" 1> also instructs the factory to perform an "efetch"
       based on the ids returned by the "esearch" (if any), so that the adaptor returned iterates
       over Bio::SeqI objects.

       Here is a rundown of the different adaptor flavors:

       •   "efetch", Fetch Adaptors, and BioPerl object iterators

           The "FetchAdaptor" creates bona fide BioPerl objects. Currently, there are
           FetchAdaptor subclasses for sequence data (both Genbank and FASTA rettypes) and
           taxonomy data. The choice of FetchAdaptor is based on information in the result
           message, and should be transparent to the user.

            $seqio = $fac->efetch( -db =>'nucleotide',
                                   -id => \@ids,
                                   -rettype => 'gb' )->run( -auto_adapt => 1 );
            while (my $seq = $seqio->next_seq) {
               my $taxio = $fac->efetch(
                   -db => 'taxonomy',
                   -id => $seq->species->ncbi_taxid )->run(-auto_adapt => 1);
               my $tax = $taxio->next_species;
               unless ( $tax->TaxId == $seq->species->ncbi_taxid ) {
                 print "more work for MAJ"
               }
            }

           See the pod for the FetchAdaptor subclasses (e.g.,
           Bio::DB::SoapEUtilities::FetchAdaptor::seq) for more detail.

       •   "elink", the Link adaptor, and the "linkset" iterator

           The "LinkAdaptor" manages LinkSets. In "SoapEU", an "elink" call always preserves the
           correspondence between submitted and retrieved ids. The mapping between these can be
           accessed from the adaptor object directly as "id_map()"

            my $links = $fac->elink( -db => 'protein',
                                     -dbfrom => 'nucleotide',
                                     -id => \@nucids )->run( -auto_adapt => 1 );

            # maybe more than one associated id...
            my @prot_0 = $links->id_map( $nucids[0] );

           Or iterate over the linksets:

            while ( my $ls = $links->next_linkset ) {
               @ids = $ls->ids;
               @submitted_ids = $ls->submitted_ids;
               # etc.
            }

       •   "esummary", the DocSum adaptor, and the "docsum" iterator

           The "DocSumAdaptor" manages docsums, the "esummary" return type.  The objects returned
           by iterating with a "DocSumAdaptor" have accessors that let you obtain field
           information directly. Docsums contain lots of easy-to-forget fields; use
           "item_names()" to remind yourself.

            my $docs = $fac->esummary( -db => 'taxonomy',
                                       -id => 527031 )->run(-auto_adapt=>1);
            # iterate over docsums
            while (my $d = $docs->next_docsum) {
               @available_items = $docsum->item_names;
               # any available item can be called as an accessor
               # from the docsum object...watch your case...
               $sci_name = $d->ScientificName;
               $taxid = $d->TaxId;
            }

       •   "egquery", the GQuery adaptor, and the "query" iterator

           The "GQueryAdaptor" manages global query items returned by calls to "egquery", which
           identifies all NCBI databases containing hits for your query term. The databases
           actually containing hits can be retrieved directly from the adaptor with
           "found_in_dbs":

            my $queries = $fac->egquery(
                -term => 'BRCA and human'
               )->run(-auto_adapt=>1);
            my @dbs = $queries->found_in_dbs;

           Retrieve the global query info returned for any database with "query_by_db":

            my $prot_q = $queries->query_by_db('protein');
            if ($prot_q->count) {
               #do something
            }

           Or iterate as usual:

            while ( my $q = $queries->next_query ) {
               if ($q->status eq 'Ok') {
                 # do sth
               }
            }

   Web environments and query keys
       To make large or complex requests for data, or to share queries, it may be helpful to use
       the NCBI WebEnv system to manage your queries. Each EUtility accepts the following
       parameters:

        -usehistory
        -WebEnv
        -QueryKey

       for this purpose. These store the details of your queries serverside.

       "SoapEU" attempts to make using these relatively straightforward. Use "Result" objects to
       obtain the correct parameters, and don't forget "-usehistory":

        my $result1 = $fac->esearch(
            -term => 'BRCA and human',
            -db => 'nucleotide',
            -usehistory => 1 )->run( -no_parse=>1 );

        my $result = $fac->esearch(
            -term => 'AND early onset',
            -QueryKey => $result1->query_key,
            -WebEnv => $result1->webenv )->run( -no_parse => 1 );

        my $result = $fac->esearch(
           -db => 'protein',
           -term => 'sonic',
           -usehistory => 1 )->run( -no_parse => 1 );

        # later (but not more than 8 hours later) that day...

        $result = $fac->esearch(
           -WebEnv => $result->webenv,
           -QueryKey => $result->query_key,
           -RetMax => 800 # get 'em all
           )->run; # note we're parsing the result...
        @all_ids = $result->ids;

   Error checking
       Two kinds of errors can ensue on an Entrez SOAP run. One is a SOAP fault, and the other is
       an error sent in non-faulted SOAP message from the server. The distinction is probably
       systematic, and I would welcome an explanation of it. To check for result errors, try
       something like:

        unless ( $result = $fac->$util->run ) {
           die $fac->errstr; # this will catch a SOAP fault
        }
        # a valid result object was returned, but it may carry an error
        if ($result->count == 0) {
           warn "No hits returned";
           if ($result->ERROR) {
             warn "Entrez error : ".$result->ERROR;
           }
        }

       Error handling will be improved in the package eventually.

SEE ALSO

       Bio::DB::EUtilities, Bio::DB::SoapEUtilities::Result, Bio::DB::ESoap.

FEEDBACK

   Mailing Lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send
       your comments and suggestions preferably to the Bioperl mailing list.  Your participation
       is much appreciated.

         bioperl-l@bioperl.org                  - General discussion
       http://bioperl.org/wiki/Mailing_lists  - About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will
       be able look at the problem and quickly address it. Please include a thorough description
       of the problem with code and data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their
       resolution. Bug reports can be submitted via the web:

         http://redmine.open-bio.org/projects/bioperl/

AUTHOR - Mark A. Jensen

       Email maj -at- fortinbras -dot- us

APPENDIX

       The rest of the documentation details each of the object methods.  Internal methods are
       usually preceded with a _

   new
        Title   : new
        Usage   : my $eutil = new Bio::DB::SoapEUtilities();
        Function: Builds a new Bio::DB::SoapEUtilities object
        Returns : an instance of Bio::DB::SoapEUtilities
        Args    :

   run()
        Title   : run
        Usage   : $fac->$eutility->run(@args)
        Function: Execute the EUtility
        Returns : true on success, false on fault or error
                  (reason in errstr(), for more detail check the SOAP message
                   in last_result() )
        Args    : named params appropriate to utility
                  -auto_adapt => boolean ( return an iterator over results as
                                           appropriate to util if true)
                  -raw_xml => boolean ( return raw xml result; no processing )
                  Bio::DB::SoapEUtilities::Result constructor parms

   Useful Accessors
   response_message()
        Title   : response_message
        Aliases : last_response, last_result
        Usage   : $som = $fac->response_message
        Function: get the last response message
        Returns : a SOAP::SOM object
        Args    : none

   webenv()
        Title   : webenv
        Usage   :
        Function: contains WebEnv key referencing the session
                  (set after run() )
        Returns : scalar
        Args    : none

   errstr()
        Title   : errstr
        Usage   : $fac->errstr
        Function: get the last error, if any
        Example :
        Returns : value of errstr (a scalar)
        Args    : none

   Bio::ParameterBaseI compliance
   available_parameters()
        Title   : available_parameters
        Usage   :
        Function: get available request parameters for calling
                  utility
        Returns :
        Args    : -util => $desired_utility [optional, default is
                  caller utility]

   set_parameters()
        Title   : set_parameters
        Usage   :
        Function:
        Returns : none
        Args    : -util => $desired_utility [optional, default is
                   caller utility],
                  named utility arguments

   get_parameters()
        Title   : get_parameters
        Usage   :
        Function:
        Returns : array of named parameters
        Args    : utility (scalar string) [optional]
                  (default is caller utility)

   reset_parameters()
        Title   : reset_parameters
        Usage   :
        Function:
        Returns : none
        Args    : -util => $desired_utility [optional, default is
                   caller utility],
                  named utility arguments

   parameters_changed()
        Title   : parameters_changed
        Usage   :
        Function:
        Returns : boolean
        Args    : utility (scalar string) [optional]
                  (default is caller utility)

   _soap_facs()
        Title   : _soap_facs
        Usage   : $self->_soap_facs($util, $fac)
        Function: caches Bio::DB::ESoap factories for the
                  eutils in use by this instance
        Example :
        Returns : Bio::DB::ESoap object
        Args    : $eutility, [optional on set] $esoap_factory_object

   _caller_util()
        Title   : _caller_util
        Usage   : $self->_caller_util($newval)
        Function: the utility requested off the main SoapEUtilities
                  object
        Example :
        Returns : value of _caller_util (a scalar string, a valid eutility)
        Args    : on set, new value (a scalar string [optional])