Provided by: libboulder-perl_1.30-5.1_all bug

NAME

       Boulder::LocusLink - Fetch LocusLink data records as parsed Boulder Stones

SYNOPSIS

         # parse a file of LocusLink records
         $ll = new Boulder::LocusLink(-accessor=>'File',
                                    -param => '/home/data/LocusLink/LL_tmpl');
         while (my $s = $ll->get) {
           print $s->Identifier;
           print $s->Gene;
         }

         # parse flatfile records yourself
         open (LL,"/home/data/LocusLink/LL_tmpl");
         local $/ = "*RECORD*";
         while (<LL>) {
            my $s = Boulder::LocusLink->parse($_);
            # etc.
         }

DESCRIPTION

       Boulder::LocusLink provides retrieval and parsing services for LocusLink records

       Boulder::LocusLink provides retrieval and parsing services for NCBI LocusLink records.  It
       returns Unigene entries in Stone format, allowing easy access to the various fields and
       values.  Boulder::LocusLink is a descendent of Boulder::Stream, and provides a stream-like
       interface to a series of Stone objects.

       Access to LocusLink is provided by one accessors, which give access to  local LocusLink
       database.  When you create a new Boulder::LocusLink stream, you provide the accessors,
       along with accessor-specific parameters that control what entries to fetch.  The accessors
       is:

       File
         This provides access to local LocusLink entries by reading from a flat file (typically
         Hs.dat file downloadable from NCBI's Ftp site).  The stream will return a Stone
         corresponding to each of the entries in the file, starting from the top of the file and
         working downward.  The parameter is the path to the local file.

       It is also possible to parse a single LocusLink entry from a text string stored in a
       scalar variable, returning a Stone object.

   Boulder::LocusLink methods
       This section lists the public methods that the Boulder::LocusLink class makes available.

       new()
              # Local fetch via File
              $ug=new Boulder::LocusLink(-accessor  =>  'File',
                                       -param     =>  '/data/LocusLink/Hs.dat');

           The new() method creates a new Boulder::LocusLink stream on the accessor provided.
           The only possible accessors is File.  If successful, the method returns the stream
           object.  Otherwise it returns undef.

           new() takes the following arguments:

                   -accessor       Name of the accessor to use
                   -param          Parameters to pass to the accessor

           Specify the accessor to use with the -accessor argument.  If not specified, it
           defaults to File.

           -param is an accessor-specific argument.  The possibilities is:

           For File, the -param argument must point to a string-valued scalar, which will be
           interpreted as the path to the file to read LocusLink entries from.

       get()
           The get() method is inherited from Boulder::Stream, and simply returns the next parsed
           LocusLink Stone, or undef if there is nothing more to fetch.  It has the same
           semantics as the parent class, including the ability to restrict access to certain
           top-level tags.

       put()
           The put() method is inherited from the parent Boulder::Stream class, and will write
           the passed Stone to standard output in Boulder format.  This means that it is
           currently not possible to write a Boulder::LocusLink object back into LocusLink
           flatfile form.

OUTPUT TAGS

       The tags returned by the parsing operation are taken from the names shown in the Flat file
       Hs.dat since no better description of them is provided yet by the database source
       producer.

   Top-Level Tags
       These are tags that appear at the top level of the parsed LocusLink entry.

       Identifier
           The LocusLink identifier of this entry.  Identifier is a single-value tag.

           Example:

                 my $identifierNo = $s->Identifier;

       Current_locusid
           If a locus has been merged with another, the Current_locusid contains the previous
           LOCUSID line (A bit confusing, shall be called "previous_locusid", but this is defined
           in NCBI README File ... ).

           Example:
                 my $prevlocusid=$s->Current_locusid;

       Organism Source species ased on NCBI's Taxonomy
           Example:
                 my $theorganism=$s->Organism;

       Status Type of reference sequence record. If "PROVISIONAL" then means that is generated
       automatically from existing Genbank record and information stored in the LocusLink
       database, no curation. If "REVIEWED" than it means that is generated from the most
       representative complete GenBank sequence or merge of GenBank sequenes and from information
       stored in the LocusLink database
           Example:
                 my $thestatus=$s->Status;

       LocAss Here comes a complex record ... made up of LOCUS_STRING, NM         The value in
       the LOCUS field of the RefSeq record , NP         The RefSeq accession number for an mRNA
       record, PRODUCT    The name of the produc tof this transcript, TRANSVAR   a variant-
       specific description, ASSEMBLY   The Genbank accession used to assemble the refseq record
           Example:
                 my $theprod=$s->LocAss->Product;

       AccProt Here comes a complex record ... made up of ACCNUM        Nucleotide sequence
       accessio number TYPE         e=EST, m=mRNA, g=Genomic PROT         set of PID values for
       the coding region or regions annotated on the nucleotide record. The first value is the
       PID (an integer or null), then either MMDB or na, separated from the PID by a |. If MMDB
       is present, it indicates there are structur edata available for a protein related to the
       protein referenced by the PID Example: my $theprot=$s->AccProt->Prot;
       OFFICIAL_SYMBOL The symbol used for gene reports, validated by the appropriate
       nomenclature committee
       PREFERRED_SYMBOL Interim symbol used for display
       OFFICIAL_GENE_NAME The gene description used for gene reports validate by the appropriate
       nomenclatur eommittee. If the symbol is official, the gene name will be official. No
       records will have both official and interim nomenclature.
       PREFERRED_GENE_NAME Interim used for display
       PREFERRED_PRODUCT The name of the product used in the RefSeq record
       ALIAS_SYMBOL Other symbols associated with this gene
       ALIAS_PROT Other protein names associated with this gene
       PhenoTable A complex record made up of Phenotype Phenotype_ID
       SUmmary
       Unigene
       Omim
       Chr
       Map
       STS
       ECNUM
       ButTable BUTTON LINK
       DBTable DB_DESCR DB_LINK
       PMID a subset of publications associated with this locus with the link being the PubMed
       unique identifier comma separated

SEE ALSO

       Boulder, Boulder::Blast, Boulder::Genbank

AUTHOR

       Lincoln Stein <lstein@cshl.org>.  Luca I.G. Toldo <luca.toldo@merck.de>

       Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.  See DISCLAIMER.txt for disclaimers of warranty.