oracular (3) Bio::SCF.3pm.gz

Provided by: libbio-scf-perl_1.03-7build2_amd64 bug

NAME

       Bio::SCF - Perl extension for reading and writing SCF sequence files

SYNOPSIS

       use Bio::SCF;

       # tied interface tie %hash,'Bio::SCF','my_scf_file.scf';

       my $sequence_length            = $hash{bases_length}; my $chromatogram_sample_length =
       $hash{samples_length}; my $third_base                 = $hash{bases}[2]; my $quality_score
       = $hash{$third_base}[2]; my $sample_A_at_time_1400      = $hash{samples}{A}[1400];

       # change the third base and write out new file $hash{bases}[2] = 'C'; tied
       (%hash)->write('new.scf');

       # object-oriented interface my $scf                        =
       Bio::SCF->new('my_scf_file.scf'); my $sequence_length            = $scf->bases_length; my
       $chromatogram_sample_length = $scf->samples_length; my $third_base                 =
       $scf->bases(2); my $quality_score              = $scf->score(2); my $sample_A_at_time_1400
       = $scf->sample('A',1400);

       # change the third base and write out new file $scf->bases(2,'C'); $scf->write('new.scf');

DESCRIPTION

       This module provides a perl interface to SCF DNA sequencing files. It has both tied hash
       and an object-oriented interfaces. It provides the ability to read fields from SCF files
       and limited ability to modify them and write them back.

   Tied Methods
       $obj = tie %hash,'Bio::SCF',$filename_or_handle
           Tie the Bio::SCF module to a filename or filehandle. If successful, tie() will return
           the object.

       $value = $hash{'key'}
           Fetch a field from the SCF file. Valid keys are as follows:

             Key             Value
             ---             -----

             bases_length    Number of called bases in the sequence (read-only)

             samples_length  Number of samples in the file (read-only)

             version         SCF version (read-only)

             code_set        Code set used to code bases (read-only)

             comments        Structured comments (read-only)

             bases           Array reference to a list of the base calls

             index           Array reference to a list of the sample position
                               for each of the base calls (e.g. the position of
                               the base calling peak)

             A               An array reference that can be used to determine the
                               probability that the base in position $i is an "A".

             G               An array reference that can be used to determine the
                               probability that the base in position $i is a "G".

             C               An array reference that can be used to determine the
                               probability that the base in position $i is a "C".

             T               An array reference that can be used to determine the
                               probability that the base in position $i is a "T".

             samples         A hash reference with keys "A", "C", "G" and "T". The
                               value of each hash is an array reference to the list
                               of intensity values for each sample.

           To get the length of the called sequence:              $scf{bases_length}

           To get the value of the called sequence at position 3: $scf{bases}[3]

           To get the sample position at which base 3 was called: $scf{index}[3]

           To get the value of the "C" curve under base 3:
           $scf{samples}{C}[$scf{index}[3]]

           To get the probability that base 3 is a "C":           $scf{C}[3]

           To print out the chromatogram as a four-column list:

               my $samples = $scf{samples};
               for (my $i = 0; $i<$scf{samples_length}; $i++) {
                  print join "\t",$samples->{C}[$i],$samples->{G}[$i],
                                  $samples->{A}[$i],$samples->{T}[$i],"\n";
               }

       $scf{bases}[$index] = $new_value
           The base call probability scores, base call values, base call positions, and sample
           values are all read/write, so that you can change them:

              $samples->{C}[500] = 0;

       each %scf
           Will return keys and values for the tied object.

       delete $scf{$key}
       %scf = ()
           These operations are not supported and will return a run-time error

   Object Methods
       $scf = Bio::SCF->new($scf_file_or_filehandle)
           Create a new Bio::SCF object. The single argument is the name of a file or a
           previously-opened filehandle. If successful, new() returns the Bio::SCF object.

       $length = $scf->bases_length
           Return the length of the called sequence.

       $samples = $scf->samples_length
           Return the length of the list of chromatogram samples in the file. There are four
           sample series, one for each base.

       $sample_size = $scf->sample_size
           Returns the size of each sample (bytes).

       $code_set = $scf->code_set
           Return the code set used for base calling.

       $base = $scf->base($base_no [,$new_base])
           Get the base call at the indicated position. If a new value is provided, will change
           the base call to the indicated base.

       $index = $scf->index($base_no [,$new_index])
           Translates the indicated base position into the sample index for that called base.
           Here is how to fetch the intensity values at base number 5:

             my $sample_index = $scf->index(5);
             my ($g,$a,$t,$c) = map { $scf->sample($_,$sample_index) } qw(G A T C);

           If you provide a new value for the sample index, it will be updated.

       $base_score = $scf->base_score($base,$base_no [,$new_score])
           Get the probability that the indicated base occurs at position $base_no. Here is how
           to fetch the probabilities for the four bases at base position 5:

             my ($g,$a,$t,$c) = map { $scf->base_score($_,5) } qw(G A T C);

           If you provide a new value for the base probability score, it will be updated.

       $score = $scf->score($base_no)
           Get the quality score for the called base at the indicated position.

       $intensity = $scf->sample($base,$sample_index [,$new_value])
           Get the intensity value for the channel corresponding to the indicated base at the
           indicated sample index. You may update the intensity by providing a new value.

       $scf->write('file_path')
           Write the updated SCF file to the indicated file path.

       $scf->fwrite($file_handle)
           Write the updated SCF file to the indicated filehandle. The file must previously have
           been opened for writing. The filehandle is actually reopened in append mode, so you
           can call fwrite() multiple times and interperse your own record separators.

EXAMPLES

       Reading information from a preexisting file:

          tie %scf, 'Bio::SCF', "data.scf";
          print "Base calls:\n";
          for ( my $i=0; $i<$scf{bases}; $i++ ){
             print "$scf{base}[$i] ";
          }
          print "\n";

          print "Intensity values for the A curve\n";
          for ( my $i=0; $i<$scf{samples}; $i++ ){
             print "$scf{sample}{A}[$i];
          }
          print "\n";

       Another example, where we set all bases to "A", indexes to 10 and write the file back:

          my $obj = tie %scf,'Bio::SCF','data.scf';
          for (0...@{$scf{bases}}-1){
             $scf{base}[$_] = "A";
             $obj->set('index', $_, 10);
          }
          $obj->write('data.scf');

AUTHOR

       Dmitri Priimak, priimak@cshl.org (1999)

       with some cleanups by Lincoln Stein, lstein@cshl.edu (2006)

       This package and its accompanying libraries is free software; you can redistribute it
       and/or modify it under the terms of the GPL (either version 1, or at your option, any
       later version) or the Artistic License 2.0.  Refer to LICENSE for the full license text.
       In addition, please see DISCLAIMER for disclaimers of warranty.

SEE ALSO

       perl(1).