Provided by: bioperl_1.6.924-3_all bug

NAME

       bp_process_gadfly.pl - Massage Gadfly/FlyBase GFF files into a version suitable for the
       Generic Genome Browser

SYNOPSIS

         % bp_process_gadfly.pl ./RELEASE2 > gadfly.gff

DESCRIPTION

       This script massages the RELEASE 3 Flybase/Gadfly GFF files located at
       http://www.fruitfly.org/sequence/release3download.shtml into the "correct" version of the
       GFF format.

       To use this script, download the whole genome FASTA file and save it to disk.  (The
       downloaded file will be called something like
       "na_whole-genome_genomic_dmel_RELEASE3.FASTA", but the link on the HTML page doesn't give
       the filename.)  Do the same for the whole genome GFF annotation file (the saved file will
       be called something like "whole-genome_annotation-feature-region_dmel_RELEASE3.GFF".)  If
       you wish you can download the ZIP compressed versions of these files.

       Next run this script on the two files, indicating the name of the downloaded FASTA file
       first, followed by the gff file:

        % bp_process_gadfly.pl na_whole-genome_genomic_dmel_RELEASE3.FASTA whole-genome_annotation-feature-region_dmel_RELEASE3.GFF > fly.gff

       The gadfly.gff file and the fasta file can now be loaded into a Bio::DB::GFF database
       using the following command:

         % bulk_load_gff.pl -d fly -fasta na_whole-genome_genomic_dmel_RELEASE3.FASTA fly.gff

       (Where "fly" is the name of the database.  Change it as appropriate.  The database must
       already exist and be writable by you!)

       The resulting database will have the following feature types (represented as
       "method:source"):

         Component:arm              A chromosome arm
         Component:scaffold         A chromosome scaffold (accession #)
         Component:gap              A gap in the assembly
         clone:clonelocator         A BAC clone
         gene:gadfly                A gene accession number
         transcript:gadfly          A transcript accession number
         translation:gadfly         A translation
         codon:gadfly               Significance unknown
         exon:gadfly                An exon
         symbol:gadfly              A classical gene symbol
         similarity:blastn          A BLASTN hit
         similarity:blastx          A BLASTX hit
         similarity:sim4            EST->genome using SIM4
         similarity:groupest        EST->genome using GROUPEST
         similarity:repeatmasker    A repeat

       IMPORTANT NOTE: This script will *only* work with the RELEASE3 gadfly files and will not
       work with earlier releases.

SEE ALSO

       Bio::DB::GFF, bulk_load_gff.pl, load_gff.pl

AUTHOR

       Lincoln Stein, lstein@cshl.org

       Copyright (c) 2002 Cold Spring Harbor Laboratory

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.  See DISCLAIMER.txt for disclaimers of warranty.