bionic (1) eval_predictions.1.gz

Provided by: phast_1.4+dfsg-1_amd64 bug

NAME

       eval_predictions - Compares predicted genes with "real" (or annotated) genes.

SYNOPSIS

       eval_predictions -r <real_fname_list> -p <pred_fname_list> -l <seq_len_list> [OPTIONS]

DESCRIPTION:

       Compares  predicted  genes  with  "real"  (or  annotated) genes.  Reports standard measures of prediction
       quality.  The following measures are reported:

              - nucleotide sensitivity (Sn)

              - nucleotide specificity (Sp)

              - approximate correlation (AC)

              - correlation coefficient (CC)

              - exon sensitivity (ESn)

              - exon specificity (ESp)

              - proportion of real exons correctly predicted (CRa)

              - proportion of real exons partially predicted (PCa)

              - proportion of real exons with overlapping predictions (OLa)

              - missed exons (ME)

              - proportion of predicted exons that are correct (CRp)

              - proportion of predicted exons that are partially correct (PCp)

              - proportion of predicted exons that overlap real ones (OLp)

              - wrong exons (WE)

       All quantities are computed as described in "Evaluation of Gene-Finding Programs on Mammalian Sequences,"
       by  Rogic  et al. (Genome Research 11:817-832).  Note that CRa + PCa + OLa + ME = 1 and CRp + PCp + OLp +
       WE = 1.  Note also that each set (predicted  and  real)  should  consist  of  non-overlapping  groups  of
       features (see 'refeature').

OPTIONS

       -r <real_fname_list>

              (required) List of names of files defining real genes (GFF).

       -p <pred_fname_list>

              (required)  List  of  names  of files defining predicted genes (GFF).  Must correspond in order to
              <real_fname_list>.

       -l <seq_len_list> (required) List of lengths of sequences.  Needed to compute

              certain nucleotide-level statistics.

       -f <feat_list>

       List of names of all features denoting exon regions.
              By

              default, equal to the single name "CDS".

       -d <fname_prefix> Dump full coords of correct, partially correct, wrong, missed, and overlapping exons to
              a set of files having the specified file name prefix.

       -n  <nbases>  Also  report stats on "nearly correct" exons, that is, incorrect exons whose boundaries are
              within <nbases> of being correct.  Columns will be labeled "NCa" and "NCp".

       -h Print this help message.

       NOTE: be sure stop codons are included in CDSs in both the predicted and real sets, or in neither set.