Ubuntu Manpage: phastOdds - Compute log-odds scores based on two phylogenetic models or phylo-HMMs,

NAME

       phastOdds - Compute log-odds scores based on two phylogenetic models or phylo-HMMs,

DESCRIPTION

       Compute  log-odds  scores  based  on  two phylogenetic models or phylo-HMMs, one for features of interest
       (e.g., coding exons, conserved regions) and one for background.  Will either (1) compute a score for each
       feature  in an input set, and output the same set of features with scores; or (2) output a separate score
       for each position in fixed-step WIG format (http://genome.ucsc.edu/goldenPath/help/wiggle.html);  or  (3)
       compute  scores in a sliding window of designated size, and output a three-column file, with the index of
       the center of each window followed by the score for that window on the positive strand,  then  the  score
       for  that  window  on the negative strand.  The default is to assume a reference sequence alignment, with
       the reference sequence appearing first; feature coordinates are assumed to be defined with respect to the
       reference sequence (see --refidx).

SYNOPSIS

       phastOdds   [OPTIONS]   --background-mods   <bmods>   [--background-hmm  <bhmm>]  --feature-mods  <fmods>
       [--feature-hmm <fhmm>] ( --features <feats> | --window <size> ) <alignment>

       Arguments <bmods> and <fmods> should be comma-delimited lists of phylogenetic models in .mod  format  (as
       produced  by  phyloFit),  <feats> may be in GFF, BED, or genepred format, and <alignment> may be in FASTA
       format or an alternative format specified by --msa-format.  HMM files should be in  the  format  used  by
       exoniphy.

EXAMPLE

       (See below for more details on options)

       1. Compute conservation scores for features in a GFF file, based on a

       model  for conserved sites (conserved.mod) vs. a model of neutral evolution (neutral.mod).  (These models
       may be estimated with phyloFit or phastCons.)

              phastOdds  --background-mods  neutral.mod  --feature-mods  conserved.mod  --features  features.gff
              alignment.fa > scores.gff

       Features  could  alternatively  be  specified in BED or genepred format (format will be auto-recognized).
       The program can be made to produce BED-formatted output with --output-bed.

       2. Compute conservation scores in a sliding window of size 100.

              phastOdds --background-mods neutral.mod --feature-mods conserved.mod --window 100  alignment.fa  >
              scores.dat

       (Window  is  advanced  one  site at a time.  Window boundaries are defined in the coordinate frame of the
       multiple alignment, but center coordinates are converted to the frame of the reference sequence  as  they
       are output.)

       3. Compute a "coding potential" score for features in a BED file, based on a phylo-HMM for coding regions
       versus a phylo-HMM for noncoding DNA, with states for conserved and nonconserved sequences.

              phastOdds   --background-mods   codon1.mod,codon2.mod,codon3.mod    --background-hmm    coding.hmm
              --feature-mods    neutral.mod,conserved-noncoding.mod   --feature-hmm   noncoding.hmm   --features
              features.bed --output-bed alignment.fa > scores.bed

OPTIONS

       --background-mods, -b <backgd_mods>

              (Required) Comma-delimited list of  tree  model  (*.mod)  files  for  background.   If  used  with
              --background-hmm, order of models must correspond to order of states in HMM.

       --background-hmm, -B <backgd.hmm>

       HMM for background.
              If there is only one backgound tree

              model, a trivial (single-state) HMM will be assumed.

       --feature-mods,  -f <feat_mods> (Required) Comma-delimited list of tree model (*.mod) files for features.
              If used with --feature-hmm, order of models must correspond to order of states in HMM.

       --feature-hmm, -F <feat.hmm> HMM for features.  If there is only one tree model for features,  a  trivial
              (single-state) HMM will be assumed.

       --features, -g <feats.gff>

              (Required unless -w or -y) File defining features to be scored (GFF, BED, or genepred).

       --window, -w <size> (Can be used instead of -g or -y) Compute scores in a sliding window of the specified
              size.

       --base-by-base, -y

              (Can be used instead of -g or -y) Output base-by-base scores,  in  the  coordinate  frame  of  the
              reference sequence (or of the sequence specified by --refidx).  Output is in fixed-step WIG format
              (http://genome.ucsc.edu/goldenPath/help/wiggle.html).   This  option  can  only   be   used   with
              individual phylogenetic models, not with sets of models and a (nontrivial) HMM.

       --window-wig, -W <size>

              (Can  be  used  instead of -g or -y) Like --window but outputs scores in fixed-step WIG format, as
              with --base-by-base.  Scores for the positive strand only are output.

       --msa-format, -i <type>

       Input format for alignment.
              May be FASTA, PHYLIP, MPM, SS, or

              MAF (default is to guess format from file contents).

       --refidx, -r <ref_seq> Index of reference sequence for coordinates.  Use 0  to  indicate  the  coordinate
              system of the alignment as a whole.  Default is 1, for first sequence.

       --output-bed, -d

              (For use with -g) Generate output in bed format rather than GFF.

       --verbose, -v Verbose mode.  Print messages to stderr describing what the program is doing.

       --help, -h

              Print this help message.