Provided by: phast_1.5+dfsg-1_amd64 bug

NAME

       phastBias - Identify regions of the alignment which are affected by gBGC,

SYNOPSIS

       The  alignment file can be in any of several file formats (see --msa-format).  The neutral
       model must be in the .mod format produced by the phyloFit program.  The  foreground_branch
       should  identify  a  branch  of  the tree (internal branches can be named with tree_doctor
       --name-ancestors).

DESCRIPTION

       Identify regions of the alignment which are affected by gBGC, indicated by  a  cluster  of
       weak-to-strong (A/T -> G/C) substitutions amidst a deficit of strong-to-weak substitutions
       on a particular branch of the tree.  The regions are identified by a phylo-HMM  with  four
       states: neutral, conserved, neutral with gBGC, and conserved with gBGC.

   OUTPUT:
       phastBias  produces  a wig file with scores for every position in the alignment indicating
       the probability of being in one of the gBGC states.  It can also produce  gBGC  tracts  by
       thresholding  this  probability  at 0.5, or a matrix of probabilities for all four states.
       See OUTPUT OPTIONS below.

OPTIONS

   GENERAL OPTIONS:
       --help,-h Print this help message.

       TUNING PARAMETER OPTIONS:

              gBGC PARAMETERS:

       --bgc <B>

       The B parameter describes the strength of gBGC.
              It must be > 0.

              Too low  of  a  value  may  yield  false  positives,  as  the  gBGC  model  becomes
              indistinguishable from the non-gBGC model.

              Default: 3

       --estimate-bgc <0|1> Use "--estimate-bgc 1" to estimate B by maximum likelihood.  Default:
              0

       --bgc-exp-length <length>

       Set the prior expected length of gBGC tracts.
              This is equivalent to

              1/alpha in the parametrization defined by Capra et al, where alpha is the rate  out
              of gBGC states.

              Default: 1000

       --estimate-bgc-exp-length   <0|1>  Use  "--estimate-bgc-exp-length  1"  to  estimate  this
              parameter by an

              expectation-maximization algorithm.

              Default: 0

       --bgc-target-coverage <coverage>

              Set the prior for gBGC tract coverage (as a fraction between 0 and 1).

              This is represented in the model as beta/(alpha+beta), where beta is the rate  into
              the gBGC state, and alpha is the rate out of the gBGC state.

              Default: 0.01

       --estimate-bgc-target-coverage  <0|1>  Use "--estimate-bgc-target-coverage 0" to hold this
              parameter constant.  Default: 1 (This is the only parameter estimated by default.)

   CONSERVATION PARAMETERS:
       Note: it is not recommended to tune these parameters with phastBias.

       Rather, phastCons may be used to determine the best values  for  rho  and  the  transition
       rates  into/out  of  conserved  elements.   See  phastCons  --help and the phastCons HOWTO
       (available online) to learn about tuning these parameters.

       --rho <rho>

       Set the scaling factor for branch lengths in conserved states.
              Rho should

              be between 0 and 1.

              Default: 0.31

       --cons-exp-length <length>

       Set the prior expected length of conserved elements.
              This parameter is

              held constant; if you want to tune it, it  is  recommended  to  do  this  with  the
              phastCons  program  under  a  non-gBGC  model  (see the --expected-length option in
              phastCons).  Default: 45

       --cons-target-coverage <cov>

              Set the prior for coverage of conserved elements (as a fraction between 0  and  1).
              Like  the --cons-exp-length above, this parameter is also held constant, but can be
              tuned with phastCons (see phastCons --transitions).  Default: 0.3

   OTHER PARAMETERS:
       --scale <scale> Set an overall scaling factor  for  the  branch  lengths  in  all  states.
              Default: 1

       --estimate-scale <0|1>

              Rescale the branches in all states by a scaling factor determined by

              maximum likelihood (initialized by --scale above).  Default: 0

       --eqfreqs-from-msa <0|1>

              Reset  equilibrium  frequencies  of  A,C,G,T  based  on frequencies observed in the
              alignment.  Otherwise will not be altered from input model.  Default: 1

   OUTPUT OPTIONS
       --output-tracts <file.gff>

              Print a GFF file identifying all regions with posterior probability of being  in  a
              gBGC state > 0.5.

       --posteriors <none|wig|full>

              Use  this  option  to  control  posterior  probability  output, which is written to
              stdout.  "none" implies do not output anything; wig outputs a  standard  fixed-step
              wiggle  file  giving  the  probability  that each base is assigned to a gBGC state;
              "full" outputs a table with five columns.   The  first  column  is  the  coordinate
              (1-based  relative  to  the  first  sequence  in  the  alignment),  followed by the
              probabilities of each of the four states: neutral, conserved,  gBGC  neutral,  gBGC
              conserved.

              Default: wig

       --output-mods <output_root>

              Print   out  the  tree  models  for  all  four  states  to  <output_root>.cons.mod,
              <output_root>.neutral.mod,             <output_root>.gBGC_cons.mod,             and
              <output_root>.gBGC_neutral.mod.

       --informative-fn,-i <file.gff>

              Print  a  GFF  containing  regions of the alignment which are informative for gBGC.
              Note: only works properly if foreground branch is a single branch (not a  group  of
              branches).

       --informative-only,-o

              (To be used with --informative-fn). Print the informative regions, then quit.

SEE ALSO

       Capra  JA,  Hubisz MJ, Kostka D, Pollard KS, Siepel A: A Model-Based Analysis of GC-Biased
       Gene Conversion in the Human and Chimpanzee Genomes.  (Manuscript in submission).