lunar (1) phyloBoot.1.gz

Provided by: phast_1.6+dfsg-3_amd64 bug

NAME

       phyloBoot - Generate simulated alignment data by parametric or nonparametric

DESCRIPTION

       Generate  simulated  alignment  data  by parametric or nonparametric bootstrapping, and/or
       estimate errors in phylogenetic model parameters.  When estimating errors  in  parameters,
       the  tree  topology  is  not  inferred  --  estimated  errors are conditional on the given
       topology.

       If a model is  given  in  the  form  of  a  .mod  file  (<model_fname>),  then  parametric
       bootstrapping  is  performed  -- i.e., synthetic data sets are drawn from the distribution
       defined by the model.  Otherwise, the input file is assumed to be  a  multiple  alignment,
       and  non-parametric bootstrapping is performed -- i.e., sites are drawn (with replacement)
       from the empirical distribution defined by the given alignment.

       The default behavior is to produce simulated alignments,  estimate  model  parameters  for
       each  one,  and then write a table to stdout with a row for each parameter and columns for
       the mean, standard deviation (approximate standard error), median, minimum, and maximum of
       estimated values, plus the boundaries of 95%% and 90%% confidence intervals.

       The --alignments-only option, however, allows the parameter estimation step to be bypassed
       entirely, and the program to be used simply to generate simulated data  sets.   See  usage
       for phyloFit for additional details on tree-building options.

EXAMPLE

       (See below for more details on options)

       1. Estimation of parameter errors by parametric bootstrapping.

              phyloBoot --nreps 500 --nsites 10000 mymodel.mod > par_errors

       2. Estimation of parameter errors by nonparametric bootstrapping.

              phyloBoot   --nreps   500   --nsites   10000  --tree  "((human,chimp),(mouse,rat))"
              myalignment.fa > nonpar_errors

       3. Parametric generation of simulated data.

              phyloBoot mymodel.mod --alignments-only pardata --nreps 500 --nsites 10000

       4. Nonparametric generation of simulated data.

              phyloBoot myalignment.fa --alignments-only nonpardata --nreps 500 --nsites 10000

OPTIONS

   bootstrapping options
       --nsites, -L <number> Number of sites in sampled alignments.  If an alignment is

              given (non-parametric case), default is number of  sites  in  alignment,  otherwise
              default is 1000.

       --nreps, -n <number> Number of replicates.  Default is 100.

       --msa-format, -i FASTA|PHYLIP|MPM|MAF|SS

       (non-parametric case only)
              Alignment format.  Default is to guess format from file contents.

       --alignments-only,  -a <fname_root> Generate alignments and write them to files with given
              filename root, but do not estimate parameters.

       --dump-mods, -d <fname_root>

              Dump .mod files for individual estimated models (one for each replicate).

       --dump-samples, -m <fname_root>

              Dump  simulated  alignments  to  files  with  given  filename  root.   Similar   to
              --alignments-only but does not disable parameter estimation.

       --dump-format, -o FASTA|PHYLIP|MPM|SS.

              (For  use with --alignments-only or --dump-samples) File format to use when dumping
              raw alignments.  Default FASTA.

       --read-mods, -R <fname_list> Read estimated models  from  list  of  filenames  instead  of
              generating  alignments and estimating parameters.  fname_list can be commadelimited
              list of files, or, if preceded by a '*', the name of a  file  containing  the  file
              names  (one  per line).  Can be used to compute statistics for replicates that have
              been processed separately (see --alignments-only).  When this option is  used,  the
              primary argument to the program (<model_fname>|<msa_fname>) will be ignored.

       --output-average,  -A  <fname>  Output  a tree model representing the average of all input
              models to the specified file.

       --quiet, -q

              Proceed quietly.

       --help, -h Print this help message.

   tree-building options
       --tree, -t <tree_fname>|<tree_string>  (Required  if  non-parametric  and  more  than  two
              species) Name of file or literal string defining tree topology.

       --subst-mod, -s JC69|F81|HKY85|REV|SSREV|UNREST|R2|R2S|U2|U2S|R3|R3S|U3|U3S

       (default REV).
              Nucleotide substitution model.

       --nrates, -k <nratecats> (default 1).  Number of rate categories to use.  Specifying a

              value  of greater than one causes the discrete gamma model for rate variation to be
              used.

       --EM, -E Use EM rather than the BFGS quasi-Newton algorithm for parameter estimation.

       --precision, -p HIGH|MED|LOW

              (default HIGH) Level of precision to use in estimating model parameters.

       --init-model, -M <mod_fname>

              Initialize optimization procedure with specified tree model.

       --init-random, -r

              Initialize parameters randomly.

       --scale,-P <rho> Scale input tree by factor rho before doing parametric simulations.

       --subtree,-S <node> For use with --subtree-scale and/or subtree-switch.  Define

              subtree including all children of named node, including branch leading up to node.

       --subtree-scale,-l <lambda> Scale subtree defined with --subtree option by factor lambda.

       --subtree-switch,-w <prob>

              With given probability, randomly switch branches in tree from subtree to  supertree
              and  vice versa.  Randomization is performed independently for each branch in every
              column of simulated data.

       --scale-file,-F <file> (For use with --subtree in parametric mode).  Instead of using

       --subtree-scale  or  --scale,  read  in  a  tab-delimited   file   with   three   columns:
              numSite,scale,subtree_scale.   For each row in the file phyloBoot will simulate the
              given number of sites with those scaling factors, and then will move on to the next
              row, so that the total number of sites is the sum of the first column.