bionic (1) mptp.1.gz

Provided by: mptp_0.2.2-2_amd64 bug


       mptp — single-locus species delimitation


       Maximum-likelihood species delimitation:
              mptp --ml (--single | --multi) --tree_file newickfile --output_file outputfile [options]

       Species delimitation with support values:
              mptp --mcmc positive integer (--single | --multi) (--mcmc_startnull | --mcmc_startrandom |
              --mcmc_startml) --mcmc_log positive integer --tree_file newickfile --output_file outputfile


       Species  is  one  of  the  fundamental  units  of  comparison in virtually all subfields of biology, from
       systematics to anatomy, development, ecology, evolution, genetics and molecular biology. The aim of  mptp
       is  to  offer an open source tool to infer species boundaries on a a given phylogenetic tree based on the
       Poisson Tree Process (PTP) and the Multiple Poisson Tree Process (mPTP) models.

       mptp offers two methods for inferring species delimitation. First, a maximum-likelihood based method that
       uses  a  dynamic  programming approach to infer an ML estimate. Second, an mcmc approach for sampling the
       space of possible delimitations providing the  user  with  support  values  on  the  tree  clades.   Both
       approaches are available in two flavours: the PTP and the mPTP model. The PTP model is specified by using
       the single switch and the mPTP by using multi.

       The input for mptp is a newick file that contains one  phylogenetic  tree,  i.e.,  branches  express  the
       expected number of substitutions per alignment site.

       mptp  parses  a large number of command-line options. For easier navigation, options are grouped below by

       General options:

              --help   Display help text and exit.

                       Output version information and exit.

              --quiet  Supress all output to stdout except for warnings and fatal error messages.

              --tree_file filename
                       Input newick file that contains a phylogenetic tree. Can be rooted or unrooted.

              --output_file filename
                       Specifies the prefix used for generating output  files.  For  maximum-likelihood  species
                       delimitation  two  files  will  be  created. First, filename.txt that contains the actual
                       delimitation and filename.svg that contains an SVG figure of the  computed  delimitation.
                       For  mcmc  analyses,  a  file  filename.txt is created that contains the newick tree with
                       supports values.

              --outgroup comma-separated list of taxa
                       All computations for species delimitation are carried out on rooted trees. This option is
                       used  only  (and is required) In case an unrooted tree was specified with the --tree_file
                       option. mptp roots the unrooted tree by splitting the branch leading to the  most  recent
                       common  ancestor  (MRCA)  of  the comma-separated list of taxa into two branches of equal
                       size and introducing a new node (the root of the new rooted tree) that connects these two

                       Crops taxa specified with the --outgroup option from the the tree.

              --min_br real
                       Any  branch  lengths  in the input tree smaller or equal than real are excluded (ignored)
                       from the computations. In addition, for mcmc analyses, subtrees that exclusively  consist
                       of  branch  lengths  smaller  or  equal to real are completely ignored from the proposals
                       (support values for those clades are set to 0). (default: 0.0001)

              --precision positive integer
                       Specifies the precision of the decimal part of floating point numbers on output (default:

              --minbr_auto filename
                       Automatically  detects  the  minimum branch length from the p-distances of the FASTA file

                       Show an ASCII version  of  the  processed  input  tree  (i.e.  after  it  is  rooted  by,
                       potentially cropping, the outgroup).

       Maximum-likelihood estimations:

              Estimating  the  maximum-likelihood  delimitation  is  triggered  by  the  switch --ml followed by
              --single (the PTP model) or --ml --multi (the mPTP model). Note that these two methods affect  how
              options  --output_file  behaves  and  can  be  controlled  using the --min_br switch. Both methods
              require a rooted phylogenetic tree, however an unrooted tree may be specified in  conjuction  with
              the  option  --outgroup.  In  this  case,  mptp  roots  it  at that outgroup (see General options,
              --outgroup for more info). Note that both methods output an SVG depiction of the ML  delimitation.
              See Visualization for more information on adjusting and fine-tuning the SVG output.

              Both  methods  ignore  discard  branch  lengths  of size smaller than the size specified using the
              --min_br option. The PTP model then attempts to find a connected subgraph of the rooted tree  that
              (a) contains the root, and (b) the sum of likelihoods of fitting the edges of that subgraph in one
              exponential distribution and  the  remaining   edges  in  another  (exponential  distribution)  is
              maximized.  With  likelihood  we  mean  the sums of the probability density function with the mean
              defined as the reciprocal of the average of edge lengths in the particular distribution.

              --ml --single
                       Triggers the algorithm for computing an ML estimate of the  delimitation  using  the  PTP

              --ml --multi
                       Triggers  the  algorithm  for computing an ML estimate of the delimitation using the mPTP

              --pvalue  real
                       Only used with the PTP model (specified with --single). Sets the p-value for performing a
                       likelihood  ratio  test.  Note that, there is no likelihood ratio test for the mPTP model
                       this test is not done. (default: 0.001)

       MCMC method:

              The MCMC method is triggered with the --mcmc switch combined with either --single (the PTP  model)
              or --multi (the mPTP model).

              Some more stuff to write

              --mcmc  positive integer --single
                       Triggers  the  algorithm  for  computing support values by taking the specified number of
                       MCMC samples (delimitations) using the PTP model.

              --mcmc  positive integer --multi
                       Triggers the algorithm for computing support values by taking  the  specified  number  of
                       MCMC samples (delimitations) using the mPTP model.

              --mcmc_sample  positive integer
                       Sample only every n-th MCMC step.

                       Log the scores (log-likelihood) for each MCMC sample in a file and create an SVG plot.

              --mcmc_burnin  positive integer
                       Ignore all MCMC samples generated before the specified step. (default: 1)

              --mcmc_runs  positive integer
                       Perform  multiple MCMC runs. If more than 1 run is specified, mptp will generate one seed
                       for each run based on the provided seed using the --seed switch.  Output  files  will  be
                       generated for each run (default: 1)

              --mcmc_credible  real
                       Specify  the  probability  (0.0 to 1.0) for which to generate the credible interval i.e.,
                       the probability the true number of species will fall within the credible  interval  given
                       the observed data. (default: 0.95)

                       Start MCMC sampling from the null-model.

                       Start MCMC sampling from a random delimitation.

                       Start MCMC sampling from the ML delimitation.

              --seed  positive integer
                       Specifies  the  seed for the pseudo-random number generator. (default: randomly generated
                       based on system time)

       SVG Output:

              The ML method generates one SVG file that visualizes the processed input tree (i.e.  after  it  is
              rooted  by, potentially cropping, the outgroup) and marks the subtrees corresponding to coalescent
              processes (the detected species groups) with red color, while the speciation  process  is  colored

              The  MCMC  method generates one SVG file per run visualizing the processed tree, and indicates the
              support value for each node, i.e., the percentage of MCMC samples  (delimitations)  in  which  the
              particular  node  was  part  of  the  speciation process.  A value of 1 means it was always in the
              speciation process while a value of 0 means it was  always  in  a  coalescent  process.  The  tree
              branches  are colored according to the support values of descendant nodes; a support of value of 0
              is colored with red, 1 with black, and values in between are gradients of  the  two  colors.  Only
              support values above 0.5 are shown to avoid packed numbers in dense branching events. In addition,
              if --mcmc_log is specified, an additional SVG image of  log-likelihoods  plots  for  each  sampled
              delimitation is created.

              --svg_width  positive integer
                       Sets the total width (including margins) of the SVG in pixels. (default: 1920)

              --svg_fontsize  positive integer
                       Size of font in SVG image. (default: 12)

              --svg_tipspacing  positive integer
                       Vertical space in pixels between taxa in SVG tree. (default: 20)

              --svg_legend_ratio  real
                       Ratio  (value  between  0.0 and 1.0) of total tree length to be displayed as legend line.
                       (default: 0.1)

                       Hide legend.

              --svg_marginleft  positive integer
                       Left margin in pixels. (default: 20)

              --svg_marginright  positive integer
                       Right margin in pixels. (default: 20)

              --svg_margintop  positive integer
                       Top margin in pixels. (default: 20)

              --svg_marginbottom  positive integer
                       Top margin in pixels. (default: 20)

              --svg_inner_radius  positive integer
                       Radius of inner nodes in pixels. (default: 0)


       Compute the maximum likelihood estimate using the mPTP model by discarding all branches with length below
       or equal to 0.0001

              mptp --ml --multi --min_br 0.0001 --tree_file newick.txt --output_file out

       Run  an  MCMC  analysis  of  100  million steps with the mPTP model, that logs every one million-th step,
       ignores the first 2 million steps and discards all branches with lengths smaller or equal to 0.0001.  Use
       777 as seed. The chain will start from the ML delimitation (default).

              mptp  --mcmc 100000000 --multi --min_br 0.0001 --tree_file newick.txt --output_file out --mcmc_log
              1000000 --mcmc_burnin 2000000 -seed 777

       Perform an MCMC analysis of 5 runs, each of 100 million steps with the mPTP model, log every one million-
       th  step,  ignore the first 2 million steps, and detect the minimum branch length by specifying the FASTA
       file alignment.fa that  contains  the  alignment.  Use  777  as  seed.  Start  each  run  from  a  random

              mptp  --mcmc  100000000  --multi  ---mcmc_runs  5  --mcmc_log  1000000  --minbr_auto  alignment.fa
              --tree_file newick.txt --output_file out --mcmc_burnin 2000000 -seed 777 --mcmc_startrandom


       Implementation by Tomas Flouri, Sarah Lutteropp and  Paschalia  Kapli.  Additional  PTP  and  mPTP  model
       authors include Kassian Kobert, Jiajie Zhang, Pavlos Pavlidis, and Alexandros Stamatakis.


       Submit  suggestions and bug-reports at <>, or e-mail Tomas Flouri


       Source code and binaries are available at <>.

       Copyright (C) 2015-2017, Tomas Flouri, Sarah Lutteropp, Paschalia Kapli

       All rights reserved.

       Contact:  Tomas  Flouri  <>,  Scientific  Computing,  Heidelberg   Insititute   for
       Theoretical Studies, 69118 Heidelberg, Germany

       This software is licensed under the terms of the GNU Affero General Public License version 3.

       GNU Affero General Public License version 3

       This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero
       General Public License as published by the Free Software Foundation, either version 3 of the License,  or
       (at your option) any later version.

       This  program  is  distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
       the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Affero  General
       Public License for more details.

       You  should  have  received  a copy of the GNU Affero General Public License along with this program.  If
       not, see <>.


       New features and important modifications  of  mptp  (short  lived  or  minor  bug  releases  may  not  be

              v0.1.0 released June 27th, 2016
                     First public release.

              v0.1.1 released July 15th, 2016
                     Bug fix (now LRT test is not printed in output file when using --multi)

              v.0.2.0 released September 27th, 2016
                     Fixed  floating  point exception error when constructing random trees, caused from dividing
                     by zero.  Changed allocation from malloc to calloc, as it  caused  unititialized  variables
                     when  converting unrooted trees to rooted when using the MCMC method. Fixed sample size for
                     the AIC with a correction for finite sample sizes.

              v.0.2.1 released October 18th, 2016
                     Updated ASV to consider  only  coalescent  roots  of  ML  delimitation.  Removed  assertion
                     stopping mptp when using random starting delimitations for the MCMC method.

              v0.2.2 released January 31st, 2017
                     Fixed  regular  expressions  to  allow  scientific notation for branch lengths when parsing
                     trees.  Improved the accuracy of ASV  score  by  also  taking  into  account  tips  forming
                     coalescent roots.  Fixed memory leaks that occur when parsing incorrectly formatted trees.