Provided by: phybin_0.3-3_amd64 bug


       phybin - binning/clustering newick trees by topology


       phybin [OPTION...] files or directories...


       PhyBin  takes Newick tree files as input.  Paths of Newick files can be passed directly on
       the command line.  Or, if directories are provided, all files in those directories will be
       read.   Taxa  are  named  based on the files containing them.  If a file contains multiple
       trees, all are read by phybin, and the taxa name then includes  a  suffix  indicating  the
       position in the file:

              e.g. FILENAME_0, FILENAME_1, etc.

       When  clustering  trees,  Phybin  computes  a complete all-to-all Robinson-Foulds distance
       matrix.  If a threshold distance (tree edit  distance)  is  given,  then  a  flat  set  of
       clusters  will  be  produced  in  files   Otherwise  it  produces a full

       Binning mode provides an especially quick-and-dirty form of clustering.  When running with
       the  --bin  option,  only  exactly  equal  trees  are  put  in  the  same  cluster.   Tree
       pre-processing still applies, however: for example collapsing short branches.

       * Currently phybin ignores input trees with the wrong number of taxa.

       * If given a directory as input phybin will assume all contained files are Newick trees.


       -v       --verbose
              print WARNINGS and other information (recommended at first)

       -V       --version
              show version number

       -o DIR   --output=DIR
              set directory to contain all output files (default "./phybin_out/")

              run internal unit tests

   Clustering Options
       --bin  Use simple binning, the cheapest form of 'clustering'

              Use single-linkage clustering (nearest neighbor)

              Use complete-linkage clustering (furthest neighbor)

              Use Unweighted Pair Group Method (average linkage) - DEFAULT mode

              Combine all clusters separated by DIST or less.  Report a flat  list  of  clusters.
              Irrespective   of   whether   this   is   activated,   a   hierarchical  clustering
              (dendogram.pdf) is produced.

   Select Robinson-Foulds (symmetric difference) distance algorithm:
              (default) use a variant of the HashRF algorithm for the distance matrix

              use a slower, modified RF metric that tolerates missing taxa

       -g       --graphbins
              use graphviz to produce .dot and .pdf output files

       -d       --drawbins
              like -g, but open GUI windows to show each bin's tree

       -w       --view
              for convenience, "view mode" simply displays input Newick files without binning

              Print (textual) tree topology inside the nodes of the dendrogram

              Highlight nodes in the tree-of-trees (dendrogram) consistent with the.  given  tree
              file.  Multiple highlights are permitted and use different colors.

              Show  the  consensus  trees  for  interior nodes in the dendogram, rather than just

   Tree pre-processing
              Prune trees to only TAXA before doing anything else.   Space  and  comma  separated
              lists of taxa are allowed.  Use quotes.

       -b LEN   --minbranchlen=LEN
              collapse branches less than LEN

              collapse branches with bootstrap values less than INT

   Extracting taxa names
       -p NUM   --nameprefix=NUM
              Leaf  names  in  the  input  Newick  trees can be gene names, not taxa.  Then it is
              typical to extract taxa names from genes.  This option extracts  a  prefix  of  NUM
              characters to serve as the taxa name.

       -s STR   --namesep=STR
              An  alternative  to  --nameprefix,  STR provides a set of delimeter characters, for
              example '-' or '0123456789'.  The taxa name is then  a  variable-length  prefix  of
              each gene name up to but not including any character in STR.

       -m FILE  --namemap=FILE
              Even  once  prefixes  are  extracted  it  may be necessary to use a lookup table to
              compute taxa names, e.g. if multiple genes/plasmids map onto one taxa.  This option
              specifies  a text file with find/replace entries of the form "<string> <taxaname>",
              which are applied AFTER -s and -p.

   Utility Modes
              print a Robinson Foulds distance matrix for the input trees

              for convenience, print the set difference between cluster*.txt files

              simply print out a concise form of each input tree

              simply print out a concise and NORMALIZED form of each input tree

              print a strict consensus tree for the inputs, then exit

              print a list of tree names that match any --highlight argument


       This manpage was written by Andreas Tille for the Debian distribution and can be used  for
       any other usage of the program.