Provided by: phybin_0.3-3build3_amd64 bug

NAME

       phybin - binning/clustering newick trees by topology

SYNOPSIS

       phybin [OPTION...] files or directories...

DESCRIPTION

       PhyBin  takes  Newick  tree  files as input.  Paths of Newick files can be passed directly on the command
       line.  Or, if directories are provided, all files in those directories will  be  read.   Taxa  are  named
       based  on  the files containing them.  If a file contains multiple trees, all are read by phybin, and the
       taxa name then includes a suffix indicating the position in the file:

              e.g. FILENAME_0, FILENAME_1, etc.

       When clustering trees, Phybin computes a complete  all-to-all  Robinson-Foulds  distance  matrix.   If  a
       threshold  distance  (tree edit distance) is given, then a flat set of clusters will be produced in files
       clusterXX_YY.tr.  Otherwise it produces a full dendogram.

       Binning mode provides an especially quick-and-dirty form of clustering.   When  running  with  the  --bin
       option,  only  exactly  equal  trees  are  put  in  the same cluster.  Tree pre-processing still applies,
       however: for example collapsing short branches.

   USAGE NOTES:
       * Currently phybin ignores input trees with the wrong number of taxa.

       * If given a directory as input phybin will assume all contained files are Newick trees.

OPTIONS

       -v       --verbose
              print WARNINGS and other information (recommended at first)

       -V       --version
              show version number

       -o DIR   --output=DIR
              set directory to contain all output files (default "./phybin_out/")

       --selftest
              run internal unit tests

   Clustering Options
       --bin  Use simple binning, the cheapest form of 'clustering'

       --single
              Use single-linkage clustering (nearest neighbor)

       --complete
              Use complete-linkage clustering (furthest neighbor)

       --UPGMA
              Use Unweighted Pair Group Method (average linkage) - DEFAULT mode

       --editdist=DIST
              Combine all clusters separated by DIST or less.  Report a flat list of clusters.  Irrespective  of
              whether this is activated, a hierarchical clustering (dendogram.pdf) is produced.

   Select Robinson-Foulds (symmetric difference) distance algorithm:
       --hashrf
              (default) use a variant of the HashRF algorithm for the distance matrix

       --tolerant
              use a slower, modified RF metric that tolerates missing taxa

   Visualization
       -g       --graphbins
              use graphviz to produce .dot and .pdf output files

       -d       --drawbins
              like -g, but open GUI windows to show each bin's tree

       -w       --view
              for convenience, "view mode" simply displays input Newick files without binning

       --showtrees
              Print (textual) tree topology inside the nodes of the dendrogram

       --highlight=FILE
              Highlight nodes in the tree-of-trees (dendrogram) consistent with the.  given tree file.  Multiple
              highlights are permitted and use different colors.

       --interior
              Show the consensus trees for interior nodes in the dendogram, rather than just points.

   Tree pre-processing
       --prune=TAXA
              Prune trees to only TAXA before doing anything else.  Space and comma separated lists of taxa  are
              allowed.  Use quotes.

       -b LEN   --minbranchlen=LEN
              collapse branches less than LEN

       --minbootstrap=INT
              collapse branches with bootstrap values less than INT

   Extracting taxa names
       -p NUM   --nameprefix=NUM
              Leaf  names  in the input Newick trees can be gene names, not taxa.  Then it is typical to extract
              taxa names from genes.  This option extracts a prefix of NUM characters to serve as the taxa name.

       -s STR   --namesep=STR
              An alternative to --nameprefix, STR provides a set of delimeter characters,  for  example  '-'  or
              '0123456789'.   The  taxa  name  is  then a variable-length prefix of each gene name up to but not
              including any character in STR.

       -m FILE  --namemap=FILE
              Even once prefixes are extracted it may be necessary to use a lookup table to compute taxa  names,
              e.g.  if  multiple  genes/plasmids  map  onto  one  taxa.   This option specifies a text file with
              find/replace entries of the form "<string> <taxaname>", which are applied AFTER -s and -p.

   Utility Modes
       --rfdist
              print a Robinson Foulds distance matrix for the input trees

       --setdiff
              for convenience, print the set difference between cluster*.txt files

       --print
              simply print out a concise form of each input tree

       --printnorms
              simply print out a concise and NORMALIZED form of each input tree

       --consensus
              print a strict consensus tree for the inputs, then exit

       --matching
              print a list of tree names that match any --highlight argument

AUTHOR

       This manpage was written by Andreas Tille for the Debian distribution and can be used for any other usage
       of the program.