Provided by: mlpack-bin_2.0.1-1_amd64 bug

NAME

       mlpack_hoeffding_tree - hoeffding trees

SYNOPSIS

        mlpack_hoeffding_tree [-h] [-v] [-b] [-B int] [-c double] [-m string] [-l string] [-n int] [-I int] [-N string] [-o int] [-M string] [-s int] [-p string] [-P string] [-T string] [-L string] [-t string] -V

DESCRIPTION

       This program implements Hoeffding trees, a form of streaming decision tree suited best for
       large (or streaming) datasets. This program supports both  categorical  and  numeric  data
       stored  in the ARFF format. Given an input dataset, this program is able to train the tree
       with numerous training options, and save the model to a file. The program is also able  to
       use a trained model or a model from file in order to predict classes for a given test set.

       The  training  file  and  associated  labels  are  specified  with the --training_file and
       --labels_file options, respectively. The  training  file  must  be  in  ARFF  format.  The
       training  may  be  performed  in  batch  mode  (like a typical decision tree algorithm) by
       specifying the --batch_mode option, but  this  may  not  be  the  best  option  for  large
       datasets.

       When  a  model  is  trained,  it  may be saved to a file with the --output_model_file (-M)
       option. A model may be  loaded  from  file  for  further  training  or  testing  with  the
       --input_model_file (-m) option.

       A  test file may be specified with the --test_file (-T) option, and if performance numbers
       are desired for that test set, labels may be specified with  the  --test_labels_file  (-L)
       option.  Predictions  for  each  test  point  will  be  stored  in  the  file specified by
       --predictions_file (-p) and probabilities for each predictions will be stored in the  file
       specified by the --probabilities_file (-P) option.

OPTIONS

       --batch_mode (-b)
              If true, samples will be considered in batch instead of as a stream. This generally
              results in better trees but at the cost of memory usage and runtime.

       --bins (-B) [int]
              If the 'domingos' split strategy is used, this specifies the  number  of  bins  for
              each numeric split. Default value 10.

       --confidence (-c) [double]
              Confidence before splitting (between 0 and 1).  Default value 0.95.

       --help (-h)
              Default help info.

       --info [string]
              Get help on a specific module or option.  Default value ''.

       --info_gain (-i)
              If set, information gain is used instead of Gini impurity for calculating Hoeffding
              bounds.  --input_model_file (-m) [string] File to load trained tree  from.  Default
              value ’'.

       --labels_file (-l) [string]
              Labels for training dataset. Default value ''.

       --max_samples (-n) [int]
              Maximum number of samples before splitting.  Default value 5000.

       --min_samples (-I) [int]
              Minimum    number    of    samples    before   splitting.    Default   value   100.
              --numeric_split_strategy (-N) [string] The splitting strategy to  use  for  numeric
              features:      'domingos'      or     'binary'.     Default     value     ’binary'.
              --observations_before_binning (-o) [int] If the 'domingos' split strategy is  used,
              this  specifies the number of samples observed before binning is performed. Default
              value 100.  --output_model_file (-M) [string] File to save trained tree to. Default
              value ’'.

       --passes (-s) [int]
              Number  of  passes  to take over the dataset.  Default value 1.  --predictions_file
              (-p) [string] File to output label predictions for test data  into.  Default  value
              ''.   --probabilities_file  (-P) [string] In addition to predicting labels, provide
              prediction probabilities in this file. Default value ''.

       --test_file (-T) [string]
              File of testing data. Default value ''.  --test_labels_file (-L) [string] Labels of
              test  data. Default value ''.  --training_file (-t) [string] Training dataset file.
              Default value ''.

       --verbose (-v)
              Display informational messages and the full list of parameters and  timers  at  the
              end of execution.

       --version (-V)
              Display the version of mlpack.

ADDITIONAL INFORMATION

ADDITIONAL INFORMATION

       For  further  information,  including  relevant papers, citations, and theory, For further
       information, including relevant papers, citations, and theory, consult  the  documentation
       found  at  http://www.mlpack.org  or included with your consult the documentation found at
       http://www.mlpack.org or included with  your  DISTRIBUTION  OF  MLPACK.   DISTRIBUTION  OF
       MLPACK.

                                                                         mlpack_hoeffding_tree(1)