bionic (1) mlpack_hoeffding_tree.1.gz

Provided by: mlpack-bin_2.2.5-1build1_amd64 bug

NAME

       mlpack_hoeffding_tree - hoeffding trees

SYNOPSIS

        mlpack_hoeffding_tree [-h] [-v]

DESCRIPTION

       This  program  implements  Hoeffding  trees,  a form of streaming decision tree suited best for large (or
       streaming) datasets. This program supports both categorical and numeric data stored in the  ARFF  format.
       Given  an  input dataset, this program is able to train the tree with numerous training options, and save
       the model to a file. The program is also able to use a trained model or a model from  file  in  order  to
       predict classes for a given test set.

       The training file and associated labels are specified with the --training_file and --labels_file options,
       respectively. The training file must be in ARFF format. The training may be performed in batch mode (like
       a  typical  decision  tree algorithm) by specifying the --batch_mode option, but this may not be the best
       option for large datasets.

       When a model is trained, it may be saved to a file with the --output_model_file (-M) option. A model  may
       be loaded from file for further training or testing with the --input_model_file (-m) option.

       A test file may be specified with the --test_file (-T) option, and if performance numbers are desired for
       that test set, labels may be specified with the --test_labels_file (-L) option. Predictions for each test
       point  will  be  stored  in  the  file  specified  by  --predictions_file (-p) and probabilities for each
       predictions will be stored in the file specified by the --probabilities_file (-P) option.

OPTIONAL INPUT OPTIONS

       --batch_mode (-b)
              If true, samples will be considered in batch instead of as a stream.  This  generally  results  in
              better trees but at the cost of memory usage and runtime.

       --bins (-B) [int]
              If  the  'domingos'  split  strategy  is  used, this specifies the number of bins for each numeric
              split. Default value 10.

       --confidence (-c) [double]
              Confidence before splitting (between 0 and 1).  Default value 0.95.

       --help (-h)
              Default help info.

       --info [string]
              Get help on a specific module or option.  Default value ''.

       --info_gain (-i)
              If set, information gain is used instead  of  Gini  impurity  for  calculating  Hoeffding  bounds.
              --input_model_file (-m) [string] File to load trained tree from. Default value ’'.

       --labels_file (-l) [string]
              Labels for training dataset. Default value ''.

       --max_samples (-n) [int]
              Maximum number of samples before splitting.  Default value 5000.

       --min_samples (-I) [int]
              Minimum  number  of  samples  before splitting.  Default value 100.  --numeric_split_strategy (-N)
              [string] The splitting strategy to use for numeric features: 'domingos' or 'binary'. Default value
              ’binary'.  --observations_before_binning (-o) [int] If the 'domingos' split strategy is used, this
              specifies the number of samples observed before binning is performed. Default value 100.

       --passes (-s) [int]
              Number of passes to take over the dataset.  Default value 1.

       --test_file (-T) [string]
              File of testing data. Default value ''.  --test_labels_file (-L) [string]  Labels  of  test  data.
              Default value ''.  --training_file (-t) [string] Training dataset file. Default value ''.

       --verbose (-v)
              Display informational messages and the full list of parameters and timers at the end of execution.

       --version (-V)
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_model_file  (-M)  [string]  File  to save trained tree to. Default value ’'.  --predictions_file
       (-p)  [string]  File  to  output  label   predictions   for   test   data   into.   Default   value   ''.
       --probabilities_file  (-P) [string] In addition to predicting labels, provide prediction probabilities in
       this file. Default value ''.

ADDITIONAL INFORMATION

ADDITIONAL INFORMATION

       For further information, including relevant papers,  citations,  and  theory,  For  further  information,
       including    relevant   papers,   citations,   and   theory,   consult   the   documentation   found   at
       http://www.mlpack.org or included with your consult the documentation found at  http://www.mlpack.org  or
       included with your DISTRIBUTION OF MLPACK.  DISTRIBUTION OF MLPACK.

                                                                         mlpack_hoeffding_tree(16 November 2017)