lunar (1) mlpack_decision_stump.1.gz

Provided by: mlpack-bin_3.4.2-7ubuntu1_amd64 bug

NAME

       mlpack_decision_stump - decision stump

SYNOPSIS

        mlpack_decision_stump [-b int] [-m unknown] [-l string] [-T string] [-t string] [-V bool] [-M unknown] [-p string] [-h -v]

DESCRIPTION

       This  program  implements  a  decision  stump,  which is a single-level decision tree. The
       decision stump will split on one dimension of the input data, and will split into multiple
       buckets.  The  dimension  and  bins are selected by maximizing the information gain of the
       split. Optionally, the minimum number of training points in each bin can be specified with
       the '--bucket_size (-b)' parameter.

       The  decision  stump is parameterized by a splitting dimension and a vector of values that
       denote the splitting values of each bin.

       This program enables several applications: a decision tree may be trained or  loaded,  and
       then  that  decision tree may be used to classify a given set of test points. The decision
       tree may also be saved to a file for later usage.

       To train a decision stump, training data should be passed with the ’--training_file  (-t)'
       parameter,  and  their corresponding labels should be passed with the '--labels_file (-l)'
       option. Optionally, if '--labels_file (-l)' is not specified, the labels are assumed to be
       the  last  dimension  of the training dataset. The '--bucket_size (-b)' parameter controls
       the minimum number of training points in each decision stump bucket.

       For classifying a test set, a decision stump may be loaded  with  the  ’--input_model_file
       (-m)'  parameter  (useful for the situation where a stump has already been trained), and a
       test set may be specified with the ’--test_file (-T)' parameter. The predicted labels  can
       be saved with the ’--predictions_file (-p)' output parameter.

       Because  decision  stumps are trained in batch, retraining does not make sense and thus it
       is not possible  to  pass  both  '--training_file  (-t)'  and  ’--input_model_file  (-m)';
       instead, simply build a new decision stump with the training data.

       After  training,  a decision stump can be saved with the '--output_model_file (-M)' output
       parameter. That stump may later be  re-used  in  subsequent  calls  to  this  program  (or
       others).

OPTIONAL INPUT OPTIONS

       --bucket_size (-b) [int]
              The  minimum number of training points in each decision stump bucket. Default value
              6.

       --help (-h) [bool]
              Default help info.

       --info [string]
              Print help on a specific option. Default value ''.

       --input_model_file (-m) [unknown]
              Decision stump model to load.

       --labels_file (-l) [string]
              Labels for the training set. If not specified, the labels are  assumed  to  be  the
              last row of the training data.

       --test_file (-T) [string]
              A dataset to calculate predictions for.

       --training_file (-t) [string]
              The dataset to train on.

       --verbose (-v) [bool]
              Display  informational  messages  and the full list of parameters and timers at the
              end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_model_file (-M) [unknown]
              Output decision stump model to save.

       --predictions_file (-p) [string]
              The output matrix that will hold the predicted labels for the test set.

ADDITIONAL INFORMATION

       For further information, including relevant papers, citations,  and  theory,  consult  the
       documentation found at http://www.mlpack.org or included with your distribution of mlpack.