Provided by: mlpack-bin_3.0.4-1_amd64 bug

NAME

       mlpack_random_forest - random forests

SYNOPSIS

        mlpack_random_forest [-m unknown] [-l string] [-n int] [-N int] [-a bool] [-T string] [-L string] [-t string] [-V bool] [-M unknown] [-p string] [-P string] [-h -v]

DESCRIPTION

       This  program  is an implementation of the standard random forest classification algorithm
       by Leo Breiman. A random forest can be trained and saved for later use, or a random forest
       may be loaded and predictions or class probabilities for points may be generated.

       The  training  set and associated labels are specified with the '--training_file (-t)' and
       '--labels_file (-l)' parameters, respectively. The labels  should  be  in  the  range  [0,
       num_classes  -  1].  Optionally,  if '--labels_file (-l)' is not specified, the labels are
       assumed to be the last dimension of the training dataset.

       When a model is trained, the '--output_model_file (-M)' output parameter may  be  used  to
       save the trained model. A model may be loaded for predictions with the '--input_model_file
       (-m)'parameter. The '--input_model_file (-m)' parameter may  not  be  specified  when  the
       '--training_file  (-t)'  parameter  is specified. The '--minimum_leaf_size (-n)' parameter
       specifies the minimum number of training points that must fall into each leaf for it to be
       split.   The  '--num_trees  (-N)'  controls  the  number of trees in the random forest. If
       ’--print_training_accuracy (-a)' is specified, the calculated accuracy on the training set
       will be printed.

       Test  data  may  be  specified  with  the '--test_file (-T)' parameter, and if performance
       measures are desired for that test set, labels for the test points may be  specified  with
       the  '--test_labels_file (-L)' parameter. Predictions for each test point may be saved via
       the '--predictions_file (-p)'output parameter. Class probabilities for each prediction may
       be saved with the ’--probabilities_file (-P)' output parameter.

       For example, to train a random forest with a minimum leaf size of 20 using 10 trees on the
       dataset contained in 'data.csv'with labels 'labels.csv', saving the output  random  forest
       to 'rf_model.bin' and printing the training error, one could call

       $  random_forest  --training_file data.csv --labels_file labels.csv --minimum_leaf_size 20
       --num_trees 10 --output_model_file rf_model.bin --print_training_accuracy

       Then, to use that model to classify points in 'test_set.csv'  and  print  the  test  error
       given the labels 'test_labels.csv' using that model, while saving the predictions for each
       point to 'predictions.csv', one could call

       $    random_forest    --input_model_file     rf_model.bin     --test_file     test_set.csv
       --test_labels_file test_labels.csv --predictions_file predictions.csv

OPTIONAL INPUT OPTIONS

       --help (-h) [bool]
              Default help info.

       --info [string]
              Get help on a specific module or option.  Default value ''.

       --input_model_file (-m) [unknown]
              Pre-trained random forest to use for classification. Default value ''.

       --labels_file (-l) [string]
              Labels for training dataset. Default value ''.

       --minimum_leaf_size (-n) [int]
              Minimum number of points in each leaf node.  Default value 20.

       --num_trees (-N) [int]
              Number of trees in the random forest. Default value 10.

       --print_training_accuracy (-a) [bool]
              If  set,  then  the  accuracy  of  the  model on the training set will be predicted
              (verbose must also be specified).

       --test_file (-T) [string]
              Test dataset to produce predictions for.  Default value ''.

       --test_labels_file (-L) [string]
              Test dataset labels, if accuracy calculation is desired. Default value ''.

       --training_file (-t) [string]
              Training dataset. Default value ''.

       --verbose (-v) [bool]
              Display informational messages and the full list of parameters and  timers  at  the
              end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_model_file (-M) [unknown]
              Model to save trained random forest to. Default value ''.

       --predictions_file (-p) [string]
              Predicted classes for each point in the test set. Default value ''.

       --probabilities_file (-P) [string]
              Predicted class probabilities for each point in the test set. Default value ''.

ADDITIONAL INFORMATION

       For  further  information,  including  relevant papers, citations, and theory, consult the
       documentation found at http://www.mlpack.org or included with your distribution of mlpack.