Provided by: mlpack-bin_3.0.4-1_amd64 bug


       mlpack_random_forest - random forests


        mlpack_random_forest [-m unknown] [-l string] [-n int] [-N int] [-a bool] [-T string] [-L string] [-t string] [-V bool] [-M unknown] [-p string] [-P string] [-h -v]


       This  program  is an implementation of the standard random forest classification algorithm
       by Leo Breiman. A random forest can be trained and saved for later use, or a random forest
       may be loaded and predictions or class probabilities for points may be generated.

       The  training  set and associated labels are specified with the '--training_file (-t)' and
       '--labels_file (-l)' parameters, respectively. The labels  should  be  in  the  range  [0,
       num_classes  -  1].  Optionally,  if '--labels_file (-l)' is not specified, the labels are
       assumed to be the last dimension of the training dataset.

       When a model is trained, the '--output_model_file (-M)' output parameter may  be  used  to
       save the trained model. A model may be loaded for predictions with the '--input_model_file
       (-m)'parameter. The '--input_model_file (-m)' parameter may  not  be  specified  when  the
       '--training_file  (-t)'  parameter  is specified. The '--minimum_leaf_size (-n)' parameter
       specifies the minimum number of training points that must fall into each leaf for it to be
       split.   The  '--num_trees  (-N)'  controls  the  number of trees in the random forest. If
       ’--print_training_accuracy (-a)' is specified, the calculated accuracy on the training set
       will be printed.

       Test  data  may  be  specified  with  the '--test_file (-T)' parameter, and if performance
       measures are desired for that test set, labels for the test points may be  specified  with
       the  '--test_labels_file (-L)' parameter. Predictions for each test point may be saved via
       the '--predictions_file (-p)'output parameter. Class probabilities for each prediction may
       be saved with the ’--probabilities_file (-P)' output parameter.

       For example, to train a random forest with a minimum leaf size of 20 using 10 trees on the
       dataset contained in 'data.csv'with labels 'labels.csv', saving the output  random  forest
       to 'rf_model.bin' and printing the training error, one could call

       $  random_forest  --training_file data.csv --labels_file labels.csv --minimum_leaf_size 20
       --num_trees 10 --output_model_file rf_model.bin --print_training_accuracy

       Then, to use that model to classify points in 'test_set.csv'  and  print  the  test  error
       given the labels 'test_labels.csv' using that model, while saving the predictions for each
       point to 'predictions.csv', one could call

       $    random_forest    --input_model_file     rf_model.bin     --test_file     test_set.csv
       --test_labels_file test_labels.csv --predictions_file predictions.csv


       --help (-h) [bool]
              Default help info.

       --info [string]
              Get help on a specific module or option.  Default value ''.

       --input_model_file (-m) [unknown]
              Pre-trained random forest to use for classification. Default value ''.

       --labels_file (-l) [string]
              Labels for training dataset. Default value ''.

       --minimum_leaf_size (-n) [int]
              Minimum number of points in each leaf node.  Default value 20.

       --num_trees (-N) [int]
              Number of trees in the random forest. Default value 10.

       --print_training_accuracy (-a) [bool]
              If  set,  then  the  accuracy  of  the  model on the training set will be predicted
              (verbose must also be specified).

       --test_file (-T) [string]
              Test dataset to produce predictions for.  Default value ''.

       --test_labels_file (-L) [string]
              Test dataset labels, if accuracy calculation is desired. Default value ''.

       --training_file (-t) [string]
              Training dataset. Default value ''.

       --verbose (-v) [bool]
              Display informational messages and the full list of parameters and  timers  at  the
              end of execution.

       --version (-V) [bool]
              Display the version of mlpack.


       --output_model_file (-M) [unknown]
              Model to save trained random forest to. Default value ''.

       --predictions_file (-p) [string]
              Predicted classes for each point in the test set. Default value ''.

       --probabilities_file (-P) [string]
              Predicted class probabilities for each point in the test set. Default value ''.


       For  further  information,  including  relevant papers, citations, and theory, consult the
       documentation found at or included with your distribution of mlpack.