Ubuntu Manpage: mlpack_logistic_regression - l2-regularized logistic regression and prediction

name
synopsis
description
optional input options
optional output options
additional information
additional information

bionic (1) mlpack_logistic_regression.1.gz

Provided by: mlpack-bin_2.2.5-1build1_amd64

NAME

       mlpack_logistic_regression - l2-regularized logistic regression and prediction

SYNOPSIS

        mlpack_logistic_regression [-h] [-v]

DESCRIPTION

       An  implementation  of  L2-regularized  logistic  regression  using  either  the  L-BFGS optimizer or SGD
       (stochastic gradient descent). This solves the regression problem

         y = (1 / 1 + e^-(X * b))

       where y takes values 0 or 1.

       This program allows loading a logistic  regression  model  from  a  file  (-i)  or  training  a  logistic
       regression model given training data (-t), or both those things at once. In addition, this program allows
       classification on a test dataset (-T) and will save the classification results to the given  output  file
       (-o). The logistic regression model itself may be saved with a file specified using the -m option.

       The  training  data  given  with the -t option should have class labels as its last dimension (so, if the
       training data is in CSV format, labels should be the last column). Alternately,  the  -l  (--labels_file)
       option may be used to specify a separate file of labels.

       When  a model is being trained, there are many options. L2 regularization (to prevent overfitting) can be
       specified with the -l option, and the optimizer used to  train  the  model  can  be  specified  with  the
       --optimizer  option.   Available  options  are  'sgd'  (stochastic gradient descent), 'lbfgs' (the L-BFGS
       optimizer), and  'minibatch-sgd'  (minibatch  stochastic  gradient  descent).   There  are  also  various
       parameters  for  the  optimizer;  the  --max_iterations parameter specifies the maximum number of allowed
       iterations, and the --tolerance (-e) parameter specifies the tolerance for convergence. For the  SGD  and
       mini-batch  SGD  optimizers,  the --step_size parameter controls the step size taken at each iteration by
       the optimizer. The batch size for mini-batch SGD is controlled with the --batch_size (-b)  parameter.  If
       the  objective  function  for  your  data is oscillating between Inf and 0, the step size is probably too
       large. There are more parameters for the optimizers, but the C++ interface must be used to access these.

       For SGD, an iteration refers to a single point, and for mini-batch SGD, an iteration refers to  a  single
       batch.  So  to take a single pass over the dataset with SGD, --max_iterations should be set to the number
       of points in the dataset.

       Optionally, the model can be used to predict  the  responses  for  another  matrix  of  data  points,  if
       --test_file  is  specified.  The  --test_file option can be specified without --input_file, so long as an
       existing logistic regression model is given with --model_file. The output predictions from  the  logistic
       regression model are stored in the file given with --output_predictions.

       This implementation of logistic regression does not support the general multi-class case but instead only
       the two-class case. Any responses must be either 0 or 1.

OPTIONAL INPUT OPTIONS

       --batch_size (-b) [int]
              Batch size for mini-batch SGD. Default value

              50.

                  --decision_boundary (-d) [double] Decision boundary for prediction; if the  logistic  function
                  for  a point is less than the boundary, the class is taken to be 0; otherwise, the class is 1.
                  Default value 0.5.

       --help (-h)
              Default help info.

       --info [string]
              Get help on a specific module or option.  Default value ''.  --input_model_file (-m) [string] File
              containing existing model (parameters).  Default value ''.

       --labels_file (-l) [string]
              A file containing labels (0 or 1) for the points in the training set (y). Default value ''.

       --lambda (-L) [double]
              L2-regularization parameter for training.  Default value 0.

       --max_iterations (-n) [int]
              Maximum iterations for optimizer (0 indicates no limit). Default value 10000.

       --optimizer (-O) [string]
              Optimizer to use for training ('lbfgs' or ’sgd'). Default value 'lbfgs'.

       --step_size (-s) [double]
              Step size for SGD and mini-batch SGD optimizers.  Default value 0.01.

       --test_file (-T) [string]
              File containing test dataset. Default value ’'.

       --tolerance (-e) [double]
              Convergence  tolerance  for  optimizer. Default value 1e-10.  --training_file (-t) [string] A file
              containing the training set (the matrix of predictors, X). Default value ''.

       --verbose (-v)
              Display informational messages and the full list of parameters and timers at the end of execution.

       --version (-V)
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_file (-o) [string]
              If --test_file is specified, this file is where the predictions for the test set  will  be  saved.
              Default  value  ''.   --output_model_file  (-M)  [string] File to save trained logistic regression
              model  to.  Default  value  ''.   --output_probabilities_file  (-p)  [string]  If  --test_file  is
              specified,  this  file  is  where  the class probabilities for the test set will be saved. Default
              value ''.

ADDITIONAL INFORMATION

       For further information, including relevant papers,  citations,  and  theory,  For  further  information,
       including    relevant   papers,   citations,   and   theory,   consult   the   documentation   found   at
       http://www.mlpack.org or included with your consult the documentation found at  http://www.mlpack.org  or
       included with your DISTRIBUTION OF MLPACK.  DISTRIBUTION OF MLPACK.

                                                                    mlpack_logistic_regression(16 November 2017)