Ubuntu Manpage: mlpack_nca - neighborhood components analysis (nca)

NAME

       mlpack_nca - neighborhood components analysis (nca)

SYNOPSIS

        mlpack_nca -i string [-A double] [-b int] [-l string] [-L bool] [-n int] [-T int] [-M double] [-m double] [-N bool] [-B int] [-O string] [-s int] [-a double] [-t double] [-V bool] [-w double] [-o string] [-h -v]

DESCRIPTION

       This  program  implements  Neighborhood  Components Analysis, both a linear dimensionality
       reduction technique and a distance learning technique. The  method  seeks  to  improve  k-
       nearest-neighbor  classification  on  a  dataset  by scaling the dimensions. The method is
       nonparametric, and does not require a value of k. It works by  using  stochastic  ("soft")
       neighbor  assignments  and using optimization techniques over the gradient of the accuracy
       of the neighbor assignments.

       To work, this algorithm needs labeled data. It can be given as the last row of  the  input
       dataset  (specified  with  '--input_file  (-i)'),  or  alternatively  as a separate matrix
       (specified with '--labels_file (-l)').

       This implementation  of  NCA  uses  stochastic  gradient  descent,  mini-batch  stochastic
       gradient  descent,  or  the  L_BFGS  optimizer.  These  optimizers do not guarantee global
       convergence for a nonconvex objective function (NCA's objective function is nonconvex), so
       the final results could depend on the random seed or other optimizer parameters.

       Stochastic  gradient  descent, specified by the value 'sgd' for the parameter ’--optimizer
       (-O)', depends primarily on three parameters: the step size (specified  with  '--step_size
       (-a)'),  the  batch  size  (specified with ’--batch_size (-b)'), and the maximum number of
       iterations (specified with ’--max_iterations (-n)'). In addition,  a  normalized  starting
       point  can  be  used by specifying the '--normalize (-N)' parameter, which is necessary if
       many warnings of the form 'Denominator of p_i is 0!' are given. Tuning the step  size  can
       be a tedious affair. In general, the step size is too large if the objective is not mostly
       uniformly decreasing, or if zero-valued denominator warnings are being  issued.  The  step
       size  is  too  small  if  the  objective  is changing very slowly. Setting the termination
       condition can be done easily once a good step size parameter is found; either increase the
       maximum  iterations  to a large number and allow SGD to find a minimum, or set the maximum
       iterations to 0 (allowing  infinite  iterations)  and  set  the  tolerance  (specified  by
       '--tolerance (-t)') to define the maximum allowed difference between objectives for SGD to
       terminate. Be careful---setting the tolerance instead of the maximum iterations can take a
       very long time and may actually never converge due to the properties of the SGD optimizer.
       Note that a single iteration of SGD refers to a single point, so to  take  a  single  pass
       over  the  dataset,  set  the  value of the '--max_iterations (-n)' parameter equal to the
       number of points in the dataset.

       The L-BFGS optimizer, specified by the value 'lbfgs' for the parameter ’--optimizer (-O)',
       uses  a  back-tracking  line  search  algorithm  to  minimize  a  function.  The following
       parameters are used by L-BFGS: '--num_basis (-B)' (specifies the number of  memory  points
       used  by  L-BFGS),  '--max_iterations  (-n)',  '--armijo_constant  (-A)',  '--wolfe (-w)',
       '--tolerance (-t)' (the optimization is terminated when the gradient norm  is  below  this
       value),  ’--max_line_search_trials  (-T)', '--min_step (-m)', and '--max_step (-M)' (which
       both refer to the line search routine). For more details on the L-BFGS optimizer,  consult
       either  the  mlpack  L-BFGS  documentation  (in  lbfgs.hpp)  or  the vast set of published
       literature on L-BFGS.

       By default, the SGD optimizer is used.

REQUIRED INPUT OPTIONS

       --input_file (-i) [string]
              Input dataset to run NCA on.

OPTIONAL INPUT OPTIONS

       --armijo_constant (-A) [double]
              Armijo constant for L-BFGS. Default value 0.0001.

       --batch_size (-b) [int]
              Batch size for mini-batch SGD. Default value 50.

       --help (-h) [bool]
              Default help info.

       --info [string]
              Print help on a specific option. Default value ''.

       --labels_file (-l) [string]
              Labels for input dataset.

       --linear_scan (-L) [bool]
              Don't shuffle the order in which data points are visited for SGD or mini-batch SGD.

       --max_iterations (-n) [int]
              Maximum number of iterations for SGD or L-BFGS  (0  indicates  no  limit).  Default
              value 500000.

       --max_line_search_trials (-T) [int]
              Maximum number of line search trials for L-BFGS. Default value 50.

       --max_step (-M) [double]
              Maximum step of line search for L-BFGS. Default value 1e+20.

       --min_step (-m) [double]
              Minimum step of line search for L-BFGS. Default value 1e-20.

       --normalize (-N) [bool]
              Use  a  normalized  starting point for optimization. This is useful for when points
              are far apart, or when SGD is returning NaN.

       --num_basis (-B) [int]
              Number of memory points to be stored for L-BFGS. Default value 5.

       --optimizer (-O) [string]
              Optimizer to use; 'sgd' or 'lbfgs'. Default value 'sgd'.

       --seed (-s) [int]
              Random seed. If 0, 'std::time(NULL)' is used.  Default value 0.

       --step_size (-a) [double]
              Step size for stochastic gradient descent (alpha). Default value 0.01.

       --tolerance (-t) [double]
              Maximum tolerance for termination of SGD or L-BFGS. Default value 1e-07.

       --verbose (-v) [bool]
              Display informational messages and the full list of parameters and  timers  at  the
              end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

       --wolfe (-w) [double]
              Wolfe condition parameter for L-BFGS. Default value 0.9.

OPTIONAL OUTPUT OPTIONS

       --output_file (-o) [string]
              Output matrix for learned distance matrix.

ADDITIONAL INFORMATION

       For  further  information,  including  relevant papers, citations, and theory, consult the
       documentation found at http://www.mlpack.org or included with your distribution of mlpack.