Ubuntu Manpage: mlpack_gmm_train - gaussian mixture model (gmm) training

Provided by: mlpack-bin_2.2.5-1build1_amd64

NAME

       mlpack_gmm_train - gaussian mixture model (gmm) training

SYNOPSIS

        mlpack_gmm_train [-h] [-v]

DESCRIPTION

       This program takes a parametric estimate of a Gaussian mixture model (GMM) using the EM algorithm to find
       the  maximum  likelihood  estimate.  The model may be saved to file, which will contain information about
       each Gaussian.

       If GMM training fails with an error indicating that a covariance matrix could not be inverted, make  sure
       that the --no_force_positive flag is not specified.  Alternately, adding a small amount of Gaussian noise
       (using  the  --noise  parameter) to the entire dataset may help prevent Gaussians with zero variance in a
       particular dimension, which is usually the cause of non-invertible covariance matrices.

       The 'no_force_positive' flag, if set, will avoid the checks after each  iteration  of  the  EM  algorithm
       which  ensure  that  the  covariance matrices are positive definite. Specifying the flag can cause faster
       runtime, but may also cause non-positive definite covariance matrices, which will cause  the  program  to
       crash.

       Optionally,  multiple trials may be performed, by specifying the --trials option. The model with greatest
       log-likelihood will be taken.

REQUIRED INPUT OPTIONS

       --gaussians (-g) [int]
              Number of Gaussians in the GMM.

       --input_file (-i) [string]
              File containing the data on which the model will be fit.

OPTIONAL INPUT OPTIONS

       --help (-h)
              Default help info.

       --info [string]
              Get help on a specific module or option.  Default value ''.  --input_model_file (-m) [string] File
              containing initial input GMM model.  Default value ''.

       --max_iterations (-n) [int]
              Maximum number of iterations of EM algorithm (passing 0 will run until convergence). Default value
              250.

       --no_force_positive (-P)
              Do not force the covariance matrices to be positive definite.

       --noise (-N) [double]
              Variance of zero-mean Gaussian noise to add to data. Default value 0.

       --percentage (-p) [double]
              If using --refined_start, specify the percentage of the dataset used for each sampling (should  be
              between 0.0 and 1.0). Default value 0.02.

       --refined_start (-r)
              During  the  initialization,  use  refined  initial  positions for k-means clustering (Bradley and
              Fayyad, 1998).

       --samplings (-S) [int]
              If using --refined_start, specify the number of samplings used for initial points.  Default  value
              100.

       --seed (-s) [int]
              Random seed. If 0, 'std::time(NULL)' is used.  Default value 0.

       --tolerance (-T) [double]
              Tolerance for convergence of EM. Default value 1e-10.

       --trials (-t) [int]
              Number of trials to perform in training GMM.  Default value 1.

       --verbose (-v)
              Display informational messages and the full list of parameters and timers at the end of execution.

       --version (-V)
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_model_file (-M) [string] File to save trained GMM model to. Default value ''.

ADDITIONAL INFORMATION

       For  further  information,  including  relevant  papers,  citations, and theory, For further information,
       including   relevant   papers,   citations,   and   theory,   consult   the   documentation   found    at
       http://www.mlpack.org  or  included with your consult the documentation found at http://www.mlpack.org or
       included with your DISTRIBUTION OF MLPACK.  DISTRIBUTION OF MLPACK.

                                                                              mlpack_gmm_train(16 November 2017)