Ubuntu Manpage: mlpack_gmm_train - gaussian mixture model (gmm) training

Provided by: mlpack-bin_2.2.5-1build1_amd64

NAME

       mlpack_gmm_train - gaussian mixture model (gmm) training

SYNOPSIS

        mlpack_gmm_train [-h] [-v]

DESCRIPTION

       This  program  takes  a parametric estimate of a Gaussian mixture model (GMM) using the EM
       algorithm to find the maximum likelihood estimate. The model may be saved to  file,  which
       will contain information about each Gaussian.

       If  GMM  training  fails  with  an  error indicating that a covariance matrix could not be
       inverted, make sure that the --no_force_positive  flag  is  not  specified.   Alternately,
       adding  a  small  amount  of  Gaussian  noise  (using the --noise parameter) to the entire
       dataset may help prevent Gaussians with zero variance in a particular dimension, which  is
       usually the cause of non-invertible covariance matrices.

       The 'no_force_positive' flag, if set, will avoid the checks after each iteration of the EM
       algorithm which ensure that the covariance matrices are positive definite. Specifying  the
       flag  can  cause  faster  runtime,  but  may  also  cause non-positive definite covariance
       matrices, which will cause the program to crash.

       Optionally, multiple trials may be performed, by specifying the --trials option. The model
       with greatest log-likelihood will be taken.

REQUIRED INPUT OPTIONS

       --gaussians (-g) [int]
              Number of Gaussians in the GMM.

       --input_file (-i) [string]
              File containing the data on which the model will be fit.

OPTIONAL INPUT OPTIONS

       --help (-h)
              Default help info.

       --info [string]
              Get  help  on  a  specific module or option.  Default value ''.  --input_model_file
              (-m) [string] File containing initial input GMM model.  Default value ''.

       --max_iterations (-n) [int]
              Maximum  number  of  iterations  of  EM  algorithm  (passing  0  will   run   until
              convergence). Default value 250.

       --no_force_positive (-P)
              Do not force the covariance matrices to be positive definite.

       --noise (-N) [double]
              Variance of zero-mean Gaussian noise to add to data. Default value 0.

       --percentage (-p) [double]
              If  using  --refined_start,  specify  the  percentage  of the dataset used for each
              sampling (should be between 0.0 and 1.0). Default value 0.02.

       --refined_start (-r)
              During the initialization, use refined initial  positions  for  k-means  clustering
              (Bradley and Fayyad, 1998).

       --samplings (-S) [int]
              If  using --refined_start, specify the number of samplings used for initial points.
              Default value 100.

       --seed (-s) [int]
              Random seed. If 0, 'std::time(NULL)' is used.  Default value 0.

       --tolerance (-T) [double]
              Tolerance for convergence of EM. Default value 1e-10.

       --trials (-t) [int]
              Number of trials to perform in training GMM.  Default value 1.

       --verbose (-v)
              Display informational messages and the full list of parameters and  timers  at  the
              end of execution.

       --version (-V)
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_model_file (-M) [string] File to save trained GMM model to. Default value ''.

ADDITIONAL INFORMATION

       For  further  information,  including  relevant papers, citations, and theory, For further
       information, including relevant papers, citations, and theory, consult  the  documentation
       found  at  http://www.mlpack.org  or included with your consult the documentation found at
       http://www.mlpack.org or included with  your  DISTRIBUTION  OF  MLPACK.   DISTRIBUTION  OF
       MLPACK.

                                                               mlpack_gmm_train(16 November 2017)