Provided by: mlpack-bin_4.5.0-1_amd64 bug

NAME

       mlpack_kernel_pca - kernel principal components analysis

SYNOPSIS

        mlpack_kernel_pca -i unknown -k string [-b double] [-c bool] [-D double] [-S double] [-d int] [-n bool] [-O double] [-s string] [-V bool] [-o unknown] [-h -v]

DESCRIPTION

       This program performs Kernel Principal Components Analysis (KPCA) on the specified dataset
       with the specified kernel.  This  will  transform  the  data  onto  the  kernel  principal
       components,  and  optionally  reduce  the  dimensionality by ignoring the kernel principal
       components with the smallest eigenvalues.

       For the case where a linear kernel is used, this reduces to regular PCA.

       The kernels that are supported are listed below:

              •  ’linear': the standard linear dot product (same as normal PCA): `K(x, y)  =  x^T
                 y`

              •  ’gaussian':  a Gaussian kernel; requires bandwidth: `K(x, y) = exp(-(|| x - y ||
                 ^ 2) / (2 * (bandwidth ^ 2)))`

              •  ’polynomial': polynomial kernel; requires offset and degree: `K(x, y) = (x^T y +
                 offset) ^ degree`

              •  ’hyptan':  hyperbolic  tangent  kernel;  requires  scale  and offset: `K(x, y) =
                 tanh(scale * (x^T y) + offset)`

              •  ’laplacian': Laplacian kernel; requires bandwidth: `K(x, y) = exp(-(|| x - y ||)
                 / bandwidth)`

              •  ’epanechnikov':  Epanechnikov  kernel; requires bandwidth: `K(x, y) = max(0, 1 -
                 || x - y ||^2 / bandwidth^2)`

              •  ’cosine': cosine distance: `K(x, y) = 1 - (x^T y) / (|| x || * || y ||)`

       The parameters for each of the kernels should be specified with the  options  ’--bandwidth
       (-b)',  '--kernel_scale  (-S)',  '--offset  (-O)', or '--degree (-D)' (or a combination of
       those parameters).

       Optionally, the Nystroem method ("Using the Nystroem method to speed up kernel  machines",
       2001)  can  be  used  to  calculate the kernel matrix by specifying the ’--nystroem_method
       (-n)' parameter. This approach works by using a subset of the data as basis to reconstruct
       the  kernel  matrix;  to  specify  the sampling scheme, the '--sampling (-s)' parameter is
       used. The sampling scheme for the Nystroem method can be chosen from the  following  list:
       'kmeans', 'random', ’ordered'.

       For  example, the following command will perform KPCA on the dataset ’input.csv' using the
       Gaussian kernel, and saving the transformed data to ’transformed.csv':

       $ mlpack_kernel_pca --input_file input.csv --kernel gaussian --output_file transformed.csv

REQUIRED INPUT OPTIONS

       --input_file (-i) [unknown]
              Input dataset to perform KPCA on.

       --kernel (-k) [string]
              The kernel to use; see the above documentation for the list of usable kernels.

OPTIONAL INPUT OPTIONS

       --bandwidth (-b) [double]
              Bandwidth, for 'gaussian' and 'laplacian' kernels. Default value 1.

       --center (-c) [bool]
              If set, the transformed data will be centered about the origin.

       --degree (-D) [double]
              Degree of polynomial, for 'polynomial' kernel.  Default value 1.

       --help (-h) [bool]
              Default help info.

       --info [string]
              Print help on a specific option. Default value ''.   --kernel_scale  (-S)  [double]
              Scale, for 'hyptan' kernel. Default value 1.

       --new_dimensionality (-d) [int]
              If  not  0,  reduce  the  dimensionality  of  the  output  dataset  by ignoring the
              dimensions with the smallest eigenvalues. Default value 0.

       --nystroem_method (-n) [bool]
              If set, the Nystroem method will be used.

       --offset (-O) [double]
              Offset, for 'hyptan' and 'polynomial' kernels.  Default value 0.

       --sampling (-s) [string]
              Sampling scheme to use for  the  Nystroem  method:  'kmeans',  'random',  'ordered'
              Default value 'kmeans'.

       --verbose (-v) [bool]
              Display  informational  messages  and the full list of parameters and timers at the
              end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_file (-o) [unknown] Matrix to save modified dataset to.

ADDITIONAL INFORMATION

       For further information, including relevant papers, citations,  and  theory,  consult  the
       documentation found at http://www.mlpack.org or included with your distribution of mlpack.