Provided by: mlpack-bin_2.2.5-1build1_amd64 bug

NAME

       mlpack_kernel_pca - kernel principal components analysis

SYNOPSIS

        mlpack_kernel_pca [-h] [-v]

DESCRIPTION

       This program performs Kernel Principal Components Analysis (KPCA) on the specified dataset
       with the specified kernel.  This  will  transform  the  data  onto  the  kernel  principal
       components,  and  optionally  reduce  the  dimensionality by ignoring the kernel principal
       components with the smallest eigenvalues.

       For the case where a linear kernel is used, this reduces to regular PCA.

       For example, the following will perform KPCA on the 'input.csv' file  using  the  gaussian
       kernel and store the transformed date in the 'transformed.csv' file.

       $ kernel_pca -i input.csv -k gaussian -o transformed.csv

       The kernels that are supported are listed below:

              •  ’linear': the standard linear dot product (same as normal PCA): K(x, y) = x^T y

              •  ’gaussian': a Gaussian kernel; requires bandwidth: K(x, y) = exp(-(|| x - y || ^
                 2) / (2 * (bandwidth ^ 2)))

              •  ’polynomial': polynomial kernel; requires offset and degree: K(x, y) = (x^T y  +
                 offset) ^ degree

              •  ’hyptan':  hyperbolic  tangent  kernel;  requires  scale  and  offset: K(x, y) =
                 tanh(scale * (x^T y) + offset)

              •  ’laplacian': Laplacian kernel; requires bandwidth: K(x, y) = exp(-(|| x - y  ||)
                 / bandwidth)

              •  ’epanechnikov': Epanechnikov kernel; requires bandwidth: K(x, y) = max(0, 1 - ||
                 x - y ||^2 / bandwidth^2)

              •  ’cosine': cosine distance: K(x, y) = 1 - (x^T y) / (|| x || * || y ||)

       The parameters for each of the kernels should be specified with the  options  --bandwidth,
       --kernel_scale, --offset, or --degree (or a combination of those options).

       Optionally,  the  nyström method ("Using the Nystroem method to speed up kernel machines",
       2001) can be used to calculate the kernel matrix by specifying the --nystroem_method  (-n)
       option.  This  approach  works  by  using a subset of the data as basis to reconstruct the
       kernel matrix; to specify the sampling scheme,  the  --sampling  parameter  is  used,  the
       sampling  scheme  for  the  nyström  method can be chosen from the following list: kmeans,
       random, ordered.

REQUIRED INPUT OPTIONS

       --input_file (-i) [string]
              Input dataset to perform KPCA on.

       --kernel (-k) [string]
              The kernel to use; see the above documentation for the list of usable kernels.

OPTIONAL INPUT OPTIONS

       --bandwidth (-b) [double]
              Bandwidth, for 'gaussian' and 'laplacian' kernels. Default value 1.

       --center (-c)
              If set, the transformed data will be centered about the origin.

       --degree (-D) [double]
              Degree of polynomial, for 'polynomial' kernel.  Default value 1.

       --help (-h)
              Default help info.

       --info [string]
              Get help on a specific module or option.  Default value  ''.   --kernel_scale  (-S)
              [double]  Scale,  for  'hyptan' kernel. Default value 1.  --new_dimensionality (-d)
              [int] If not 0, reduce the dimensionality of the output  dataset  by  ignoring  the
              dimensions with the smallest eigenvalues. Default value 0.

       --nystroem_method (-n)
              If set, the nystroem method will be used.

       --offset (-O) [double]
              Offset, for 'hyptan' and 'polynomial' kernels.  Default value 0.

       --sampling (-s) [string]
              Sampling  scheme  to  use  for  the  nystroem method: ’kmeans', 'random', 'ordered'
              Default value ’kmeans'.

       --verbose (-v)
              Display informational messages and the full list of parameters and  timers  at  the
              end of execution.

       --version (-V)
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_file (-o) [string]
              File to save modified dataset to. Default value ’'.

ADDITIONAL INFORMATION

ADDITIONAL INFORMATION

       For  further  information,  including  relevant papers, citations, and theory, For further
       information, including relevant papers, citations, and theory, consult  the  documentation
       found  at  http://www.mlpack.org  or included with your consult the documentation found at
       http://www.mlpack.org or included with  your  DISTRIBUTION  OF  MLPACK.   DISTRIBUTION  OF
       MLPACK.

                                                              mlpack_kernel_pca(16 November 2017)