Provided by: libstatistics-pca-perl_0.0.1-2_all bug

NAME

       Statistics::PCA - A simple Perl implementation of Principal Component Analysis.

VERSION

       This document describes Statistics::PCA version 0.0.1

SYNOPSIS

           use Statistics::PCA;

           # Create new Statistics::PCA object.
           my $pca = Statistics::PCA->new;

           #                  Var1    Var2    Var3    Var4...
           my @Obs1 = (qw/    32      26      51      12    /);
           my @Obs2 = (qw/    17      13      34      35    /);
           my @Obs3 = (qw/    10      94      83      45    /);
           my @Obs4 = (qw/    3       72      72      67    /);
           my @Obs5 = (qw/    10      63      35      34    /);

           # Load data. Data is loaded as a LIST-of-LISTS (LoL) pointed to by a named argument 'data'. Requires argument for format (see METHODS).
           $pca->load_data ( { format => 'table', data => [ \@Obs1, \@Obs2, \@Obs3, \@Obs4, \@Obs5 ], } ) ;

           # Perform the PCA analysis. Takes optional argument 'eigen' (see METHODS).
           #$pca->pca( { eigen => 'C' } );
           $pca->pca();

           # Access results. The return value of this method is context-dependent (see METHODS). To print a report to STDOUT call in VOID-context.
           $pca->results();

DESCRIPTION

       Principal component analysis (PCA) transforms higher-dimensional data consisting of a
       number of possibly correlated variables into a smaller number of uncorrelated variables
       termed principal components (PCs). The higher the ranking of the PCs the greater the
       amount of variability that the PC accounts for. This PCA procedure involves the
       calculation of the eigenvalue decomposition using either the Math::Cephes::Matrix or
       Math::MatrixReal modules (see METHODS) from a data covariance matrix after mean centering
       the data. See http://en.wikipedia.org/wiki/Principal_component_analysis for more details.

METHODS

   new
       Create a new Statistics::PCA object.

           my $pca = Statistics::PCA->new;

   load_data
       Used for loading data into object. Data is fed as a reference to a LoL within an anonymous
       hash using the named argument 'data'. Data may be entered in one of two forms specified by
       the obligatory named argument 'format'.  Data may either be entered in standard 'table'
       fashion (with rows corresponding to observations and columns corresponding to variables).
       Thus to enter the following table of data:

                   Var1    Var2    Var3    Var4

           Obs1    32      26      51      12
           Obs2    17      13      34      35
           Obs3    10      94      83      45
           Obs4    3       72      72      67
           Obs5    10      63      35      34 ...

       The data is passed as an LoL with the with each nested ARRAY reference corresponding to a
       row of observations in the data table and the 'format' argument value 'table' as follows:

           #                       Var1    Var2    Var3    Var4 ...
           my $data  =   [
                           [qw/    32      26      51      12    /],     # Obs1
                           [qw/    17      13      34      35    /],     # Obs2
                           [qw/    10      94      83      45    /],     # Obs3
                           [qw/    3       72      72      67    /],     # Obs4
                           [qw/    10      63      35      34    /],     # Obs5 ...
                       ];

           $pca->load_data ( { format => 'table', data => $data, } );

       Alternatively you may enter the data in a variable-centric fashion where each nested ARRAY
       reference corresponds to a single variable within the data (i.e. the transpose of the
       above table-fashion). To pass the above data in this fashion use the 'format' argument
       with value 'variable' as follows:

           #                           Obs1    Obs2    Obs3    Obs4    Obs5 ...
           my $transpose = [
                               [qw/    32      17      10      3       10    /],   # Var1
                               [qw/    26      13      94      72      63    /],   # Var2
                               [qw/    51      34      83      72      35    /],   # Var3
                               [qw/    12      35      45      67      34    /],   # Var4 ...
                           ];

           $pca->load_data ( { format => 'variable', data => $transpose, } ) ;

   pca
       To perform the PCA analysis. This method takes the optional named argument 'eigen' that
       takes the values 'M' or 'C' to calculate the eigenvalue decomposition using either the
       Math::MatrixReal or Math::Cephes::Matrix modules respectively (defaults to 'M' without
       argument).

           $pca->pca();
           $pca->pca( { eigen => 'M' } );
           $pca->pca( { eigen => 'C' } );

   results
       Used to access the results of the PCA analysis. This method is context-dependent and will
       return a variety of different values depending on whether it is called in VOID or LIST
       context and the arguments its passed.  In VOID-context it prints a formatted table of the
       computed results to STDOUT.

           $pca->results;

       In LIST context this method takes an obligatory argument that determines its return
       values. To return an ordered list (ordered by PC ranking) of the proportions of total
       variance of each PC pass 'proportion' to the method.

           my @list = $pca->results('proportion');
           print qq{\nOrdered list of individual proportions of variance: @list};

       To return an ordered list of the cumulative variance of the PCs pass argument
       'cumulative'.

           @list = $pca->results('cumulative');
           print qq{\nOrdered list of cumulative variance of the PCs: @list};

       To return an ordered list of the individual standard deviations of the PCs pass argument
       'stdev'.

           @list = $pca->results('stdev');
           print qq{\nOrdered list of individual standard deviations of the PCs: @list};

       To return an ordered list of the individual eigenvalues of the PCs pass argument
       'eigenvalue'.

           @list = $pca->results('eigenvalue');
           print qq{\nOrdered list of individual eigenvalues of the PCs: @list};

       To return an ordered list of ARRAY references containing the eigenvectors of the PCs pass
       argument 'eigenvector'.

           # Returns an ordered list of array references containing the eigenvectors for the components
           @list = $pca->results('eigenvector');
           use Data::Dumper;
           print Dumper \@list;

       To return an ordered list of ARRAY references containing more detailed information about
       each PC use the 'full' argument. Each nested ARRAY reference consists of an ordered list
       of: PC rank, PC stdev, PC proportion of variance, PC cumulative_variance, PC eigenvalue
       and a further nested ARRAY reference containing the PC eigenvector.

           @list = $pca->results('full');
           for my $i (@list) {
               print qq{\nPC rank: $i->[0]}
                     . qq{\nPC stdev $i->[1]}
                     . qq{\nPC proportion of variance $i->[2]}
                     . qq{\nPC cumulative variance $i->[3]}
                     . qq{\nPC eigenvalue $i->[4]}
               }

       To return an ordered LoL of the transformed data for each of the PCs pass 'transformed' to
       the method.

           @list = $pca->results('transformed');
           print qq{\nThe transformed data for 'the' principal component (first PC): @{$list[0]} };

DEPENDENCIES

       'version'                   =>  '0', 'Carp'                      => '1.08',
       'Math::Cephes::Matrix'      => '0.47', 'Math::Cephes'              => '0.47', 'List::Util'
       => '1.19', 'Math::MatrixReal'          => '2.05', 'Text::SimpleTable'         => '2.0',
       'Contextual::Return'        => '0.2.1',

AUTHOR

       Daniel S. T. Hughes  "<dsth@cpan.org>"

LICENCE AND COPYRIGHT

       Copyright (c) 2009, Daniel S. T. Hughes "<dsth@cantab.net>". All rights reserved.

       This module is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself. See perlartistic.

DISCLAIMER OF WARRANTY

       Because this software is licensed free of charge, there is no warranty for the software,
       to the extent permitted by applicable law. Except when otherwise stated in writing the
       copyright holders and/or other parties provide the software "as is" without warranty of
       any kind, either expressed or implied, including, but not limited to, the implied
       warranties of merchantability and fitness for a particular purpose. The entire risk as to
       the quality and performance of the software is with you. Should the software prove
       defective, you assume the cost of all necessary servicing, repair, or correction.

       In no event unless required by applicable law or agreed to in writing will any copyright
       holder, or any other party who may modify and/or redistribute the software as permitted by
       the above licence, be liable to you for damages, including any general, special,
       incidental, or consequential damages arising out of the use or inability to use the
       software (including but not limited to loss of data or data being rendered inaccurate or
       losses sustained by you or third parties or a failure of the software to operate with any
       other software), even if such holder or other party has been advised of the possibility of
       such damages.