Provided by: libcatmandu-stat-perl_0.13-2_all bug

NAME

       Catmandu::Stat - Catmandu modules for working with statistical data

SYNOPSIS

           # Calculate statistics on the availabity of the ISBN fields in the dataset
           cat data.json | catmandu convert JSON to Stat --fields isbn

           # Preprocess data and calculate statistics
           catmandu convert MARC to Stat --fix 'marc_map(020a,isbn)' --fields isbn < data.mrc

           # Or in fix files

           # Calculate the mean of foo. E.g. foo => [1,2,3,4]
           stat_mean(foo)  # foo => '2.5'

           # Calculate the median of foo. E.g. foo => [1,2,3,4]
           stat_median(foo)  # foo => '2.5'

           # Calculate the standard deviation of foo. E.g. foo => [1,2,3,4]
           stat_stddev(foo)  # foo => '1.12'

           # Calculate the variance of foo. E.g. foo => [1,2,3,4]
           stat_variance(foo)  # foo => '1.25'

MODULES

       •   Catmandu::Exporter::Stat

       •   Catmandu::Fix::stat_mean

       •   Catmandu::Fix::stat_median

       •   Catmandu::Fix::stat_stddev

       •   Catmandu::Fix::stat_variance

EXAMPLES

       The Catmandu::Stat distribution includes a CSV file on the Sacramento crime rate in
       January 2006, "t/SacramentocrimeJanuary2006.csv" also available at
       http://samplecsvs.s3.amazonaws.com/SacramentocrimeJanuary2006.csv

       To view statistics on the fields available in this file type:

           $ catmandu convert CSV to Stat < t/SacramentocrimeJanuary2006.csv

           | name          | count | zeros | zeros% | min | max | mean | variance | stdev | uniq~ | uniq% | entropy   |
           |---------------|-------|-------|--------|-----|-----|------|----------|-------|-------|-------|-----------|
           | #             | 7584  |       |        |     |     |      |          |       |       |       |           |
           | address       | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 5425  | 71.5  | 12.4/12.4 |
           | beat          | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 20    | 0.3   | 4.3/12.9  |
           | cdatetime     | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 5071  | 66.9  | 12.3/12.3 |
           | crimedescr    | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 305   | 4.0   | 5.6/12.6  |
           | district      | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 6     | 0.1   | 2.6/12.9  |
           | grid          | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 537   | 7.1   | 7.8/9.9   |
           | latitude      | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 5288  | 69.7  | 12.4/12.4 |
           | longitude     | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 5295  | 69.8  | 12.4/12.4 |
           | ucr_ncic_code | 7584  | 0     | 0.0    | 1   | 1   | 1    | 0.0      | 0.0   | 88    | 1.2   | 4.1/12.9  |

       The file has 7584 rows where and all the fields "address" to "ucr_ncic_code" contain
       values.  Each field has only one value (no arrays available in the CSV file). The are 5492
       unique addresses in the CSV file. The "district" field has the lowest entropy, most of its
       values are shared among many rows.

SEE ALSO

       Catmandu, Catmandu::Breaker,

AUTHOR

       Patrick Hochstenbach, "<patrick.hochstenbach at ugent.be>"

LICENSE AND COPYRIGHT

       This program is free software; you can redistribute it and/or modify it under the terms of
       either: the GNU General Public License as published by the Free Software Foundation; or
       the Artistic License.

       See http://dev.perl.org/licenses/ for more information.