Provided by: python-mvpa2_2.4.1-1_all bug

NAME

       pymvpa2-crossval -  cross-validation of a learner's performance

SYNOPSIS

       pymvpa2  crossval  [--version] [-h] -i DATASET [DATASET ...] --learner LEARNER [--learner-
       space  LEARNER_SPACE]  --partitioner  PARTITIONER  [--errorfx  ERRORFX]   [--avg-datafold-
       results]        [--balance-training        BALANCE_TRAINING]       [--sampling-repetitions
       SAMPLING_REPETITIONS] [--permutations PERMUTATIONS] [--prob-tail {left,right}]  -o  OUTPUT
       [--hdf5-compression TYPE]

DESCRIPTION

       Cross-validation of a learner's performance

       A  learner  is  repeatedly  trained  and tested on partitions of an input dataset that are
       generated by a configurable partitioning scheme.  Partition  usually  constitute  training
       and  testing portions.  The learner is trained on training portion of the dataset and then
       learner's generalization is tested by comparing its predictions on the testing portion.

       A summary of a learner performance is written to STDOUT. Depending on the particular setup
       of  the  cross-validation  analysis,  either  the  learner's  raw  predictions  or summary
       statistics are returned in an output dataset.

       If Monte-Carlo permutation testing is enabled (see --permutations) a second output dataset
       with the corresponding p-values is stored as well (filename suffix '_nullprob').

OPTIONS

       --version
              show program's version and license information and exit

       -h, --help, --help-np
              show  this  help message and exit. --help-np forcefully disables the use of a pager
              for displaying the help.

       -i DATASET [DATASET ...], --input DATASET [DATASET ...]
              path(s) to one or more PyMVPA dataset files. All datasets will  be  merged  into  a
              single dataset (vstack'ed) in order of specification. In some cases this option may
              need to be specified more than once if multiple, but separate, input  datasets  are
              required.

   Options for cross-validation setup:
       --learner LEARNER
              select a learner (trainable node) via its description in the learner warehouse (see
              'info' command for a listing), a colon-separated list of capabilities, or by a file
              path to a Python script that creates a classifier instance (advanced).

       --learner-space LEARNER_SPACE
              name  of  a  sample attribute that defines the model to be learned by a learner. By
              default this is an attribute named 'targets'.

       --partitioner PARTITIONER
              select a data folding  scheme.  Supported  arguments  are:  'half'  for  split-half
              partitioning,  'oddeven' for partitioning into odd and even chunks, 'group-X' where
              X can be any positive integer for partitioning in X groups, 'n-X' where  X  can  be
              any  positive  integer for leave-X-chunks out partitioning. By default partitioners
              operate on dataset chunks that are defined by a 'chunks' sample attribute. The name
              of the "chunking" attribute can be changed by appending a colon and the name of the
              attribute (e.g.  'oddeven:run'). optionally an argument to this option can also  be
              a  file  path  to  a  Python  script  that  creates  a  custom partitioner instance
              (advanced).

       --errorfx ERRORFX
              error  function  to  be  applied  to  the   targets   and   predictions   of   each
              cross-validation  data  fold.  This  can  either be a name of any error function in
              PyMVPA's mvpa2.misc.errorfx module, or a file path to a Python script that  creates
              a custom error function (advanced).

       --avg-datafold-results
              average  result  values across data folds generated by the partitioner. For example
              to compute a mean prediction error across all folds of a crossvalidation procedure.

       --balance-training BALANCE_TRAINING
              If enabled, training samples are balanced within each data  fold.  If  the  keyword
              'equal'  is  given  as  argument  an equal number of random samples for each unique
              target value is chosen. The number of samples per category  is  determined  by  the
              category  with  the  least  number  of  samples  in the respective training set. An
              integer argument will cause the a corresponding number of samples per  category  to
              be  randomly  selected. A floating point number argument (interval [0,1]) indicates
              what fraction of the available samples shall be selected.

       --sampling-repetitions SAMPLING_REPETITIONS
              If training set balancing is enabled, how often should random sample  selection  be
              performed for each data fold. Default: 1

       --permutations PERMUTATIONS
              Number  of  Monte-Carlo  permutation  runs  to  be  computed  for  estimating an H0
              distribution for all  crossvalidation  results.  Enabling  this  option  will  make
              reports of corresponding p-values available in the result summary and output.

       --prob-tail {left,right}
              which  tail of the probability distribution to report p-values from when evaluating
              permutation test results. For example, a cross-validation computing mean prediction
              error could report left-tail p-value for a single-sided test.

   Output options:
       -o OUTPUT, --output OUTPUT
              output  filename ('.hdf5' extension is added automatically if necessary). NOTE: The
              output format is suitable for data exchange between PyMVPA  commands,  but  is  not
              recommended  for  long-term  storage  or  exchange as its specific content may vary
              depending on the  actual  software  environment.  For  long-term  storage  consider
              conversion into other data formats (see 'dump' command).

       --hdf5-compression TYPE
              compression  type  for  HDF5  storage. Available values depend on the specific HDF5
              installation. Typical values are: 'gzip', 'lzf', 'szip', or integers from  1  to  9
              indicating gzip compression levels.

AUTHOR

       Written by Michael Hanke & Yaroslav Halchenko, and numerous other contributors.

COPYRIGHT

       Copyright © 2006-2015 PyMVPA developers

       Permission  is  hereby  granted,  free  of  charge, to any person obtaining a copy of this
       software and associated documentation files (the "Software"),  to  deal  in  the  Software
       without  restriction, including without limitation the rights to use, copy, modify, merge,
       publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons
       to whom the Software is furnished to do so, subject to the following conditions:

       The  above  copyright notice and this permission notice shall be included in all copies or
       substantial portions of the Software.

       THE SOFTWARE IS PROVIDED "AS IS", WITHOUT  WARRANTY  OF  ANY  KIND,  EXPRESS  OR  IMPLIED,
       INCLUDING  BUT  NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
       PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE  LIABLE
       FOR  ANY  CLAIM,  DAMAGES  OR  OTHER  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
       OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR  THE  USE  OR  OTHER
       DEALINGS IN THE SOFTWARE.