xenial (1) pymvpa2-preproc.1.gz

Provided by: python-mvpa2_2.4.1-1_all bug

NAME

       pymvpa2-preproc -  apply preprocessing steps to a PyMVPA dataset

SYNOPSIS

       pymvpa2  preproc  [--version]  [-h]  -i  DATASET [DATASET ...] [--chunks CHUNKS_ATTR] [--strip-invariant-
       features] [--poly-detrend DEG] [--detrend-chunks CHUNKS_ATTR] [--detrend-coords COORDS_ATTR]  [--detrend-
       regrs  ATTR  [ATTR  ...]]  [--filter-passband  FREQ  [FREQ  ...]]  [--filter-stopband  FREQ  [FREQ  ...]]
       [--sampling-rate FREQ] [--filter-passloss dB] [--filter-stopattenuation dB]  [--zscore]  [--zscore-chunks
       CHUNKS_ATTR] [--zscore-params PARAM PARAM] -o OUTPUT [--hdf5-compression TYPE]

DESCRIPTION

       Preprocess a PyMVPA dataset.

       This command can apply a number of preprocessing steps to a dataset. Currently supported are

       1. Polynomial de-trending

       2. Spectral filtering

       3. Feature-wise Z-scoring

       All  preprocessing  steps are applied in the above order. If a different order is required, preprocessing
       has to be split into two separate command calls.

       POLYNOMIAL DE-TRENDING

       This type of de-trending can be used to regress out arbitrary signals. In addition to polynomials of  any
       degree arbitrary timecourses stored as sample attributes in a dataset can be used as confound regressors.
       This detrending functionality  is,  in  contrast  to  the  implementation  of  spectral  filtering,  also
       applicable to sparse-sampled data with potentially irregular inter-sample intervals.

       SPECTRAL FILTERING

       Several option are provided that are used to construct a Butterworth low-, high-, or band-pass filter. It
       is advised to inspect the filtered data carefully as inappropriate filter settings can lead to unintented
       side-effect.  Only dataset with a fixed sampling rate are supported. The sampling rate must be provided.

OPTIONS

       --version
              show program's version and license information and exit

       -h, --help, --help-np
              show  this  help message and exit. --help-np forcefully disables the use of a pager for displaying
              the help.

       -i DATASET [DATASET ...], --input DATASET [DATASET ...]
              path(s) to one or more PyMVPA dataset files. All datasets will be merged  into  a  single  dataset
              (vstack'ed)  in  order  of  specification. In some cases this option may need to be specified more
              than once if multiple, but separate, input datasets are required.

   Common options for all preprocessing:
       --chunks CHUNKS_ATTR
              shortcut option to enabled uniform chunkwise processing for all relevant preprocessing steps  (see
              --zscore-chunks,  --detrend-chunks).  This  global  setting  can  be  overwritten  by additionally
              specifying the corresponding individual "chunk" options.

       --strip-invariant-features
              After all pre-processing steps are done, strip all invariant features from the dataset.

   Options for data detrending:
       --poly-detrend DEG
              Order of the Legendre polynomial to remove from the data. This will remove every polynomial up  to
              and  including  the  provided  value.  For  example,  3  will  remove 0th, 1st, 2nd, and 3rd order
              polynomials from the data. np.B.: The 0th polynomial is the baseline shift, the 1st is the  linear
              trend. If you specify a single int and the `chunks_attr` parameter is not None, then this value is
              used for each chunk. You can also specify a different polyord value for each chunk by providing  a
              list  or  ndarray  of  polyord  values with the length equal to the number of chunks. Constraints:
              value must be convertible to type 'int'. [Default: 1]

       --detrend-chunks CHUNKS_ATTR
              If None, the whole dataset is detrended at once.  Otherwise, the given samples attribute (given by
              its  name)  is used to define chunks of the dataset that are processed individually. In that case,
              all the samples within a chunk should be in contiguous order and the chunks should  be  sorted  in
              order  from  low  to  high -- unless the dataset provides information about the coordinate of each
              sample  in  the  space  that  should  be  spanned  be  the  polynomials  (see  `space`  argument).
              Constraints: value must be `None`, or value must be a string. [Default: None]

       --detrend-coords COORDS_ATTR
              name  of  a samples attribute that is added to the preprocessed dataset storing the coordinates of
              each sample in the space spanned by the polynomials. If an  attribute  of  such  name  is  already
              present  in  the  dataset its values are interpreted as sample coordinates in the space spanned by
              the polynomials.  This can be used to detrend datasets with irregular sample spacing.

       --detrend-regrs ATTR [ATTR ...]
              List of sample attribute names that should be used as additional regressors. An example use  would
              be  to  regress  out  motion  parameters.  Constraints:  value  must  be  `None`, or value must be
              convertible to list(str).  [Default: None]

   Options for spectral filtering:
       --filter-passband FREQ [FREQ ...]
              critical frequencies of a Butterworth filter's pass band. Critical frequencies need to  match  the
              unit  of the specified sampling rate (see: --sampling-rate). In case of a band pass filter low and
              high frequency cutoffs need to be specified (in this order). For  low  and  high-pass  filters  is
              single  cutoff  frequency  must be provided. The type of filter (low/high-pass) is determined from
              the relation to the stop band frequency (--filter-stopband).

       --filter-stopband FREQ [FREQ ...]
              Analog setting to --filter-passband for specifying the filter's stop band.

       --sampling-rate FREQ
              sampling rate of the dataset. All frequency specifications need to match the unit of the  sampling
              rate.

       --filter-passloss dB
              maximum loss in the passband (dB). Default: 1 dB

       --filter-stopattenuation dB
              minimum attenuation in the stopband (dB). Default: 30 dB

   Options for data normalization:
       --zscore
              perform feature normalization by Z-scoring.

       --zscore-chunks CHUNKS_ATTR
              name   of  a  dataset  sample  attribute  defining  chunks  of  samples  that  shall  be  Z-scored
              independently. By default no chunk-wise normalization is done.

       --zscore-params PARAM PARAM
              define a fixed parameter set (mean, std) for Z-scoring, instead of computing from actual data.

   Output options:
       -o OUTPUT, --output OUTPUT
              output filename ('.hdf5' extension is added automatically if necessary). NOTE: The  output  format
              is  suitable  for  data  exchange  between  PyMVPA  commands, but is not recommended for long-term
              storage or exchange as its specific content may vary depending on the actual software environment.
              For long-term storage consider conversion into other data formats (see 'dump' command).

       --hdf5-compression TYPE
              compression  type  for  HDF5  storage.  Available values depend on the specific HDF5 installation.
              Typical values are: 'gzip', 'lzf', 'szip', or integers from 1 to  9  indicating  gzip  compression
              levels.

EXAMPLES

       Normalize all features in a dataset by Z-scoring

              $ pymvpa2 preproc --zscore -o ds_preprocessed -i dataset.hdf5

       Perform  Z-scoring  and  quadratic  detrending  of all features, but process all samples sharing a unique
       value of the "chunks" sample attribute individually

              $ pymvpa2 preproc --chunks "chunks" --poly-detrend 2 --zscore -o ds_pp2 -i ds.hdf5

AUTHOR

       Written by Michael Hanke & Yaroslav Halchenko, and numerous other contributors.

       Copyright © 2006-2015 PyMVPA developers

       Permission is hereby granted, free of charge, to any  person  obtaining  a  copy  of  this  software  and
       associated  documentation  files (the "Software"), to deal in the Software without restriction, including
       without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,  and/or  sell
       copies  of the Software, and to permit persons to whom the Software is furnished to do so, subject to the
       following conditions:

       The above copyright notice and this permission notice shall be included  in  all  copies  or  substantial
       portions of the Software.

       THE  SOFTWARE  IS  PROVIDED  "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
       LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
       EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
       IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE  SOFTWARE  OR
       THE USE OR OTHER DEALINGS IN THE SOFTWARE.