Ubuntu Manpage: pymvpa2-mkds - create a PyMVPA dataset from various sources

name
synopsis
description
options
examples
author
copyright

NAME

       pymvpa2-mkds -  create a PyMVPA dataset from various sources

SYNOPSIS

       pymvpa2  mkds  [--version]  [-h]  [-i [dataset [dataset ...]]] [--txt-data VALUE [VALUE ...] | --npy-data
       VALUE [VALUE ...] | --mri-data IMAGE [IMAGE ...] | --openfmri-modelbold SPEC SPEC  SPEC  SPEC]  [--add-sa
       VALUE  [VALUE  ...]]  [--add-fa  VALUE  [VALUE ...]] [--add-sa-txt VALUE [VALUE ...]] [--add-fa-txt VALUE
       [VALUE ...]] [--add-sa-attr FILENAME] [--add-sa-npy VALUE [VALUE ...]] [--add-fa-npy VALUE  [VALUE  ...]]
       [--mask IMAGE] [--add-vol-attr ARG ARG] [--add-fsl-mcpar FILENAME] -o OUTPUT [--hdf5-compression TYPE]

DESCRIPTION

       Create a PyMVPA dataset from various sources.

       This  command converts data from various sources, such as text files, NumPy's NPY files, and MR (magnetic
       resonance) images into a PyMVPA dataset that gets stored in HDF5 format. An arbitrary  number  of  sample
       and  feature  attributes  can  be  added  to  a  dataset,  and  individual  attributes  can  be read from
       heterogeneous sources (e.g. they do not have to be all from text files).

       For datasets from MR images this command also supports automatic conversion  of  additional  images  into
       (volumetric)  feature  attributes.  This  can  be useful for describing features with, for example, atlas
       labels.

       COMPOSE ATTRIBUTES ON THE COMMAND LINE

       Options --add-sa and --add-fa  can be used to compose dataset attributes directly on  The  command  line.
       The syntax is:

       ... --add-sa <attribute name> <comma-separated values> [DTYPE]

       where  the optional 'DTYPE' is any identifier of a NumPy data type (e.g. 'int', or 'float32'). If no data
       type is specified the attribute values will be strings.

       If only one attribute value is given, it will copied and assigned to all entries in the dataset.

       LOAD DATA FROM TEXT FILES

       All options for loading data from text files support optional parameters to Tweak the conversion:

       ... --add-sa-txt <mandatory values> [DELIMITER [DTYPE [SKIPROWS [COMMENTS]]]]

       where 'DELIMITER' is the string that is used to separate  values  in  the  input  file,  'DTYPE'  is  any
       identifier  of a NumPy data type (e.g. 'int', or 'float32'), 'SKIPROWS' is an integer indicating how many
       lines at the beginning of the respective file shall be ignored, and 'COMMENTS' is a string indicating how
       to-be-ignored comment lines are prefixed in the file.

       LOAD DATA FROM NUMPY NPY FILES

       All options for loading data from NumPy NPY files support an optional parameter:

       ... --add-fa-npy <mandatory values> [MEMMAP]

       where 'MEMMAP' is a flag  that triggers whether the respective file shall be read by memory-mapping, i.e.
       not read (immediately) into memory. Enable by with on of: yes|1|true|enable|on'.

OPTIONS

--version
show program's version and license information and exit

-h, --help, --help-np
show this help message and exit. --help-np forcefully disables the use of a pager for displaying
the help.

-i [dataset [dataset ...]], --input [dataset [dataset ...]]
path(s) to one or more PyMVPA dataset files. All datasets will be merged into a single dataset
(vstack'ed) in order of specification. In some cases this option may need to be specified more
than once if multiple, but separate, input datasets are required.

Input data sources:
--txt-data VALUE [VALUE ...]
load samples from a text file. The first value is the filename the data will be loaded from.
Additional values modifying the way the data is loaded are described in the section "Load data
from text files".

--npy-data VALUE [VALUE ...]
load samples from a Numpy .npy file. Compressed files (i.e. .npy.gz) are supported as well. The
first value is the filename the data will be loaded from. Additional values modifying the way the
data is loaded are described in the section "Load data from Numpy NPY files".

--mri-data IMAGE [IMAGE ...]
load data from an MR image, such as a NIfTI file. This can either be a single 4D image, or a list
of 3D images, or a combination of both.

--openfmri-modelbold SPEC SPEC SPEC SPEC
load all data associated with a stimulation model in an OpenFMRI-compliant dataset. This option
needs 4 argument values: <path> <model ID> <subj ID> <flavor>. The first value is the base
directory of the dataset. The next two are (integer) ID for the desired stimulus model and
subject. The last argument is either a string indicating the data flavor to load, or an empty
string for the default image (bold.nii.gz).

Options for attributes from the command line:
--add-sa VALUE [VALUE ...]
compose a sample attribute from the command line input. The first value is the desired attribute
name, the second value is a comma-separated list (appropriately quoted) of actual attribute
values. An optional third value can be given to specify a data type. Additional information on
defining dataset attributes on the command line are given in the section "Compose attributes on
the command line.

--add-fa VALUE [VALUE ...]
compose a feature attribute from the command line input. The first value is the desired attribute
name, the second value is a comma-separated list (appropriately quoted) of actual attribute
values. An optional third value can be given to specify a data type. Additional information on
defining dataset attributes on the command line are given in the section "Compose attributes on
the command line.

Options for attributes from text files:
--add-sa-txt VALUE [VALUE ...]
load sample attribute from a text file. The first value is the desired attribute name, the second
value is the filename the attribute will be loaded from. Additional values modifying the way the
data is loaded are described in the section "Load data from text files".

--add-fa-txt VALUE [VALUE ...]
load feature attribute from a text file. The first value is the desired attribute name, the second
value is the filename the attribute will be loaded from. Additional values modifying the way the
data is loaded are described in the section "Load data from text files".

--add-sa-attr FILENAME
load sample attribute values from an legacy 'attributes file'. Column data is read as "literal".
Only two column files ('targets' + 'chunks') without headers are supported. This option allows for
reading attributes files from early PyMVPA versions.

Options for attributes from stored Numpy arrays:
--add-sa-npy VALUE [VALUE ...]
load sample attribute from a Numpy .npy file. Compressed files (i.e. .npy.gz) are supported as
well. The first value is the desired attribute name, the second value is the filename the data
will be loaded from. Additional values modifying the way the data is loaded are described in the
section "Load data from Numpy NPY files".

--add-fa-npy VALUE [VALUE ...]
load feature attribute from a Numpy .npy file. Compressed files (i.e. .npy.gz) are supported as
well. The first value is the desired attribute name, the second value is the filename the data
will be loaded from. Additional values modifying the way the data is loaded are described in the
section "Load data from Numpy NPY files".

Options for input from MR images:
--mask IMAGE
mask image file with the same dimensions as an input data sample. All voxels corresponding to
non-zero mask elements will be permitted into the dataset.

--add-vol-attr ARG ARG
attribute name (1st argument) and image file with the same dimensions as an input data sample (2nd
argument). The image data will be added as a feature attribute under the specified name.

--add-fsl-mcpar FILENAME
6-column motion parameter file in FSL's McFlirt format. Six additional sample attributes will be
created: mc_{x,y,z} and mc_rot{1-3}, for translation and rotation estimates respectively.

Output options:
-o OUTPUT, --output OUTPUT
output filename ('.hdf5' extension is added automatically if necessary). NOTE: The output format
is suitable for data exchange between PyMVPA commands, but is not recommended for long-term
storage or exchange as its specific content may vary depending on the actual software environment.
For long-term storage consider conversion into other data formats (see 'dump' command).

--hdf5-compression TYPE
compression type for HDF5 storage. Available values depend on the specific HDF5 installation.
Typical values are: 'gzip', 'lzf', 'szip', or integers from 1 to 9 indicating gzip compression
levels.

EXAMPLES

       Load 4D MRI image, assign atlas labels to a feature attribute, and attach class labels from a text  file.
       The resulting dataset is stored as 'ds.hdf5' in the current directory.

              $  pymvpa2  mkds  -o  ds --mri-data bold.nii.gz --vol-attr area harvox.nii.gz --add-sa-txt targets
              labels.txt

AUTHOR

       Written by Michael Hanke & Yaroslav Halchenko, and numerous other contributors.

COPYRIGHT

       Copyright © 2006-2015 PyMVPA developers

       Permission is hereby granted, free of charge, to any  person  obtaining  a  copy  of  this  software  and
       associated  documentation  files (the "Software"), to deal in the Software without restriction, including
       without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense,  and/or  sell
       copies  of the Software, and to permit persons to whom the Software is furnished to do so, subject to the
       following conditions:

       The above copyright notice and this permission notice shall be included  in  all  copies  or  substantial
       portions of the Software.

       THE  SOFTWARE  IS  PROVIDED  "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
       LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
       EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
       IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE  SOFTWARE  OR
       THE USE OR OTHER DEALINGS IN THE SOFTWARE.