lunar (1) h5import.1.gz

Provided by: hdf5-tools_1.10.8+repack1-1ubuntu1_amd64 bug

NAME

       h5import - Imports data into an existing or new HDF5 file.

SYNOPSIS

       h5import  infile  -d dim_list [ -p pathname ] [ -t input_class ] [ -s input_size ] [infile
       ...]  -o outfile

       h5import infile -dims dim_list [ -path pathname ] [ -type input_class ] [ -size input_size
       ] [infile ...]  -outfile outfile

       h5import infile -c config_file [infile ...]  -outfile outfile

       h5import -h

       h5import -help

DESCRIPTION

       h5import  converts  data  from  one  or  more ASCII or binary files, infile, into the same
       number of HDF5 datasets in the existing or new HDF5  file,  outfile.  Data  conversion  is
       performed  in  accordance with the user-specified type and storage properties specified in
       in_options.

       The primary objective of h5import is  to  import  floating  point  or  integer  data.  The
       utility's  design  allows  for  future versions that accept ASCII text files and store the
       contents as a compact array  of  one-dimensional  strings,  but  that  capability  is  not
       implemented in HDF5 Release 1.6.

       Input data and options

       Input data can be provided in one of the follwing forms:

       *      As an ASCII, or plain-text, file containing either floating point or integer data

       *      As a binary file containing either 32-bit or 64-bit native floating point data

       *      As  a  binary  file  containing  native integer data, signed or unsigned and 8-bit,
              16-bit, 32-bit, or 64-bit.

       *      As an ASCII, or plain-text,  file  containing  text  data.  (This  feature  is  not
              implemented in HDF5 Release 1.6.)

       Each  input  file,  infile,  contains a single n-dimensional array of values of one of the
       above types expressed in the order of fastest-changing dimensions first.

       Floating point data in an ASCII input file must be expressed in the  fixed  floating  form
       (e.g.,  323.56)  h5import is designed to accept scientific notation (e.g., 3.23E+02) in an
       ASCII, but that is not implemented in HDF5 release 1.6.

       Each input file can be  associated  with  options  specifying  the  datatype  and  storage
       properties.  These  options  can  be  specified  either  as command line arguments or in a
       configuration file. Note that exactly one of these approaches must be used with  a  single
       input file.

       Command  line  arguments,  best  used  with simple input files, can be used to specify the
       class, size, dimensions of the input data and a path identifying the output dataset.

       The recommended means of specifying input data options is in a configuration file; this is
       also  the  only  means  of specifying advanced storage features. See further discussion in
       "The configuration file" below.

       The only required option for input data is dimension sizes; defaults are available for all
       others.

       h5import  will accept up to 30 input files in a single call. Other considerations, such as
       the maximum length of a command line, may impose a more stringent limitation.

       Output data and options:

       The name of the output file is specified following the -o or -output  option  in  outfile.
       The data from each input file is stored as a separate dataset in this output file. outfile
       may be an existing file. If it does not yet exist, h5import will create it.

       Output dataset information and storage properties can be specified  only  by  means  of  a
       configuration file.

       Dataset path
              If the groups in the path leading to the dataset do not exist, h5import will create
              them.  If no group is specified, the dataset will be created as  a  member  of  the
              root  group.  If no dataset name is specified, the default name is dataset1 for the
              first input dataset, dataset2 for the second input dataset, dataset3 for the  third
              input  dataset,  etc.   h5import  does  not overwrite a pre-existing dataset of the
              specified or default name. When an  existing  dataset  of  a  confilcting  name  is
              encountered,  h5import  quits  with  an  error;  the  current  input  file  and any
              subsequent input files are not processed.

       Output type
              Datatype parameters for output data

       Output data class
              Signed or unsigned integer or floating point

       Output data size
              8-, 16-, 32-, or 64-bit integer 32- or 64-bit floating point

       Output architecture
              IEEE, STD, NATIVE (Default), Other  architectures  are  included  in  the  h5import
              design but are not implemented in this release.

       Output byte order
              Little- or big-endian.  Relevant only if output architecture is IEEE, UNIX, or STD;
              fixed for other architectures.

       Dataset layout and storage properties
              Denote how raw data is to be organized on the disk. If none of  the  following  are
              specified, the default configuration is contiguous layout and with no compression.

       Layout Contiguous (Default),  Chunked

       External storage
              Allows  raw  data  to  be  stored  in  a non-HDF5 file or in an external HDF5 file.
              Requires contiguous layout.

       Compressed
              Sets the type of compression and the level to which the dataset must be compressed.
              Requires chunked layout.

       Extendible
              Allows  the  dimensions  of  the dataset increase over time and/or to be unlimited.
              Requires chunked layout.

       Compressed and extendible
              Requires chunked layout.

FILES

       A configuration file is specified with the -c config_file option:
              h5import infile -c config_file [infile -

       The configuration file is an ASCII file and must be  organized  as  "Configuration_Keyword
       Value"  pairs, with one pair on each line. For example, the line indicating that the input
       data class (configuration keyword INPUT-CLASS) is floating point in  a  text  file  (value
       TEXTFP) would appear as follows:
              INPUT-CLASS TEXTFP

       A configuration file may have the following keywords each followed by one of the following
       defined values. One entry for each of the first two keywords, RANK and DIMENSION-SIZES, is
       required; all other keywords are optional.

       RANK rank
              An  integer  specifying the number of dimensions in the dataset.  Example:   RANK 4
              for a 4-dimensional dataset.

       DIMENSION-SIZES dim_sizes
              Sizes of the dataset dimensions. (Required) A string  of  space-separated  integers
              specifying  the sizes of the dimensions in the dataset. The number of sizes in this
              entry must match the value in the RANK entry. The fastest-changing  dimension  must
              be listed first.  Example:   DIMENSION_SIZES 4 3 4 38   for a 38x4x3x4 dataset.

       PATH path
              Path  of the output dataset.  The full HDF5 pathname identifying the output dataset
              relative to the root  group  within  the  output  file.  I.e.,  path  is  a  string
              consisting  of  optional  group  names, each followed by a slash, and ending with a
              dataset name. If the groups in the path do no exist, they will be created. If  PATH
              is  not  specified,  the output dataset is stored as a member of the root group and
              the default dataset name is dataset1 for the first input dataset, dataset2 for  the
              second  input  dataset,  dataset3  for  the  third  input  dataset, etc.  Note that
              h5import does not overwrite a pre-existing dataset  of  the  specified  or  default
              name. When an existing dataset of a confilcting name is encountered, h5import quits
              with an error; the current input file  and  any  subsequent  input  files  are  not
              processed.    Example:  The  configuration  file  entry  "PATH  grp1/grp2/dataset1"
              indicates that the output dataset dataset1 will be written in the group grp2/ which
              is in the group grp1/, a member of the root group in the output file.

       INPUT-CLASS {TEXTIN|TEXTUIN|TEXTFP|TEXTFPE|IN|UIN|FP|STR}
              A string denoting the type of input data.
              TEXTIN Input is signed integer data in an ASCII file.
              TEXTUIN Input is unsigned integer data in an ASCII file.
              TEXTFP  Input  is  floating point data in fixed notation (e.g., 325.34) in an ASCII
              file.
              TEXTFPE Input is floating point data in scientific notation (e.g.,  3.2534E+02)  in
              an ASCII file. (Not implemented in this release.)
              IN Input is signed integer data in a binary file.
              UIN Input is unsigned integer data in a binary file.
              FP Input is floating point data in a binary file. (Default)
              STR  Input  is  character data in an ASCII file. With this value, the configuration
              keywords RANK, DIMENSION-SIZES, OUTPUT-CLASS, OUTPUT-SIZE, OUTPUT-ARCHITECTURE, and
              OUTPUT-BYTE-ORDER will be ignored. (Not implemented in this release.)

       INPUT-SIZE {8|16|32|64}
              An  integer  denoting the size of the input data, in bits.  For signed and unsigned
              integer data (TEXTIN, TEXTUIN, IN, or UIN) any of 8, 16, 32, or  64  may  be  used.
              The default is 32.   For floating point data (TEXTFP, TEXTFPE, or FP), either 32 or
              64 may be specified.  The default is 32.

       OUTPUT-CLASS {IN|UIN|FP|STR}
              A string denoting the type of output data.
              IN Output is signed integer data.  (Default if INPUT-CLASS is IN or TEXTIN)
              UIN Output is unsigned integer data. (Default if INPUT-CLASS is UIN or TEXTUIN)
              FP Output is floating point data. (Default if INPUT-CLASS is not  specified  or  is
              FP, TEXTFP, or TEXTFPE)
              STR  Output  is  character data, to be written as a 1-dimensional array of strings.
              (Default if INPUT-CLASS is STR) (Not implemented in this release.)

       OUTPUT-SIZE {8|16|32|64}
              An integer denoting the size of the output data, in bits.  For signed and  unsigned
              integer  data (IN or UIN), any of the four sizes are valid. The default is the same
              as INPUT-SIZE, else 32.  For floating point data (FP),  either  32  or  64  may  be
              specified.  The default is the same as INPUT-SIZE, else 32.

       OUTPUT-ARCHITECTURE {NATIVE|STD|IEEE|INTEL|CRAY|MIPS|ALPHA|UNIX}
              A  string  denoting  the  type  of output architecture.  See the "Predefined Atomic
              Types" section in the "HDF5 Datatypes" chapter of  the  HDF5  User's  Guide  for  a
              discussion  of  these  architectures.   INTEL,  CRAY, MIPS, ALPHA, and UNIX are not
              implemented in this release.  (Default: NATIVE)

       OUTPUT-BYTE-ORDER {BE|LE}
              A string denoting the output byte order. This  entry  is  ignored  if  the  OUTPUT-
              ARCHITECTURE is not specified or if it is not specified as IEEE, UNIX, or STD.
              BE Big-endian. (Default)
              LE Little-endian.

       The  following  options  are disabled by default, making the default storage properties no
       chunking, no compression, no external storage, and no extensible dimensions.

       CHUNKED-DIMENSION-SIZES chunk_dims
              Dimension sizes of the chunk for chunked output data.  A string of  space-separated
              integers  specifying  the dimension sizes of the chunk for chunked output data. The
              number of dimensions must correspond to the value of RANK.  The  presence  of  this
              field  indicates that the output dataset is to be stored in chunked layout; if this
              configuration field is absent, the dataset will be stored in contiguous layout.

       COMPRESSION-TYPE {GZIP}
              Type of compression to  be  used  with  chunked  storage.  Requires  that  CHUNKED-
              DIMENSION-SIZES  be  specified.   GZIP        is gzip compression. Othe compression
              algorithms are not implemented in this release of h5import.

       COMPRESSION-PARAM [1-9]
              Compression level.  Required if COMPRESSION-TYPE is  specified.   Gzip  compression
              levels:  1  will  result in the fastest compression while 9 will result in the best
              compression ratio.  (Default: 6. The default gzip compression level is 6;  not  all
              compression methods will have a default level.)

       EXTERNAL-STORAGE external_file
              Name of an external file in which to create the output dataset. Cannot be used with
              CHUNKED-DIMENSIONS-SIZES,  COMPRESSION-TYPE,  OR  MAXIMUM-DIMENSIONS.    A   string
              specifying the name of an external file.

       MAXIMUM-DIMENSIONS max_dims
              Maximum   sizes   of  all  dimensions.  Requires  that  CHUNKED-DIMENSION-SIZES  be
              specified.  A string of space-separated integers specifying  the  maximum  size  of
              each  dimension  of  the  output  dataset.  A value of -1 for any dimension implies
              unlimited size for that  particular  dimension.   The  number  of  dimensions  must
              correspond to the value of RANK.

OPTIONS

       -h[elp]
              prints the h5import usage summary

       infile(s)
              Name of the Input file(s).

       -d[ims] dim_list
              Input  data  dimensions.   dim_list  is a string of comma-separated numbers with no
              spaces describing the dimensions of the  input  data.  For  example,  a  50  x  100
              2-dimensional  array  would be specified as -dims 50,100.  Required argument: if no
              configuration file is used, this command-line argument is mandatory.

       -p[athname] pathname
              pathname is a string consisting of one or more strings  separated  by  slashes  (/)
              specifying the path of the dataset in the output file. If the groups in the path do
              no exist, they will be created.  Optional argument: if not specified,  the  default
              path  is  dataset1  for  the  first  input  dataset,  dataset2 for the second input
              dataset, dataset3 for the third input dataset, etc.  h5import does not overwrite  a
              pre-existing  dataset of the specified or default name. When an existing dataset of
              a confilcting name is encountered, h5import quits with an error; the current  input
              file and any subsequent input files are not processed.

       -t[ype] input_class
              input_class  specifies  the class of the input data and determines the class of the
              output data.  Valid values are as  defined  in  the  Keyword/Values  table  in  the
              section  "The  configuration file" above.  Optional argument: if not specified, the
              default value is FP.

       -s[size] input_size
              input_size specifies the size in bits of the input data and determines the size  of
              the  output  data.  Valid values for signed or unsigned integers are 8, 16, 32, and
              64.  Valid values for floating point data are 32 and 64.  Optional argument: if not
              specified, the default value is 32.

       -c config_file
              config_file  specifies  a  configuration  file.   This  argument replaces all other
              arguments except infile and -o outfile

       outfile
              Name of the HDF5 output file.

NOTES

       If the -c config_file option is used with an input file, no other  argument  can  be  used
       with  that  input  file. If the -c config_file option is not used with an input data file,
       the -d dim_list argument (or -dims dim_list) must be  used  and  any  combination  of  the
       remaining options may be used. Any arguments used must appear in exactly the order used in
       the syntax declarations immediately above.

       Note that while only the -dims argument is required, arguments must used in the  order  in
       which they are listed below.

EXAMPLES

       Using command-line arguments:

       This  command  creates a file out1 containing a single 2x3x4 32-bit integer dataset. Since
       no pathname is specified, the dataset is stored in out1 as /dataset1.
              h5import infile -dims 2,3,4 -type TEXTIN -size 32 -o out1

       This command creates a file out2  containing  a  single  a  20x50  64-bit  floating  point
       dataset. The dataset is stored in out2 as /bin1/dset1.
              h5import infile -dims 20,50 -path bin1/dset1 -type FP -size 64 -o out2

       Sample configuration files: The following configuration file specifies the following:

       o      The input data is a 5x2x4 floating point array in an ASCII file.

       o      The  output  dataset will be saved in chunked layout, with chunk dimension sizes of
              2x2x2.

       o      The output datatype will be 64-bit floating point, little-endian, IEEE.

       o      The output dataset will be stored in outfile at /work/h5/pkamat/First-set.

       o      The maximum dimension sizes of the output dataset will be 8x8x(unlimited).

              PATH work/h5/pkamat/First-set
              INPUT-CLASS TEXTFP
              RANK 3
              DIMENSION-SIZES 5 2 4
              OUTPUT-CLASS FP
              OUTPUT-SIZE 64
              OUTPUT-ARCHITECTURE IEEE
              OUTPUT-BYTE-ORDER LE
              CHUNKED-DIMENSION-SIZES 2 2 2
              MAXIMUM-DIMENSIONS 8 8 -1

       The next configuration file specifies the following:

       o      The input data is a 6x3x5x2x4 integer array in a binary file.

       o      The output dataset will be saved in chunked layout, with chunk dimension  sizes  of
              2x2x2x2x2.

       o      The  output  datatype  will  be  32-bit  integer  in  NATIVE  format (as the output
              architecure is not specified).

       o      The output dataset will be compressed using Gzip  compression  with  a  compression
              level of 7.

       o      The output dataset will be stored in outfile at /Second-set.

              PATH Second-set
              INPUT-CLASS IN
              RANK 5
              DIMENSION-SIZES 6 3 5 2 4
              OUTPUT-CLASS IN
              OUTPUT-SIZE 32
              CHUNKED-DIMENSION-SIZES 2 2 2 2 2
              COMPRESSION-TYPE GZIP
              COMPRESSION-PARAM 7

SEE ALSO

       h5dump(1), h5ls(1), h5diff(1), h5repart(1), gif2h5(1), h52gif(1), h5perf(1)

                                                                                      h5import(1)