Provided by: python-cf_1.3.2+dfsg1-4_amd64 bug

NAME

       cfa - create aggregated CF datasets

SYNOPSIS

       cfa [-d dir] [-f format] [-h] [-i] [-n] [-o file] [-u] [-v] [-x] [OPTIONS] INPUTS

DESCRIPTION

       The  cfa tool creates and writes to disk the CF fields contained in files contained in the
       INPUTS (which may include directories if the --recursive option is set).

       Accepts CF-netCDF and CFA-netCDF files (or URLs if DAP access is enabled), Met Office (UK)
       PP  files  and Met Office (UK) fields files as input. Multiple input files in a mixture of
       formats may be given and normal UNIX file globbing rules apply.

       Output files are in CF-netCDF or CFA-netCDF format (see the -f option).  Both output types
       are  available in netCDF3 and netCDF4 formats. Note that the netCDF3 formats are generally
       slower to write than the netCDF4 formats, by several orders of  magnitude  if  files  with
       many  data  variables  are  involved. However, not all software can read netCDF4, so it is
       advisable to check before writing in this format.

       By default the contents of each input file is  aggregated  (i.e.  combined)  into  as  few
       multi-dimensional  CF  fields as possible. Unaggregatable fields in the input files may be
       omitted  from  the  output  (see  the  -x  option).  Information  on  which   fields   are
       unaggregatable,  and why, may be displayed (see the --info option). All aggregation may be
       turned off with the -n  option,  in  which  case  all  input  fields  are  output  without
       modification.

       See  the  AGGREGATION  section  for  details on the aggregation process and unaggregatable
       fields.

       By default one output file is created per input file. In this case there is no  inter-file
       aggregation  and  the  contents  of  each  file is aggregated independently of the others.
       Output file names are created by removing the suffix .pp, .nc or .nca, if  there  is  one,
       from  each  input  file name and then adding a new suffix of .nc or .nca for CF-netCDF and
       CFA-netCDF output formats respectively. If the -d option is set then all output files will
       be  written  to the specified directory, otherwise each output file will be written to the
       same directory as its input file.

       Alternatively, all of the input files may be treated collectively as a single  CF  dataset
       and  written  to  a  single  output  file (see the -o option). In this case aggregation is
       attempted within and between the input files.

       An error occurs if an output file has the same full name as any of the input files or  any
       other output file.

AGGREGATION

       Aggregation of input fields into as few multi-dimensional CF fields as possible is carried
       out   according   to   the   aggregation   rules   documented    in    CF    ticket    #78
       (http://kitt.llnl.gov/trac/ticket/78).  For  each  input  field,  the  aggregation process
       creates a structural signature which is essentially a subset of the metadata of the field,
       including  coordinate  metadata  and  other domain information, but which contains no data
       values. The structural signature accounts for the following standard CF properties:

              add_offset,  calendar,   cell_methods,   _FillValue,   flag_masks,   flag_meanings,
              flag_values, missing_value, scale_factor, standard_error_multiplier, standard_name,
              units, valid_max, valid_min, valid_range

       Aggregation is then attempted on  each  group  of  fields  with  the  same,  well  defined
       structural  signature,  and  will  succeed  where  the coordinate data values imply a safe
       combination into a single dataset.

       Not all fields are aggregatable. Unaggregatable fields are those without  a  well  defined
       structural  signature;  or  those  with the same structural signature when at least two of
       them 1) can't be unambiguously distinguished by coordinates or other domain information or
       2) contain coordinate reference fields or ancillary variable fields which themselves can't
       be unambiguously aggregated.

EXAMPLES

       Create a new netCDF3 classic file containing the aggregatable fields in all of  the  input
       files:

              cfa -o newfile.nc *.nc

       Create,  in  an existing directory and overwriting any existing files, new netCDF3 classic
       files containing the aggregatable fields in each input file:

              cfa -d directory --overwrite *.pp

       Create a new netCDF4 file containing all fields in all of the input files:

              cfa -f NETCDF4 -o newfile.nc *.nc

       Create a new CFA-netCDF4 file containing all fields in all of the input  files  and  allow
       long names or netCDF variable names to identify fields and their components:

              cfa -i -f CFA4 -o newfile.nc *.nc

OPTIONS

       --axis=property
              Aggregation  configuration:  Create a new axis for each input field which has given
              property. If an input field has the property then, prior to aggregation, a new axis
              is  created  with an auxiliary coordinate whose data array is the property's value.
              This allows for the possibility of aggregation along the  new  axis.  The  property
              itself is deleted from that field. No axis is created for input fields which do not
              have the specified property.

              Multiple axes may be created by specifying more than one --axis option.

              For example, if you wish to aggregate an ensemble of  model  experiments  that  are
              distinguished  by  the  source  property,  you  can  use --axis=source to create an
              ensemble axis which has an auxiliary  coordinate  variable  containing  the  source
              property values.

       --cfa_base=[value]
              For  output  CFA-netCDF  files  only. File names referenced by an output CFA-netCDF
              file have relative, as opposed to absolute, paths or URL bases. This may be  useful
              when relocating a CFA-netCDF file together with the datasets referenced by it.

              If  set with no value (--cfa_base=) or the value is empty then file names are given
              relative to the directory or URL base containing the output CFA-netCDF file. If set
              with  a  non-empty value then file names are given relative to the directory or URL
              base described by the value.

              By default, file names within CFA-netCDF files  are  stored  with  absolute  paths.
              Ignored for output files of any other format.

       --compress=N

              Regulate  the speed and efficiency of compression. Must be an integer between 0 and
              9. By default N is 0, meaning no compression; 1 is the fastest, but has the  lowest
              compression ratio; 9 is the slowest but best compression ratio.

       --contiguous
              Aggregation  configuration: Requires that aggregated fields have adjacent dimension
              coordinate cells which partially overlap or share common boundary  values.  Ignored
              if the dimension coordinates do not have bounds.

       -d dir, --directory=dir
              Specify the output directory for all output files.

       --double
              Write  32-bit  floats  as  64-bit floats and 32-bit integers as 64-bit integers. By
              default, input data types are preserved.

       --equal=property
              Aggregation configuration: Require that an input field may only be aggregated  with
              other fields if they all have the given CF property (standard or non-standard) with
              equal values. Ignored for any input field which does not have this property, or  if
              the property is already accounted for in the structural signature.

              Supersedes  the  behaviour  for  the  given  property  that  may  be implied by the
              --exist_all option.

              Multiple properties may be set by specifying more than one --equal option.

       --equal_all
              Aggregation configuration: Require that an input field may only be aggregated  with
              other  fields  that  have  the  same  set of CF properties (excluding those already
              accounted for in the structural signature) with equal sets of values.

              The behaviour for individual properties may be overridden by the  --exist  --ignore
              options.

              For  example,  to  insist that a group of aggregated input fields must all have the
              same CF properties (other than those accounted for  in  the  structural  signature)
              with  matching  values,  but allowing the long_name properties have unequal values,
              you can use --equal_all --exist=long_name

       --exist=property
              Aggregation configuration: Require that an input field may only be aggregated  with
              other fields if they all have the given CF property (standard or non-standard), but
              not requiring the values to be the same. Ignored for any input field which does not
              have  this  property, or if the property is already accounted for in the structural
              signature.

              Supersedes the behaviour for  the  given  property  that  may  be  implied  by  the
              --equal_all option.

              Multiple properties may be set by specifying more than one --exist option.

       --exist_all
              Aggregation  configuration: Require that an input field may only be aggregated with
              other fields that have the same set  of  CF  properties  (excluding  those  already
              accounted  for in the structural signature), but not requiring the values to be the
              same.

              The behaviour for individual properties may be overridden by the  --equal  --ignore
              options.

              For  example,  to  insist that a group of aggregated input fields must all have the
              same CF properties (other than those accounted for in  the  structural  signature),
              regardless  of  their values, but also insisting that the long_name properties have
              equal values, you can use --exist_all --equal=long_name

       -f format, --format=format
              Set  the  format  of  the  output  file(s).  Valid  choices  are   NETCDF3_CLASSIC,
              NETCDF3_64BIT,  NETCDF4, NETCDF4_CLASSIC and NETCDF3_64BIT for outputting CF-netCDF
              files in those netCDF formats and CFA3 or CFA4 for outputting CFA-netCDF  files  in
              NETCDF3_CLASSIC  or  NETCDF4  formats  respectively. By default, NETCDF3_CLASSIC is
              assumed.

              Note that the netCDF3 formats are  generally  slower  to  write  than  the  netCDF4
              formats,  by  several  orders  of  magnitude  if files with many data variables are
              involved. However, not all software can read netCDF4, so it is advisable  to  check
              before writing in this format.

       -h, --help
              Display this man page.

       -i, --relaxed_identities
              Aggregation configuration: In the absence of standard names, allow fields and their
              components (such as coordinates) to be identified by their long_name CF  properties
              or else their netCDF file variable names.

       --ignore=property
              Aggregation  configuration:  An  input  field  may  be aggregated with other fields
              regardless of whether or not they have the given  CF  property  (standard  or  non-
              standard)  and regardless of its values. Ignored for any input field which does not
              have this property, or if the property is already accounted for in  the  structural
              signature.

              This is the default behaviour in the absence of all the --exist --equal --exist_all
              --equal_all options and supersedes the behaviour for the given property that may be
              implied if any of these options are set.

              Multiple properties may be set by specifying more than one --ignore option.

              For  example,  to  insist that a group of aggregated input fields must all have the
              same CF properties (other than those accounted for  in  the  structural  signature)
              with  the  same  values, but with no restrictions on the existence or values of the
              long_name property you can use --equal_all --ignore=long_name

       --fletcher32
              Activate the Fletcher-32 HDF5 checksum  algorithm  to  detect  compression  errors.
              Ignored if there is no compression (see the --compress option).

       --follow_symlinks
              In  combination with --recursive also search for files in directories which resolve
              to symbolic links. Files specified by the  INPUTS  which  are  symbolic  links  are
              always  followed.  Note  that  setting  --recursive  --follow_symlinks  can lead to
              infinite recursion if a directory which resolves to a symbolic  link  points  to  a
              parent directory of itself.

       --ignore_read_error
              Ignore, without failing, any input file which causes an error whilst being read, as
              would be the case for an empty file, unknown file format, etc. By default an  error
              occurs in this case.

       --info=N
              Aggregation configuration: Print information about the aggregation process. If N is
              0 then no information is displayed. If N is 1 or more then display  information  on
              which  fields are unaggregatable, and why. If N is 2 or more then display the field
              structural signatures and, when  there  is  more  than  one  field  with  the  same
              structural  signature,  their canonical first and last coordinate values. If N is 3
              or more then display the field complete aggregation metadata.

              By default N is 0.

       --least_sig_digit=N
              Truncate the input field data arrays. For a positive integer N the  precision  that
              is  retained in the compressed data is '10 to the power -N'. For example, if N is 2
              then a precision of 0.01 is retained. In conjunction with compression this produces
              'lossy', but significantly more efficient compression (see the --compress option).

       --ncvar_identities
              Aggregation  configuration: Force fields and their components (such as coordinates)
              to be identified by their netCDF file variable names.

       -n, --no_aggregation
              Aggregation configuration: Do not aggregate fields. Writes the input fields as they
              exist in the input files.

       --no_overlap
              Aggregation  configuration: Requires that aggregated fields have adjacent dimension
              coordinate cells which do not overlap (but they may share common boundary  values).
              Ignored if the dimension coordinates do not have bounds.

       --no_shuffle
              Turn  off  the  HDF5  shuffle  filter,  which  de-interlaces a block of data before
              compression by reordering the  bytes  by  storing  the  first  byte  of  all  of  a
              variable's  values in the chunk contiguously, followed by all the second bytes, and
              so on. By default the filter is applied because if the data array  values  are  not
              all  wildly different, using the filter can make the data more easily compressible.
              Ignored if there is no compression (see the --compress option).

       -o file, --outfile=file
              Treat all input files collectively as a single CF dataset. In this case aggregation
              is  attempted within and between the input files and all outputs are written to the
              specified file.

       --overwrite
              Allow pre-existing output files to be overwritten.

       --promote=component
              Promote field components to independent top-level fields. If component is ancillary
              then  ancillary  data fields are promoted. If component is auxiliary then auxiliary
              coordinate variables are promoted. If  component  is  measure  then  cell  meausure
              variables  are  promoted.  If  component  is  reference then fields pointed to from
              formula_terms attributes are promoted. If component is  field  then  all  component
              fields are promoted.

              Multiple  conponent  types  may  be  promoted by specifying more than one --promote
              option.

              For example, promote  to  ancillary  data  field  and  cell  measure  variables  to
              independent, top-level fields you can use --promote=ancillary --promote=measure

       --recursive
              Allow  directories  to  be  specified  by  the  INPUTS  and  recursively search the
              directories for actual files to read. Set the --ignore_read_error option to  bypass
              any  unreadable  files  and the --follow_symlinks option to allow directories to be
              symbolic links.

       --reference_datetime=datetime
              Set the reference date-time of time coordinate units to an ISO 8601-like date-time.
              Changing  the  reference  date-time  does not change the absolute date-times of the
              coordinates. Ignored for non-reference  date-time  coordinates.  Some  examples  of
              valid date-times: 1830-12-1, "1830-12-09 2:34:45Z".

       --respect_valid
              Aggregation configuration: Take into account the CF properties valid_max, valid_min
              and valid_range during aggregation. By default they are ignored for the purposes of
              aggregation and deleted from any aggregated output CF fields.

       --shared_nc_domain
              Aggregation  configuration:  Match axes between a field and its contained ancillary
              variable and coordinate reference fields via their netCDF dimension names  and  not
              via their domains.

       --single
              Write  64-bit  floats  as  32-bit floats and 64-bit integers as 32-bit integers. By
              default, input data types are preserved.

       --squeeze
              Remove size 1 axes from the output field data arrays. If a size one  axis  has  any
              one dimensional coordinates then these are converted to CF scalar coordinates.

       -u, --relaxed_units
              Aggregation  configuration:  Assume  that  fields  or  their  components  (such  as
              coordinates) with the same standard name (or other identifiers, see the -i  option)
              but  missing units all have equivalent (but unspecified) units, so that aggregation
              may occur. This is the default for Met Office (UK) PP files  and  Met  Office  (UK)
              fields files, but not for other formats.

       --unsqueeze
              Include  size 1 axes in the output field data arrays. If a size one axis has any CF
              scalar coordinates then these are converted to one dimensional coordinates.

       --um_version=version
              For Met Office (UK) PP files and Met Office (UK) fields  files  only,  the  Unified
              Model  (UM)  version  to  be used when decoding the header. Valid versions are, for
              example, 4.2, 6.6.3 and 8.2. The default version is  4.5.  In  general,  the  given
              version is ignored if it can be inferred from the header (which is usually the case
              for files created by the UM at versions 5.3 and later). The exception  to  this  is
              when  the given version has a third element (such as the 3 in 6.6.3), in which case
              any version in the header is ignored. This option is ignored for input files  which
              are not Met Office (UK) PP files or Met Office (UK) fields files.

       --unlimited=axis
              Create an unlimited dimension (a dimension that can be appended to). A dimension is
              identified by either a standard name; one of T, Z, Y, X denoting  time,  height  or
              horixontal axes (as defined by the CF conventions); or the value of an arbitrary CF
              property preceded by the property name and a colon. For example:

              Multiple unlimited axes may be defined by  specifying  more  than  one  --unlimited
              option.  Note,  however,  that  only  netCDF4  formats  support  multiple unlimited
              dimensions. For example, to set the time and Z dimensions to be unlimited you could
              use --unlimited=time --unlimited=Z

              An   example   of   defining   an  axis  by  an  arbitrary  CF  property  could  be
              --unlimited=long_name:pseudo_level

       -v, --verbose
              Display a one-line summary of each output CF field.

       -x, --exclude
              Aggregation configuration: Omit unaggregatable fields from the output.  Ignored  if
              the  -n  option  is  set.  See  the  AGGREGATION  section  for the definition of an
              unaggregatable field.

SEE ALSO

       cfdump(1)

LIBRARY

       cf-python library version 1.3.1

BUGS

       Reports of bugs are welcome at http://cfpython.bitbucket.org/

LICENSE

       Open Source Initiative MIT License

AUTHOR

       David Hassell