Provided by: python-cf_1.3.2+dfsg1-4_amd64 bug

NAME

       cfa - create aggregated CF datasets

SYNOPSIS

       cfa [-d dir] [-f format] [-h] [-i] [-n] [-o file] [-u] [-v] [-x] [OPTIONS] INPUTS

DESCRIPTION

       The  cfa  tool creates and writes to disk the CF fields contained in files contained in the INPUTS (which
       may include directories if the --recursive option is set).

       Accepts CF-netCDF and CFA-netCDF files (or URLs if DAP access is enabled), Met Office (UK) PP  files  and
       Met  Office  (UK)  fields  files  as input. Multiple input files in a mixture of formats may be given and
       normal UNIX file globbing rules apply.

       Output files are in CF-netCDF or CFA-netCDF format (see the -f option).  Both output types are  available
       in  netCDF3  and  netCDF4  formats.  Note that the netCDF3 formats are generally slower to write than the
       netCDF4 formats, by several orders of magnitude if files with many data variables are involved.  However,
       not all software can read netCDF4, so it is advisable to check before writing in this format.

       By default the contents of each input file is aggregated (i.e. combined) into as few multi-dimensional CF
       fields  as  possible. Unaggregatable fields in the input files may be omitted from the output (see the -x
       option). Information on which fields are unaggregatable, and  why,  may  be  displayed  (see  the  --info
       option).  All aggregation may be turned off with the -n option, in which case all input fields are output
       without modification.

       See the AGGREGATION section for details on the aggregation process and unaggregatable fields.

       By default one output file is created per input file. In this case there is no inter-file aggregation and
       the contents of each file is aggregated independently of the others. Output file  names  are  created  by
       removing  the  suffix  .pp, .nc or .nca, if there is one, from each input file name and then adding a new
       suffix of .nc or .nca for CF-netCDF and CFA-netCDF output formats respectively. If the -d option  is  set
       then  all  output  files  will  be written to the specified directory, otherwise each output file will be
       written to the same directory as its input file.

       Alternatively, all of the input files may be treated collectively as a single CF dataset and written to a
       single output file (see the -o option). In this case aggregation is  attempted  within  and  between  the
       input files.

       An  error  occurs  if an output file has the same full name as any of the input files or any other output
       file.

AGGREGATION

       Aggregation of input fields into as few multi-dimensional CF fields as possible is carried out  according
       to  the  aggregation  rules  documented  in CF ticket #78 (http://kitt.llnl.gov/trac/ticket/78). For each
       input field, the aggregation process creates a structural signature which is essentially a subset of  the
       metadata  of the field, including coordinate metadata and other domain information, but which contains no
       data values. The structural signature accounts for the following standard CF properties:

              add_offset,  calendar,   cell_methods,   _FillValue,   flag_masks,   flag_meanings,   flag_values,
              missing_value,   scale_factor,   standard_error_multiplier,   standard_name,   units,   valid_max,
              valid_min, valid_range

       Aggregation is then attempted on each group of fields with the same, well defined  structural  signature,
       and will succeed where the coordinate data values imply a safe combination into a single dataset.

       Not  all  fields  are  aggregatable.  Unaggregatable  fields  are those without a well defined structural
       signature; or those with  the  same  structural  signature  when  at  least  two  of  them  1)  can't  be
       unambiguously distinguished by coordinates or other domain information or 2) contain coordinate reference
       fields or ancillary variable fields which themselves can't be unambiguously aggregated.

EXAMPLES

       Create a new netCDF3 classic file containing the aggregatable fields in all of the input files:

              cfa -o newfile.nc *.nc

       Create, in an existing directory and overwriting any existing files, new netCDF3 classic files containing
       the aggregatable fields in each input file:

              cfa -d directory --overwrite *.pp

       Create a new netCDF4 file containing all fields in all of the input files:

              cfa -f NETCDF4 -o newfile.nc *.nc

       Create  a  new  CFA-netCDF4  file containing all fields in all of the input files and allow long names or
       netCDF variable names to identify fields and their components:

              cfa -i -f CFA4 -o newfile.nc *.nc

OPTIONS

       --axis=property
              Aggregation configuration: Create a new axis for each input field which has given property. If  an
              input  field  has the property then, prior to aggregation, a new axis is created with an auxiliary
              coordinate whose data  array  is  the  property's  value.  This  allows  for  the  possibility  of
              aggregation along the new axis. The property itself is deleted from that field. No axis is created
              for input fields which do not have the specified property.

              Multiple axes may be created by specifying more than one --axis option.

              For  example,  if you wish to aggregate an ensemble of model experiments that are distinguished by
              the source property, you can use --axis=source to create an ensemble axis which has  an  auxiliary
              coordinate variable containing the source property values.

       --cfa_base=[value]
              For  output  CFA-netCDF  files  only.  File  names  referenced  by  an output CFA-netCDF file have
              relative, as opposed to absolute, paths or URL  bases.  This  may  be  useful  when  relocating  a
              CFA-netCDF file together with the datasets referenced by it.

              If set with no value (--cfa_base=) or the value is empty then file names are given relative to the
              directory  or  URL  base containing the output CFA-netCDF file. If set with a non-empty value then
              file names are given relative to the directory or URL base described by the value.

              By default, file names within CFA-netCDF files are stored with absolute paths. Ignored for  output
              files of any other format.

       --compress=N

              Regulate the speed and efficiency of compression. Must be an integer between 0 and 9. By default N
              is  0,  meaning  no  compression; 1 is the fastest, but has the lowest compression ratio; 9 is the
              slowest but best compression ratio.

       --contiguous
              Aggregation configuration: Requires that aggregated  fields  have  adjacent  dimension  coordinate
              cells  which  partially  overlap  or  share  common  boundary  values.  Ignored  if  the dimension
              coordinates do not have bounds.

       -d dir, --directory=dir
              Specify the output directory for all output files.

       --double
              Write 32-bit floats as 64-bit floats and 32-bit integers as 64-bit  integers.  By  default,  input
              data types are preserved.

       --equal=property
              Aggregation configuration: Require that an input field may only be aggregated with other fields if
              they  all have the given CF property (standard or non-standard) with equal values. Ignored for any
              input field which does not have this property, or if the property is already accounted for in  the
              structural signature.

              Supersedes the behaviour for the given property that may be implied by the --exist_all option.

              Multiple properties may be set by specifying more than one --equal option.

       --equal_all
              Aggregation  configuration:  Require  that an input field may only be aggregated with other fields
              that have the same set of CF properties (excluding those already accounted for in  the  structural
              signature) with equal sets of values.

              The behaviour for individual properties may be overridden by the --exist --ignore options.

              For  example,  to  insist  that  a  group  of  aggregated  input  fields must all have the same CF
              properties (other than those accounted for in the structural signature) with matching values,  but
              allowing the long_name properties have unequal values, you can use --equal_all --exist=long_name

       --exist=property
              Aggregation configuration: Require that an input field may only be aggregated with other fields if
              they all have the given CF property (standard or non-standard), but not requiring the values to be
              the  same.  Ignored  for  any input field which does not have this property, or if the property is
              already accounted for in the structural signature.

              Supersedes the behaviour for the given property that may be implied by the --equal_all option.

              Multiple properties may be set by specifying more than one --exist option.

       --exist_all
              Aggregation configuration: Require that an input field may only be aggregated  with  other  fields
              that  have  the same set of CF properties (excluding those already accounted for in the structural
              signature), but not requiring the values to be the same.

              The behaviour for individual properties may be overridden by the --equal --ignore options.

              For example, to insist that a group  of  aggregated  input  fields  must  all  have  the  same  CF
              properties  (other  than  those  accounted  for  in the structural signature), regardless of their
              values, but also  insisting  that  the  long_name  properties  have  equal  values,  you  can  use
              --exist_all --equal=long_name

       -f format, --format=format
              Set  the  format of the output file(s). Valid choices are NETCDF3_CLASSIC, NETCDF3_64BIT, NETCDF4,
              NETCDF4_CLASSIC and NETCDF3_64BIT for outputting CF-netCDF files in those netCDF formats and  CFA3
              or  CFA4  for  outputting  CFA-netCDF files in NETCDF3_CLASSIC or NETCDF4 formats respectively. By
              default, NETCDF3_CLASSIC is assumed.

              Note that the netCDF3 formats are generally slower to write than the netCDF4 formats,  by  several
              orders  of magnitude if files with many data variables are involved. However, not all software can
              read netCDF4, so it is advisable to check before writing in this format.

       -h, --help
              Display this man page.

       -i, --relaxed_identities
              Aggregation configuration: In the absence of standard names, allow  fields  and  their  components
              (such  as coordinates) to be identified by their long_name CF properties or else their netCDF file
              variable names.

       --ignore=property
              Aggregation configuration: An input field may  be  aggregated  with  other  fields  regardless  of
              whether  or  not  they have the given CF property (standard or non-standard) and regardless of its
              values. Ignored for any input field which does not have this  property,  or  if  the  property  is
              already accounted for in the structural signature.

              This  is  the  default behaviour in the absence of all the --exist --equal --exist_all --equal_all
              options and supersedes the behaviour for the given property that may be implied if  any  of  these
              options are set.

              Multiple properties may be set by specifying more than one --ignore option.

              For  example,  to  insist  that  a  group  of  aggregated  input  fields must all have the same CF
              properties (other than those accounted for in the structural signature) with the same values,  but
              with  no restrictions on the existence or values of the long_name property you can use --equal_all
              --ignore=long_name

       --fletcher32
              Activate the Fletcher-32 HDF5 checksum algorithm to detect compression errors. Ignored if there is
              no compression (see the --compress option).

       --follow_symlinks
              In combination with --recursive also search for files in directories  which  resolve  to  symbolic
              links.  Files  specified  by  the  INPUTS  which are symbolic links are always followed. Note that
              setting --recursive --follow_symlinks can lead to infinite recursion if a directory which resolves
              to a symbolic link points to a parent directory of itself.

       --ignore_read_error
              Ignore, without failing, any input file which causes an error whilst being read, as would  be  the
              case for an empty file, unknown file format, etc. By default an error occurs in this case.

       --info=N
              Aggregation  configuration:  Print  information  about  the aggregation process. If N is 0 then no
              information is displayed. If N is  1  or  more  then  display  information  on  which  fields  are
              unaggregatable,  and why. If N is 2 or more then display the field structural signatures and, when
              there is more than one field with the same structural signature, their canonical  first  and  last
              coordinate values. If N is 3 or more then display the field complete aggregation metadata.

              By default N is 0.

       --least_sig_digit=N
              Truncate  the  input field data arrays. For a positive integer N the precision that is retained in
              the compressed data is '10 to the power -N'. For example, if N is 2 then a precision  of  0.01  is
              retained.  In conjunction with compression this produces 'lossy', but significantly more efficient
              compression (see the --compress option).

       --ncvar_identities
              Aggregation configuration:  Force  fields  and  their  components  (such  as  coordinates)  to  be
              identified by their netCDF file variable names.

       -n, --no_aggregation
              Aggregation  configuration:  Do not aggregate fields. Writes the input fields as they exist in the
              input files.

       --no_overlap
              Aggregation configuration: Requires that aggregated  fields  have  adjacent  dimension  coordinate
              cells  which  do not overlap (but they may share common boundary values). Ignored if the dimension
              coordinates do not have bounds.

       --no_shuffle
              Turn off the HDF5 shuffle filter, which de-interlaces  a  block  of  data  before  compression  by
              reordering  the  bytes  by  storing  the  first  byte  of  all of a variable's values in the chunk
              contiguously, followed by all the second bytes, and so  on.  By  default  the  filter  is  applied
              because  if the data array values are not all wildly different, using the filter can make the data
              more easily compressible. Ignored if there is no compression (see the --compress option).

       -o file, --outfile=file
              Treat all input files collectively as a single CF dataset. In this case aggregation  is  attempted
              within and between the input files and all outputs are written to the specified file.

       --overwrite
              Allow pre-existing output files to be overwritten.

       --promote=component
              Promote field components to independent top-level fields. If component is ancillary then ancillary
              data  fields  are  promoted.  If  component  is  auxiliary then auxiliary coordinate variables are
              promoted. If component is measure then cell meausure  variables  are  promoted.  If  component  is
              reference then fields pointed to from formula_terms attributes are promoted. If component is field
              then all component fields are promoted.

              Multiple conponent types may be promoted by specifying more than one --promote option.

              For  example, promote to ancillary data field and cell measure variables to independent, top-level
              fields you can use --promote=ancillary --promote=measure

       --recursive
              Allow directories to be specified by the INPUTS and recursively search the directories for  actual
              files  to  read.  Set  the  --ignore_read_error  option  to  bypass  any  unreadable files and the
              --follow_symlinks option to allow directories to be symbolic links.

       --reference_datetime=datetime
              Set the reference date-time of time coordinate units to an ISO 8601-like date-time.  Changing  the
              reference  date-time  does not change the absolute date-times of the coordinates. Ignored for non-
              reference date-time  coordinates.  Some  examples  of  valid  date-times:  1830-12-1,  "1830-12-09
              2:34:45Z".

       --respect_valid
              Aggregation   configuration:  Take  into  account  the  CF  properties  valid_max,  valid_min  and
              valid_range during aggregation. By default they are ignored for the purposes  of  aggregation  and
              deleted from any aggregated output CF fields.

       --shared_nc_domain
              Aggregation  configuration:  Match  axes  between a field and its contained ancillary variable and
              coordinate reference fields via their netCDF dimension names and not via their domains.

       --single
              Write 64-bit floats as 32-bit floats and 64-bit integers as 32-bit  integers.  By  default,  input
              data types are preserved.

       --squeeze
              Remove  size  1 axes from the output field data arrays. If a size one axis has any one dimensional
              coordinates then these are converted to CF scalar coordinates.

       -u, --relaxed_units
              Aggregation configuration: Assume that fields or their components (such as coordinates)  with  the
              same standard name (or other identifiers, see the -i option) but missing units all have equivalent
              (but unspecified) units, so that aggregation may occur. This is the default for Met Office (UK) PP
              files and Met Office (UK) fields files, but not for other formats.

       --unsqueeze
              Include  size  1  axes  in  the  output  field  data  arrays. If a size one axis has any CF scalar
              coordinates then these are converted to one dimensional coordinates.

       --um_version=version
              For Met Office (UK) PP files and Met Office (UK) fields files only, the Unified Model (UM) version
              to be used when decoding the header. Valid versions are, for example,  4.2,  6.6.3  and  8.2.  The
              default  version  is  4.5. In general, the given version is ignored if it can be inferred from the
              header (which is usually the case for files created by the UM at  versions  5.3  and  later).  The
              exception to this is when the given version has a third element (such as the 3 in 6.6.3), in which
              case  any  version  in the header is ignored. This option is ignored for input files which are not
              Met Office (UK) PP files or Met Office (UK) fields files.

       --unlimited=axis
              Create an unlimited dimension (a dimension that can be appended to). A dimension is identified  by
              either  a standard name; one of T, Z, Y, X denoting time, height or horixontal axes (as defined by
              the CF conventions); or the value of an arbitrary CF property preceded by the property name and  a
              colon. For example:

              Multiple  unlimited  axes  may  be  defined  by specifying more than one --unlimited option. Note,
              however, that only netCDF4 formats support multiple unlimited dimensions. For example, to set  the
              time and Z dimensions to be unlimited you could use --unlimited=time --unlimited=Z

              An    example    of    defining    an    axis    by    an   arbitrary   CF   property   could   be
              --unlimited=long_name:pseudo_level

       -v, --verbose
              Display a one-line summary of each output CF field.

       -x, --exclude
              Aggregation configuration: Omit unaggregatable fields from the output. Ignored if the -n option is
              set. See the AGGREGATION section for the definition of an unaggregatable field.

SEE ALSO

       cfdump(1)

LIBRARY

       cf-python library version 1.3.1

BUGS

       Reports of bugs are welcome at http://cfpython.bitbucket.org/

LICENSE

       Open Source Initiative MIT License

AUTHOR

       David Hassell

2016-09-09                                            1.3.1                                               CFA(1)