Provided by: libgetdata-doc_0.9.0-2.2_all bug

NAME

       dirfile-format — the dirfile database format specification file

DESCRIPTION

       The  dirfile  format  specification  fully  specifies  the  raw  and  derived  time streams and auxiliary
       information for a dirfile(5) database.

       The format specification is contained in one or more case-sensitive text files  located  in  the  dirfile
       tree.   Each  file is known as a fragment.  The primary fragment is the file called format located in the
       base dirfile directory.  This file may contain only part of the format specification, and  may  reference
       other  fragments  (using the /INCLUDE directive) containing further format specification.  This inclusion
       mechanism may be nested arbitrarily deep.

       The explicit text encoding of these files is not specified by these Standards, but it must be 7-bit ASCII
       compatible. Examples of acceptable character encodings include all the  ISO  8859  character  sets  (i.e.
       Latin-1 through Latin-10, among others), as well as the UTF-8 encoding of Unicode and UCS.

       This  document  primarily  describes  the  latest  version of the Standards (Version 9); differences with
       previous versions are noted where relevant.  A complete list of changes between versions is given in  the
       HISTORY section below.

SYNTAX

       The  format  specification  is  composed  of  field  specification  lines and directive lines, optionally
       separated by blank lines or lines containing only whitespace.   Lines  are  separated  by  the  line-feed
       character  (0x0A).   Unless  escaped (see below), the hash mark (#) is the comment delimiter; the comment
       delimiter, and any text following it to the end of the line, is ignored.

   Tokens
       Both field specification lines and directive lines consist of several  tokens  separated  by  whitespace.
       Whitespace  consists  of  one  or  more  whitespace  characters.  These are: space (0x20), horizontal tab
       (0x09), vertical tab (0x0B), form-feed (0x0C),  and  carriage  return  (0x0D).   The  first  token  of  a
       directive  line  is  always  a  reserved  word.  The first token of a field specification line is never a
       reserved word.  Any amount of whitespace may precede the first token on a line.

       Since tokens are separated by whitespace, to include a whitespace character in a token,  it  must  either
       escaped  by preceding it by a backslash character (\), or be replaced by a character escape sequence (see
       below), or else the token must be enclosed in quotation marks (").  The quotation  marks  themselves  are
       stripped  from  the  token.  The  null-token  (that  is,  the token consisting of zero characters) may be
       specified by a pair of quotation marks with nothing between them ("").  To include  a  literal  quotation
       mark in a token, it must be escaped (\").  Similarly, a hash mark may be included in a token by including
       it  in  a  quoted token or else by escaping it (\#), otherwise the hash mark is understood as the comment
       delimiter.

       It is a syntax error to have a line which contains unmatched  quotation  marks,  or  in  which  the  last
       character is the backslash character.

       Several  characters when escaped by a preceding backslash character are interpreted as special characters
       in tokens.  The character escape sequences are:

              \a     an alert (bell) character (ASCII 0x07 / U+0007)

              \b     a backspace character (ASCII 0x08 / U+0008)

              \e     an escape character (ASCII 0x1B / U+001B)

              \f     a form-feed character (ASCII 0x0C / U+000C)

              \n     a line-feed character (ASCII 0x0A / U+000A)

              \r     a carriage return character (ASCII 0x0D / U+000D)

              \t     a horizontal tab character (ASCII 0x09 / U+0009)

              \v     a vertical tab character (ASCII 0x0B / U+000B)

              \\     a backslash character (ASCII 0x5C / U+005C)

              \ooo   the single byte given by the octal number ooo (1 to 3 octal digits).

              \xhh   the single byte given by the hexadecimal number hh (1 or 2 hexadecimal digits).

              \uhhhhhhh
                     the UTF-8 byte sequence encoding the Unicode code point given  by  the  hexadecimal  number
                     hhhhhhh (1 to 7 hexadecimal digits).

       Any other character which is escaped is interpreted as the character itself.  (i.e.  \c is interpreted as
       c;  also,  as  pointed  out  above,  \"  and  \# are interpreted as simply " and #, without their special
       meanings).

       No token may contain the NULL character (ASCII 0x00 / U+0000).  Furthermore, although support is  present
       to  create  UTF-8 byte sequences, tokens are not required to be valid UTF-8 sequences.  Any byte sequence
       not containing the NULL character forms a valid token.  However, there may  be  further  restrictions  on
       allowed characters for a token in a particular situation, (for example, when used as a field name).

       Standards  Version  5  and  earlier do not recognise the character escape sequences, nor allow quoting of
       tokens. As a result, they prohibit both whitespace and the comment delimiter from being used in tokens.

DIRECTIVES

       There are ten directives, each specified by a different reserved word, which  cannot  be  used  as  field
       names  in the dirfile.  As of Standards Version 8, all reserved words start with an initial forward slash
       (/), to distinguish them from field names.  Standards Versions 5, 6, and 7 permitted the omission of  the
       initial  forward  slash, while in Standards Version 4 and earlier, reserved words may not have an initial
       forward slash.  Like the rest of the format specification, directives are case sensitive.

       A number of the directives have fragment scope.  A directive with fragment  scope  only  applies  to  the
       fragment  in which it is present, plus any sub-fragments indicated by the /INCLUDE directive, but only if
       those sub-fragments don't have their own corresponding directive.  Directives which have  fragment  scope
       are:  /ENCODING, /ENDIAN, /FRAMEOFFSET, and /PROTECT.  Because of these scoping rules, different portions
       of the dirfile may have different encodings, endiannesses, frame offsets, or protection levels.

       If a directive with fragment scope appears more than once in a fragment, only the last such directive  is
       honoured,  with  the  exception  that the effect of a directive is not propagated to sub-fragments if the
       directive line appears after the sub-fragment is included.  The scoping rules of the remaining directives
       are discussed below.

       /ALIAS The /ALIAS directive defines an alternate name  for  a  field  defined  elsewhere  in  the  format
              specification  (called  the  "target").   Aliases  may  not be used as the parent field in a /META
              directive, but are in most other ways indistinguishable  from  the  target's  original,  canonical
              name.   Aliases  may  be  chained  (that  is, the target name appearing in an /ALIAS directive may
              itself be an alias).  In this case, the new alias is another name for  the  target's  own  target.
              Just as there is no requirement that the input fields of a derived field exist, it is not an error
              for the target of an alias to not exist.  Syntax is:

              /ALIAS <name> <target>

              A  metafield alias may defined using the <parent-field>/<alias-name> syntax for name in the /ALIAS
              directive.  No restriction is placed on target; specifically, a metafield alias may target a  top-
              level field, or a metafield of with a different parent; conversely, a top-level alias may target a
              metafield.

              A metafield alias may never appear as the parent part of a metafield field code, even if it refers
              to a top-level field.  That is, given the valid format:

              field1 RAW UINT8 1
              field1/meta CONST FLOAT64 0.0
              field2 RAW UINT8 1
              /ALIAS field2/alias field1

              the metafield field1/meta may not be referred to as field2/alias/meta, even though field2/alias is
              a valid field code referring to field1.

              The  /ALIAS directive has no scope: it is processed immediately.  It appeared in Standards Version
              9.

       /ENCODING
              The /ENCODING directive specifies the encoding scheme used to encode binary files in the  dirfile.
              The  encoding  scheme may be one of the predefined names listed below, which are described in more
              detail in dirfile-encoding(5), or any other site-specific encoding scheme.  The predefined  scheme
              names are:

              none   The dirfile is unencoded.

              bzip2  The dirfile is compressed using the bzip2 compression scheme.

              gzip   The dirfile is compressed using the gzip compression scheme.

              lzma   The dirfile is compressed using the LZMA compression scheme.

              slim   The dirfile is compressed using the slim compression scheme.

              sie    The dirfile is sample-index encoded (a variant of run-length encoding).

              text   The dirfile is text encoded.

              zzip   The dirfile is compressed and encapsulated using the zzip compression scheme.

              zzslim The  dirfile  is  compressed  and  encapsulated  using  a  combination of the zzip and slim
                     compression schemes.

              Implementations should fail gracefully when  encountering  an  unknown  encoding  scheme.   If  no
              encoding scheme is specified, behaviour is implementation dependent.  Syntax is:

                     /ENCODING <scheme> [<enc-datum>]

              The enc-datum token provides additional data for certain encoding schemes; see dirfile-encoding(5)
              for details.  The form of enc-datum is not specified.

              The  /ENCODING  directive has fragment scope.  It appeared in Standards Version 6.  The predefined
              schemes sie, zzip, and zzslim, and the optional enc-datum token, appeared in Standards Version  9;
              the  predefined scheme lzma appeared in Standards Version 7; all other predefined schemes appeared
              in Standards Version 6.

       /ENDIAN
              The /ENDIAN directive specifies the endianness of the raw data in the database.  The assumed endi‐
              anness of raw data in dirfiles which omit this directive is implementation dependent.  Syntax is:

                     /ENDIAN ( big | little ) [ arm ]

              where the "arm" token should be included if double precision floating point data are stored in the
              ARM middle-endian format.  The /ENDIAN directive has fragment scope.   It  appeared  in  Standards
              Version 5.  The optional arm token appeared in Standards Version 8.

       /FRAMEOFFSET
              The  /FRAMEOFFSET directive specifies the frame number of the first frame for which data exists in
              binary files associated with RAW fields.  Syntax is:

                     /FRAMEOFFSET <integer>

              The /FRAMEOFFSET directive has fragment scope.  It appeared in Standards Version 1.

       /HIDDEN
              The /HIDDEN directive indicates that the specified field name is hidden.  The difference (if  any)
              between  a field name which is hidden and one that is not is implementation dependent.  Hiddenness
              is not inherited by metafields of the specified field.  Hiddenness applies to the  name,  not  the
              field  itself;  it  does  not  hide all aliases of the field-name, and if field-name an alias, the
              alias is hidden, not its target.  Syntax is:

                     /HIDDEN <field-name>

              A /HIDDEN directive must appear after the specification of field-name, (which occurs either  in  a
              field specification line, or an /ALIAS directive, or a /META directive) in the same fragment.

              The /HIDDEN directive has no scope: it is processed immediately.  It appeared in Standards Version
              9.

       /INCLUDE
              The  /INCLUDE  directive specifies another file (called a fragment) to parse for additional format
              specification for the dirfile.  The inclusion is processed immediately, before the  fragment  con‐
              taining  the  /INCLUDE directive (the parent fragment) is parsed further.  RAW fields specified in
              the included fragment are located in the directory containing the fragment file, and  not  in  the
              directory  containing  the parent fragment, and the binary file encoding may be different for each
              fragment.  The fragment may be specified either with an absolute path, or else a path relative  to
              the directory containing the parent fragment.

              The  /INCLUDE  directive may optionally specify a prefix and/or suffix to apply to field names de‐
              fined in the included fragment.  If present, affixes are applied  to  all  field-names  (including
              aliases)  defined  in  the included fragment and any fragments it further includes.  Affixes nest,
              with the affixes of the deepest inclusion innermost.  Affixes are not applied to the names of  bi‐
              nary files associated with RAW fields.  Syntax is:

                     /INCLUDE <file> [<prefix> [<suffix>]]

              To specify only a suffix, use the null-token ("") as prefix.  The /INCLUDE directive has no scope:
              it  is processed immediately.  It appeared in Standards Version 3.  The optional prefix and suffix
              appeared in Standards Version 9.

       /META  The /META directive specifies a metafield attached to a particular parent field.  The field  meta‐
              data  may  be of any allowed type except RAW.  Metafields are retrieved in exactly the same way as
              regular field data, but the field code specified consists of the parent and metafield names joined
              with a forward slash:

                     <parent-field>/<meta-field>

              META fields may not be specified before their parent field has been.  Syntax is:

                     /META <parent-field> {field specification line}

              The <parent-field> code may not be an alias.  As an illustration of this concept,

                     /META pfield meta CONST FLOAT64 3.291882

              provides a scalar metadatum called meta with value 3.291882 attached to the  field  pfield.   This
              particular metafield may be referred to by the field code "pfield/meta".  Note that different par‐
              ent fields may have metafields with the same name, since all references to metafields must include
              the parent field name.  Metafields may not themselves have further sub-metafields.

              As  an  alternative  to the /META directive, starting with Standards Version 7, a metafield may be
              specified by a standard field specification line, using

                     <parent-field>/<meta-field>

              as the field name.  That is, the above example metafield could have also been specified as:

                     pfield/meta CONST FLOAT64 3.291882

              The /META directive has no scope: it is processed immediately.  It appeared in  Standards  Version
              6.

       /PROTECT
              The  /PROTECT directive specifies the advisory protection level of the current fragment and of the
              RAW fields defined therein.  The protection level indicates whether writing to  the  fragment,  or
              the binary data on disk is permitted.  Syntax is:

                     /PROTECT <level>

              Four advisory protection levels are defined:

              none   No  protection at all: data and metadata may be freely changed.  This is the default, if no
                     /PROTECT directive is present.

              format The dirfile metadata is protected from change, but RAW data on disk may be modified.

              data   The RAW data on disk is protected from change, but metadata may be modified.

              all    Both metadata and data on disk are protected from change.

              The /PROTECT directive has fragment scope.  It appeared in Standards Version 6.

       /REFERENCE
              The /REFERENCE directive specifies the name of the field to use as the dirfile's  reference  field
              (see  dirfile(5)).   If  no  /REFERENCE directive is specified, the first RAW field encountered is
              used as the reference field.  The /REFERENCE directive must specify a RAW field.  Syntax is:

                     /REFERENCE <field-code>

              The /REFERENCE directive has global scope: if multiple /REFERENCE directives appear in the dirfile
              metadata, only the last such is honoured.  It appeared in Standards Version 6.

       /VERSION
              The /VERSION directive specifies the particular version of the  Dirfile  Standards  to  which  the
              dirfile  format  specification conforms.  This directive should occur before any version dependent
              syntax is encountered.  As of Standards Version 6, no such syntax exists, and  this  directive  is
              provided primarily to ease forward compatibility.  Syntax is:

                     /VERSION <integer>

              The  /VERSION directive has immediate scope: its effect is immediate, and it applies only to meta‐
              data below it, including and propagating downwards to sub-fragments after the directive.

              In Standards Version 8 and earlier, its effect also propagates upwards back to  the  parent  frag‐
              ment, and affects subsequent metadata.  Starting with Standards Version 9, this no longer happens.
              As  a  result,  a  /VERSION directive which indicates a version of 9 or later never propagates up‐
              wards; additionally, /VERSION directives found in subfragments included in a Version  9  or  later
              fragment  aren't  propagated upwards into that fragment, regardless of the Version of the subfrag‐
              ments.  The /VERSION directive appeared in Standards Version 5.

FIELD SPECIFICATION LINES

       Any line which does not start with a reserved word is assumed to be a field specification line.  A  field
       specification line consists of at least two tokens.  The first token is the field name.  The second token
       is  the field type.  Subsequent tokens are field parameters.  The meaning and number these parameters de‐
       pends on the field type specified.

   Field Names
       The first token in a field specification line is the field name.  The field name consists of one or  more
       characters, excluding both ASCII control characters (the bytes 0x01 through 0x1F), and the characters

              &    /    ;    <    >    |    .

       which are reserved (but see below for the use of / to specify metafields).  The full stop (.)  is allowed
       in Standards Version 5 and earlier.  The ampersand, semicolon, less than, greater than, and vertical line
       (& ; < > |) are allowed in Standards Version 4 and earlier.  Furthermore, due to the lack of an escape or
       quoting  mechanism  (see  Tokens above), Standards Version 5 and earlier also prohibit whitespace and the
       comment delimiter (#) in field names.

       The field name may not be INDEX, which is a special, implicit field which contains the integer frame  in‐
       dex.   Standards Version 5 and earlier also prohibit FILEFRAM, which was an alias for INDEX.  Field names
       are case sensitive.  Standards Version 3 and 4 restrict field names to 50 characters. Standards Version 2
       and earlier restrict field names to 16 characters. Additionally, the filesystem may put  restrictions  on
       the length and acceptable characters of a RAW field name, regardless of Standards Version.

       Starting  in Standards Version 7, if the field name beginning a field specification line contains exactly
       one / character, the line is assumed to specify a metafield.  See the /META directive above  for  further
       details.  A field name may not contain more than one /.

   Field Types
       There are fifteen field types.  Of these, twelve are of vector type (BIT, DIVIDE, LINCOM, LINTERP, MPLEX,
       MULTIPLY,  PHASE, POLYNOM, RAW, RECIP, SBIT, and WINDOW) and three are of scalar type (CONST, CARRAY, and
       STRING).  The eleven vector field types other than RAW fields are also called derived fields, since  they
       derive their value from one or more input fields.

       Five  of  these derived fields (DIVIDE, LINCOM, MPLEX, MULTIPLY, and WINDOW) may have more than one input
       field.  In situations where these input fields have differing sample rates, the sample rate  of  the  de‐
       rived  field is the same as the sample rate of the first (left-most) input field specified.  Furthermore,
       the input fields are synchronised by aligning them on frame boundaries, assuming equally-spaced  sampling
       throughout a frame, and using the last sample of each input field which did not occur after the sample of
       the derived field being computed.  That is, if the first and second input fields have sample rates s1 and
       s2,  the  derived  field  also has sample rate s1 and, for every sample of the derived field, n, the n'th
       sample of the first field is used (since they have the same sample rate by definition),  and  the  sample
       number used of the second field, m, is computed as:

              m = floor((n * s2) / s1).

       Starting in Standards Version 6, certain scalar field parameters in the field specifications may be spec‐
       ified  using  CONST  or CARRAY fields, instead of literal values.  A list of parameters for which this is
       allowed is given below in the Field Parameters section.

       The possible fields types are:

       BIT    The BIT vector field type extracts one or more bits out of an input vector field  as  an  unsigned
              number.  Syntax is:

                     <fieldname> BIT <input> <first-bit> [<num-bits>]

              which  specifies  fieldname  to be the value of bits first-bit through first-bit+num-bits-1 of the
              input vector field input, when input is converted from its native type to an (endianness  correct‐
              ed)  unsigned 64-bit integer.  If num-bits is omitted, it is assumed to be 1.  The SBIT field type
              is a signed version of this field type.  The optional num-bits  parameter  appeared  in  Standards
              Version 1.

       CARRAY The  CARRAY  scalar  field type is a list of constants fully specified in the format specification
              metadata.  Syntax is:

                     <fieldname> CARRAY <type> <value0> <value1> <value2> ...

              where type may be any supported native data type (see the description of the RAW  field  type  be‐
              low), and value0, value1, &c. are the values of successive elements in the scalar list interpreted
              as  indicated  by type.  No limit is placed on the number of elements in a CARRAY.  (Note: despite
              being multivalued, this is not considered a vector field since the elements of the CARRAY are  not
              indexed by frames.)  It appeared in Standards Version 8.

       CONST  The  CONST  scalar  field type is a constant fully specified in the format specification metadata.
              Syntax is:

                     <fieldname> CONST <type> <value>

              where type may be any supported native data type (see the description of the RAW  field  type  be‐
              low),  and  value is the numerical value of the constant interpreted as indicated by type.  It ap‐
              peared in Standards Version 6.

       DIVIDE The DIVIDE vector field type is the quotient of two vector fields.  Syntax is:

                     <fieldname> DIVIDE <field1> <field1>

              The derived field is computed as:

                     fieldname = field1 / field2.

              It was introduced in Standards Version 8.

       LINCOM The LINCOM vector field type is the linear combination of one, two or three input  vector  fields.
              Syntax is:

                     <fieldname> LINCOM [<n>] <field1> <a1> <b1> [<field2> <a2> <b2> [<field3> <a3> <b3>]]

              where  n, if present, indicates the number of input vector fields (1, 2, or 3).  The derived field
              is computed as:

                     fieldname = (a1 * field1 + b1) + (a2 * field2 + b2) + (a3 * field3 + b3)

              with the field2 and field3 terms included only if specified.

              If n is not specified, the number of fields is determined by looking at the  supplied  parameters.
              Since  it is possible to create a field code which is identical to a literal number, the third to‐
              ken on the line is assumed to be n if it the entire token can be parsed as a literal number  using
              the  rules  outlined in strtod(3).  That is, if the field code specifying field1 could be mistaken
              for a literal number, n must be specified to prevent ambiguity.  In standards Version 6 and earli‐
              er, n is mandatory.

       LINTERP
              The LINTERP vector field type specifies a table look up based on another vector field.  Syntax is:

                     <fieldname> LINTERP <input> <table>

              where input is the input vector field for the table lookup, and table is the path  to  the  lookup
              table file for the field.  If this path is relative, it is assumed to be relative to the directory
              containing the fragment defining this field.  The lookup table file is an ASCII text file with two
              whitespace  separated  columns  of  x  and y values.  Values are linearly interpolated between the
              points specified in the lookup table.

       MPLEX  The MPLEX vector field type permits the multiplexing of several low sample rate fields into a sin‐
              gle data field of higher sample rate.  Syntax is:

                     <fieldname> MPLEX <input> <index> <count> [<period>]

              where input is the input vector containing the multiplexed fields, index is the vector  containing
              the  mutliplex  index, count is the value of the multiplex index when the computed field is stored
              in input, and period, if present and non-zero, is the number of samples between successive  occur‐
              rances of the value count in the index vector.  A period of zero (or, equivalently, it's omission)
              indicates  that either the value count is not equally spaced in the index vector, or else that the
              spacing is unknown.  Both count and period are integers, and period may not be negative.

              At every sample n, the derived field is computed as:

                     fieldname[n] = (index == count) ? input[n] : fieldname[n - 1]

              The index vector is converted to an integer type for comparison.  The value of the  derived  field
              before the first sample where index equals count is implementation dependent.

              The  values of count and period place no restrictions on values contained in index.  Specifically,
              particular values of index (including count) need not be equally spaced (neither by period nor any
              other spacing); index need not ever take on the value count (in which case the value  of  the  en‐
              tirety of the derived field is implementation dependent).  Different MPLEX field definitions which
              use the same index vector may specify different periods.  MPLEX appeared in Standards Version 9.

       MULTIPLY
              The MULTIPLY vector field type is the product of two vector fields.  Syntax is:

                     <fieldname> MULTIPLY <field1> <field2>

              The derived field is computed as:

                     fieldname = field1 * field2.

              It appeared in Standards Version 2.

       PHASE  The PHASE vector field type shifts an input vector field by the specified number of samples.  Syn‐
              tax is:

                     <fieldname> PHASE <input> <shift>

              which  specifies fieldname to be the input vector field, input, shifted by shift samples.  A posi‐
              tive shift indicates a forward shift, towards the end-of-field.  Results of shifting past the  be‐
              ginning- or end-of-field is implementation dependent.  PHASE appeared in Standards Version 4.

       POLYNOM
              The  POLYNOM  vector  field  type  specifies a polynomial function of a single input vector field.
              Syntax is:

                     <field_name> POLYNOM <input> <a0> <a1> [<a2> [<a3> [<a4> [<a5>]]]]

              where <input> is the input field code, and the order of the computed polynomial is  determined  by
              how many co-efficients are present in the specification.  The derived field is computed as:

                     fieldname = a0 + a1 * input + a2 * input**2 + a3 * input**3 + a4 * input**4 + a5 * input**5

              where ** is the element-wise exponentiation operator, and the higher order terms are computed only
              if the corresponding co-efficients ai are specified.  POLYNOM appeared in Standards Version 7.

       RAW    The RAW vector field type specifies raw time streams on disk.  In this case, the field name should
              correspond to the name of the file containing the time stream.  Syntax is:

                     <fieldname> RAW <type> <sample-rate>

              where sample-rate is the number of samples per dirfile frame for the time stream and type is a to‐
              ken specifying the native data format type:

                     UINT8  unsigned 8-bit integer

                     INT8   signed (two's complement) 8-bit integer

                     UINT16 unsigned 16-bit integer

                     INT16  signed (two's complement) 16-bit integer

                     UINT32 unsigned 32-bit integer

                     INT32  signed (two's complement) 32-bit integer

                     UINT64 unsigned 64-bit integer

                     INT64  signed (two's complement) 64-bit integer

                     FLOAT32
                            IEEE-754 standard 32-bit single precision floating point number

                     FLOAT64
                            IEEE-754 standard 64-bit double precision floating point number

                     COMPLEX64
                            a  64-bit complex number consisting of two IEEE-754 standard 32-bit single precision
                            floating point numbers representing the real and imaginary parts of the complex num‐
                            ber (Standards Version 7 and later)

                     COMPLEX128
                            a 128-bit complex number consisting of two IEEE-754 standard 64-bit double precision
                            floating point numbers representing the real and imaginary parts of the complex num‐
                            ber (Standards Version 7 and later).

              For more information on the storage of complex valued data, see dirfile(5).  Two  additional  type
              names  exist: FLOAT is equivalent to FLOAT32, and DOUBLE is equivalent to FLOAT64.  Standards Ver‐
              sion 9 deprecates these two aliases, but still allows them.

              All these type names (except those for complex data, which came later) were  introduced  in  Stan‐
              dards  Version  5.   Earlier  Standards  Versions  specified data types with single character type
              aliases:

                     c      UINT8

                     u      UINT16

                     s      INT16

                     U      UINT32

                     i, S   INT32

                     f      FLOAT32

                     d      FLOAT64

              Types INT8, UINT64, INT64, COMPLEX64, and COMPLEX128 are not supported before Standards Version 5,
              so no single character type aliases exist for these types.  These single  character  type  aliases
              were deprecated in Standards Version 5 and removed in Standards Version 8.

       RECIP  The RECIP vector field type computes the reciprocal of a single input vector field.  Syntax is:

                     <field_name> RECIP <input> <dividend>

              where  <input>  is the input field code and <dividend> is a scalar quantity.  The derived field is
              computed as:

                     fieldname = dividend / input.

              RECIP appeared in Standards Version 8.

       SBIT   The SBIT vector field type extracts one or more bits out of an input vector field as a signed num‐
              ber.  Syntax is:

                     <fieldname> SBIT <input> <first-bit> [<bits>]

              which specifies fieldname to be the value of bits first-bit through first-bit+bits-1 of the  input
              vector  field  input,  when  input  is  converted from its native type to a (endianness corrected)
              signed 64-bit integer.  If bits is omitted, it is assumed to be 1.  The BIT field type is  an  un‐
              signed version of this field type.  SBIT appeared in Standards Version 7.

       STRING The  STRING  scalar  field type is a character string fully specified in the format file metadata.
              Syntax is:

                     <fieldname> STRING <value>

              where value is the string value of the field.  Note that value is  a  single  token.   To  include
              whitespace in the string, enclose value in quotation marks ("), or else escape the whitespace with
              the backslash character (\).  STRING appeared in Standards Version 6.

       WINDOW The  WINDOW vector field type isolates a portion of an input vector based on a comparison.  Syntax
              is:

                     <fieldname> WINDOW <input> <check> <op> <threshold>

              where input is the vector containing the data to extract, check is the vector on which to test the
              comparison, threshold is the value against which check is compared, and op is one of the following
              tokens indicating the particular comparison performed:

                     EQ     data are extracted where check, converted to a 64-bit signed integer, equals thresh‐
                            old,

                     GE     data are extracted where check, converted to  a  64-bit  floating-point  number,  is
                            greater than or equal to threshold,

                     GT     data  are  extracted  where  check,  converted to a 64-bit floating-point number, is
                            strictly greater than threshold,

                     LE     data are extracted where check, converted to a 64-bit floating-point number, is less
                            than or equal to threshold,

                     LT     data are extracted where check, converted to  a  64-bit  floating-point  number,  is
                            strictly less than threshold,

                     NE     data  are  extracted where check, converted to a 64-bit signed integer, is not equal
                            to threshold,

                     SET    data are extracted where at least one bit set in threshold is  also  set  in  check,
                            when converted to a 64-bit unsigned integer,

                     CLR    data are extracted where at least one bit set in threshold is not set in check, when
                            converted to a 64-bit unsigned integer,

              The  storage  type  of threshold depends on the operator, and follows the interpretation of check.
              It may never be complex valued.

              Outside the region extracted, the value of the derived field is implementation dependent.

              Note: with the EQ operator, this derived field type is very similar to the MPLEX field type above.
              The primary difference is that MPLEX mandates the value of the derived field outside the extracted
              region, while WINDOW does not.  WINDOW appeared in Standards Version 9.

   Field Parameters
       All input vector field parameters should be field codes (see below).  Additionally, the scalar field  pa‐
       rameters listed may be either literal numbers or else the field code of a CONST field containing the val‐
       ue, or the field code of a CARRAY followed by a left angle bracket (<), then an non-negative integer used
       as the CARRAY element index, then a right angle bracket (>), that is:

              fieldcode<n>

       If  the  angle  brackets  and element index are omitted from a CARRAY field code used as a parameter, the
       first element in the field (index zero) is assumed.

       Field parameters which may be specified using a scalar field code are:

              BIT, SBIT
                     bitnum, numbits

              LINCOM any of the mi, or bi

              MPLEX  count, max

              PHASE  shift

              POLYNOM
                     any of the ai

              RAW    spf

              RECIP  dividend

              WINDOW threshold

       Since it is possible to create a field code which is identical to a literal number, a  parameter  is  as‐
       sumed  to be the field code of a scalar field only if the entire token cannot be parsed as a literal num‐
       ber using the rules outlined in strtod(3).  For example, a CONST field whose field code  consists  solely
       of digits can never be used as a parameter in a field specification line.

       Starting in Standards Version 7, literal complex number is specified as two real (floating point) numbers
       separated by a semicolon (;) with no intervening whitespace.  So, for example, the tokens

              1;0  0;1  4;0  0;5  9.313e2;74.1

       represent,  respectively,  the  real unit, the imaginary unit, the real number four, the imaginary number
       5i, and the complex number 931.3 + 74.1i.  Because the semicolon character cannot be used in field names,
       a complex valued literal can never be mistaken for a field code.  This allows, among  other  things,  the
       composition of complex valued fields from purely real input fields.  For example, a complex valued field,
       z,  may be created from a real valued field re, representing the real part of the complex number, and the
       real valued field im, representing the imaginary part of the complex number, with  the  following  LINCOM
       specification:

              z LINCOM re 1 0 im 0;1 0

       Starting  in  Standards  Version  9, in additional to decimal notation, literal integer parameters may be
       specified as hexadecimal numbers, by prefixing the number (after an optional '+' or '-' sign) with 0x  or
       0X,  or as octal numbers, by prefixing the number with 0, as described in strtol(3).  Similarly, floating
       point literal numbers (both purely real ones and components of complex  literals)  may  be  specified  in
       hexadecimal by prefixing them with 0x or 0X, and using p or P as the binary exponent prefix, as described
       in the C99 standard.  Both uppercase and lowercase hexadecimal digits may be used.  In cases where a lit‐
       eral  floating  point  number  may apear, the tokens INF or INFINITY, optionally preceded by a '+' or '-'
       sign, and NAN, optionally immediately followed by '(', then a sequence of characters, then ')',  and  all
       disregarding case, will be interpreted as the special floating point values explained in strtod(3).

   Field Codes
       When specifying the input to a field, either as a scalar parameter, or as an input vector field to a non-
       RAW vector field, field codes are used.  A field code is one of:

       •   a simple field name, possibly an alias, indicating a vector or scalar field

       •   a  parent  field  name,  followed  by  a  forward  slash,  followed by a metafield name, indicating a
           metafield.  See the description of the /META directive above for further details.

       •   either of the above, followed by a period, followed by a representation suffix, but only if the field
           or metafield specified is not a STRING type field.

       A representation suffix may be used used to extract a real number from a complex  value.   The  available
       suffixes and their meanings are:

       .a     This  representation indicates the angle (in radians) between the positive real axis and the value
              (ie. the complex argument).  The argument is in the range [-pi, pi], and a branch cut exists along
              the negative real axis.  At the branch cut, -pi is returned if the imaginary part is -0, and pi is
              returned if the imaginary part is +0.  If z=0, zero is returned.

       .i     This representation indicates the projection of the value onto the imaginary axis (ie. the  imagi‐
              nary part of the number).

       .m     This representation indicates the modulus of the value (ie. its absolute value).

       .r     This  representation  indicates the projection of the value onto the real axis (ie.  the real part
              of the number).

       If the specified field is purely real, the representations are calculated as if the  imaginary  part  was
       equal  to  +0.   For  example,  given a complex valued vector, z, a vector containing the real part of z,
       re_z, could be produced with:

              re_z PHASE z.r 0

       and similarly for the complex field's imaginary part, argument, and absolute value.  (Although it  should
       be pointed out this simplistic an example isn't strictly necessary, since z.r could be used wherever re_z
       would be.)

HISTORY

       This document describes Versions 9 and earlier of the Dirfile Standards.

       Version  9  of  the Standards (April 2012) added the MPLEX and WINDOW field types, the /ALIAS and /HIDDEN
       directives, the affixes to /INCLUDE, the sie, zzip, and zzslim encoding schemes, along with the  optional
       enc_datum  token  to /ENCODING.  It permitted specification of integer literals in octal and hexadecimal.
       Finally, it deprecated the type aliases FLOAT and DOUBLE.

       Version 8 of the Standards (November 2010) added the DIVIDE, RECIP, and CARRAY field types, made the for‐
       ward slash on reserved words mandatory, and prohibited using the single character data  type  aliases  in
       the  specification  of RAW fields.  It also introduced the optional second (arm) token to the /ENDIAN di‐
       rective.

       Version 7 of the Standards (October 2009) added the SBIT and POLYNOM field types, and the  directive-less
       method  of specifying metafields.  It also introduced the data types COMPLEX128 and COMPLEX64, along with
       the notion of representations, and the lzma encoding scheme.  Finally, it made the number of fields para‐
       meter for LINCOM optional.

       Version 6 of the Standards (October 2008) added the /ENCODING, /META,  /PROTECT,  and  /REFERENCE  direc‐
       tives,  and the CONST and STRING field types.  It permitted whitespace in tokens and introduced the char‐
       acter escape sequences. It allowed CONST fields to be used as parameters in  field  specification  lines.
       It also removed FILEFRAM as an alias for INDEX, and prohibited .  but allowed # and \ in field names.

       Version  5  of the Standards (August 2008) added VERSION and ENDIAN, slash demarcation of reserved words,
       and removed the restriction on field name length.  It introduced the data types INT8, INT64, and  UINT64,
       the new-style type specifiers, and increased the range of the BIT field type from 32 to 64 bits.  It also
       prohibited the characters &;<>\| in field names.

       Version 4 of the Standards (October 2006) added the PHASE field type.

       Version  3 of the Standards (January 2006) added INCLUDE and increased the allowed length of a field name
       from 16 to 50 characters.

       Version 2 of the Standards (September 2005) added the MULTIPLY field type.

       Version 1 of the Standards (November 2004) added FRAMEOFFSET and the optional fourth argument to the  BIT
       field type.

       Version  0  of  the Standards (before March 2003) refers to the dirfile standards supported by the getda‐
       ta(3) library originally introduced into the kst(1) sources, which contained support for all  other  fea‐
       tures covered by this document.

AUTHORS

       The dirfile specification was developed by C. B. Netterfield <netterfield@astro.utoronto.ca>.

       Since   Standards   Version   3,   the   dirfile  specification  has  been  maintained  by  D.  V.  Wiebe
       <getdata@ketiltrout.net>.

SEE ALSO

       dirfile(5), dirfile-encoding(5)

Standards Version 9                               3 April 2013                                 dirfile-format(5)