Provided by: libgetdata-doc_0.9.0-2.2_all bug

NAME

       dirfile-format — the dirfile database format specification file

DESCRIPTION

       The  dirfile  format  specification  fully  specifies the raw and derived time streams and
       auxiliary information for a dirfile(5) database.

       The format specification is contained in one or more case-sensitive text files located  in
       the  dirfile  tree.   Each  file is known as a fragment.  The primary fragment is the file
       called format located in the base dirfile directory.  This file may contain only  part  of
       the format specification, and may reference other fragments (using the /INCLUDE directive)
       containing  further  format  specification.   This  inclusion  mechanism  may  be   nested
       arbitrarily deep.

       The explicit text encoding of these files is not specified by these Standards, but it must
       be 7-bit ASCII compatible. Examples of acceptable  character  encodings  include  all  the
       ISO  8859  character  sets  (i.e.  Latin-1 through Latin-10, among others), as well as the
       UTF-8 encoding of Unicode and UCS.

       This document primarily describes  the  latest  version  of  the  Standards  (Version  9);
       differences  with  previous versions are noted where relevant.  A complete list of changes
       between versions is given in the HISTORY section below.

SYNTAX

       The format specification is composed of field specification  lines  and  directive  lines,
       optionally  separated  by  blank  lines  or  lines  containing only whitespace.  Lines are
       separated by the line-feed character (0x0A).  Unless escaped (see below),  the  hash  mark
       (#)  is the comment delimiter; the comment delimiter, and any text following it to the end
       of the line, is ignored.

   Tokens
       Both field specification lines and directive lines consist of several tokens separated  by
       whitespace.   Whitespace  consists of one or more whitespace characters.  These are: space
       (0x20), horizontal tab (0x09), vertical tab (0x0B), form-feed (0x0C), and carriage  return
       (0x0D).   The  first token of a directive line is always a reserved word.  The first token
       of a field specification line is never a reserved word.   Any  amount  of  whitespace  may
       precede the first token on a line.

       Since tokens are separated by whitespace, to include a whitespace character in a token, it
       must either escaped by preceding it by a backslash character (\),  or  be  replaced  by  a
       character  escape  sequence  (see  below), or else the token must be enclosed in quotation
       marks (").  The quotation marks themselves are stripped from  the  token.  The  null-token
       (that is, the token consisting of zero characters) may be specified by a pair of quotation
       marks with nothing between them ("").  To include a literal quotation mark in a token,  it
       must  be  escaped (\").  Similarly, a hash mark may be included in a token by including it
       in a quoted token or else by escaping it (\#), otherwise the hash mark  is  understood  as
       the comment delimiter.

       It  is a syntax error to have a line which contains unmatched quotation marks, or in which
       the last character is the backslash character.

       Several characters when escaped by a preceding  backslash  character  are  interpreted  as
       special characters in tokens.  The character escape sequences are:

              \a     an alert (bell) character (ASCII 0x07 / U+0007)

              \b     a backspace character (ASCII 0x08 / U+0008)

              \e     an escape character (ASCII 0x1B / U+001B)

              \f     a form-feed character (ASCII 0x0C / U+000C)

              \n     a line-feed character (ASCII 0x0A / U+000A)

              \r     a carriage return character (ASCII 0x0D / U+000D)

              \t     a horizontal tab character (ASCII 0x09 / U+0009)

              \v     a vertical tab character (ASCII 0x0B / U+000B)

              \\     a backslash character (ASCII 0x5C / U+005C)

              \ooo   the single byte given by the octal number ooo (1 to 3 octal digits).

              \xhh   the  single  byte  given  by  the  hexadecimal number hh (1 or 2 hexadecimal
                     digits).

              \uhhhhhhh
                     the UTF-8 byte sequence  encoding  the  Unicode  code  point  given  by  the
                     hexadecimal number hhhhhhh (1 to 7 hexadecimal digits).

       Any other character which is escaped is interpreted as the character itself.  (i.e.  \c is
       interpreted as c; also, as pointed out above, \" and \# are interpreted as simply " and #,
       without their special meanings).

       No  token  may  contain  the  NULL character (ASCII 0x00 / U+0000).  Furthermore, although
       support is present to create UTF-8 byte sequences, tokens are not  required  to  be  valid
       UTF-8 sequences.  Any byte sequence not containing the NULL character forms a valid token.
       However, there may be further  restrictions  on  allowed  characters  for  a  token  in  a
       particular situation, (for example, when used as a field name).

       Standards Version 5 and earlier do not recognise the character escape sequences, nor allow
       quoting of tokens. As a result, they prohibit both whitespace and  the  comment  delimiter
       from being used in tokens.

DIRECTIVES

       There  are  ten  directives,  each specified by a different reserved word, which cannot be
       used as field names in the dirfile.  As of Standards Version 8, all reserved  words  start
       with  an  initial  forward  slash  (/),  to  distinguish them from field names.  Standards
       Versions 5, 6, and 7 permitted the  omission  of  the  initial  forward  slash,  while  in
       Standards  Version  4  and  earlier, reserved words may not have an initial forward slash.
       Like the rest of the format specification, directives are case sensitive.

       A number of the directives have fragment scope.  A  directive  with  fragment  scope  only
       applies  to  the  fragment in which it is present, plus any sub-fragments indicated by the
       /INCLUDE directive, but only if those sub-fragments don't  have  their  own  corresponding
       directive.   Directives  which  have fragment scope are: /ENCODING, /ENDIAN, /FRAMEOFFSET,
       and /PROTECT.  Because of these scoping rules, different portions of the dirfile may  have
       different encodings, endiannesses, frame offsets, or protection levels.

       If  a  directive  with  fragment scope appears more than once in a fragment, only the last
       such directive is honoured, with the exception that the  effect  of  a  directive  is  not
       propagated  to  sub-fragments  if  the  directive  line  appears after the sub-fragment is
       included.  The scoping rules of the remaining directives are discussed below.

       /ALIAS The /ALIAS directive defines an alternate name for a field defined elsewhere in the
              format  specification (called the "target").  Aliases may not be used as the parent
              field in a /META directive, but are in most other ways indistinguishable  from  the
              target's  original,  canonical  name.   Aliases may be chained (that is, the target
              name appearing in an /ALIAS directive may itself be an alias).  In this  case,  the
              new  alias  is  another  name  for  the  target's  own target.  Just as there is no
              requirement that the input fields of a derived field exist, it is not an error  for
              the target of an alias to not exist.  Syntax is:

              /ALIAS <name> <target>

              A metafield alias may defined using the <parent-field>/<alias-name> syntax for name
              in the /ALIAS directive.  No restriction  is  placed  on  target;  specifically,  a
              metafield  alias  may  target a top-level field, or a metafield of with a different
              parent; conversely, a top-level alias may target a metafield.

              A metafield alias may never appear as the parent part of a  metafield  field  code,
              even if it refers to a top-level field.  That is, given the valid format:

              field1 RAW UINT8 1
              field1/meta CONST FLOAT64 0.0
              field2 RAW UINT8 1
              /ALIAS field2/alias field1

              the  metafield field1/meta may not be referred to as field2/alias/meta, even though
              field2/alias is a valid field code referring to field1.

              The /ALIAS directive has no scope: it is processed  immediately.   It  appeared  in
              Standards Version 9.

       /ENCODING
              The  /ENCODING  directive specifies the encoding scheme used to encode binary files
              in the dirfile.  The encoding scheme may be one  of  the  predefined  names  listed
              below,  which  are  described  in  more detail in dirfile-encoding(5), or any other
              site-specific encoding scheme.  The predefined scheme names are:

              none   The dirfile is unencoded.

              bzip2  The dirfile is compressed using the bzip2 compression scheme.

              gzip   The dirfile is compressed using the gzip compression scheme.

              lzma   The dirfile is compressed using the LZMA compression scheme.

              slim   The dirfile is compressed using the slim compression scheme.

              sie    The dirfile is sample-index encoded (a variant of run-length encoding).

              text   The dirfile is text encoded.

              zzip   The dirfile is  compressed  and  encapsulated  using  the  zzip  compression
                     scheme.

              zzslim The  dirfile  is compressed and encapsulated using a combination of the zzip
                     and slim compression schemes.

              Implementations should  fail  gracefully  when  encountering  an  unknown  encoding
              scheme.  If no encoding scheme is specified, behaviour is implementation dependent.
              Syntax is:

                     /ENCODING <scheme> [<enc-datum>]

              The enc-datum token provides additional data  for  certain  encoding  schemes;  see
              dirfile-encoding(5) for details.  The form of enc-datum is not specified.

              The  /ENCODING  directive  has fragment scope.  It appeared in Standards Version 6.
              The predefined schemes sie, zzip, and zzslim, and  the  optional  enc-datum  token,
              appeared  in  Standards Version 9; the predefined scheme lzma appeared in Standards
              Version 7; all other predefined schemes appeared in Standards Version 6.

       /ENDIAN
              The /ENDIAN directive specifies the endianness of the raw  data  in  the  database.
              The  assumed  endianness  of  raw  data  in  dirfiles  which omit this directive is
              implementation dependent.  Syntax is:

                     /ENDIAN ( big | little ) [ arm ]

              where the "arm" token should be included if double precision  floating  point  data
              are  stored  in  the  ARM middle-endian format.  The /ENDIAN directive has fragment
              scope.  It appeared in Standards Version 5.  The optional  arm  token  appeared  in
              Standards Version 8.

       /FRAMEOFFSET
              The  /FRAMEOFFSET directive specifies the frame number of the first frame for which
              data exists in binary files associated with RAW fields.  Syntax is:

                     /FRAMEOFFSET <integer>

              The /FRAMEOFFSET directive has fragment scope.  It appeared in Standards Version 1.

       /HIDDEN
              The /HIDDEN directive indicates that the  specified  field  name  is  hidden.   The
              difference  (if  any)  between  a field name which is hidden and one that is not is
              implementation dependent.   Hiddenness  is  not  inherited  by  metafields  of  the
              specified field.  Hiddenness applies to the name, not the field itself; it does not
              hide all aliases of the field-name, and  if  field-name  an  alias,  the  alias  is
              hidden, not its target.  Syntax is:

                     /HIDDEN <field-name>

              A  /HIDDEN  directive  must  appear  after  the specification of field-name, (which
              occurs either in a field specification line, or an /ALIAS  directive,  or  a  /META
              directive) in the same fragment.

              The  /HIDDEN  directive  has no scope: it is processed immediately.  It appeared in
              Standards Version 9.

       /INCLUDE
              The /INCLUDE directive specifies another file (called  a  fragment)  to  parse  for
              additional  format  specification  for  the  dirfile.   The  inclusion is processed
              immediately, before the fragment containing  the  /INCLUDE  directive  (the  parent
              fragment)  is  parsed  further.   RAW fields specified in the included fragment are
              located in the directory containing the fragment file, and  not  in  the  directory
              containing  the  parent fragment, and the binary file encoding may be different for
              each fragment.  The fragment may be specified either with an absolute path, or else
              a path relative to the directory containing the parent fragment.

              The  /INCLUDE  directive  may optionally specify a prefix and/or suffix to apply to
              field names defined in the included fragment.  If present, affixes are  applied  to
              all  field-names  (including  aliases)  defined  in  the  included fragment and any
              fragments it further includes.  Affixes nest,  with  the  affixes  of  the  deepest
              inclusion  innermost.   Affixes  are  not  applied  to  the  names  of binary files
              associated with RAW fields.  Syntax is:

                     /INCLUDE <file> [<prefix> [<suffix>]]

              To specify only a  suffix,  use  the  null-token  ("")  as  prefix.   The  /INCLUDE
              directive  has  no  scope:  it  is processed immediately.  It appeared in Standards
              Version 3.  The optional prefix and suffix appeared in Standards Version 9.

       /META  The /META directive specifies a metafield attached to a  particular  parent  field.
              The field metadata may be of any allowed type except RAW.  Metafields are retrieved
              in exactly the same way as  regular  field  data,  but  the  field  code  specified
              consists of the parent and metafield names joined with a forward slash:

                     <parent-field>/<meta-field>

              META fields may not be specified before their parent field has been.  Syntax is:

                     /META <parent-field> {field specification line}

              The <parent-field> code may not be an alias.  As an illustration of this concept,

                     /META pfield meta CONST FLOAT64 3.291882

              provides  a  scalar metadatum called meta with value 3.291882 attached to the field
              pfield.   This  particular  metafield  may  be  referred  to  by  the  field   code
              "pfield/meta".  Note that different parent fields may have metafields with the same
              name, since all references to  metafields  must  include  the  parent  field  name.
              Metafields may not themselves have further sub-metafields.

              As  an  alternative  to  the  /META directive, starting with Standards Version 7, a
              metafield may be specified by a standard field specification line, using

                     <parent-field>/<meta-field>

              as the field name.  That is, the above  example  metafield  could  have  also  been
              specified as:

                     pfield/meta CONST FLOAT64 3.291882

              The  /META  directive  has  no  scope: it is processed immediately.  It appeared in
              Standards Version 6.

       /PROTECT
              The /PROTECT directive specifies the  advisory  protection  level  of  the  current
              fragment  and  of  the  RAW fields defined therein.  The protection level indicates
              whether writing to the fragment, or the binary data on disk is  permitted.   Syntax
              is:

                     /PROTECT <level>

              Four advisory protection levels are defined:

              none   No  protection at all: data and metadata may be freely changed.  This is the
                     default, if no /PROTECT directive is present.

              format The dirfile metadata is protected from change, but RAW data on disk  may  be
                     modified.

              data   The RAW data on disk is protected from change, but metadata may be modified.

              all    Both metadata and data on disk are protected from change.

              The /PROTECT directive has fragment scope.  It appeared in Standards Version 6.

       /REFERENCE
              The  /REFERENCE  directive  specifies the name of the field to use as the dirfile's
              reference field (see dirfile(5)).  If no /REFERENCE  directive  is  specified,  the
              first  RAW  field  encountered  is  used  as  the  reference field.  The /REFERENCE
              directive must specify a RAW field.  Syntax is:

                     /REFERENCE <field-code>

              The /REFERENCE directive has global scope: if multiple /REFERENCE directives appear
              in  the dirfile metadata, only the last such is honoured.  It appeared in Standards
              Version 6.

       /VERSION
              The /VERSION directive specifies the particular version of the Dirfile Standards to
              which  the  dirfile  format  specification  conforms.   This directive should occur
              before any version dependent syntax is encountered.  As of Standards Version 6,  no
              such  syntax  exists,  and  this  directive  is  provided primarily to ease forward
              compatibility.  Syntax is:

                     /VERSION <integer>

              The /VERSION directive has immediate scope: its effect is immediate, and it applies
              only  to  metadata  below  it, including and propagating downwards to sub-fragments
              after the directive.

              In Standards Version 8 and earlier, its effect also propagates upwards back to  the
              parent  fragment, and affects subsequent metadata.  Starting with Standards Version
              9, this no longer happens.  As a result, a /VERSION  directive  which  indicates  a
              version  of  9 or later never propagates upwards; additionally, /VERSION directives
              found in subfragments included in a Version 9 or later fragment  aren't  propagated
              upwards  into  that  fragment,  regardless of the Version of the subfragments.  The
              /VERSION directive appeared in Standards Version 5.

FIELD SPECIFICATION LINES

       Any line which does not start with a reserved word is assumed to be a field  specification
       line.  A field specification line consists of at least two tokens.  The first token is the
       field name.  The second token is the field type.  Subsequent tokens are field  parameters.
       The meaning and number these parameters depends on the field type specified.

   Field Names
       The  first token in a field specification line is the field name.  The field name consists
       of one or more characters, excluding both ASCII control characters (the bytes 0x01 through
       0x1F), and the characters

              &    /    ;    <    >    |    .

       which  are reserved (but see below for the use of / to specify metafields).  The full stop
       (.)  is allowed in Standards Version 5 and earlier.  The ampersand, semicolon, less  than,
       greater  than,  and  vertical  line  (&  ;  <  > |) are allowed in Standards Version 4 and
       earlier.  Furthermore, due to the lack of an  escape  or  quoting  mechanism  (see  Tokens
       above), Standards Version 5 and earlier also prohibit whitespace and the comment delimiter
       (#) in field names.

       The field name may not be INDEX, which is a special, implicit  field  which  contains  the
       integer frame index.  Standards Version 5 and earlier also prohibit FILEFRAM, which was an
       alias for INDEX.  Field names are case sensitive.  Standards  Version  3  and  4  restrict
       field  names  to 50 characters. Standards Version 2 and earlier restrict field names to 16
       characters. Additionally, the filesystem may put restrictions on the length and acceptable
       characters of a RAW field name, regardless of Standards Version.

       Starting  in  Standards  Version 7, if the field name beginning a field specification line
       contains exactly one / character, the line is assumed to specify  a  metafield.   See  the
       /META directive above for further details.  A field name may not contain more than one /.

   Field Types
       There  are fifteen field types.  Of these, twelve are of vector type (BIT, DIVIDE, LINCOM,
       LINTERP, MPLEX, MULTIPLY, PHASE, POLYNOM, RAW, RECIP, SBIT, and WINDOW) and three  are  of
       scalar  type  (CONST,  CARRAY,  and STRING).  The eleven vector field types other than RAW
       fields are also called derived fields, since they derive their  value  from  one  or  more
       input fields.

       Five  of  these derived fields (DIVIDE, LINCOM, MPLEX, MULTIPLY, and WINDOW) may have more
       than one input field.  In situations where these input fields have differing sample rates,
       the  sample  rate  of the derived field is the same as the sample rate of the first (left-
       most) input field specified.  Furthermore, the input fields are synchronised  by  aligning
       them  on  frame boundaries, assuming equally-spaced sampling throughout a frame, and using
       the last sample of each input field which did not occur after the sample  of  the  derived
       field  being computed.  That is, if the first and second input fields have sample rates s1
       and s2, the derived field also has sample rate s1 and, for every  sample  of  the  derived
       field, n, the n'th sample of the first field is used (since they have the same sample rate
       by definition), and the sample number used of the second field, m, is computed as:

              m = floor((n * s2) / s1).

       Starting  in  Standards  Version  6,  certain  scalar  field  parameters  in   the   field
       specifications  may  be specified using CONST or CARRAY fields, instead of literal values.
       A list of parameters for which this is allowed is given  below  in  the  Field  Parameters
       section.

       The possible fields types are:

       BIT    The BIT vector field type extracts one or more bits out of an input vector field as
              an unsigned number.  Syntax is:

                     <fieldname> BIT <input> <first-bit> [<num-bits>]

              which specifies fieldname to be the value of bits first-bit through  first-bit+num-
              bits-1  of  the  input  vector field input, when input is converted from its native
              type to an (endianness corrected) unsigned 64-bit integer.  If num-bits is omitted,
              it is assumed to be 1.  The SBIT field type is a signed version of this field type.
              The optional num-bits parameter appeared in Standards Version 1.

       CARRAY The CARRAY scalar field type is a list of constants fully specified in  the  format
              specification metadata.  Syntax is:

                     <fieldname> CARRAY <type> <value0> <value1> <value2> ...

              where  type  may  be any supported native data type (see the description of the RAW
              field type below), and value0, value1, &c. are the values of successive elements in
              the scalar list interpreted as indicated by type.  No limit is placed on the number
              of elements in a CARRAY.  (Note: despite being multivalued, this is not  considered
              a  vector  field  since  the elements of the CARRAY are not indexed by frames.)  It
              appeared in Standards Version 8.

       CONST  The  CONST  scalar  field  type  is  a  constant  fully  specified  in  the  format
              specification metadata.  Syntax is:

                     <fieldname> CONST <type> <value>

              where  type  may  be any supported native data type (see the description of the RAW
              field type below), and value is the numerical value of the constant interpreted  as
              indicated by type.  It appeared in Standards Version 6.

       DIVIDE The DIVIDE vector field type is the quotient of two vector fields.  Syntax is:

                     <fieldname> DIVIDE <field1> <field1>

              The derived field is computed as:

                     fieldname = field1 / field2.

              It was introduced in Standards Version 8.

       LINCOM The  LINCOM  vector field type is the linear combination of one, two or three input
              vector fields.  Syntax is:

                     <fieldname> LINCOM [<n>] <field1> <a1> <b1> [<field2>  <a2>  <b2>  [<field3>
                     <a3> <b3>]]

              where n, if present, indicates the number of input vector fields (1, 2, or 3).  The
              derived field is computed as:

                     fieldname = (a1 * field1 + b1) + (a2 * field2 + b2) + (a3 * field3 + b3)

              with the field2 and field3 terms included only if specified.

              If n is not specified, the number  of  fields  is  determined  by  looking  at  the
              supplied  parameters.   Since  it  is  possible  to  create  a  field code which is
              identical to a literal number, the third token on the line is assumed to be n if it
              the  entire  token  can  be  parsed as a literal number using the rules outlined in
              strtod(3).  That is, if the field code specifying field1 could be  mistaken  for  a
              literal  number,  n must be specified to prevent ambiguity.  In standards Version 6
              and earlier, n is mandatory.

       LINTERP
              The LINTERP vector field type specifies a table look up  based  on  another  vector
              field.  Syntax is:

                     <fieldname> LINTERP <input> <table>

              where  input  is the input vector field for the table lookup, and table is the path
              to the lookup table file for the field.  If this path is relative, it is assumed to
              be  relative  to  the  directory  containing the fragment defining this field.  The
              lookup table file is an ASCII text file with two whitespace separated columns of  x
              and y values.  Values are linearly interpolated between the points specified in the
              lookup table.

       MPLEX  The MPLEX vector field type permits the multiplexing of  several  low  sample  rate
              fields into a single data field of higher sample rate.  Syntax is:

                     <fieldname> MPLEX <input> <index> <count> [<period>]

              where  input  is  the  input vector containing the multiplexed fields, index is the
              vector containing the mutliplex index, count is the value of  the  multiplex  index
              when the computed field is stored in input, and period, if present and non-zero, is
              the number of samples between successive occurrances of  the  value  count  in  the
              index  vector.   A  period of zero (or, equivalently, it's omission) indicates that
              either the value count is not equally spaced in the index vector, or else that  the
              spacing  is  unknown.   Both  count  and period are integers, and period may not be
              negative.

              At every sample n, the derived field is computed as:

                     fieldname[n] = (index == count) ? input[n] : fieldname[n - 1]

              The index vector is converted to an integer type for comparison.  The value of  the
              derived  field  before  the first sample where index equals count is implementation
              dependent.

              The values of count and period place no restrictions on values contained in  index.
              Specifically,  particular  values  of  index  (including count) need not be equally
              spaced (neither by period nor any other spacing); index need not ever take  on  the
              value  count  (in  which  case  the  value  of the entirety of the derived field is
              implementation dependent).  Different MPLEX field definitions which  use  the  same
              index vector may specify different periods.  MPLEX appeared in Standards Version 9.

       MULTIPLY
              The MULTIPLY vector field type is the product of two vector fields.  Syntax is:

                     <fieldname> MULTIPLY <field1> <field2>

              The derived field is computed as:

                     fieldname = field1 * field2.

              It appeared in Standards Version 2.

       PHASE  The PHASE vector field type shifts an input vector field by the specified number of
              samples.  Syntax is:

                     <fieldname> PHASE <input> <shift>

              which specifies fieldname to be the input vector field,  input,  shifted  by  shift
              samples.   A  positive  shift  indicates a forward shift, towards the end-of-field.
              Results  of  shifting  past  the  beginning-  or  end-of-field  is   implementation
              dependent.  PHASE appeared in Standards Version 4.

       POLYNOM
              The  POLYNOM  vector  field  type specifies a polynomial function of a single input
              vector field.  Syntax is:

                     <field_name> POLYNOM <input> <a0> <a1> [<a2> [<a3> [<a4> [<a5>]]]]

              where <input> is the input field code, and the order of the computed polynomial  is
              determined by how many co-efficients are present in the specification.  The derived
              field is computed as:

                     fieldname = a0 + a1 * input + a2 * input**2 + a3 * input**3 + a4 *  input**4
                     + a5 * input**5

              where  **  is  the element-wise exponentiation operator, and the higher order terms
              are computed only if the corresponding co-efficients  ai  are  specified.   POLYNOM
              appeared in Standards Version 7.

       RAW    The  RAW  vector  field type specifies raw time streams on disk.  In this case, the
              field name should correspond to the name of the file containing  the  time  stream.
              Syntax is:

                     <fieldname> RAW <type> <sample-rate>

              where  sample-rate  is  the number of samples per dirfile frame for the time stream
              and type is a token specifying the native data format type:

                     UINT8  unsigned 8-bit integer

                     INT8   signed (two's complement) 8-bit integer

                     UINT16 unsigned 16-bit integer

                     INT16  signed (two's complement) 16-bit integer

                     UINT32 unsigned 32-bit integer

                     INT32  signed (two's complement) 32-bit integer

                     UINT64 unsigned 64-bit integer

                     INT64  signed (two's complement) 64-bit integer

                     FLOAT32
                            IEEE-754 standard 32-bit single precision floating point number

                     FLOAT64
                            IEEE-754 standard 64-bit double precision floating point number

                     COMPLEX64
                            a 64-bit complex number consisting of two  IEEE-754  standard  32-bit
                            single  precision  floating  point  numbers representing the real and
                            imaginary parts of the complex number (Standards Version 7 and later)

                     COMPLEX128
                            a 128-bit complex number consisting of two IEEE-754  standard  64-bit
                            double  precision  floating  point  numbers representing the real and
                            imaginary parts of  the  complex  number  (Standards  Version  7  and
                            later).

              For  more  information  on the storage of complex valued data, see dirfile(5).  Two
              additional type names  exist:  FLOAT  is  equivalent  to  FLOAT32,  and  DOUBLE  is
              equivalent to FLOAT64.  Standards Version 9 deprecates these two aliases, but still
              allows them.

              All these type names (except  those  for  complex  data,  which  came  later)  were
              introduced in Standards Version 5.  Earlier Standards Versions specified data types
              with single character type aliases:

                     c      UINT8

                     u      UINT16

                     s      INT16

                     U      UINT32

                     i, S   INT32

                     f      FLOAT32

                     d      FLOAT64

              Types INT8, UINT64, INT64, COMPLEX64,  and  COMPLEX128  are  not  supported  before
              Standards  Version  5,  so  no single character type aliases exist for these types.
              These single character type aliases were deprecated  in  Standards  Version  5  and
              removed in Standards Version 8.

       RECIP  The RECIP vector field type computes the reciprocal of a single input vector field.
              Syntax is:

                     <field_name> RECIP <input> <dividend>

              where <input> is the input field code and <dividend> is  a  scalar  quantity.   The
              derived field is computed as:

                     fieldname = dividend / input.

              RECIP appeared in Standards Version 8.

       SBIT   The  SBIT  vector field type extracts one or more bits out of an input vector field
              as a signed number.  Syntax is:

                     <fieldname> SBIT <input> <first-bit> [<bits>]

              which specifies fieldname  to  be  the  value  of  bits  first-bit  through  first-
              bit+bits-1 of the input vector field input, when input is converted from its native
              type to a (endianness corrected) signed 64-bit integer.  If bits is omitted, it  is
              assumed  to  be  1.   The BIT field type is an unsigned version of this field type.
              SBIT appeared in Standards Version 7.

       STRING The STRING scalar field type is a character string fully specified  in  the  format
              file metadata.  Syntax is:

                     <fieldname> STRING <value>

              where  value  is the string value of the field.  Note that value is a single token.
              To include whitespace in the string, enclose value in quotation marks ("), or  else
              escape  the  whitespace  with  the  backslash  character  (\).   STRING appeared in
              Standards Version 6.

       WINDOW The WINDOW vector field type isolates a portion of  an  input  vector  based  on  a
              comparison.  Syntax is:

                     <fieldname> WINDOW <input> <check> <op> <threshold>

              where  input  is  the vector containing the data to extract, check is the vector on
              which to test the comparison,  threshold  is  the  value  against  which  check  is
              compared,  and  op  is  one  of  the  following  tokens  indicating  the particular
              comparison performed:

                     EQ     data are extracted where check, converted to a 64-bit signed integer,
                            equals threshold,

                     GE     data  are extracted where check, converted to a 64-bit floating-point
                            number, is greater than or equal to threshold,

                     GT     data are extracted where check, converted to a 64-bit  floating-point
                            number, is strictly greater than threshold,

                     LE     data  are extracted where check, converted to a 64-bit floating-point
                            number, is less than or equal to threshold,

                     LT     data are extracted where check, converted to a 64-bit  floating-point
                            number, is strictly less than threshold,

                     NE     data are extracted where check, converted to a 64-bit signed integer,
                            is not equal to threshold,

                     SET    data are extracted where at least one bit set in  threshold  is  also
                            set in check, when converted to a 64-bit unsigned integer,

                     CLR    data are extracted where at least one bit set in threshold is not set
                            in check, when converted to a 64-bit unsigned integer,

              The  storage  type  of  threshold  depends  on  the  operator,  and   follows   the
              interpretation of check.  It may never be complex valued.

              Outside  the  region  extracted,  the  value of the derived field is implementation
              dependent.

              Note: with the EQ operator, this derived field type is very similar  to  the  MPLEX
              field  type  above.  The primary difference is that MPLEX mandates the value of the
              derived field outside the extracted region, while WINDOW does not.  WINDOW appeared
              in Standards Version 9.

   Field Parameters
       All  input  vector  field parameters should be field codes (see below).  Additionally, the
       scalar field parameters listed may be either literal numbers or else the field code  of  a
       CONST  field  containing the value, or the field code of a CARRAY followed by a left angle
       bracket (<), then an non-negative integer used as the CARRAY element index, then  a  right
       angle bracket (>), that is:

              fieldcode<n>

       If  the  angle  brackets  and element index are omitted from a CARRAY field code used as a
       parameter, the first element in the field (index zero) is assumed.

       Field parameters which may be specified using a scalar field code are:

              BIT, SBIT
                     bitnum, numbits

              LINCOM any of the mi, or bi

              MPLEX  count, max

              PHASE  shift

              POLYNOM
                     any of the ai

              RAW    spf

              RECIP  dividend

              WINDOW threshold

       Since it is possible to create a field code which is identical  to  a  literal  number,  a
       parameter  is  assumed  to  be  the  field code of a scalar field only if the entire token
       cannot be parsed as a literal number using the rules outlined in strtod(3).  For  example,
       a  CONST field whose field code consists solely of digits can never be used as a parameter
       in a field specification line.

       Starting in Standards Version 7, literal complex number is specified as two real (floating
       point)  numbers  separated  by  a  semicolon  (;) with no intervening whitespace.  So, for
       example, the tokens

              1;0  0;1  4;0  0;5  9.313e2;74.1

       represent, respectively, the real unit, the imaginary unit,  the  real  number  four,  the
       imaginary  number  5i,  and  the  complex  number  931.3  +  74.1i.  Because the semicolon
       character cannot be used in field names, a complex valued literal can  never  be  mistaken
       for  a  field  code.   This  allows, among other things, the composition of complex valued
       fields from purely real input fields.  For example, a complex  valued  field,  z,  may  be
       created from a real valued field re, representing the real part of the complex number, and
       the real valued field im, representing the imaginary part of the complex number, with  the
       following LINCOM specification:

              z LINCOM re 1 0 im 0;1 0

       Starting  in  Standards  Version  9,  in  additional  to decimal notation, literal integer
       parameters may be specified as hexadecimal numbers, by  prefixing  the  number  (after  an
       optional '+' or '-' sign) with 0x or 0X, or as octal numbers, by prefixing the number with
       0, as described in strtol(3).  Similarly, floating point literal numbers (both purely real
       ones and components of complex literals) may be specified in hexadecimal by prefixing them
       with 0x or 0X, and using p or P as the binary exponent prefix, as  described  in  the  C99
       standard.   Both uppercase and lowercase hexadecimal digits may be used.  In cases where a
       literal floating point number may apear, the tokens INF or INFINITY,  optionally  preceded
       by  a '+' or '-' sign, and NAN, optionally immediately followed by '(', then a sequence of
       characters, then ')', and all disregarding  case,  will  be  interpreted  as  the  special
       floating point values explained in strtod(3).

   Field Codes
       When  specifying the input to a field, either as a scalar parameter, or as an input vector
       field to a non-RAW vector field, field codes are used.  A field code is one of:

       •   a simple field name, possibly an alias, indicating a vector or scalar field

       •   a parent field name, followed by a  forward  slash,  followed  by  a  metafield  name,
           indicating  a metafield.  See the description of the /META directive above for further
           details.

       •   either of the above, followed by a period, followed by a  representation  suffix,  but
           only if the field or metafield specified is not a STRING type field.

       A  representation  suffix  may be used used to extract a real number from a complex value.
       The available suffixes and their meanings are:

       .a     This representation indicates the angle (in radians) between the positive real axis
              and  the value (ie. the complex argument).  The argument is in the range [-pi, pi],
              and a branch cut exists along the negative real axis.  At the branch  cut,  -pi  is
              returned  if  the imaginary part is -0, and pi is returned if the imaginary part is
              +0.  If z=0, zero is returned.

       .i     This representation indicates the projection of the value onto the  imaginary  axis
              (ie. the imaginary part of the number).

       .m     This representation indicates the modulus of the value (ie. its absolute value).

       .r     This  representation  indicates the projection of the value onto the real axis (ie.
              the real part of the number).

       If the specified field is purely real,  the  representations  are  calculated  as  if  the
       imaginary  part  was equal to +0.  For example, given a complex valued vector, z, a vector
       containing the real part of z, re_z, could be produced with:

              re_z PHASE z.r 0

       and similarly for the complex  field's  imaginary  part,  argument,  and  absolute  value.
       (Although  it  should  be pointed out this simplistic an example isn't strictly necessary,
       since z.r could be used wherever re_z would be.)

HISTORY

       This document describes Versions 9 and earlier of the Dirfile Standards.

       Version 9 of the Standards (April 2012) added the MPLEX and WINDOW field types, the /ALIAS
       and  /HIDDEN  directives,  the  affixes  to  /INCLUDE,  the sie, zzip, and zzslim encoding
       schemes, along with the optional enc_datum token to /ENCODING.  It permitted specification
       of  integer  literals  in  octal and hexadecimal.  Finally, it deprecated the type aliases
       FLOAT and DOUBLE.

       Version 8 of the Standards (November 2010) added  the  DIVIDE,  RECIP,  and  CARRAY  field
       types, made the forward slash on reserved words mandatory, and prohibited using the single
       character data type aliases in the specification of RAW fields.  It  also  introduced  the
       optional second (arm) token to the /ENDIAN directive.

       Version  7 of the Standards (October 2009) added the SBIT and POLYNOM field types, and the
       directive-less method of  specifying  metafields.   It  also  introduced  the  data  types
       COMPLEX128  and COMPLEX64, along with the notion of representations, and the lzma encoding
       scheme.  Finally, it made the number of fields parameter for LINCOM optional.

       Version 6 of the Standards (October  2008)  added  the  /ENCODING,  /META,  /PROTECT,  and
       /REFERENCE  directives,  and the CONST and STRING field types.  It permitted whitespace in
       tokens and introduced the character escape sequences. It allowed CONST fields to  be  used
       as  parameters  in  field  specification  lines.  It also removed FILEFRAM as an alias for
       INDEX, and prohibited .  but allowed # and \ in field names.

       Version 5 of the Standards (August 2008) added VERSION and ENDIAN,  slash  demarcation  of
       reserved  words, and removed the restriction on field name length.  It introduced the data
       types INT8, INT64, and UINT64, the new-style type specifiers, and increased the  range  of
       the  BIT field type from 32 to 64 bits.  It also prohibited the characters &;<>\| in field
       names.

       Version 4 of the Standards (October 2006) added the PHASE field type.

       Version 3 of the Standards (January 2006) added INCLUDE and increased the  allowed  length
       of a field name from 16 to 50 characters.

       Version 2 of the Standards (September 2005) added the MULTIPLY field type.

       Version  1  of  the  Standards  (November  2004) added FRAMEOFFSET and the optional fourth
       argument to the BIT field type.

       Version 0 of the Standards (before March 2003) refers to the dirfile  standards  supported
       by  the  getdata(3) library originally introduced into the kst(1) sources, which contained
       support for all other features covered by this document.

AUTHORS

       The    dirfile    specification     was     developed     by     C.     B.     Netterfield
       <netterfield@astro.utoronto.ca>.

       Since  Standards  Version  3, the dirfile specification has been maintained by D. V. Wiebe
       <getdata@ketiltrout.net>.

SEE ALSO

       dirfile(5), dirfile-encoding(5)