Provided by: libgetdata-doc_0.9.0-2.2_all bug

NAME

       gd_getdata — retrieve data from a dirfile database

SYNOPSIS

       #include <getdata.h>

       size_t gd_getdata(DIRFILE *dirfile, const char *field_code, off_t first_frame, off_t first_sample, size_t
              num_frames, size_t num_samples, gd_type_t return_type, void *data_out);

DESCRIPTION

       The  gd_getdata()  function  queries a dirfile(5) database specified by dirfile for the field field_code.
       It fetches num_frames frames plus num_samples samples from this field, starting first_sample samples past
       frame first_frame.  The data is converted to the data type specified by return_type, and  stored  in  the
       user-supplied buffer data_out.

       The  field_code  may contain one of the representation suffixes listed in dirfile-format(5).  If it does,
       gd_getdata() will compute the appropriate complex norm before returning the data.

       The dirfile argument must point to a valid DIRFILE object previously created by  a  call  to  gd_open(3).
       The  argument data_out must point to a valid memory location of sufficient size to hold all data request‐
       ed.

       Unless using GD_HERE (see below), the first sample returned will be

              first_frame * samples_per_frame + first_sample

       as measured from the start of the dirfile, where samples_per_frame is the number of samples per frame  as
       returned by gd_spf(3).  The number of samples fetched is, similarly,

              num_frames * samples_per_frame + num_samples.

       Although calling gd_getdata() using both samples and frames is possible, the function is typically called
       with either num_samples and first_sample, or num_frames and first_frames, equal to zero.

       Instead  of  explicitly specifying the origin of the read, the caller may pass the special symbol GD_HERE
       as first_frame.  This will result in the read occurring at the current position of the  I/O  pointer  for
       the  field  (see  GetData  I/O Pointers below for a discussion of field I/O pointers).  In this case, the
       value of first_sample is ignored.

       The return_type argument should be one of the following symbols, which indicates the desired return  type
       of the data:

              GD_UINT8   unsigned 8-bit integer

              GD_INT8    signed (two's complement) 8-bit integer

              GD_UINT16  unsigned 16-bit integer

              GD_INT16   signed (two's complement) 16-bit integer

              GD_UINT32  unsigned 32-bit integer

              GD_INT32   signed (two's complement) 32-bit integer

              GD_UINT64  unsigned 64-bit integer

              GD_INT64   signed (two's complement) 64-bit integer

              GD_FLOAT32 IEEE-754 standard 32-bit single precision floating point number

              GD_FLOAT64 IEEE-754 standard 64-bit double precision floating point number

              GD_COMPLEX64
                         C99-conformant 64-bit single precision complex number

              GD_COMPLEX128
                         C99-conformant 128-bit double precision complex number

              GD_NULL    the  null  type:  the  database  is queried as usual, but no data is returned.  In this
                         case, data_out is ignored and may be NULL.

       The return type of the data need not be the same as the type of the data stored in  the  database.   Type
       conversion will be performed as necessary to return the requested type.  If the field_code does not indi‐
       cate a representation, but conversion from a complex value to a purely real one is required, only the re‐
       al portion of the requested vector will be returned.

       Upon  successful completion, the I/O pointer of the field will be on the sample immediately following the
       last sample returned, if possible.  On error, the position of the I/O pointer is not specified,  and  may
       not even be well defined.

   Behaviour While Reading Specific Field Types
       PHASE: A forward-shifted PHASE field will always encounter the end-of-field marker before its input field
              does.    This   has  ramifications  when  reading  streaming  data  with  gd_getdata()  and  using
              gd_nframes(3) to gauge field lengths (that is: a forward-shifted PHASE field always has less  data
              in it than gd_nframes(3) implies that it does).  As with any other field, gd_getdata() will return
              a short count whenever a read from a PHASE field encounters the end-of-field marker.

              Backward-shifted  PHASE fields do not suffer from this problem, since gd_getdata() pads reads past
              the beginning-of-field marker with NaN or zero as appropriate.  Database creators who wish to  use
              the PHASE field type with streaming data are encouraged to work around this limitation by only us‐
              ing  backward-shifted  PHASE  fields, by writing RAW data at the maximal frame lag, and then back-
              shifting all data which should have been written earlier.   Another  possible  work-around  is  to
              write  systematically  less  data  to the reference RAW field in proportion to the maximal forward
              phase shift.  This method will work with applications which respect the database size reported  by
              gd_nframes(3)  resulting in these applications effectively ignoring all frames past the frame con‐
              taining the maximally forward-shifted PHASE field's end-of-field marker.

       MPLEX: Reading an MPLEX field typically requires GetData to read data before the range returned in  order
              to determine the value of the first sample returned.  This can become expensive if the encoding of
              the  underlying RAW data does not support seeking backwards (which is true of most compression en‐
              codings).  How much preceding data GetData searches for the initial value of the returned data can
              be adjusted, or the lookback disabled completely, using gd_mplex_lookback(3).  If the initial val‐
              ue of the field is not found in the data searched, GetData will fill the returned  vector,  up  to
              the  next  available  sample  of  the  mulitplexed  field,  with zero for integer return types, or
              IEEE-754-conforming NaN (not-a-number) for floating point return types, as it does when  providing
              data before the beginning-of-field.

              GetData caches the value of the last sample from every MPLEX it reads so that a subsequent read of
              the  field starting from the following sample (either through an explicit starting sample given by
              the caller or else implicitly using GD_HERE) will not need to  scan  the  field  backwards.   This
              cache  is invalidated if a different return type is used, or if an intervening operation moves the
              field's I/O pointer.

       WINDOW:
              The samples of a WINDOW for which the field conditional is false will be filled with  either  zero
              for  integer  return  types,  or  IEEE-754-conforming NaN (not-a-number) for floating point return
              types.

RETURN VALUE

       In all cases, gd_getdata() returns the number of samples (not bytes) successfully read from the database.
       If the end-of-field is encountered before the requested number of samples have been read, a  short  count
       will  result.   The  library does not consider this an error.  Requests for data before the beginning-of-
       field marker, which may have been shifted from frame zero by the presence  of  a  FRAMEOFFSET  directive,
       will result in the the data being padded at the front by NaN or zero depending on whether the return type
       is of floating point or integral type.

       If an error has occurred, zero is returned and the dirfile error will be set to a non-zero value.  Possi‐
       ble error values are:

       GD_E_ALLOC
               The library was unable to allocate memory.

       GD_E_BAD_CODE
               The  field  specified by field_code, or one of the fields it uses for input, was not found in the
               database.

       GD_E_BAD_DIRFILE
               An invalid dirfile was supplied.

       GD_E_BAD_SCALAR
               A scalar field used in the definition of the field was not found, or was not of scalar type.

       GD_E_BAD_TYPE
               An invalid return_type was specified.

       GD_E_DIMENSION
               The supplied field_code referred to a CONST, CARRAY, or STRING  field.   The  caller  should  use
               gd_get_constant(3),  gd_get_carray(3), or gd_get_string(3) instead.  Or, a scalar field was found
               where a vector field was expected in the definition of field_code or one of its inputs.

       GD_E_DOMAIN
               An immediate read was attempted using GD_HERE, but the I/O pointer of the field was not well  de‐
               fined  because  two  or  more  of  the field's inputs did not agree as to the location of the I/O
               pointer.

       GD_E_INTERNAL_ERROR
               An internal error occurred in the library while trying to perform the task.  This indicates a bug
               in the library.  Please report the incident to the maintainer.

       GD_E_IO An error occurred while trying to open or read from a file on disk containing a raw field or LIN‐
               TERP table.

       GD_E_LUT
               A LINTERP table was malformed.

       GD_E_RECURSE_LEVEL
               Too many levels of recursion were encountered while trying to resolve field_code.   This  usually
               indicates a circular dependency in field specification in the dirfile.

       GD_E_UNKNOWN_ENCODING
               The  encoding scheme of a RAW field could not be determined.  This may also indicate that the bi‐
               nary file associated with the RAW field could not be found.

       GD_E_UNSUPPORTED
               Reading from dirfiles with the encoding scheme of the specified dirfile is not supported  by  the
               library.  See dirfile-encoding(5) for details on dirfile encoding schemes.

       The dirfile error may be retrieved by calling gd_error(3).  A descriptive error string for the last error
       encountered can be obtained from a call to gd_error_string(3).

NOTES

       To  save memory, gd_getdata() uses the memory pointed to by data_out as scratch space while computing de‐
       rived fields.  As a result, if an error is encountered during the computation, the contents of this memo‐
       ry buffer are unspecified, and may have been modified by this call, even though gd_getdata() will  report
       zero samples returned on error.

       Reading  slim-compressed  data  (see defile-encoding(5)), may cause unexpected memory usage.  This is be‐
       cause slimlib internally caches open decompressed files as they are read, and GetData doesn't close  data
       files  between gd_getdata() calls for efficiency's sake.  Memory used by this internal slimlib buffer can
       be reclaimed by calling gd_raw_close(3) on fields when finished reading them.

GETDATA I/O POINTERs

       This is a general discussion of field I/O pointers in the GetData library, and contains  information  not
       directly applicable to gd_getdata().

       Every  RAW  field  in  an  open Dirfile has an I/O pointer which indicates the library's current read and
       write poisition in the field.  These I/O pointers are useful when performing sequential reads  or  writes
       on Dirfile fields (see GD_HERE in the description above).  The value of the I/O pointer of a field is re‐
       ported by gd_tell(3).

       Derived fields have virtual I/O pointers arising from the I/O pointers of their input fields.  These vir‐
       tual  I/O pointers may be valid (when all input fields agree on their position in the dirfile) or invalid
       (when the input fields are not in agreement).  The I/O pointer of some derived fields is always  invalid.
       The  usual  reason  for this is the derived field simultaneously reading from two different places in the
       same RAW field.  For example, given the following Dirfile metadata specification:

              a RAW UINT8 1
              b PHASE a 1
              c LINCOM 2 a 1 0 b 1 0

       the derived field c never has a valid I/O pointer, since any particular sample of c  ultimately  involves
       reading  from  more  than one place in the RAW field a.  Attempting to perform sequential reads or writes
       (with GD_HERE) on a derived field when its I/O pointer is invalid will result in an error  (specifically,
       GD_E_DOMAIN).

       The  implicit  INDEX  field  has  an  effective I/O pointer than mostly behaves like a true RAW field I/O
       pointer, except that it permits simultaneous reads from multiple  locations.   So,  given  the  following
       metadata specification:

              d PHASE INDEX 1
              e LINCOM 2 INDEX 1 0 d 1 0

       the  I/O  pointer of the derived field e will always be valid, unlike the similarly defined c above.  The
       virtual I/O pointer of a derived field will change in response to movement of the RAW I/O pointers under‐
       lying the derived fields inputs, and vice versa: moving the I/O pointer of a derived field will move  the
       I/O pointer of the RAW fields from which it ultimately derives.  As a result, the I/O pointer of any par‐
       ticular field may move in unexpected ways if multiple fields are manipulated at the same time.

       When  a Dirfile is first opened, the I/O pointer of every RAW field is set to the beginning-of-frame (the
       value returned by gd_bof(3)), as is the I/O pointer of any newly-created RAW field.

       The following library calls cause I/O pointers to move:

       gd_getdata() and gd_putdata(3)
              These functions move the I/O pointer of affected fields to the sample  immediately  following  the
              last  sample  read  or  written,  both when performed at an absolutely specified position and when
              called for a sequential read or write using GD_HERE.  When reading a derived field which  simulta‐
              neously  reads from more than one place in a RAW field (such as c above), the position of that RAW
              field's I/O pointer is unspecified (that is: it is not specified which input field is read first).

       gd_seek(3)
              This function is used to manipulate I/O pointers directly.

       gd_flush(3) and gd_raw_close(3)
              These functions set the I/O pointer of any RAW field which is closed  back  to  the  beginning-of-
              field.

       calls which result in modifications to raw data files:
              this may happen when calling any of: gd_alter_encoding(3), gd_alter_endianness(3), gd_alter_frame‐
              offset(3), gd_alter_entry(3), gd_alter_raw(3), gd_alter_spec(3), gd_malter_spec(3), gd_move(3), or
              gd_rename(3);  these  functions  close  affected  RAW fields before making changes to the raw data
              files, and so reset the corresponding I/O pointers to the beginning-of-field.

       In general, when these calls fail, the I/O pointers of affected fields  may  be  anything,  even  out-of-
       bounds  or  invalid.   After  an error, the caller should issue an explicit gd_seek(3) to repoisition I/O
       pointers before attempting further sequential operations.

SEE ALSO

       dirfile(5), dirfile-encoding(5), gd_get_constant(3), gd_get_string(3),  gd_error(3),  gd_error_string(3),
       gd_mplex_lookback(3),  gd_nframes(3),  gd_open(3), gd_raw_close(3), gd_seek(3), gd_spf(3), gd_putdata(3),
       GD_SIZE(3)

Version 0.9.0                                    16 October 2014                                   gd_getdata(3)