Provided by: pcp_5.3.7-1_amd64 bug

NAME

       LOGARCHIVE - performance metrics archive format

DESCRIPTION

       Performance  Co-Pilot  (PCP)  archives  store  historical  values  about arbitrary metrics
       recorded from a single host.  Archives are machine independent and  self-contained  -  all
       metadata required for off-line or off-site analysis is stored.

       The  format  is  stable  in  order to allow long-term historical storage and processing by
       PMAPI client tools.

       Archives may be read by most PCP client tools, using  the  -a/--archive  NAME  option,  or
       dumped  raw  by pmdumplog(1).  Archives are created primarily by pmlogger(1), however they
       can also be created using the LOGIMPORT(3) programming interface.

       Archives  may  be  merged,  analyzed,  modified  and  subsampled   using   pmlogreduce(1),
       pmlogsummary(1),  pmlogrewrite(1)  and  pmlogextract(1).  In addition, PCP archives may be
       examined in sets or grouped together into "archive folios", which are created and  managed
       by the mkaf(1) and pmafm(1) tools.

       Archives  consist  of  several  physical  files that share a common arbitrary prefix, e.g.
       myarchive.

       myarchive.0, myarchive.1, ...
              One or more  data  volumes  containing  the  metric  values  and  any  error  codes
              encountered  during  metric  sampling.   Typically the largest of the files and may
              grow very rapidly, depending on the pmlogger sampling interval(s) being used.

       myarchive.meta
              Information for PMAPI functions  such  as  pmLookupDesc(3),  pmLookupLabels(3)  and
              pmLookupInDom(3).   The  metadata  file  may  grow  sporadically as logged metrics,
              instance domains and labels vary over time.

       myarchive.index
              A temporal index, mapping timestamps to offsets in the other files.

COMMON FEATURES

       All three types of files have a similar record-based structure, a convention  of  network-
       byte-order (big-endian) encoding, and 32-bit fields for tagging/padding for those records.
       Strings are stored as 8-bit characters without assuming a specific encoding,  so  normally
       ASCII.  See also the __pmLog* types in libpcp.h.

   RECORD FRAMING
       The volume and meta files are divided into self-identifying records.

                ┌───────┬────────┬─────────────────────────────────────────────────────┐
                │Offset │ Length │                        Name                         │
                ├───────┼────────┼─────────────────────────────────────────────────────┤
                │  0    │   4    │ N, length of record, in bytes, including this field │
                │  4    │  N-8   │ record payload, usually starting with a 32-bit tag  │
                │ N-4   │   4    │ N, length of record (again)                         │
                └───────┴────────┴─────────────────────────────────────────────────────┘

   ARCHIVE LOG LABEL
       All  three types of files begin with a "log label" header, which identifies the host name,
       the time interval covered, and a time zone.

                 ┌───────┬────────┬────────────────────────────────────────────────────┐
                 │Offset │ Length │                        Name                        │
                 ├───────┼────────┼────────────────────────────────────────────────────┤
                 │  0    │   4    │ tag, PM_LOG_MAGIC | PM_LOG_VERS02=0x50052602       │
                 │  4    │   4    │ pid of pmlogger process that wrote file            │
                 │  8    │   4    │ log start time, seconds part (past UNIX epoch)     │
                 │  12   │   4    │ log start time, microseconds part                  │
                 │  16   │   4    │ current log volume number (or -1=.meta, -2=.index) │
                 │  20   │   64   │ name of collection host                            │
                 │  80   │   40   │ time zone string ($TZ environment variable)        │
                 └───────┴────────┴────────────────────────────────────────────────────┘

       All fields, except for the current log volume number field, match for all  archive-related
       files produced by a single run of the tool.

ARCHIVE VOLUME (.0, .1, ...) RECORDS

   pmResult
       After  the  archive  log  label  record,  an  archive  volume  file contains metric values
       corresponding to the pmResult set of one pmFetch operation, which is almost  identical  to
       the  form  on  disk.  The record size may vary according to number of PMIDs being fetched,
       the number of instances for their domains.  File size is limited to 2GiB, due  to  storage
       of 32-bit offsets within the temporal index.

                     ┌────────┬────────┬───────────────────────────────────────────┐
                     │Offset  │ Length │                   Name                    │
                     ├────────┼────────┼───────────────────────────────────────────┤
                     │   0    │   4    │ timestamp, seconds part (past UNIX epoch) │
                     │   4    │   4    │ timestamp, microseconds part              │
                     │   8    │   4    │ number of PMIDs with data following       │
                     │  12    │   M    │ pmValueSet #0                             │
                     │ 12+M   │   N    │ pmValueSet #1                             │
                     │12+M+N  │  ...   │ ...                                       │
                     │  NOP   │   X    │ pmValueBlock #0                           │
                     │ NOP+X  │   Y    │ pmValueBlock #1                           │
                     │NOP+X+Y │  ...   │ ...                                       │
                     └────────┴────────┴───────────────────────────────────────────┘

       Records   with  a  number-of-PMIDs  equal  to  zero  are  "mark  records",  and  represent
       interruptions, missing data, or time discontinuities in logging.

   pmValueSet
       This subrecord represents the measurements for one metric.

                   ┌───────┬────────┬────────────────────────────────────────────────┐
                   │Offset │ Length │                      Name                      │
                   ├───────┼────────┼────────────────────────────────────────────────┤
                   │  0    │   4    │ PMID                                           │
                   │  4    │   4    │ number of values                               │
                   │  8    │   4    │ storage mode, PM_VAL_INSITU=0 or PM_VAL_DPTR=1 │
                   │  12   │   M    │ pmValue #0                                     │
                   │ 12+M  │   N    │ pmValue #1                                     │
                   │12+M+N │  ...   │ ...                                            │
                   └───────┴────────┴────────────────────────────────────────────────┘

       The metric-description metadata for PMIDs is found in the .meta files.  These entries  are
       not  timestamped,  so  the  metadata  is assumed to be unchanging throughout the archiving
       session.

   pmValue
       This subrecord represents one measurement for one instance of the metric.  It is a variant
       type,  depending on the parent pmValueSet's value-format field.  This allows small numbers
       to be encoded compactly, but retain flexibility for larger or variable-length data  to  be
       stored later in the pmResult record.

                   ┌───────┬────────┬───────────────────────────────────────────────┐
                   │Offset │ Length │                     Name                      │
                   ├───────┼────────┼───────────────────────────────────────────────┤
                   │  0    │   4    │ number in instance-domain (or PM_IN_NULL=-1)  │
                   │  4    │   4    │ value (INSITU) or                             │
                   │       │        │ offset in pmResult to our pmValueBlock (DPTR) │
                   └───────┴────────┴───────────────────────────────────────────────┘

       The  instance-domain  metadata  for  PMIDs is found in the .meta files.  Since the numeric
       mappings may change during the lifetime of the logging session, it is important  to  match
       up  the timestamp of the measurement record with the corresponding instance-domain record.
       That is, the instance-domain corresponding to a measurement at time T are the records with
       largest timestamps T' <= T.

   pmValueBlock
       Instances of this subrecord are placed at the end of the pmValueSet, after all the pmValue
       subrecords.  If (and only if) needed, they are padded at the end to the next-higher 32-bit
       boundary.

                   ┌───────┬────────┬────────────────────────────────────────────────┐
                   │Offset │ Length │                      Name                      │
                   ├───────┼────────┼────────────────────────────────────────────────┤
                   │  0    │   1    │ value type (same as pmDesc.type)               │
                   │  1    │   3    │ 4 + N, the length of the subrecord             │
                   │  4    │   N    │ bytes that make up the raw value               │
                   │ 4+N   │  0-3   │ padding (not included in the 4+N length field) │
                   └───────┴────────┴────────────────────────────────────────────────┘

       Note  that  for PM_TYPE_STRING, the length includes an explicit NULL terminator byte.  For
       PM_TYPE_EVENT, the value bytestring is further structured.

METADATA FILE (.meta) RECORDS

       After the archive log  label  record,  the  metadata  file  contains  interleaved  metric-
       description  and  timestamped  instance-domain descriptors.  File size is limited to 2GiB,
       due to storage of 32-bit offsets within the temporal  index.   Unlike  the  data  volumes,
       these records are not forced to 32-bit alignment.  See also libpcp/logmeta.c.

   pmDesc
       Instances  of this record represent the metric description, giving a name, type, instance-
       domain identifier, and a set of names to each PMID used in the archive volume.

                 ┌───────┬────────┬───────────────────────────────────────────────────┐
                 │Offset │ Length │                       Name                        │
                 ├───────┼────────┼───────────────────────────────────────────────────┤
                 │  0    │   4    │ tag, TYPE_DESC=1                                  │
                 │  4    │   4    │ PMID                                              │
                 │  8    │   4    │ type (PM_TYPE_*)                                  │
                 │  12   │   4    │ instance domain number                            │
                 │  16   │   4    │ semantics of value (PM_SEM_*)                     │
                 │  20   │   4    │ units: bit-packed pmUnits                         │
                 │  4    │   4    │ number of alternative names for this PMID         │
                 │  28   │   4    │ N: number of bytes in this name                   │
                 │  32   │   N    │ bytes of the name, no NULL terminator nor padding │
                 │ 32+N  │   4    │ N2: number of bytes in next name                  │
                 │ 36+N  │   N2   │ bytes of the name, no NULL terminator nor padding │
                 │ ...   │  ...   │ ...                                               │
                 └───────┴────────┴───────────────────────────────────────────────────┘

   pmLogIndom
       Instances of this record represent the number-string mapping table of an instance  domain.
       The  instance domain number will have already been mentioned in a prior pmDesc record.  As
       new instances may appear over a long archiving run these records are timestamped, and must
       be  searched when decoding pmResult records from the archive data volumes.  Instance names
       may be reused between instance numbers, so an  offset-based  string  table  is  used  that
       facilitates sharing.

                  ┌─────────┬────────┬───────────────────────────────────────────────┐
                  │ Offset  │ Length │                     Name                      │
                  ├─────────┼────────┼───────────────────────────────────────────────┤
                  │   0     │   4    │ tag, TYPE_INDOM=2                             │
                  │   4     │   4    │ timestamp, seconds part (past UNIX epoch)     │
                  │   8     │   4    │ timestamp, microseconds part                  │
                  │   12    │   4    │ instance domain number                        │
                  │   16    │   4    │ N: number of instances in domain, normally >0 │
                  │   20    │   4    │ first instance number                         │
                  │   24    │   4    │ second instance number (if appropriate)       │
                  │  ...    │  ...   │ ...                                           │
                  │ 20+4*N  │   4    │ first offset into string table (see below)    │
                  │20+4*N+4 │   4    │ second offset into string table (etc.)        │
                  │  ...    │  ...   │ ...                                           │
                  │ 20+8*N  │   M    │ base of string table, containing              │
                  │         │        │ packed, NULL-terminated instance names        │
                  └─────────┴────────┴───────────────────────────────────────────────┘

       Records  of this form replace the existing instance-domain: prior records are not searched
       for resolving instance numbers in measurements after this timestamp.

   pmLogLabelSet
       Instances of this record represent sets of name:value pairs associated with labels of  the
       context,  instance domains and individual performance metrics - refer to pmLookupLabels(3)
       for further details.

       Any instance domain number will have already been mentioned in a prior pmDesc record.   As
       new  labels can appear during an archiving session, these records are timestamped and must
       be searched when decoding pmResult records from the archive data volumes.

                ┌────────────┬────────┬────────────────────────────────────────────────┐
                │  Offset    │ Length │                      Name                      │
                ├────────────┼────────┼────────────────────────────────────────────────┤
                │     0      │   4    │ tag, TYPE_LABEL=3                              │
                │     4      │   4    │ timestamp, seconds part (past UNIX epoch)      │
                │     8      │   4    │ timestamp, microseconds part                   │
                │    12      │   4    │ label type (PM_LABEL_* type macros.)           │
                │    16      │   4    │ numeric identifier - domain, PMID, etc         │
                │            │        │ or PM_IN_NULL=-1 for context labels            │
                │    20      │   4    │ N: number of label sets in this record,        │
                │            │        │ usually 1 except in the case of instances      │
                │    24      │   4    │ offset to the start of the JSONB labels string │
                │    28      │   L1   │ first labelset array entry (see below)         │
                │    ...     │  ...   │ ...                                            │
                │   28+L1    │   LN   │ N-th labelset array entry (see below)          │
                │    ...     │  ...   │ ...                                            │
                │28+L1+...LN │   M    │ concatenated JSONB strings for all labelsets   │
                └────────────┴────────┴────────────────────────────────────────────────┘

       Records of this form replace the existing labels for a given type: prior records  are  not
       searched for resolving that class of label in measurements after this timestamp.

       The  individual  labelset  array  entries  are variable length, depending on the number of
       labels present within that set.  These entries contain the instance  identifiers  (in  the
       case of type PM_LABEL_INSTANCES labels), lengths and offsets of each label name and value,
       and also any flags set for each label.

                     ┌───────┬────────┬───────────────────────────────────────────┐
                     │Offset │ Length │                   Name                    │
                     ├───────┼────────┼───────────────────────────────────────────┤
                     │  0    │   4    │ instance identifier (or PM_IN_NULL=-1)    │
                     │  4    │   4    │ length of JSONB label string              │
                     │  8    │   4    │ N: number of labels in this labelset      │
                     │  12   │   2    │ first label name offset                   │
                     │  14   │   1    │ first label name length                   │
                     │  15   │   1    │ first label flags (e.g. optionality)      │
                     │  16   │   2    │ first label value offset                  │
                     │  18   │   2    │ first label value length                  │
                     │  20   │   2    │ second label name offset (if appropriate) │
                     │ ...   │  ...   │ ...                                       │
                     └───────┴────────┴───────────────────────────────────────────┘

   pmLogText
       This record stores help text associated with a metric or an instance domain - as  provided
       by pmLookupText(3) and pmLookupInDomText(3).

       The  metric  identifier  and  instance domain number will have already been mentioned in a
       prior pmDesc record.

                    ┌───────┬────────┬──────────────────────────────────────────────┐
                    │Offset │ Length │                     Name                     │
                    ├───────┼────────┼──────────────────────────────────────────────┤
                    │  0    │   4    │ tag, TYPE_TEXT=4                             │
                    │  4    │   4    │ text and identifier type (PM_TEXT_* macros.) │
                    │  8    │   4    │ numeric identifier - PMID or instance domain │
                    │  12   │   M    │ help text string, arbitrary text             │
                    └───────┴────────┴──────────────────────────────────────────────┘

INDEX FILE (.index) RECORDS

       After  the  archive  log  label  record,  the  temporal  index  file  contains  a  plainly
       concatenated,  unframed group of tuples, which relate timestamps to 32-bit seek offsets in
       the volume and meta files.  These  records  are  fixed-size,  fixed-format,  and  are  not
       enclosed  in the standard length/payload/length wrapper: they take up the entire remainder
       of the .index file.  See also libpcp/logutil.c.

                 ┌───────┬────────┬───────────────────────────────────────────────────┐
                 │Offset │ Length │                       Name                        │
                 ├───────┼────────┼───────────────────────────────────────────────────┤
                 │  0    │   4    │ event time, seconds part (past UNIX epoch)        │
                 │  4    │   4    │ event time, microseconds part                     │
                 │  8    │   4    │ archive volume number (0...N)                     │
                 │  12   │   4    │ byte offset in .meta file of pmDesc or pmLogIndom │
                 │  16   │   4    │ byte offset in archive volume file of pmResult    │
                 └───────┴────────┴───────────────────────────────────────────────────┘

       Since the temporal index is optional, and exists only to speed up time-based random access
       to  metrics  and  their  metadata,  the index records are emitted only intermittently.  An
       archive reader program should not presume any particular rate of data flow into the index.
       However,  common  events  that  may trigger a new temporal-index record include changes in
       instance-domains, switching over to  a  new  archive  volume,  and  starting  or  stopping
       logging.  One reliable invariant however is that, for each index entry, there are to be no
       meta or archive-volume records with a timestamp after that in the  index,  but  physically
       before the byte-offset in the index.

FILES

       Several PCP tools create archives in standard locations:

       $HOME/.pcp/pmlogger
                 default location for the interactive chart recording mode in pmchart(1)
       $PCP_LOG_DIR/pmlogger
                 default location for pmlogger_daily(1) and pmlogger_check(1) scripts

SEE ALSO

       PCPIntro(1),    PMAPI(3),    pmLookupDesc(3),    pmLookupInDom(3),   pmLookupInDomText(3),
       pmLookupLabels(3),   pmLookupText(3),   mkaf(1),   pmafm(1),   pmchart(1),   pmdumplog(1),
       pmlogger(1),   pmlogger_check(1),   pmlogger_daily(1),   pmlogreduce(1),  pmlogrewrite(1),
       pmlogsummary(1), pcp.conf(5), and pcp.env(5).