Provided by: pcp_4.0.1-1_amd64 bug

NAME

       pmlogrewrite - rewrite Performance Co-Pilot archives

SYNOPSIS

       $PCP_BINADM_DIR/pmlogrewrite [-Cdiqsvw ] [-c config] inlog [outlog]

DESCRIPTION

       pmlogrewrite reads a set of Performance Co-Pilot (PCP) archive logs identified by inlog and creates a PCP
       archive  log  in outlog.  Under normal usage, the -c option will be used to nominate a configuration file
       or files that contains specifications (see the REWRITING RULES SYNTAX section below)  that  describe  how
       the data and metadata from inlog should be transformed to produce outlog.

       The  typical  uses  for  pmlogrewrite  would be to accommodate the evolution of Performance Metric Domain
       Agents (PMDAs) where the names, metadata and semantics of metrics and their associated  instance  domains
       may  change over time, e.g. promoting the type of a metric from a 32-bit to a 64-bit integer, or renaming
       a group of metrics.  Refer to the EXAMPLES section for some additional use cases.

       pmlogrewrite is most useful where PMDA changes, or  errors  in  the  production  environment,  result  in
       archives  that cannot be combined with pmlogextract(1).  By pre-processing the archives with pmlogrewrite
       the resulting archives may be able to be merged with pmlogextract(1).

       The input inlog must be a set of PCP archive logs created by pmlogger(1), or possibly one  of  the  tools
       that  read and create PCP archives, e.g.  pmlogextract(1) and pmlogreduce(1).  inlog is a comma-separated
       list of names, each of which may be the base name of an archive or the name of a directory containing one
       or more archives.

       If no -c option is specified, then the default behavior simply creates outlog as a copy of  inlog.   This
       is a little more complicated than cat(1), as each PCP archive is made up of several physical files.

       While  pmlogrewrite  may  be used to repair some data consistency issues in PCP archives, there is also a
       class of repair tasks that cannot be handled by pmlogrewrite and pmloglabel(1) may be a  useful  tool  in
       these cases.

OPTIONS

       The command line options for pmlogrewrite are as follows:

       -C     Parse  the rewriting rules and quit.  outlog is not created.  When -C is specified, this also sets
              -v and -w so that all warnings and verbose messages are displayed as config is parsed.

       -c config
              If config is a file or symbolic link, read and parse rewriting rules from there.  If config  is  a
              directory,  then  all  of the files or symbolic links in that directory (excluding those beginning
              with a period ``.'') will be used to  provide  the  rewriting  rules.   Multiple  -c  options  are
              allowed.

       -d     Desperate  mode.  Normally if a fatal error occurs, all trace of the partially written PCP archive
              outlog is removed.  With the -d option, the partially created outlog archive log is not removed.

       -i     Rather than creating outlog, inlog is rewritten in place when  the  -i  option  is  used.   A  new
              archive  is created using temporary file names and then renamed to inlog in such a way that if any
              errors (not warnings) are encountered, inlog remains unaltered.

       -q     Quick mode, where if there are no rewriting actions to be performed  (none  of  the  global  data,
              instance  domains or metrics from inlog will be changed), then pmlogrewrite will exit (with status
              0, so success) immediately after parsing the configuration file(s) and outlog is not created.

       -s     When the ``units'' of a metric are changed, if the dimension in terms of space, time and count  is
              unaltered,  then  the scaling factor is being changed, e.g. BYTE to KBYTE, or MSEC-1 to USEC-1, or
              the composite MBYTE.SEC-1 to KBYTE.USEC-1.  The motivation may be (a) that the  original  metadata
              was  wrong but the values in inlog are correct, or (b) the metadata is changing so the values need
              to change as well.  The default pmlogrewrite behaviour matches case (a).   If  case  (b)  applies,
              then use the -s option and the values of all the metrics with a scale factor change in each result
              will  be  rescaled.   For  finer  control over value rescaling refer to the RESCALE option for the
              UNITS clause of the metric rewriting rule described below.

       -v     Increase verbosity of diagnostic output.

       -w     Emit warnings.  Normally pmlogrewrite remains silent for any warning that is not fatal and  it  is
              expected  that for a particular archive, some (or indeed, all) of the rewriting specifications may
              not apply.  For example, changes to a PMDA may be captured in a set  of  rewriting  rules,  but  a
              single  archive  may  not  contain  all  of  the modified metrics nor all of the modified instance
              domains and/or instances.  Because these cases are expected,  they  do  not  prevent  pmlogrewrite
              executing,  and rules that do not apply to inlog are silently ignored by default.  Similarly, some
              rewriting rules may involve no change because the metadata in inlog already matches the intent  of
              the rewriting rule to correct data from a previous version of a PMDA.  The -w flag forces warnings
              to be emitted for all of these cases.

       The argument outlog is required in all cases, except when -i is specified.

REWRITING RULES SYNTAX

       A configuration file contains zero or more rewriting rules as defined below.

       Keywords  and special punctuation characters are shown below in bolditalic font and are case-insensitive,
       so METRIC, metric and Metric are all equivalent in rewriting rules.

       The character ``#'' introduces a comment and the remainder of the line is ignored.  Otherwise  the  input
       is  relatively  free format with optional white space (spaces, tabs or newlines) between lexical items in
       the rules.

       A global rewriting rule has the form:

       GLOBAL { globalspec ...  }

       where globalspec is zero or more of the following clauses:

           HOSTNAME -> hostname

               Modifies the label records in the outlog PCP archive, so that the metrics  will  appear  to  have
               been collected from the host hostname.

           TIME -> delta

               Both  metric  values  and  the  instance domain metadata in a PCP archive carry timestamps.  This
               clause forces all the timestamps to be adjusted by delta, where delta is an optional  sign  ``+''
               (the default) or ``-'', an optional number of hours followed by a colon ``:'', an optional number
               of  minutes  followed  by  a  colon  ``:'',  a number of seconds, an optional fraction of seconds
               following a period ``.''.  The simplest example would be ``30'' to increase the timestamps by  30
               seconds.   A  more complex example would be ``-23:59:59.999'' to move the timestamps backwards by
               one millisecond less than one day.

           TZ -> "timezone"

               Modifies the label records in the outlog PCP archive, so that the metrics  will  appear  to  have
               been  collected  from  a  host  with  a local timezone of timezone.  timezone must be enclosed in
               quotes, and should conform to the valid timezone syntax rules for the local platform.

       An indom rewriting rule modifies an instance domain and has the form:

       INDOM domain.serial { indomspec ...  }

       where domain and serial identify one or more existing instance domains  from  inlog  -  typically  domain
       would be an integer in the range 1 to 510 and serial would be an integer in the range 0 to 4194304.

       As a special case serial could be an asterisk ``*'' which means the rule applies to every instance domain
       with a domain number of domain.

       If a designated instance domain is not in inlog the rule has no effect.

       The indomspec is zero or more of the following clauses:

           INAME "oldname" -> "newname"

               The  instance  identified  by  the  external  instance  name oldname is renamed to newname.  Both
               oldname and newname must be enclosed in quotes.

               As a special case, the new name may be the keyword DELETE (with no quotes), and then the instance
               oldname will be expunged from outlog which removes it  from  the  instance  domain  metadata  and
               removes all values of this instance for all the associated metrics.

               If  the instance names contain any embedded spaces then special care needs to be taken in respect
               of the PCP instance naming rule that treats the leading non-space part of the  instance  name  as
               the  unique  portion  of  the name for the purposes of matching and ensuring uniqueness within an
               instance domain, refer to pmdaInstance(3) for a discussion of this issue.

               As an illustration, consider the hypothetical instance domain  for  a  metric  which  contains  2
               instances with the following names:
                   red
                   eek urk

               Then some possible INAME clauses might be:

               "eek" -> "yellow like a flower"
                         Acceptable, oldname "eek" matches the "eek urk" instance.

               "red" -> "eek"
                         Error, newname "eek" matches the existing "eek urk" instance.

               "eek urk" -> "red of another hue"
                         Error, newname "red of another hue" matches the existing "red" instance.

           INDOM -> newdomain.newserial

               Modifies  the  metadata  for  the  instance  domain and every metric associated with the instance
               domain.  As a special case, newserial could be an asterisk ``*'' which means use serial from  the
               indom  rewriting  rule,  although  this  is  most useful when serial is also an asterisk.  So for
               example:
                   indom 29.* { indom -> 109.* }
               will move all instance domains from domain 29 to domain 109.

           INDOM -> DUPLICATE newdomain.newserial

               A special case of the previous INDOM clause where the instance domain is a duplicate copy of  the
               domain.serial  instance  domain  from  the  indom  rewriting rule, and then any mapping rules are
               applied to the copied newdomain.newserial instance domain.  This is useful when a PMDA  is  split
               and  the  same instance domain needs to be replicated for domain domain and domain newdomain.  So
               for example if the metrics foo.one and foo.two are both defined over instance domain  12.34,  and
               foo.two  is  moved  to  another PMDA using domain 27, then the following rewriting rules could be
               used:
                   indom 12.34 { indom -> duplicate 27.34 }
                   metric foo.two { indom -> 27.34 pmid -> 27.*.*  }

           INST oldid -> newid

               The instance identified by the internal instance identifier oldid is renumbered to  newid.   Both
               oldid and newid are integers in the range 0 to 231-1.

               As  a  special case, newid may be the keyword DELETE and then the instance oldid will be expunged
               from outlog which removes it from the instance domain metadata and removes  all  values  of  this
               instance for all the associated metrics.

       A metric rewriting rule has the form:

       METRIC metricid { metricspec ...  }

       where  metricid  identifies  one  or  more existing metrics from inlog using either a metric name, or the
       internal encoding for a metric's PMID as domain.cluster.item.  In the latter case, typically domain would
       be an integer in the range 1 to 510, cluster would be an integer in the range 0 to 4095, and  item  would
       be an integer in the range 0 to 1023.

       As  special  cases  item  could  be an asterisk ``*'' which means the rule applies to every metric with a
       domain number of domain and a cluster number of cluster, or cluster could be an asterisk which means  the
       rule  applies  to every metric with a domain number of domain and an item number of item, or both cluster
       and item could be asterisks, and rule applies to every metric with a domain number of domain.

       If a designated metric is not in inlog the rule has no effect.

       The metricspec is zero or more of the following clauses:

           DELETE

               The metric is completely removed from outlog, both the metadata and all  values  in  results  are
               expunged.

           INDOM -> newdomain.newserial [ pick ]

               Modifies  the  metadata  to  change the instance domain for this metric.  The new instance domain
               must exist in outlog.

               The optional pick clause may be used to select one input value, or  compute  an  aggregate  value
               from  the  instances  in  an  input result, or assign an internal instance identifier to a single
               output value.  If no pick clause is specified, the default behaviour is to copy all input  values
               from  each  input  result  to  an output result, however if the input instance domain is singular
               (indom PM_INDOM_NULL) then the one output value must be assigned an internal instance identifier,
               which is 0 by default, unless over-ridden by a INST or INAME clause as defined below.

               The choices for pick are as follows:

               OUTPUT FIRST
                           choose the value of the first instance from each input result

               OUTPUT LAST choose the value of the last instance from each input result

               OUTPUT INST instid
                           choose the value of the instance with internal instance identifier instid  from  each
                           result;  the sequence of rewriting rules ensures the OUTPUT processing happens before
                           instance identifier renumbering from any associated indom rule, so instid  should  be
                           one of the internal instance identifiers that appears in inlog

               OUTPUT INAME "name"
                           choose  the  value of the instance with name for its external instance name from each
                           result; the sequence of rewriting rules ensures the OUTPUT processing happens  before
                           instance  renaming  from  any  associated  indom  rule,  so name should be one of the
                           external instance names that appears in inlog

               OUTPUT MIN  choose the smallest value in each result (metric type  must  be  numeric  and  output
                           instance will be 0 for a non-singular instance domain)

               OUTPUT MAX  choose  the  largest  value  in  each  result (metric type must be numeric and output
                           instance will be 0 for a non-singular instance domain)

               OUTPUT SUM  choose the sum of all values in each result (metric type must be numeric  and  output
                           instance will be 0 for a non-singular instance domain)

               OUTPUT AVG  choose  the  average  of  all  values in each result (metric type must be numeric and
                           output instance will be 0 for a non-singular instance domain)

               If the input instance domain is singular (indom  PM_INDOM_NULL)  then  independent  of  any  pick
               specifications, there is at most one value in each input result and so FIRST, LAST, MIN, MAX, SUM
               and AVG are all equivalent and the output instance identifier will be 0.

               In  general  it  is an error to specify a rewriting action for the same metadata or result values
               more than once, e.g. more than one INDOM clause for the same instance domain.  The one  exception
               is the possible interaction between the INDOM clauses in the indom and metric rules.  For example
               the  metric  sample.bin  is  defined  over the instance domain 29.2 in inlog and the following is
               acceptable (albeit redundant):
                   indom 29.* { indom -> 109.* }
                   metric sample.bin { indom -> 109.2 }
               However the following is an error, because the instance domain for sample.bin has two conflicting
               definitions:
                   indom 29.* { indom -> 109.* }
                   metric sample.bin { indom -> 123.2 }

           INDOM -> NULL[ pick ]

               The metric (which must have been previously defined over an instance domain) is being modified to
               be a singular metric.  This involves a metadata change and collapsing all results for this metric
               so that multiple values become one value.

               The optional pick part of the clause defines  how  the  one  value  for  each  result  should  be
               calculated and follows the same rules as described for the non-NULL INDOM case above.

               In the absence of pick, the default is OUTPUT FIRST.

           NAME -> newname

               Renames the metric in the PCP archive's metadata that supports the Performance Metrics Name Space
               (PMNS).   newname  should  not  match any existing name in the archive's PMNS and must follow the
               syntactic rules for valid metric names as outlined in pmns(5).

           PMID -> newdomain.newcluster.newitem

               Modifies the metadata and results to renumber the metric's PMID.  As  special  cases,  newcluster
               could  be  an  asterisk  ``*'' which means use cluster from the metric rewriting rule and/or item
               could be an asterisk which means use item from the metric rewriting rule.  This  is  most  useful
               when cluster and/or item is also an asterisk.  So for example:
                   metric 30.*.* { pmid -> 123.*.* }
               will move all metrics from domain 30 to domain 123.

           SEM -> newsem

               Change  the  semantics  of  the  metric.  newsem should be the XXX part of the name of one of the
               PM_SEM_XXX macros defined in <pcp/pmapi.h> or pmLookupDesc(3), e.g.  COUNTER for PM_TYPE_COUNTER.

               No data value rewriting is performed as a result of the SEM clause, so the usefulness is  limited
               to cases where a version of the associated PMDA was exporting incorrect semantics for the metric.
               pmlogreduce(1)  may  provide  an  alternative  in  cases where re-computation of result values is
               desired.

           TYPE -> newtype

               Change the type of the metric which alters the metadata and may change the encoding of values  in
               results.   newtype should be the XXX part of the name of one of the PM_TYPE_XXX macros defined in
               <pcp/pmapi.h> or pmLookupDesc(3), e.g.  FLOAT for PM_TYPE_FLOAT.

               Type conversion is only supported for cases where the old and new  metric  type  is  numeric,  so
               PM_TYPE_STRING, PM_TYPE_AGGREGATE and PM_TYPE_EVENT are not allowed.  Even for the numeric cases,
               some  conversions  may produce run-time errors, e.g. integer overflow, or attempting to rewrite a
               negative value into an unsigned type.

           TYPE IF oldtype -> newtype

               The same as the preceding TYPE clause, except the type of the metric is only changed  to  newtype
               if the type of the metric in inlog is oldtype.

               This  useful  in  cases where the type of metricid in inlog may be platform dependent and so more
               than one type rewriting rule is required.

           UNITS -> newunits [ RESCALE ]

               newunits is six values separated by commas.  The first 3 values describe  the  dimension  of  the
               metric  along  the dimensions of space, time and count; these are integer values, usually 0, 1 or
               -1.  The remaining 3 values describe the scale of the metric's values in the dimensions of space,
               time and count.  Space scale values should be 0 (if the space dimension is 0), else the XXX  part
               of  the name of one of the PM_SPACE_XXX macros, e.g.  KBYTE for PM_TYPE_KBYTE.  Time scale values
               should be 0 (if the time dimension is 0), else the XXX part of the name of one of the PM_TIME_XXX
               macros, e.g.  SEC for PM_TIME_SEC.  Count scale values should be 0 (if the time dimension is  0),
               else ONE for PM_COUNT_ONE.

               The   PM_SPACE_XXX,   PM_TIME_XXX  and  PM_COUNT_XXX  macros  are  defined  in  <pcp/pmapi.h>  or
               pmLookupDesc(3).

               When the scale is changed (but the dimension is unaltered) the optional keyword  RESCALE  may  be
               used to chose value rescaling as per the -s command line option, but applied to just this metric.

           When  changing  the domain number for a metric or instance domain, the new domain number will usually
           match an existing PMDA's domain number.  If this is not the case, then the new domain  number  should
           not  be  randomly  chosen;  consult  $PCP_VAR_DIR/pmns/stdpmid  for  domain  numbers that are already
           assigned to PMDAs.

EXAMPLES

       To promote the values of the per-disk IOPS metrics to 64-bit to allow aggregation over a long time period
       for capacity planning, or because the PMDA has changed to export 64-bit counters and we want  to  convert
       old archives so they can be processed alongside new archives.
           metric disk.dev.read { type -> U64 }
           metric disk.dev.write { type -> U64 }
           metric disk.dev.total { type -> U64 }

       The  instances associated with the load average metric kernel.all.load could be renamed and renumbered by
       the rules below.
           # for the Linux PMDA, the kernel.all.load metric is defined
           # over instance domain 60.2
           indom 60.2 {
               inst 1 -> 60 iname "1 minute" -> "60 second"
               inst 5 -> 300 iname "5 minute" -> "300 second"
               inst 15 -> 900 iname "15 minute" -> "900 second"
           }

       If we decide to split the ``proc'' metrics out of the Linux PMDA, this will involve changing  the  domain
       number  for the PMID of these metrics and the associated instance domains.  The rules below would rewrite
       an old archive to match the changes after the PMDA split.
           # all Linux proc metrics are in 7 clusters
           metric 60.8.* { pmid -> 123.*.* }
           metric 60.9.* { pmid -> 123.*.* }
           metric 60.13.* { pmid -> 123.*.* }
           metric 60.24.* { pmid -> 123.*.* }
           metric 60.31.* { pmid -> 123.*.* }
           metric 60.32.* { pmid -> 123.*.* }
           metric 60.51.* { pmid -> 123.*.* }
           # only one instance domain for Linux proc metrics
           indom 60.9 { indom -> 123.0 }

       If the metric foo.count_em was exported as a native ``long'' then it could be a 32-bit  integer  on  some
       platforms  and  a 64-bit integer on other platforms.  Subsequent investigations show the value is in fact
       unsigned, so the following rules could be used.
           metric foo.count_em {
                type if 32 -> U32
                type if 64 -> U64
           }

FILES

       For each of the inlog and outlog archive logs, several physical files are used.
       archive.meta
                 metadata (metric descriptions, instance domains, etc.) for the archive log
       archive.0 initial volume of metrics values (subsequent volumes have suffixes 1, 2, ...).
       archive.index
                 temporal index to support rapid random access to the other files in the archive log.

PCP ENVIRONMENT

       Environment variables with the prefix PCP_ are used to parameterize the file and directory names used  by
       PCP.   On  each  installation, the file /etc/pcp.conf contains the local values for these variables.  The
       $PCP_CONF variable may be used to specify an alternative configuration file, as described in pcp.conf(5).

SEE ALSO

       PCPIntro(1), pmdaInstance(3), pmdumplog(1), pmlogger(1), pmlogextract(1), pmloglabel(1),  pmlogreduce(1),
       pmLookupDesc(3), pmns(5), pcp.conf(5) and pcp.env(5).

DIAGNOSTICS

       All  error  conditions  detected by pmlogrewrite are reported on stderr with textual (if sometimes terse)
       explanation.

       Should the input archive log be corrupted (this can happen if  the  pmlogger  instance  writing  the  log
       suddenly  dies), then pmlogrewrite will detect and report the position of the corruption in the file, and
       any subsequent information from that archive log will not be processed.

       If any error is detected, pmlogrewrite will exit with a non-zero status.

Performance Co-Pilot                                                                             PMLOGREWRITE(1)