Provided by: pcp_6.3.3-1_amd64 

NAME
pmlogger_daily - administration of Performance Co-Pilot archive files
SYNOPSIS
$PCP_BINADM_DIR/pmlogger_daily [-DEfKMNoprRVzZ?] [-c control] [-k time] [-l logfile] [-m addresses] [-s
size] [-t want] [-x time] [-X program] [-Y regex]
DESCRIPTION
pmlogger_daily and the related pmlogger_check(1) tools along with associated control files (see
pmlogger.control(5)) may be used to create a customized regime of administration and management for
historical archives of performance data within the Performance Co-Pilot (see PCPIntro(1)) infrastructure.
pmlogger_daily is intended to be run once per day, preferably in the early morning, as soon after
midnight as practicable. Its task is to aggregate, rotate and perform general housekeeping one or more
sets of PCP archives.
To accommodate the evolution of PMDAs and changes in production logging environments, pmlogger_daily is
integrated with pmlogrewrite(1) to allow optional and automatic rewriting of archives before merging. If
there are global rewriting rules to be applied across all archives mentioned in the control file(s), then
create the directory $PCP_SYSCONF_DIR/pmlogrewrite and place any pmlogrewrite(1) rewriting rules in this
directory. For rewriting rules that are specific to only one family of archives, use the directory name
from the control file(s) - i.e. the fourth field - and create a file, or a directory, or a symbolic link
named pmlogrewrite within this directory and place the required rewriting rule(s) in the pmlogrewrite
file or in files within the pmlogrewrite subdirectory. pmlogger_daily will choose rewriting rules from
the archive directory if they exist, else rewriting rules from $PCP_SYSCONF_DIR/pmlogrewrite if that
directory exists, else no rewriting is attempted.
As an alternate mechanism, if the file $PCP_LOG_DIR/pmlogger/.NeedRewrite exists when pmlogger_daily
starts then this is treated the same as specifying -R on the command line and
$PCP_LOG_DIR/pmlogger/.NeedRewrite will be removed once all the rewriting has been done.
OPTIONS
-c control, --control=control
Both pmlogger_daily and pmlogger_check(1) are controlled by PCP logger control file(s) that
specifies the pmlogger instances to be managed. The default control file is
$PCP_PMLOGGERCONTROL_PATH, but an alternate may be specified using the -c option. If the directory
$PCP_PMLOGGERCONTROL_PATH.d (or control.d from the -c option) exists, then the contents of any
additional control files therein will be appended to the main control file (which must exist).
-D, --noreport
Do not perform the conditional pmlogger_daily_report(1) processing as described below.
-E, --expunge
This option causes pmlogger_daily to pass the -E flag to pmlogger_merge(1) in order to expunge
metrics with metadata inconsistencies and continue rather than fail. This is intended for automated
daily archive rotation where it is highly desirable for unattended daily archive merging, rewriting
and compression to succeed. For further details, see pmlogger_merge(1) and description for the -x
flag in pmlogextract(1).
-f, --force
This option forces pmlogger_daily to attempt compression actions. Using this option in production
is not recommended.
-k time, --discard=time
After some period, old PCP archives are discarded. time is a time specification in the syntax of
find-filter(1), so DD[:HH[:MM]]. The optional HH (hours) and MM (minutes) parts are 0 if not
specified. By default the time is 14:0:0 or 14 days, but may be changed using this option.
Some special values are recognized for the time, namely 0 to keep no archives beyond the the ones
being currently written by pmlogger(1), and forever or never to prevent any archives being
discarded.
The time can also be set using the $PCP_CULLAFTER variable, set in either the environment or in a
control file. If both $PCP_CULLAFTER and -k specify different values for time then the environment
variable value is used and a warning is issued, i.e. if $PCP_CULLAFTER is set in the control file,
it overrides -k given on the command line.
Note that the semantics of time are that it is measured from the time of last modification of each
archive, and not from the original archive creation date. This has subtle implications for
compression (see below) - the compression process results in the creation of new archive files which
have new modification times. In this case, the time period (re)starts from the time of compression.
-K When this option is specified for pmlogger_daily then only the compression tasks are attempted, so
no pmlogger rotation, no culling, no rewriting, etc. When -K is used and a period of 0 is in effect
(from -x on the command line or $PCP_COMPRESSAFTER in the environment or via the control file) this
is intended for environments where compression of archives is desired before the scheduled daily
processing happens. To achieve this, once pmlogger_check(1) has completed regular processing, it
calls pmlogger_daily with just the -K option. Provided $PCP_COMPRESSAFTER is set to 0 along with
any other required compression options to match the scheduled invocation of pmlogger_daily, then
this will compress all volumes except the ones being currently written by pmlogger(1). If
$PCP_COMPRESSAFTER is set to a value greater than zero, then manually running pmlogger_daily with
the -x option may be used to compress volumes that are younger than the $PCP_COMPRESSAFTER time.
This may be used to reclaim filesystem space by compressing volumes earlier than they would have
otherwise been compressed. Note that since the default value of $PCP_COMPRESSAFTER is 0 days, the
-x option has no effect unless the control file has been edited and $PCP_COMPRESSAFTER has been set
to a value greater than 0.
-l file, --logfile=file
In order to ensure that mail is not unintentionally sent when these scripts are run from cron(8) or
systemd(1) diagnostics are always sent to log files. By default, this file is
$PCP_LOG_DIR/pmlogger/pmlogger_daily.log but this can be changed using the -l option. If this log
file already exists when the script starts, it will be renamed with a .prev suffix (overwriting any
log file saved earlier) before diagnostics are generated to the log file. The -l and -t options
cannot be used together.
-m addresses, --mail=addresses
Use of this option causes pmlogger_daily to construct a summary of the ``notices'' file entries
which were generated in the last 24 hours, and e-mail that summary to the set of space-separated
addresses. This daily summary is stored in the file $PCP_LOG_DIR/NOTICES.daily, which will be empty
when no new ``notices'' entries were made in the previous 24 hour period.
-M This option may be used to disable archive merging (or renaming) and rewriting (-M implies -r).
This is most useful in cases where the archives are being incrementally copied to a remote
repository, e.g. using rsync(1). Merging, renaming and rewriting all risk an increase in the
synchronization load, especially immediately after pmlogger_daily has run, so -M may be useful in
these cases.
-N, --showme
This option enables a ``show me'' mode, where the programs actions are echoed, but not executed, in
the style of ``make -n''. Using -N in conjunction with -V maximizes the diagnostic capabilities for
debugging.
-o By default all possible archives will be merged. This option reinstates the old behaviour in which
only yesterday's archives will be considered as merge candidates. In the special case where only a
single input archive needs to be merged, pmlogmv(1) is used to rename the archive, otherwise
pmlogger_merge(1) is used to merge all of the archives for a single host and a single day into a new
PCP archive and the individual archives are removed.
-p If this option is specified for pmlogger_daily then the status of the daily processing is polled and
if the daily pmlogger(1) rotation, culling, rewriting, compressing, etc. has not been done in the
last 24 hours then it is done now. The intent is to have pmlogger_daily called regularly with the
-p option (at 30 mins past the hour, every hour in the default cron(8) set up) to ensure daily
processing happens as soon as possible if it was missed at the regularly scheduled time (which is
00:10 by default), e.g. if the system was down or suspended at that time. With this option
pmlogger_daily simply exits if the previous day's processing has already been done. Note that this
option is not used on platforms supporting systemd(1) because the pmlogger_daily.timer service unit
specifies a timer setting with Persistent=true. The -K and -p options to pmlogger_daily are
mutually exclusive.
-r, --norewrite
This command line option acts as an override and prevents all archive rewriting with pmlogrewrite(1)
independent of the presence of any rewriting rule files or directories.
-R, --rewriteall
Sometimes PMDA changes require all archives to be rewritten, not just the ones involved in any
current merging. This is required for example after a PCP upgrade where a new version of an
existing PMDA has revised metadata. The -R command line forces this universal-style of rewriting.
The -R option to pmlogger_daily is mutually exclusive with both the -r and -M options.
-s size, --rotate=size
If the PCP ``notices'' file ($PCP_LOG_DIR/NOTICES) is larger than 20480 bytes, pmlogger_daily will
rename the file with a ``.old'' suffix, and start a new ``notices'' file. The rotate threshold may
be changed from 20480 to size bytes using the -s option.
-t period
To assist with debugging or diagnosing intermittent failures the -t option may be used. This will
turn on very verbose tracing (-VV) and capture the trace output in a file named
$PCP_LOG_DIR/pmlogger/daily.datestamp.trace, where datestamp is the time pmlogger_daily was run in
the format YYYYMMDD.HH.MM. In addition, the period argument will ensure that trace files created
with -t will be kept for period days and then discarded.
-V, --verbose
The output from the cron execution of the scripts may be extended using the -V option to the scripts
which will enable verbose tracing of their activity. By default the scripts generate no output
unless some error or warning condition is encountered. A second -V increases the verbosity. Using
-N in conjunction with -V maximizes the diagnostic capabilities for debugging.
-x time, --compress-after=time
Archive data files can optionally be compressed after some period to conserve disk space. This is
particularly useful for large numbers of pmlogger processes under the control of pmlogger_daily.
time is a time specification in the syntax of find-filter(1), so DD[:HH[:MM]]. The optional HH
(hours) and MM (minutes) parts are 0 if not specified.
Some special values are recognized for the time, namely 0 to apply compression as soon as possible,
and forever or never to prevent any compression being done.
If transparent_decompress is enabled when libpcp was built (can be checked with the pmconfig(1) -L
option), then the default behaviour is compression ``as soon as possible''. Otherwise the default
behaviour is to not compress files (which matches the historical default behaviour in earlier PCP
releases).
The time can also be set using the $PCP_COMPRESSAFTER variable, set in either the environment or in
a control file. If both $PCP_COMPRESSAFTER and -x specify different values for time then the
environment variable value is used and a warning is issued. For important other detailed notes
concerning volume compression, see the -K and -k options (above).
-X program, --compressor=program
This option specifies the program to use for compression - by default this is xz(1). The
environment variable $PCP_COMPRESS may be used as an alternative mechanism to define program. If
both $PCP_COMPRESS and -X specify different compression programs then the environment variable value
is used and a warning is issued.
-Y regex, --regex=regex
This option allows a regular expression to be specified causing files in the set of files matched
for compression to be omitted - this allows only the data file to be compressed, and also prevents
the program from attempting to compress it more than once. The default regex is
"\.(index|Z|gz|bz2|zip|xz|lzma|lzo|lz4|zst)$"
- such files are filtered using the -v option to egrep(1). The environment variable
$PCP_COMPRESSREGEX may be used as an alternative mechanism to define regex. If both
$PCP_COMPRESSREGEX and -Y specify different values for regex then the environment variable value is
used and a warning is issued.
-z This option causes pmlogger_daily to not ``re-exec'', see pmlogger(1), when it would otherwise
choose to do so and is intended only for QA testing.
-Z This option causes pmlogger_daily to ``re-exec'', see pmlogger(1), whenever that is possible and is
intended only for QA testing.
-?, --help
Display usage message and exit.
CALLBACKS
Additionally pmlogger_daily supports the following ``hooks'' to allow auxiliary operations to be
performed at key points in the daily processing of the archives. These callbacks are controlled via
variables that may be set in the environment or via the control file.
Note that merge callbacks and autosaving described below are not enabled when only compression tasks are
being attempted, i.e. when -K command line option is used.
All of the callback script execution and the autosave file moving will be executed as the non-privileged
user ``pcp'' and group ``pcp'', so appropriate permissions may need to have been set up in advance.
$PCP_MERGE_CALLBACK
As each day's archive is created by merging and before any compression takes place, if
$PCP_MERGE_CALLBACK is defined, then it is assumed to be a script that will be called with one
argument being the name of the archive (stripped of any suffixes), so something of the form
/some/directory/path/YYYYMMDD. The script needs to be either a full pathname, or something that
will be found on the shell's $PATH . The callback script will be run in the foreground, so
pmlogger_daily will wait for it to complete.
If the control file contains more than one $PCP_MERGE_CALLBACK specification then these will be run
serially in the order they appear in the control file. If $PCP_MERGE_CALLBACK is defined in the
environment when pmlogger_daily is run, this is treated as though this option was the first in the
control file, i.e. it will be run before any merge callbacks mentioned in the control file.
If the pcp-zeroconf packages is installed, then a special merge callback is added to call
pmlogger_daily_report(1) first, before any other merge callback options. Refer to
pmlogger_daily_report(1) for an explanation of the pcp-zeroconf requirements.
If pmlogger_daily is in ``catch up'' mode (more than one day's worth of archives need to be
combined) then each call back is executed once for each day's archive that is generated.
A typical use might be to produce daily reports from the PCP archive which needs to wait until the
archive has been created, but is more efficient if it is done before any potential compression of
the archive.
$PCP_COMPRESS_CALLBACK
If pmlogger_daily is run with -x 0 or $PCP_COMPRESSAFTER=0, then compression is done immediately
after merging. As each day's archive is compressed, if $PCP_COMPRESS_CALLBACK is defined, then it
is assumed to be a script that will be called with one argument being the name of the archive
(stripped of any suffixes), so something of the form /some/directory/path/YYYYMMDD. The script
needs to be either a full pathname, or something that will be found on the shell's $PATH . The
callback script will be run in the foreground, so pmlogger_daily will wait for it to complete.
If the control file contains more than one $PCP_COMPRESS_CALLBACK specification then these will be
run serially in the order they appear in the control file. If $PCP_COMPRESS_CALLBACK is defined in
the environment when pmlogger_daily is run, this is treated as though this option was the first in
the control file, i.e. it will be run first.
If pmlogger_daily is in ``catch up'' mode (more than one day's worth of archives need to be
compressed) then each call back is executed once for each day's archive that is compressed.
A typical use might be to keep recent archives in uncompressed form for efficient querying, but move
the older archives to some other storage location once the compression has been done.
$PCP_AUTOSAVE_DIR
Once the merging and possible compression has been done by pmlogger_daily, if $PCP_AUTOSAVE_DIR is
defined then all of the physical files that make up one day's archive will be moved (autosaved) to
the directory specified by $PCP_AUTOSAVE_DIR.
The basename of the archive is used to set the reserved words DATEYYYY (year), DATEMM (month) and
DATEDD (day) and these (along with LOCALHOSTNAME) may appear literally in $PCP_AUTOSAVE_DIR, and
will be substituted at execution time to generate the destination directory name. For example:
$PCP_AUTOSAVE_DIR=/gpfs/LOCALHOSTNAME/DATEYYYY/DATEMM-DATEDD
Note that these ``date'' reserved words correspond to the date on which the archive data was
collected, not the date that pmlogger_daily was run.
If $PCP_AUTOSAVE_DIR (after LOCALHOSTNAME and ``date'' substitution) does not exist then
pmlogger_daily will attempt to create it (along with any parent directories that do not exist).
Just be aware that this directory creation runs under the uid of the user ``pcp'', so directories
along the path to $PCP_AUTOSAVE_DIR may need to be writeable by this non-root user.
By ``move'' the archives we mean a paranoid checksum-copy-checksum-remove (using the -c option for
pmlogmv(1)) that will bail if the copy fails or the checksums do not match (the archives are
important so we cannot risk something like a full filesystem or a permissions issue messing with the
copy process).
If pmlogger_daily is in ``catch up'' mode (more than one day's worth of archives need to be
combined) then the archives for more than one day could be copied in this step.
A typical use might be to create PCP archives on a local filesystem initially, then once all the
data for a single day has been collected and merged, migrate that day's archive to a shared
filesystem or a remote filesystem. This may allow automatic backup to off-site storage and/or
reduce the number of I/O operations and filesystem metadata operations on the (potentially slower)
non-local filesystem.
CONFIGURATION
Refer to pmlogger.control(5) for a description of the contol file(s) that are used to control which
pmlogger instances and which archives are managed by pmlogger_check and pmlogger_daily(1).
FILES
$PCP_VAR_DIR/config/pmlogger/config.default
default pmlogger configuration file location for the local primary logger, typically generated
automatically by pmlogconf(1).
$PCP_ARCHIVE_DIR/<hostname>
default location for archives of performance information collected from the host hostname
$PCP_ARCHIVE_DIR/<hostname>/lock
transient lock file to guarantee mutual exclusion during pmlogger administration for the host
hostname - if present, can be safely removed if neither pmlogger_daily nor pmlogger_check(1) are
running
$PCP_ARCHIVE_DIR/<hostname>/Latest
PCP archive folio created by mkaf(1) for the most recently launched archive containing performance
metrics from the host hostname
$PCP_LOG_DIR/NOTICES
PCP ``notices'' file used by pmie(1) and friends
$PCP_LOG_DIR/pmlogger/pmlogger_daily.log
if the previous execution of pmlogger_daily produced any output it is saved here. The normal case
is no output in which case the file does not exist.
$PCP_ARCHIVE_DIR/SaveLogs
if this directory exists, then the log file from the -l argument for pmlogger_daily will be saved
in this directory with the name of the format <date>-pmlogger_daily.log.<pid> or
<date>-pmlogger_daily-K.log.<pid> This allows the log file to be inspected at a later time, even if
several pmlogger_daily executions have been launched in the interim. Because the PCP archive
management tools run under the $PCP_USER account ``pcp'', $PCP_ARCHIVE_DIR/SaveLogs typically needs
to be owned by the user ``pcp''.
$PCP_ARCHIVE_DIR/<hostname>/SaveLogs
if this directory exists, then the log file from the -l argument of a newly launched pmlogger(1)
for hostname will be saved in this directory with the name archive.log where archive is the
basename of the associated pmlogger(1) PCP archive files. This allows the log file to be
inspected at a later time, even if several pmlogger(1) instances for hostname have been launched
in the interim. Because the PCP archive management tools run under the uid of the user ``pcp'',
$PCP_ARCHIVE_DIR/<hostname>/SaveLogs typically needs to be owned by the user ``pcp''.
$PCP_LOG_DIR/pmlogger/.NeedRewrite
if this file exists, then this is treated as equivalent to using -R on the command line and the
file will be removed once all rewriting has been done.
PCP ENVIRONMENT
Environment variables with the prefix PCP_ are used to parameterize the file and directory names used by
PCP. On each installation, the file /etc/pcp.conf contains the local values for these variables. The
$PCP_CONF variable may be used to specify an alternative configuration file, as described in pcp.conf(5).
COMPATIBILITY ISSUES
Earlier versions of pmlogger_daily used find(1) to locate files for compressing or culling and the -k and
-x options took only integer values to mean ``days''. The semantics of this was quite loose given that
find(1) offers different precision and semantics across platforms.
The current implementation of pmlogger_daily uses find-filter(1) which provides high precision intervals
and semantics that are relative to the time of execution and are consistent across platforms.
SEE ALSO
egrep(1), find-filter(1), PCPIntro(1), pmconfig(1), pmlc(1), pmlogconf(1), pmlogctl(1), pmlogextract(1),
pmlogger(1), pmlogger_check(1), pmlogger_daily_report(1), pmlogger_merge(1), pmlogmv(1), pmlogrewrite(1),
systemd(1), xz(1) and cron(8).
Performance Co-Pilot PCP PMLOGGER_DAILY(1)