Provided by: pcp_6.3.0-1_amd64 bug

NAME

       pmlogger_daily - administration of Performance Co-Pilot archive files

SYNOPSIS

       $PCP_BINADM_DIR/pmlogger_daily  [-DEfKMNoprRVzZ?]  [-c control] [-k time] [-l logfile] [-m
       addresses] [-s size] [-t want] [-x time] [-X program] [-Y regex]

DESCRIPTION

       pmlogger_daily and the related pmlogger_check(1) tools along with associated control files
       (see  pmlogger.control(5)) may be used to create a customized regime of administration and
       management for historical archives of performance data  within  the  Performance  Co-Pilot
       (see PCPIntro(1)) infrastructure.

       pmlogger_daily  is  intended  to  be run once per day, preferably in the early morning, as
       soon after midnight as practicable.  Its task is to aggregate, rotate and perform  general
       housekeeping one or more sets of PCP archives.

       To  accommodate  the  evolution  of  PMDAs and changes in production logging environments,
       pmlogger_daily  is  integrated  with  pmlogrewrite(1)  to  allow  optional  and  automatic
       rewriting  of  archives before merging.  If there are global rewriting rules to be applied
       across  all  archives  mentioned  in  the  control  file(s),  then  create  the  directory
       $PCP_SYSCONF_DIR/pmlogrewrite  and  place  any  pmlogrewrite(1)  rewriting  rules  in this
       directory.  For rewriting rules that are specific to only one family of archives, use  the
       directory  name from the control file(s) - i.e. the fourth field - and create a file, or a
       directory, or a symbolic link named pmlogrewrite  within  this  directory  and  place  the
       required  rewriting  rule(s)  in the pmlogrewrite file or in files within the pmlogrewrite
       subdirectory.  pmlogger_daily will choose rewriting rules from the  archive  directory  if
       they  exist,  else  rewriting  rules  from $PCP_SYSCONF_DIR/pmlogrewrite if that directory
       exists, else no rewriting is attempted.

       As an alternate mechanism, if  the  file  $PCP_LOG_DIR/pmlogger/.NeedRewrite  exists  when
       pmlogger_daily  starts  then this is treated the same as specifying -R on the command line
       and $PCP_LOG_DIR/pmlogger/.NeedRewrite will be removed once all  the  rewriting  has  been
       done.

OPTIONS

       -c control, --control=control
            Both  pmlogger_daily  and  pmlogger_check(1)  are  controlled  by  PCP logger control
            file(s) that specifies the pmlogger instances to be  managed.   The  default  control
            file  is  $PCP_PMLOGGERCONTROL_PATH,  but  an alternate may be specified using the -c
            option.  If the directory  $PCP_PMLOGGERCONTROL_PATH.d  (or  control.d  from  the  -c
            option)  exists,  then  the  contents of any additional control files therein will be
            appended to the main control file (which must exist).

       -D, --noreport
            Do not perform  the  conditional  pmlogger_daily_report(1)  processing  as  described
            below.

       -E, --expunge
            This  option  causes pmlogger_daily to pass the -E flag to pmlogger_merge(1) in order
            to expunge metrics with metadata inconsistencies and continue rather than fail.  This
            is  intended  for  automated  daily archive rotation where it is highly desirable for
            unattended daily archive merging, rewriting and compression to succeed.  For  further
            details, see pmlogger_merge(1) and description for the -x flag in pmlogextract(1).

       -f, --force
            This  option forces pmlogger_daily to attempt compression actions.  Using this option
            in production is not recommended.

       -k time, --discard=time
            After some period, old PCP archives are discarded.  time is a time  specification  in
            the  syntax  of  find-filter(1),  so  DD[:HH[:MM]].   The  optional HH (hours) and MM
            (minutes) parts are 0 if not specified.  By default the time is 14:0:0  or  14  days,
            but may be changed using this option.

            Some  special values are recognized for the time, namely 0 to keep no archives beyond
            the the ones being currently written by pmlogger(1), and forever or never to  prevent
            any archives being discarded.

            The  time  can  also  be  set  using  the  $PCP_CULLAFTER variable, set in either the
            environment or in a control file.  If both $PCP_CULLAFTER and  -k  specify  different
            values  for time then the environment variable value is used and a warning is issued,
            i.e. if $PCP_CULLAFTER is set in the control file,  it  overrides  -k  given  on  the
            command line.

            Note  that  the  semantics  of  time  are  that  it is measured from the time of last
            modification of each archive, and not from the original archive creation date.   This
            has subtle implications for compression (see below) - the compression process results
            in the creation of new archive files which have  new  modification  times.   In  this
            case, the time period (re)starts from the time of compression.

       -K   When  this option is specified for pmlogger_daily then only the compression tasks are
            attempted, so no pmlogger rotation, no culling, no rewriting, etc.  When -K  is  used
            and  a period of 0 is in effect (from -x on the command line or $PCP_COMPRESSAFTER in
            the environment or via the control file) this  is  intended  for  environments  where
            compression of archives is desired before the scheduled daily processing happens.  To
            achieve this, once pmlogger_check(1)  has  completed  regular  processing,  it  calls
            pmlogger_daily  with  just  the  -K  option.  Provided $PCP_COMPRESSAFTER is set to 0
            along with any other required compression options to match the  scheduled  invocation
            of  pmlogger_daily,  then  this  will  compress  all  volumes  except  the ones being
            currently written by pmlogger(1).  If $PCP_COMPRESSAFTER is set to  a  value  greater
            than  zero,  then  manually  running pmlogger_daily with the -x option may be used to
            compress volumes that are younger than the $PCP_COMPRESSAFTER time.  This may be used
            to  reclaim  filesystem  space  by  compressing  volumes earlier than they would have
            otherwise been compressed.  Note that since the default value  of  $PCP_COMPRESSAFTER
            is  0  days,  the -x option has no effect unless the control file has been edited and
            $PCP_COMPRESSAFTER has been set to a value greater than 0.

       -l file, --logfile=file
            In order to ensure that mail is not unintentionally sent when these scripts  are  run
            from  cron(8)  or  systemd(1)  diagnostics are always sent to log files.  By default,
            this file is $PCP_LOG_DIR/pmlogger/pmlogger_daily.log but this can be  changed  using
            the  -l  option.   If this log file already exists when the script starts, it will be
            renamed with  a  .prev  suffix  (overwriting  any  log  file  saved  earlier)  before
            diagnostics  are  generated  to  the  log file.  The -l and -t options cannot be used
            together.

       -m addresses, --mail=addresses
            Use of this option causes pmlogger_daily to construct a summary  of  the  ``notices''
            file  entries  which  were generated in the last 24 hours, and e-mail that summary to
            the set of space-separated addresses.  This daily  summary  is  stored  in  the  file
            $PCP_LOG_DIR/NOTICES.daily,  which will be empty when no new ``notices'' entries were
            made in the previous 24 hour period.

       -M   This option may be used to disable archive merging (or renaming)  and  rewriting  (-M
            implies -r).  This is most useful in cases where the archives are being incrementally
            copied to a remote repository, e.g. using rsync(1).  Merging, renaming and  rewriting
            all  risk  an  increase  in  the  synchronization  load, especially immediately after
            pmlogger_daily has run, so -M may be useful in these cases.

       -N, --showme
            This option enables a ``show me'' mode, where the programs actions  are  echoed,  but
            not executed, in the style of ``make -n''.  Using -N in conjunction with -V maximizes
            the diagnostic capabilities for debugging.

       -o   By default all possible archives will be merged.   This  option  reinstates  the  old
            behaviour  in which only yesterday's archives will be considered as merge candidates.
            In the special case where only a single input archive needs to be merged,  pmlogmv(1)
            is  used  to  rename the archive, otherwise pmlogger_merge(1) is used to merge all of
            the archives for a single host and a single day  into  a  new  PCP  archive  and  the
            individual archives are removed.

       -p   If  this  option  is  specified  for  pmlogger_daily  then  the  status  of the daily
            processing is polled and if  the  daily  pmlogger(1)  rotation,  culling,  rewriting,
            compressing,  etc.   has not been done in the last 24 hours then it is done now.  The
            intent is to have pmlogger_daily called regularly with the -p option (at 30 mins past
            the  hour,  every  hour  in  the  default  cron(8) set up) to ensure daily processing
            happens as soon as possible if it was missed at the regularly scheduled  time  (which
            is  00:10  by  default), e.g. if the system was down or suspended at that time.  With
            this option pmlogger_daily simply exits if the previous day's processing has  already
            been  done.   Note  that  this  option is not used on platforms supporting systemd(1)
            because  the  pmlogger_daily.timer  service  unit  specifies  a  timer  setting  with
            Persistent=true.  The -K and -p options to pmlogger_daily are mutually exclusive.

       -r, --norewrite
            This  command line option acts as an override and prevents all archive rewriting with
            pmlogrewrite(1)  independent  of  the  presence  of  any  rewriting  rule  files   or
            directories.

       -R, --rewriteall
            Sometimes  PMDA  changes  require  all  archives  to  be rewritten, not just the ones
            involved in any current merging.  This is required for example after  a  PCP  upgrade
            where  a  new  version of an existing PMDA has revised metadata.  The -R command line
            forces this universal-style  of  rewriting.   The  -R  option  to  pmlogger_daily  is
            mutually exclusive with both the -r and -M options.

       -s size, --rotate=size
            If  the  PCP  ``notices''  file  ($PCP_LOG_DIR/NOTICES)  is  larger than 20480 bytes,
            pmlogger_daily will rename  the  file  with  a  ``.old''  suffix,  and  start  a  new
            ``notices'' file.  The rotate threshold may be changed from 20480 to size bytes using
            the -s option.

       -t period
            To assist with debugging or diagnosing intermittent failures the  -t  option  may  be
            used.  This will turn on very verbose tracing (-VV) and capture the trace output in a
            file named $PCP_LOG_DIR/pmlogger/daily.datestamp.trace, where datestamp is  the  time
            pmlogger_daily  was  run  in  the  format  YYYYMMDD.HH.MM.   In  addition, the period
            argument will ensure that trace files created with -t will be kept  for  period  days
            and then discarded.

       -V, --verbose
            The output from the cron execution of the scripts may be extended using the -V option
            to the scripts which will enable verbose tracing of their activity.  By  default  the
            scripts  generate no output unless some error or warning condition is encountered.  A
            second -V increases the verbosity.  Using -N in conjunction  with  -V  maximizes  the
            diagnostic capabilities for debugging.

       -x time, --compress-after=time
            Archive  data  files  can optionally be compressed after some period to conserve disk
            space.  This is particularly useful for large numbers of pmlogger processes under the
            control of pmlogger_daily.

            time  is  a time specification in the syntax of find-filter(1), so DD[:HH[:MM]].  The
            optional HH (hours) and MM (minutes) parts are 0 if not specified.

            Some special values are recognized for the time, namely 0  to  apply  compression  as
            soon as possible, and forever or never to prevent any compression being done.

            If  transparent_decompress  is enabled when libpcp was built (can be checked with the
            pmconfig(1) -L option), then the  default  behaviour  is  compression  ``as  soon  as
            possible''.   Otherwise the default behaviour is to not compress files (which matches
            the historical default behaviour in earlier PCP releases).

            The time can also be set using the $PCP_COMPRESSAFTER variable,  set  in  either  the
            environment  or  in  a  control  file.   If  both  $PCP_COMPRESSAFTER  and -x specify
            different values for time then the environment variable value is used and  a  warning
            is issued.  For important other detailed notes concerning volume compression, see the
            -K and -k options (above).

       -X program, --compressor=program
            This option specifies the program to use for compression - by default this is  xz(1).
            The  environment  variable  $PCP_COMPRESS  may be used as an alternative mechanism to
            define program.  If both $PCP_COMPRESS and -X specify different compression  programs
            then the environment variable value is used and a warning is issued.

       -Y regex, --regex=regex
            This  option  allows a regular expression to be specified causing files in the set of
            files matched for compression to be omitted - this allows only the data  file  to  be
            compressed,  and  also  prevents the program from attempting to compress it more than
            once.  The default regex is ".(index|Z|gz|bz2|zip|xz|lzma|lzo|lz4)$" - such files are
            filtered   using   the   -v   option   to   egrep(1).    The   environment   variable
            $PCP_COMPRESSREGEX may be used as an alternative mechanism to define regex.  If  both
            $PCP_COMPRESSREGEX  and  -Y  specify  different values for regex then the environment
            variable value is used and a warning is issued.

       -z   This option causes pmlogger_daily to not ``re-exec'', see pmlogger(1), when it  would
            otherwise choose to do so and is intended only for QA testing.

       -Z   This  option  causes pmlogger_daily to ``re-exec'', see pmlogger(1), whenever that is
            possible and is intended only for QA testing.

       -?, --help
            Display usage message and exit.

CALLBACKS

       Additionally pmlogger_daily supports the following ``hooks'' to allow auxiliary operations
       to  be  performed  at key points in the daily processing of the archives.  These callbacks
       are controlled via variables that may be set in the environment or via the control file.

       Note that merge callbacks and  autosaving  described  below  are  not  enabled  when  only
       compression tasks are being attempted, i.e. when -K command line option is used.

       All  of the callback script execution and the autosave file moving will be executed as the
       non-privileged user ``pcp'' and group ``pcp'', so appropriate permissions may need to have
       been set up in advance.

       $PCP_MERGE_CALLBACK
            As  each  day's archive is created by merging and before any compression takes place,
            if $PCP_MERGE_CALLBACK is defined, then it is assumed to be a  script  that  will  be
            called with one argument being the name of the archive (stripped of any suffixes), so
            something of the form /some/directory/path/YYYYMMDD.  The script needs to be either a
            full  pathname,  or something that will be found on the shell's $PATH .  The callback
            script will be run in the foreground, so pmlogger_daily will wait for it to complete.

            If the control file contains more than  one  $PCP_MERGE_CALLBACK  specification  then
            these  will  be  run  serially  in  the  order  they  appear in the control file.  If
            $PCP_MERGE_CALLBACK is defined in the environment when pmlogger_daily is run, this is
            treated  as though this option was the first in the control file, i.e. it will be run
            before any merge callbacks mentioned in the control file.

            If the pcp-zeroconf packages is installed, then a special merge callback is added  to
            call  pmlogger_daily_report(1) first, before any other merge callback options.  Refer
            to pmlogger_daily_report(1) for an explanation of the pcp-zeroconf requirements.

            If pmlogger_daily is in ``catch up'' mode (more than one day's worth of archives need
            to  be  combined) then each call back is executed once for each day's archive that is
            generated.

            A typical use might be to produce daily reports from the PCP archive which  needs  to
            wait  until  the archive has been created, but is more efficient if it is done before
            any potential compression of the archive.

       $PCP_COMPRESS_CALLBACK
            If pmlogger_daily is run with -x 0 or $PCP_COMPRESSAFTER=0, then compression is  done
            immediately   after   merging.    As   each   day's   archive   is   compressed,   if
            $PCP_COMPRESS_CALLBACK is defined, then it is assumed to be a  script  that  will  be
            called with one argument being the name of the archive (stripped of any suffixes), so
            something of the form /some/directory/path/YYYYMMDD.  The script needs to be either a
            full  pathname,  or something that will be found on the shell's $PATH .  The callback
            script will be run in the foreground, so pmlogger_daily will wait for it to complete.

            If the control file contains more than one $PCP_COMPRESS_CALLBACK specification  then
            these  will  be  run  serially  in  the  order  they  appear in the control file.  If
            $PCP_COMPRESS_CALLBACK is defined in the environment when pmlogger_daily is run, this
            is  treated  as though this option was the first in the control file, i.e. it will be
            run first.

            If pmlogger_daily is in ``catch up'' mode (more than one day's worth of archives need
            to be compressed) then each call back is executed once for each day's archive that is
            compressed.

            A typical use might be to keep recent archives in  uncompressed  form  for  efficient
            querying,  but  move  the  older  archives  to  some  other storage location once the
            compression has been done.

       $PCP_AUTOSAVE_DIR
            Once the merging and  possible  compression  has  been  done  by  pmlogger_daily,  if
            $PCP_AUTOSAVE_DIR  is  defined  then all of the physical files that make up one day's
            archive will be moved (autosaved) to the directory specified by $PCP_AUTOSAVE_DIR.

            The basename of the archive is used to set the reserved words DATEYYYY (year), DATEMM
            (month) and DATEDD (day) and these (along with LOCALHOSTNAME) may appear literally in
            $PCP_AUTOSAVE_DIR, and  will  be  substituted  at  execution  time  to  generate  the
            destination directory name.  For example:
                  $PCP_AUTOSAVE_DIR=/gpfs/LOCALHOSTNAME/DATEYYYY/DATEMM-DATEDD

            Note  that  these ``date'' reserved words correspond to the date on which the archive
            data was collected, not the date that pmlogger_daily was run.

            If $PCP_AUTOSAVE_DIR (after LOCALHOSTNAME and ``date'' substitution) does  not  exist
            then pmlogger_daily will attempt to create it (along with any parent directories that
            do not exist).  Just be aware that this directory creation runs under the uid of  the
            user  ``pcp'',  so  directories  along  the  path to $PCP_AUTOSAVE_DIR may need to be
            writeable by this non-root user.

            By ``move'' the archives we mean a paranoid checksum-copy-checksum-remove (using  the
            -c  option  for  pmlogmv(1)) that will bail if the copy fails or the checksums do not
            match (the archives are important so we cannot risk something like a full  filesystem
            or a permissions issue messing with the copy process).

            If pmlogger_daily is in ``catch up'' mode (more than one day's worth of archives need
            to be combined) then the archives for more than one day could be copied in this step.

            A typical use might be to create PCP archives on a local filesystem  initially,  then
            once  all the data for a single day has been collected and merged, migrate that day's
            archive to a shared filesystem or a remote  filesystem.   This  may  allow  automatic
            backup  to off-site storage and/or reduce the number of I/O operations and filesystem
            metadata operations on the (potentially slower) non-local filesystem.

CONFIGURATION

       Refer to pmlogger.control(5) for a description of the contol  file(s)  that  are  used  to
       control  which  pmlogger  instances  and  which archives are managed by pmlogger_check and
       pmlogger_daily(1).

FILES

       $PCP_VAR_DIR/config/pmlogger/config.default
            default pmlogger configuration file location for the local primary logger,  typically
            generated automatically by pmlogconf(1).

       $PCP_ARCHIVE_DIR/<hostname>
            default  location  for  archives  of  performance information collected from the host
            hostname

       $PCP_ARCHIVE_DIR/<hostname>/lock
            transient lock file to guarantee mutual exclusion during pmlogger administration  for
            the  host  hostname - if present, can be safely removed if neither pmlogger_daily nor
            pmlogger_check(1) are running

       $PCP_ARCHIVE_DIR/<hostname>/Latest
            PCP archive  folio  created  by  mkaf(1)  for  the  most  recently  launched  archive
            containing performance metrics from the host hostname

       $PCP_LOG_DIR/NOTICES
            PCP ``notices'' file used by pmie(1) and friends

       $PCP_LOG_DIR/pmlogger/pmlogger_daily.log
            if  the  previous  execution  of pmlogger_daily produced any output it is saved here.
            The normal case is no output in which case the file does not exist.

       $PCP_ARCHIVE_DIR/SaveLogs
             if this directory exists, then the log file from the -l argument for  pmlogger_daily
             will    be    saved    in   this   directory   with   the   name   of   the   format
             <date>-pmlogger_daily.log.<pid> or <date>-pmlogger_daily-K.log.<pid> This allows the
             log  file to be inspected at a later time, even if several pmlogger_daily executions
             have been launched in the interim.  Because the PCP  archive  management  tools  run
             under the $PCP_USER account ``pcp'', $PCP_ARCHIVE_DIR/SaveLogs typically needs to be
             owned by the user ``pcp''.

       $PCP_ARCHIVE_DIR/<hostname>/SaveLogs
              if this directory exists, then the log  file  from  the  -l  argument  of  a  newly
              launched  pmlogger(1)  for  hostname  will be saved in this directory with the name
              archive.log where archive is the basename of the associated pmlogger(1) PCP archive
              files.   This  allows the log file to be inspected at a later time, even if several
              pmlogger(1) instances for hostname have been launched in the interim.  Because  the
              PCP   archive   management   tools   run   under  the  uid  of  the  user  ``pcp'',
              $PCP_ARCHIVE_DIR/<hostname>/SaveLogs typically  needs  to  be  owned  by  the  user
              ``pcp''.

       $PCP_LOG_DIR/pmlogger/.NeedRewrite
               if this file exists, then this is treated as equivalent to using -R on the command
               line and the file will be removed once all rewriting has been done.

PCP ENVIRONMENT

       Environment variables with the prefix PCP_ are used to parameterize the file and directory
       names used by PCP.  On each installation, the file /etc/pcp.conf contains the local values
       for these variables.  The $PCP_CONF  variable  may  be  used  to  specify  an  alternative
       configuration file, as described in pcp.conf(5).

COMPATIBILITY ISSUES

       Earlier versions of pmlogger_daily used find(1) to locate files for compressing or culling
       and the -k and -x options took only integer values to mean  ``days''.   The  semantics  of
       this  was  quite  loose given that find(1) offers different precision and semantics across
       platforms.

       The current implementation of  pmlogger_daily  uses  find-filter(1)  which  provides  high
       precision  intervals  and  semantics  that  are  relative to the time of execution and are
       consistent across platforms.

SEE ALSO

       egrep(1), find-filter(1), PCPIntro(1), pmconfig(1),  pmlc(1),  pmlogconf(1),  pmlogctl(1),
       pmlogextract(1),       pmlogger(1),      pmlogger_check(1),      pmlogger_daily_report(1),
       pmlogger_merge(1), pmlogmv(1), pmlogrewrite(1), systemd(1), xz(1) and cron(8).