oracular (8) xymond_alert.8.gz

Provided by: xymon_4.3.30-3_amd64 bug

NAME

       xymond_alert - xymond worker module for sending out alerts

SYNOPSIS

       xymond_channel --channel=page xymond_alert [options]

DESCRIPTION

       xymond_alert  is  a  worker  module  for  xymond,  and  as such it is normally run via the
       xymond_channel(8) program. It receives xymond  page-  and  ack-messages  from  the  "page"
       channel  via stdin, and uses these to send out alerts about failed and recovered hosts and
       services.

       The operation of this module is controlled by the alerts.cfg(5) file. This file holds  the
       definition  of  rules  and recipients, that determine who gets alerts, how often, for what
       servers etc.

OPTIONS

       --config=FILENAME
              Sets the filename for the alerts.cfg file. The default  value  is  "etc/alerts.cfg"
              below the Xymon server directory.

       --dump-config
              Dumps the configuration after parsing it. May be useful to track down problems with
              configuration file errors.

       --checkpoint-file=FILENAME
              File where the current state of the xymond_alert module is  saved.   When  starting
              up, xymond_alert will also read this file to restore the previous state.

       --checkpoint-interval=N
              Defines how often (in seconds) the checkpoint-file is saved.

       --cfid If this option is present, alert messages will include a line with "cfid:N" where N
              is the linenumber in the alerts.cfg file that caused this message to be sent.  This
              can be useful to track down problems with duplicate alerts.

       --test HOST SERVICE [options]
              Shows  which  alert  rules  matches  the given HOST/SERVICE combination.  Useful to
              debug configuration problems, and see what rules are used for an alert.

              The possible options are:
              --color=COLORNAME The COLORNAME parameter is the color of the alert: red, yellow or
              purple.
              --duration=MINUTES The MINUTES parameter is the duration of the alert in minutes.
              --group=GROUPNAME The GROUPNAME parameter is a groupid string from the analysis.cfg
              file.
              --time=TIMESTRING The TIMESTRING  parameter  is  the  time-of-day  for  the  alert,
              expressed  as an absolute time in the epoch format (seconds since Jan 1 1970). This
              is easily obtained with the GNU date utility using the "+%s" output format.

       --trace=FILENAME
              Send trace output to FILENAME, This allows for more detailed analysis of how alerts
              trigger, without having the full debugging enabled.

       --debug
              Enable debugging output.

HOW XYMON DECIDES WHEN TO SEND ALERTS

       The  xymond_alert  module  is responsible for sending out all alerts.  When a status first
       goes to one of the ALERTCOLORS, xymond_alert is notified of this change. It notes that the
       status  is  now  in an alert state, and records the timestamp when this event started, and
       adds the alert to the list statuses  that  may  potentially  trigger  one  or  more  alert
       messages.

       This  list  is  then  matched against the alerts.cfg configuration.  This happens at least
       once a minute, but may happen more often. E.g.  when  status  first  goes  into  an  alert
       state, this will always trigger the matching to happen.

       When  scanning the configuration, xymond_alert looks at all of the configuration rules. It
       also checks the DURATION setting against how long time has elapsed since the event started
       - i.e. against the timestamp logged when xymond_alert first heard of this event.

       When an alert recipient is found, the alert is sent and it is recorded when this recipient
       is due for his next alert message, based on the REPEAT setting defined for this recipient.
       The  next time xymond_alert scans the configuration for what alerts to send, it will still
       find this recipient because all of the configuration rules are  fulfilled,  but  an  alert
       message will not be generated until the repeat interval has elapsed.

       It can happen that a status first goes yellow and triggers an alert, and later it goes red
       - e.g. a disk filling up. In that case, xymond_alert clears the internal  timer  for  when
       the  next  (repeat)  alert  is  due for all recipients. You generally want to be told when
       something that has been in a warning state becomes critical, so in that  case  the  REPEAT
       setting  is  ignored and the alert is sent. This only happens the first time such a change
       occurs - if the status switches between yellow and red  multiple  times,  only  the  first
       transition from yellow->red causes this override.

       When an status recovers, a recovery message may be sent - depending on the configuration -
       and then xymond_alert forgets everything about this status. So the next time it goes  into
       an alert state, the entire process starts all over again.

ENVIRONMENT

       MAIL   The  first  part  of  a  command  line  used  to send out an e-mail with a subject,
              typically set to "/usr/bin/mail -s" . xymond_alert will add  the  subject  and  the
              mail recipients to form the command line used for sending out email alerts.

       MAILC  The  first  part  of  a  command line used to send out an e-mail without a subject.
              Typically this will be "/usr/bin/mail". xymond_alert will add the  mail  recipients
              to form the command line used for sending out email alerts.

FILES

       ~xymon/server/etc/alerts.cfg

SEE ALSO

       alerts.cfg(5), xymond(8), xymond_channel(8), xymon(7)