Ubuntu Manpage: alerts.cfg - Configuration for for xymond

NAME

       alerts.cfg - Configuration for for xymond_alert module

SYNOPSIS

       ~xymon/server/etc/alerts.cfg

DESCRIPTION

       The alerts.cfg file controls the sending of alerts by Xymon when monitoring detects a failure.

FILE FORMAT

       The  configuration  file  consists of rules, that may have one or more recipients associated. A recipient
       specification may include additional rules that limit the circumstances when this recipient  is  eligible
       for receiving an alert.

       Blank  lines and lines starting with a hash mark (#) are treated as comments and ignored.  Long lines can
       be broken up by putting a backslash at the end of the line and continuing the entry on the next line.

RULES

A rule consists of one of more filters using these keywords:

PAGE=targetstring Rule matching an alert by the name of the page in Xymon. This is the path of the page
as defined in the hosts.cfg file. E.g. if you have this setup:

page servers All Servers
subpage web Webservers
10.0.0.1 www1.foo.com
subpage db Database servers
10.0.0.2 db1.foo.com

Then the "All servers" page is found with PAGE=servers, the "Webservers" page is PAGE=servers/web and the
"Database servers" page is PAGE=servers/db. Note that you can also use regular expressions to specify the
page name, e.g. PAGE=%.*/db would find the "Database servers" page regardless of where this page was
placed in the hierarchy.

The PAGE name of top-level page is an empty string. To match this, use PAGE=%^$ to match the empty
string.

EXPAGE=targetstring Rule excluding an alert if the pagename matches.

DISPLAYGROUP=groupstring Rule matching an alert by the text of the display-group (text following the
group, group-only, group-except heading) in the hosts.cfg file. "groupstring" is the text for the group,
stripped of any HTML tags. E.g. if you have this setup:

group Web
10.0.0.1 www1.foo.com
10.0.0.2 www2.foo.com
group Production databases
10.0.1.1 db1.foo.com

Then the hosts in the Web-group can be matched with DISPLAYGROUP=Web, and the database servers can be
matched with DISPLAYGROUP="Production databases". Note that you can also use regular expressions, e.g.
DISPLAYGROUP=%database. If there is no group-setting for the host, use "DISPLAYGROUP=NONE".

EXDISPLAYGROUP=groupstring Rule excluding a group by matching the display-group string.

HOST=targetstring Rule matching an alert by the hostname.

EXHOST=targetstring Rule excluding an alert by matching the hostname.

SERVICE=targetstring Rule matching an alert by the service name.

EXSERVICE=targetstring Rule excluding an alert by matching the service name.

GROUP=groupname Rule matching an alert by the group name. Groupnames are assigned to a status via the
GROUP setting in the analysis.cfg file.

EXGROUP=groupname Rule excluding an alert by the group name. Groupnames are assigned to a status via the
GROUP setting in the analysis.cfg file.

CLASS=classname Rule matching an alert by the class that the host belongs to. By default, the classname
is the operating system name; you can set another class either in hosts.cfg(5) using the CLASS tag, or a
client running on the server can set the class (via a parameter to the client startup-script).

EXCLASS=classname Rule excluding an alert by the class name.

COLOR=color[,color] Rule matching an alert by color. Can be "red", "yellow", or "purple". The forms
"!red", "!yellow" and "!purple" can also be used to NOT send an alert if the color is the specified one.

TIME=timespecification Rule matching an alert by the time-of-day. This is specified as the DOWNTIME
timespecification in the hosts.cfg file.

EXTIME=timespecification Rule excluding an alert by the time-of-day. This is specified as the DOWNTIME
timespecification in the hosts.cfg file.

DURATION>time, DURATION<time Rule matcing an alert if the event has lasted longer/shorter than the given
duration. E.g. DURATION>1h (lasted longer than 1 hour) or DURATION<30 (only sends alerts the first 30
minutes). The duration is specified as a number, optionally followed by 'm' (minutes, default), 'h'
(hours) or 'd' (days).

RECOVERED Rule matches if the alert has recovered from an alert state.

NOTICE Rule matches if the message is a "notify" message. This type of message is sent when a host or
test is disabled or enabled.

The "targetstring" is either a simple pagename, hostname or servicename, OR a '%' followed by a Perl-
compatible regular expression. E.g. "HOST=%www(.*)" will match any hostname that begins with "www". The
same for the "groupname" setting.

RECIPIENTS

The recipients are listed after the initial rule. The following keywords can be used to define
recipients:

MAIL address[,address] Recipient who receives an e-mail alert. This takes one parameter, the e-mail
address. The strings "&host&", "&service&" and "&color&" in an address will be replaced with the
hostname, service and color of the alert, respectively.

SCRIPT /path/to/script recipientID Recipient that invokes a script. This takes two parameters: The script
filename, and the recipient that gets passed to the script. The strings "&host&", "&service&" and
"&color&" in the recipientID will be replaced with the hostname, service and color of the alert,
respectively.

IGNORE This is used to define a recipient that does NOT trigger any alerts, and also terminates the
search for more recipients. It is useful if you have a rule that handles most alerts, but there is just
that one particular server where you don't want cpu alerts on Monday morning. Note that the IGNORE
recipient always has the STOP flag defined, so when the IGNORE recipient is matched, no more recipients
will be considered. So the location of this recipient in your set of recipients is important.

FORMAT=formatstring Format of the text message with the alert. Default is "TEXT" (suitable for e-mail
alerts). "PLAIN" is the same as text, but without the URL link to the status webpage. "SMS" is a short
message with no subject for SMS alerts. "SCRIPT" is a brief message template for scripts.

REPEAT=time How often an alert gets repeated. As with DURATION, time is a number optionally followed by
'm', 'h' or 'd'.

UNMATCHED The alert is sent to this recipient ONLY if no other recipients received an alert for this
event.

STOP Stop looking for more recipients after this one matches. This is implicit on IGNORE recipients.

Rules You can specify rules for a recipient also. This limits the alerts sent to this particular
recipient.

MACROS

       It is possible to use macros in the configuration file. To define a macro:

            $MYMACRO=text extending to end of line

       After the definition of a macro, it can be used throughout the file. Wherever the text $MYMACRO  appears,
       it will be substituted with the text of the macro before any processing of rules and recipients.

       It is possible to nest macros, as long as the macro is defined before it is used.

ALERT SCRIPTS

       Alerts  can  go  out  via custom scripts, by using the SCRIPT keyword for a recipient.  Such scritps have
       access to the following environment variables:

       BBALPHAMSG The full text of the status log triggering the alert

       ACKCODE The "cookie" that can be used to acknowledge the alert

       RCPT The recipientID from the SCRIPT entry

       BBHOSTNAME The name of the host that the alert is about

       MACHIP The IP-address of the host that has a problem

       BBSVCNAME The name of the service that the alert is about

       BBSVCNUM The numeric code for the service. From the SVCCODES definition.

       BBHOSTSVC HOSTNAME.SERVICE that the alert is about.

       BBHOSTSVCCOMMAS As BBHOSTSVC, but dots in the hostname replaced with commas

       BBNUMERIC A 22-digit number made by BBSVCNUM, MACHIP and ACKCODE.

       RECOVERED Is "0" if the service is alerting, "1" if the service has recovered, "2"  if  the  service  was
       disabled.

       EVENTSTART Timestamp when the current status (color) began.

       SECS Number of seconds the service has been down.

       DOWNSECSMSG When recovered, holds the text "Event duration : N" where N is the DOWNSECS value.

       CFID  Line-number  in  the  alerts.cfg  file  that  caused  the script to be invoked.  Can be useful when
       troubleshooting alert configuration rules.