Provided by: xymon_4.3.0~beta2.dfsg-9.1_amd64 bug

NAME

       hobbit-clients.cfg - Configuration file for the hobbitd_client module

SYNOPSIS

       ~Xymon/server/etc/hobbit-clients.cfg

DESCRIPTION

       The  hobbit-clients.cfg  file  controls what color is assigned to the status-messages that
       are generated from the Xymon client data - typically the cpu,  disk,  memory,  procs-  and
       msgs-columns.  Color  is  decided  on  the  basis  of  some settings defined in this file;
       settings apply to specific hosts through a set of rules.

       Note: This file is only used on the Xymon server - it is not used by the Xymon client,  so
       there is no need to distribute it to your client systems.

FILE FORMAT

       Blank lines and lines starting with a hash mark (#) are treated as comments and ignored.

CPU STATUS COLUMN SETTINGS

       LOAD warnlevel paniclevel

       If the system load exceeds "warnlevel" or "paniclevel", the "cpu" status will go yellow or
       red, respectively. These are decimal numbers.

       Defaults: warnlevel=5.0, paniclevel=10.0

       UP bootlimit toolonglimit

       The cpu status goes yellow if the system has been up for less than  "bootlimit"  time,  or
       longer   than  "toolonglimit".  The  time  is  in  minutes,  or  you  can  add  h/d/w  for
       hours/days/weeks - eg. "2h" for two hours, or "4w" for 4 weeks.

       Defaults: bootlimit=1h, toolonglimit=-1 (infinite).

       CLOCK max.offset

       The cpu status  goes  yellow  if  the  system  clock  on  the  client  differs  more  than
       "max.offset"  seconds  from that of the Xymon server. Note that this is not a particularly
       accurate test, since it is affected by network delays between the client and  the  server,
       and the load on both systems. You should therefore not rely on this being accurate to more
       than +/- 5 seconds, but it will let you catch a client clock that goes  completely  wrong.
       The default is NOT to check the system clock.
       NOTE: Correct operation of this test obviously requires that the system clock of the Xymon
       server is correct. You should therefore make sure that the Xymon server is synchronized to
       the real clock, e.g. by using NTP.

       Example:  Go  yellow  if  the  load  average exceeds 5, and red if it exceeds 10. Also, go
       yellow for 10 minutes after a reboot, and after 4 weeks uptime. Finally,  check  that  the
       system clock is at most 15 seconds offset from the clock of the Xymon system.

              LOAD 5 10
              UP 10m 4w
              CLOCK 15

DISK STATUS COLUMN SETTINGS

       DISK filesystem warnlevel paniclevel DISK filesystem IGNORE

       If  the utilization of "filesystem" is reported to exceed "warnlevel" or "paniclevel", the
       "disk" status will go yellow or  red,  respectively.   "warnlevel"  and  "paniclevel"  are
       either  the  percentage used, or the space available as reported by the local "df" command
       on the host.  For the latter type of check, the "warnlevel" must be followed by the letter
       "U", e.g. "1024U".

       The special keyword "IGNORE" causes this filesystem to be ignored completely, i.e. it will
       not appear in the "disk" status column and it will not be tracked  in  a  graph.  This  is
       useful for e.g. removable devices, backup-disks and similar hardware.

       "filesystem"  is the mount-point where the filesystem is mounted, e.g.  "/usr" or "/home".
       A filesystem-name that begins  with  "%"  is  interpreted  as  a  Perl-compatible  regular
       expression;  e.g.  "%^/oracle.*/"  will  match any filesystem whose mountpoint begins with
       "/oracle".

       Defaults: warnlevel=90%, paniclevel=95%

MEMORY STATUS COLUMN SETTINGS

       MEMPHYS warnlevel paniclevel
       MEMACT warnlevel paniclevel
       MEMSWAP warnlevel paniclevel

       If the memory utilization exceeds the "warnlevel" or  "paniclevel",  the  "memory"  status
       will  change to yellow or red, respectively.  Note: The words "PHYS", "ACT" and "SWAP" are
       also recognized.

       Example: Go yellow if more than 20% swap is used, and red if more than 40% swap is used or
       the actual memory utilisation exceeds 90%. Dont alert on physical memory usage.

              MEMSWAP 20 40
              MEMACT 90 90
              MEMPHYS 101 101

       Defaults:

              MEMPHYS warnlevel=100 paniclevel=101 (i.e. it will never go red).
              MEMSWAP warnlevel=50 paniclevel=80
              MEMACT  warnlevel=90 paniclevel=97

PROCS STATUS COLUMN SETTINGS

       PROC processname minimumcount maximumcount color [TRACK=id] [TEXT=text]

       The  "ps"  listing  sent  by  the client will be scanned for how many processes containing
       "processname" are running, and this is then matched against the min/max  settings  defined
       here.  If  the  running  count  is outside the thresholds, the color of the "procs" status
       changes to "color".

       To check for a process that must NOT be running: Set minimum and maximum to 0.

       "processname" can be a simple string, in which case this string must show up in  the  "ps"
       listing  as  a command. The scanner will find a ps-listing of e.g. "/usr/sbin/cron" if you
       only specify "processname" as  "cron".   "processname"  can  also  be  a  Perl-compatiable
       regular  expression,  e.g.   "%java.*inst[0123]"  can  be  used to find entries in the ps-
       listing for "java -Xmx512m inst2" and "java -Xmx256 inst3". In  that  case,  "processname"
       must begin with "%" followed by the regular expression.  Note that Xymon defaults to case-
       insensitive pattern matching; if that is not what you want, put "(?-i)"  between  the  "%"
       and  the regular expression to turn this off. E.g. "%(?-i)HTTPD" will match the word HTTPD
       only when it is upper-case.
       If "processname" contains whitespace (blanks or TAB), you must enclose the full string  in
       double quotes - including the "%" if you use regular expression matching. E.g.

           PROC "%hobbitd_channel --channel=data.*hobbitd_rrd" 1 1 yellow

       or

           PROC "java -DCLASSPATH=/opt/java/lib" 2 5

       You  can have multiple "PROC" entries for the same host, all of the checks are merged into
       the "procs" status and the most severe check defines the color of the status.

       The optional TRACK=id setting causes Xymon to track the number of processes  found  in  an
       RRD  file,  and put this into a graph which is shown on the "procs" status display. The id
       setting is a simple text string which will be used as the legend for the graph,  and  also
       as  part  of  the RRD filename. It is recommended that you use only letters and digits for
       the ID.
       Note that the process counts which are tracked are only performed  once  when  the  client
       does  a  poll  cycle  -  i.e.  the  counts represent snapshots of the system state, not an
       average value over the client poll cycle.  Therefore there may be peaks  or  dips  in  the
       actual  process counts which will not show up in the graphs, because they happen while the
       Xymon client is not doing any polling.

       The optional TEXT=text setting is used in the summary of the "procs" status. Normally, the
       summary  will  show  the  "processname"  to identify the process and the related count and
       limits. But this may be a regular expression which  is  not  easily  recognizable,  so  if
       defined,  the  text  setting  string  will  be used instead. This only affects the "procs"
       status display - it has no effect on how the rule counts or recognizes  processes  in  the
       "ps" output.

       Example: Check that "cron" is running:
            PROC cron

       Example: Check that at least 5 "httpd" processes are running, but not more than 20:
            PROC httpd 5 20

       Defaults:
            mincount=1, maxcount=-1 (unlimited), color="red".
            Note that no processes are checked by default.

MSGS STATUS COLUMN SETTINGS

       LOG logfilename pattern [COLOR=color] [IGNORE=excludepattern]

       The  Xymon  client  extracts interesting lines from one or more logfiles - see the client-
       local.cfg(5) man-page for information about how to configure which logs  a  client  should
       look at.

       The  LOG  setting  determine  how  these  extracts  of log entries are processed, and what
       warnings or alerts trigger as a result.

       "logfilename" is the name of the logfile. Only  logentries  from  this  filename  will  be
       matched  against  this  rule.   Note  that  "logfilename"  can be a regular expression (if
       prefixed with a '%' character).

       "pattern" is a string or regular expression. If the logfile  data  matches  "pattern",  it
       will  trigger  the  "msgs" column to change color. If no "color" parameter is present, the
       default is to go "red" when the pattern is matched. To match against a regular expression,
       "pattern"  must  begin  with  a  '%'  sign  -  e.g  "%WARNING|NOTICE" will match any lines
       containing either of these two  words.   Note  that  Xymon  defaults  to  case-insensitive
       pattern  matching;  if  that  is  not  what  you want, put "(?-i)" between the "%" and the
       regular expression to turn this off. E.g. "%(?-i)WARNING" will match the word WARNING only
       when it is upper-case.

       "excludepattern"  is  a  string  or  regular expression that can be used to filter out any
       unwanted strings that happen to match "pattern".

       Example: Trigger a red alert when the string  "ERROR"  appears  in  the  "/var/adm/syslog"
       file:
            LOG /var/adm/syslog ERROR

       Example:  Trigger a yellow warning on all occurrences of the word "WARNING" or "NOTICE" in
       the "daemon.log" file, except those from the "lpr" system:
            LOG /var/log/daemon.log %WARNING|NOTICE COLOR=yellow IGNORE=lpr

       Defaults:
            color="red", no "excludepattern".

       Note that no logfiles are checked by default. Any log data reported by a client will  just
       show up on the "msgs" column with status OK (green).

FILES STATUS COLUMN SETTINGS

       FILE filename [color] [things to check] [TRACK]

       DIR directoryname [color] [size<MAXSIZE] [size>MINSIZE] [TRACK]

       These entries control the status of the "files" column. They allow you to check on various
       data for files and directories.

       filename and directoryname are names of files or directories, with a full  path.  You  can
       use  a  regular  expression  to  match  the names of files and directories reported by the
       client, if you prefix the expression with a '%' character.

       color is the color that triggers when one or more of the checks fail.

       The TRACK keyword causes the size of the file or directory to be tracked in an  RRD  file,
       and presented in a graph on the "files" status display.

       For files, you can check one or more of the following:

       noexist
              triggers a warning if the file exists. By default, a warning is triggered for files
              that have a FILE entry, but which do not exist.

       ifexist
              only checks the file if it exists. If the  file  is  reported  as  missing  by  the
              client, it is ignored.

       type=TYPE
              where  TYPE is one of "file", "dir", "char", "block", "fifo", or "socket". Triggers
              warning if the file is not of the specified type.

       ownerid=OWNER
              triggers a warning if the owner does not match  what  is  listed  here.   OWNER  is
              specified either with the numeric uid, or the user name.

       groupid=GROUP
              triggers  a  warning  if  the  group  does not match what is listed here.  GROUP is
              specified either with the numeric gid, or the group name.

       mode=MODE
              triggers a warning if the file permissions are not as listed. MODE  is  written  in
              the standard octal notation, e.g.  "644" for the rw-r--r-- permissions.

       size<MAX.SIZE and size>MIN.SIZE
              triggers a warning it the file size is greater than MAX.SIZE or less than MIN.SIZE,
              respectively. For filesizes, you can use the  letters  "K",  "M",  "G"  or  "T"  to
              indicate  that  the  filesize  is  in Kilobytes, Megabytes, Gigabytes or Terabytes,
              respectively. If there is no such modifier, Kilobytes is assumed. E.g. to warn if a
              file grows larger than 1MB, use size<1024M.

       mtime>MIN.MTIME mtime<MAX.MTIME
              checks  how  long  ago the file was last modified (in seconds). E.g.  to check if a
              file was updated within the past 10 minutes (600 seconds): mtime<600. Or  to  check
              that a file has NOT been updated in the past 24 hours: mtime>86400.

       mtime=TIMESTAMP
              checks  if  a  file was last modified at TIMESTAMP.  TIMESTAMP is a unix epoch time
              (seconds since midnight Jan 1 1970 UTC).

       ctime>MIN.CTIME, ctime<MAX.CTIME, ctime=TIMESTAMP
              acts as the mtime checks, but for the ctime timestamp (when the directory entry  of
              the file was last changed, eg. by chown, chgrp or chmod).

       md5=MD5SUM, sha1=SHA1SUM, rmd160=RMD160SUM
              trigger a warning if the file checksum using the MD5, SHA1 or RMD160 message digest
              algorithms do not match the one configured here. Note:  The  "file"  entry  in  the
              client-local.cfg(5) file must specify which algorithm to use.

       For directories, you can check one or more of the following:

       size<MAX.SIZE and size>MIN.SIZE
              triggers  a  warning  it  the  directory size is greater than MAX.SIZE or less than
              MIN.SIZE, respectively. Directory sizes  are  reported  in  whatever  unit  the  du
              command on the client uses - often KB or diskblocks - so MAX.SIZE and MIN.SIZE must
              be given in the same unit.

       Experience shows that it can be difficult to  get  these  rules  right.   Especially  when
       defining  minimum/maximum values for file sizes, when they were last modified etc. The one
       thing you must remember when setting up these checks is that the rules  describe  criteria
       that must be met - only when they are met will the status be green.

       So  "mtime<600"  means "the difference between current time and the mtime of the file must
       be less than 600 seconds - if not, the file status will go red".

PORTS STATUS COLUMN SETTINGS

       PORT criteria [MIN=mincount] [MAX=maxcount] [COLOR=color] [TRACK=id] [TEXT=displaytext]

       The "netstat" listing sent by the client will be scanned for how many  sockets  match  the
       criteria listed.  The criteria you can use are:

       LOCAL=addr
              "addr"  is a (partial) local address specification in the format used on the output
              from netstat.

       EXLOCAL=addr
              Exclude certain local adresses from the rule.

       REMOTE=addr
              "addr" is a (partial) remote address specification in the format used on the output
              from netstat.

       EXREMOTE=addr
              Exclude certain remote adresses from the rule.

       STATE=state
              Causes  only  the sockets in the specified state to be included, "state" is usually
              LISTEN or ESTABLISHED but can be any socket state reported by the clients "netstat"
              command.

       EXSTATE=state
              Exclude certain states from the rule.

       "addr"  is  typically "10.0.0.1:80" for the IP 10.0.0.1, port 80.  Or "*:80" for any local
       address, port 80. Note that the Xymon clients normally report only the  numeric  data  for
       IP-adresses  and  port-numbers, so you must specify the port number (e.g. "80") instead of
       the service name ("www").
       "addr"  and  "state"  can  also   be   a   Perl-compatiable   regular   expression,   e.g.
       "LOCAL=%[.:](80|443)"  can be used to find entries in the netstat local port for both http
       (port 80) and https (port 443). In that case,  portname  or  state  must  begin  with  "%"
       followed by the reg.expression.

       The  socket  count found is then matched against the min/max settings defined here. If the
       count is outside the thresholds, the color of the "ports" status changes to  "color".   To
       check for a socket that must NOT exist: Set minimum and maximum to 0.

       The  optional TRACK=id setting causes Xymon to track the number of sockets found in an RRD
       file, and put this into a graph which is shown on  the  "ports"  status  display.  The  id
       setting  is  a simple text string which will be used as the legend for the graph, and also
       as part of the RRD filename. It is recommended that you use only letters  and  digits  for
       the ID.
       Note  that  the  sockets  counts which are tracked are only performed once when the client
       does a poll cycle - i.e. the counts represent  snapshots  of  the  system  state,  not  an
       average  value  over  the  client poll cycle.  Therefore there may be peaks or dips in the
       actual sockets counts which will not show up in the graphs, because they happen while  the
       Xymon client is not doing any polling.

       The  TEXT=displaytext  option  affects how the port appears on the "ports" status page. By
       default, the port is listed with the local/remote/state rules as identification, but  this
       may be somewhat difficult to understand. You can then use e.g. "TEXT=Secure Shell" to make
       these ports appear with the name "Secure Shell" instead.

       Defaults: mincount=1, maxcount=-1 (unlimited), color="red".  Note: No ports are checked by
       default.

       Example: Check that the SSH daemon is listening on port 22. Track the number of active SSH
       connections, and warn if there are more than 5.
               PORT LOCAL=%[.:]22$ STATE=LISTEN "TEXT=SSH listener"
               PORT LOCAL=%[.:]22$ STATE=ESTABLISHED MAX=5 TRACK=ssh TEXT=SSH

CHANGING THE DEFAULT SETTINGS

       If you would like to use different defaults for the settings described above, then you can
       define the new defaults after a DEFAULT line. E.g. this would explicitly define all of the
       default settings:

              DEFAULT
                   UP      1h
                   LOAD    5.0 10.0
                   DISK    * 90 95
                   MEMPHYS 100 101
                   MEMSWAP 50 80
                   MEMACT  90 97

RULES TO SELECT HOSTS

       All of the settings can be applied to a group of hosts, by preceding them  with  rules.  A
       rule  defines  of one of more filters using these keywords (note that this is identical to
       the rule definitions used in the hobbit-alerts.cfg(5) file).

       PAGE=targetstring Rule matching an alert by the name of the page in BB. "targetstring"  is
       the path of the page as defined in the bb-hosts file. E.g. if you have this setup:

              page servers All Servers
              subpage web Webservers
              10.0.0.1 www1.foo.com
              subpage db Database servers
              10.0.0.2 db1.foo.com

       Then  the  "All  servers"  page  is  found  with  PAGE=servers,  the  "Webservers" page is
       PAGE=servers/web and the "Database servers" page is PAGE=servers/db.  Note  that  you  can
       also  use  regular  expressions  to specify the page name, e.g. PAGE=%.*/db would find the
       "Database servers" page regardless of where this page was placed in the hierarchy.

       The top-level page has a the fixed name /, e.g. PAGE=/ would match all hosts on the  Xymon
       frontpage.  If  you  need  it  in a regular expression, use PAGE=%^/ to avoid matching the
       forward-slash present in subpage-names.

       EXPAGE=targetstring Rule excluding a host if the pagename matches.

       HOST=targetstring Rule matching a host by the hostname.  "targetstring" is either a comma-
       separated  list  of  hostnames (from the bb-hosts file), "*" to indicate "all hosts", or a
       Perl-compatible regular expression.  E.g.  "HOST=dns.foo.com,www.foo.com"  identifies  two
       specific  hosts;  "HOST=%www.*.foo.com  EXHOST=www-test.foo.com"  matches all hosts with a
       name beginning with "www", except the "www-test" host.

       EXHOST=targetstring Rule excluding a host by matching the hostname.

       CLASS=classname Rule match by the client class-name. You specify the class-name for a host
       when  starting the client through the "--class=NAME" option to the runclient.sh script. If
       no class is specified, the host by default goes  into  a  class  named  by  the  operating
       system.

       EXCLASS=classname Exclude all hosts belonging to "classname" from this rule.

       TIME=timespecification Rule matching by the time-of-day. This is specified as the DOWNTIME
       time specification in the bb-hosts file.  E.g. "TIME=W:0800:2200" applied to a  rule  will
       make this rule active only on week-days between 8AM and 10PM.

DIRECTING ALERTS TO GROUPS

       For  some tests - e.g. "procs" or "msgs" - the right group of people to alert in case of a
       failure may be different, depending on which of  the  client  rules  actually  detected  a
       problem.  E.g.  if  you  have  PROCS  rules  for  a  host checking both "httpd" and "sshd"
       processes, then the Web admins should handle httpd-failures, whereas "sshd"  failures  are
       handled by the Unix admins.

       To  handle  this,  all  rules can have a "GROUP=groupname" setting.  When a rule with this
       setting triggers a yellow or red status, the groupname is passed on to  the  Xymon  alerts
       module,  so you can use it in the alert rule definitions in hobbit-alerts.cfg(5) to direct
       alerts to the correct group of people.

RULES: APPLYING SETTINGS TO SELECTED HOSTS

       Rules must be placed after the settings, e.g.

              LOAD 8.0 12.0  HOST=db.foo.com TIME=*:0800:1600

       If you have multiple settings that you want to apply the same rules to, you can write  the
       rules *only* on one line, followed by the settings. E.g.

              HOST=%db.*.foo.com TIME=W:0800:1600
                   LOAD 8.0 12.0
                   DISK /db  98 100
                   PROC mysqld 1

       will  apply  the three settings to all of the "db" hosts on week-days between 8AM and 4PM.
       This can be combined with per-settings rule, in which case the per-settings rule overrides
       the general rule; e.g.

              HOST=%.*.foo.com
                   LOAD 7.0 12.0 HOST=bax.foo.com
                   LOAD 3.0 8.0

       will  result in the load-limits being 7.0/12.0 for the "bax.foo.com" host, and 3.0/8.0 for
       all other foo.com hosts.

       The entire file is evaluated from the top to bottom, and the first match found is used. So
       you should put the specific settings first, and the generic ones last.

NOTES

       For  the  LOG, FILE and DIR checks, it is necessary also to configure the actual file- and
       directory-names in the client-local.cfg(5) file. If the filenames are  not  listed  there,
       the  clients  will not collect any data about these files/directories, and the settings in
       the hobbit-clients.cfg file will be silently ignored.

       The ability to compute file checksums with MD5, SHA1 or RMD160  should  not  be  used  for
       general-purpose  file  integrity  checking,  since  the overhead of calculating these on a
       large number of files can be significant. If you need this, look  at  tools  designed  for
       this purpose - e.g. Tripwire or AIDE.

       At  the  time  of  writing  (april  2006),  the SHA-1 and RMD160 algorithms are considered
       cryptographically safe. The MD5 algorithm has been shown to have some weaknesses,  and  is
       not considered strong enough when a high level of security is required.

SEE ALSO

       hobbitd_client(8), client-local.cfg(5), hobbitd(8), xymon(7)