Provided by: ganeti-2.16_2.16.1-2ubuntu1_all bug

Name

       ganeti-watcher - Ganeti cluster watcher

Synopsis

       ganeti-watcher    [--debug]    [--job-age=*age*    ]   [--ignore-pause]   [--rapi-ip=*IP*]
       [--no-verify-disks]

DESCRIPTION

       The ganeti-watcher is a periodically run script  which  is  responsible  for  keeping  the
       instances  in  the correct status.  It has two separate functions, one for the master node
       and another one that runs on every node.

       If the watcher is disabled at cluster level (via the gnt-cluster watcher  pause  command),
       it  will  exit  without doing anything.  The cluster-level pause can be overridden via the
       --ignore-pause option, for example if  during  a  maintenance  the  watcher  needs  to  be
       disabled in general, but the administrator wants to run it just once.

       The --debug option will increase the verbosity of the watcher and also activate logging to
       the standard error.

       The --rapi-ip option needs to be set if the RAPI daemon was started with a  particular  IP
       (using  the  -b  option).   The two options need to be exactly the same to ensure that the
       watcher can reach the RAPI interface.

   Master operations
       Its primary function is to try to keep running all instances which are marked as up in the
       configuration file, by trying to start them a limited number of times.

       Another  function is to "repair" DRBD links by reactivating the block devices of instances
       which have secondaries on nodes that have been rebooted.

       Additionally, it will verify and repair degraded DRBD disks; this will not happen, if  the
       --no-verify-disks option is given.

       The watcher will also archive old jobs (older than the age given via the --job-age option,
       which defaults to 6 hours), in order to keep the job queue manageable.

   Node operations
       The watcher will restart any down daemons that are appropriate for the current node.

       In addition, it will execute any scripts which exist under the "watcher" directory in  the
       Ganeti  hooks  directory (@SYSCONFDIR@/ganeti/hooks).  This should be used for lightweight
       actions, like starting any extra daemons.

       If the cluster parameter maintain_node_health is  enabled,  then  the  watcher  will  also
       shutdown  instances  and  DRBD  devices if the node is declared as offline by known master
       candidates.

       The watcher does synchronous queries but will submit jobs for executing the changes.   Due
       to locking, it could be that the jobs execute much later than the watcher submits them.

FILES

       The    command    has    a   set   of   state   files   (one   per   group)   located   at
       @LOCALSTATEDIR@/lib/ganeti/watcher.GROUP-UUID.data (only used on the  master)  and  a  log
       file at @LOCALSTATEDIR@/log/ganeti/watcher.log.  Removal of either file(s) will not affect
       correct operation; the removal of the state file will just cause the restart counters  for
       the  instances  to  reset to zero, and mark nodes as freshly rebooted (so for example DRBD
       minors will be re-activated).

       In some cases, it's  even  desirable  to  reset  the  watcher  state,  for  example  after
       maintenance  actions,  or  when  you  want to simulate the reboot of all nodes, so in this
       case, you can remove all state files:

              rm -f @LOCALSTATEDIR@/lib/ganeti/watcher.*.data
              rm -f @LOCALSTATEDIR@/lib/ganeti/watcher.*.instance-status
              rm -f @LOCALSTATEDIR@/lib/ganeti/instance-status

       And then re-run the watcher.

REPORTING BUGS

       Report  bugs  to  project  website  (http://code.google.com/p/ganeti/)  or   contact   the
       developers using the Ganeti mailing list (ganeti@googlegroups.com).

SEE ALSO

       Ganeti  overview  and specifications: ganeti(7) (general overview), ganeti-os-interface(7)
       (guest OS definitions), ganeti-extstorage-interface(7) (external storage providers).

       Ganeti  commands:  gnt-cluster(8)   (cluster-wide   commands),   gnt-job(8)   (job-related
       commands),  gnt-node(8)  (node-related  commands),  gnt-instance(8)  (instance  commands),
       gnt-os(8) (guest OS commands), gnt-storage(8) (storage commands), gnt-group(8) (node group
       commands), gnt-backup(8) (instance import/export commands), gnt-debug(8) (debug commands).

       Ganeti  daemons:  ganeti-watcher(8) (automatic instance restarter), ganeti-cleaner(8) (job
       queue cleaner), ganeti-noded(8) (node daemon), ganeti-rapi(8) (remote API daemon).

       Ganeti htools: htools(1) (generic binary), hbal(1) (cluster balancer), hspace(1) (capacity
       calculation),  hail(1) (IAllocator plugin), hscan(1) (data gatherer from remote clusters),
       hinfo(1) (cluster information printer), mon-collector(7) (data collectors interface).

COPYRIGHT

       Copyright (C) 2006-2015 Google Inc.  All rights reserved.

       Redistribution and use in source and binary  forms,  with  or  without  modification,  are
       permitted provided that the following conditions are met:

       1.   Redistributions  of  source code must retain the above copyright notice, this list of
       conditions and the following disclaimer.

       2.  Redistributions in binary form must reproduce the above copyright notice, this list of
       conditions  and  the  following  disclaimer  in  the  documentation and/or other materials
       provided with the distribution.

       THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT  HOLDERS  AND  CONTRIBUTORS  "AS  IS"  AND  ANY
       EXPRESS  OR  IMPLIED  WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
       MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.  IN  NO  EVENT  SHALL
       THE  COPYRIGHT  HOLDER  OR  CONTRIBUTORS  BE  LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
       SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED  TO,  PROCUREMENT
       OF  SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
       HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT  LIABILITY,  OR
       TORT  (INCLUDING  NEGLIGENCE  OR  OTHERWISE)  ARISING  IN  ANY  WAY OUT OF THE USE OF THIS
       SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.