oracular (8) sge_shadowd.8.gz

Provided by: gridengine-master_8.1.9+dfsg-11.1_amd64 bug

NAME

       sge_shadowd - Grid Engine shadow master daemon

SYNOPSIS

       sge_shadowd

DESCRIPTION

       sge_shadowd  is a "light weight" process which can be run on so-called shadow master hosts
       in a Grid Engine cluster to detect failure of  the  current  Grid  Engine  master  daemon,
       sge_qmaster(8),  and to start-up a new sge_qmaster(8) on the host on which the sge_shadowd
       runs. If multiple shadow daemons are active in  a  cluster,  they  run  a  protocol  which
       ensures that only one of them will start-up a new master daemon.

       The  hosts  suitable as shadow master hosts must have shared root read/write access to the
       directory $SGE_ROOT/$SGE_CELL/common, as well as to the master daemon spool directory  (by
       default  $SGE_ROOT/$SGE_CELL/spool/qmaster).  The names of the shadow master hosts need to
       be contained in the file $SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters.

RESTRICTIONS

       sge_shadowd may only be started by root.

ENVIRONMENT VARIABLES

       SGE_ROOT       Specifies the location of the Grid Engine standard configuration files.

       SGE_CELL       If set, specifies the default Grid Engine cell. To address  a  Grid  Engine
                      cell sge_shadowd uses (in order of precedence):

                             The name of the cell specified in the environment variable SGE_CELL,
                             if it is set.

                             The name of the default cell, i.e. default.

       SGE_DEBUG_LEVEL
                      If set, specifies that debug information should be written  to  stderr.  In
                      addition  the  level  of  detail in which debug information is generated is
                      defined.

       SGE_QMASTER_PORT
                      If set, specifies the TCP port  on  which  sge_qmaster(8)  is  expected  to
                      listen  for communication requests.  Most installations will use a services
                      map entry for the service "sge_qmaster" instead to define that port.

       SGE_DELAY_TIME This variable controls the time for which sge_shadowd pauses if a  takeover
                      bid  fails.  This  value  is  used only when there are multiple sge_shadowd
                      instances and they are contending to be the master.   The  default  is  600
                      seconds.

       SGE_CHECK_INTERVAL
                      This  variable  controls  the  interval  between  sge_shadowd checks of the
                      heartbeat file (60 seconds by default).

       SGE_GET_ACTIVE_INTERVAL
                      This variable controls the  interval  between  attempts  by  a  sge_shadowd
                      instance to take over when the heartbeat file has not changed.  The default
                      is 240 seconds.

FILES

       <sge_root>/<cell>/common
                       Default configuration directory
       <sge_root>/<cell>/common/shadow_masters
                       Shadow master hostname file.
       <sge_root>/<cell>/spool/qmaster
                       Default master daemon spool directory
       <sge_root>/<cell>/spool/qmaster/heartbeat
                       The heartbeat file.

SEE ALSO

       sge_intro(1), sge_conf(5), sge_qmaster(8)

       See sge_intro(1) for a full statement of rights and permissions.