Provided by: corosync-qdevice_3.0.0-4ubuntu1_amd64 bug


       corosync-qdevice - QDevice daemon


       corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]


       corosync-qdevice  is  a daemon running on each node of a cluster. It provides a configured
       number of votes to the quorum subsystem based on a third-party arbitrator's decision.  Its
       primary use is to allow a cluster to sustain more node failures than standard quorum rules
       allow.  It is recommended for clusters with an even number of nodes and highly recommended
       for 2 node clusters.


       -d     Forcefully turn on debug information without the need to change corosync.conf.

       -f     Do not daemonize, run in the foreground.

       -h     Show short help text

       -S     Set  advanced settings described in its own section below. This option shouldn't be
              generally used because most of the options are not safe to change.


       corosync-qdevice reads its configuration from corosync.conf file.

       The main configuration is within quorum.device  sub-key.  Each  model  also  has  its  own
       configuration within a similarly named sub-key.

       model  Specifies  the  model  to be used. This parameter is required.  corosync-qdevice is
              modular and is able to support  multiple  different  models.  The  model  basically
              defines what type of arbitrator is used. Currently only net is supported.

              Specifies  how  often corosync-qdevice should call the votequorum_poll function. It
              is also used by the net model to adjust its hearbeat  timeout.  It  is  recommended
              that you don't change this value.  Default is 10000.

              Specifies  how  often  corosync-qdevice  should  call  the votequorum_poll function
              during a sync phase. It is recommended that you don't change this  value.   Default
              is 30000.

       votes  The number of votes provided to the cluster by qdevice. Default is (number_of_nodes
              - 1) or generally sum(votes_per_node) - 1.

       quorum.device.heuristics subkey holds the configuration of the heuristics. Heuristics  are
       set of commands executed locally on startup, cluster membership change, successful connect
       to corosync-qnetd and optionally also at regular times. Commands are executed in parallel.
       When  all  commands  finish  successfully  (their  return  error  code  is  zero) on time,
       heuristics have passed, otherwise they have failed.  The  heuristics  result  is  sent  to
       corosync-qnetd  and there it's used in calculations to determine which partition should be

              Specifies maximum time in milliseconds how long  corosync-qdevice  waits  till  the
              heuristics commands finish. If some command doesn't finish before the timeout, it's
              killed and heuristics fail. This timeout is used for heuristics executed at regular
              times.  Default value is half of the quorum.device.timeout, so 5000.

              Similar  to  quorum.device.heuristics.timeout  but  used during membership changes.
              Default value is half of the quorum.device.sync_timeout, so 15000.

              Specifies interval between two regular heuristics execution. Default value is  3  *
              quorum.device.timeout, so 30000.

       mode   Can  be  one  of  on,  sync  or  off and specifies mode of operation of heuristics.
              Default is off, which means heuristics are disabled. When sync is  set,  heuristics
              are  executed  only  during  startup,  membership  change  and  when  connection to
              corosync-qnetd is established. When heuristics should be running  also  on  regular
              basis, this option should be set to on value.

              defines  executables.   NAME can be arbitrary valid cmap key name string and it has
              no special meaning.  The value of this variable must contain a command to  execute.
              The  value  is  parsed  (split)  into arguments similarly as Bourne shell would do.
              Quoting is possible by using backslash and double quotes. subkey holds the configuration for model net.

       tls    Can be one of on, off or required and specifies if tls should be used.  on means  a
              connection  with  TLS  is  attempted first, but if the server doesn't advertise TLS
              support then non-TLS will be used.  off is used then TLS is not required  and  it's
              then  not  even  tried.  This  mode  is  the only one which doesn't need a properly
              initialized NSS database.  required means TLS is required and if the server doesn't
              support TLS, qdevice will exit with error message. Default is on.

       host   Specifies  the  IP  address  or  host  name  of  the  qnetd server to be used. This
              parameter is required.

       port   Specifies TCP port of qnetd server. Default is 5403.

              Decision algorithm. Can be one of the ffsplit or lms.   (actually  there  are  also
              test  and  2nodelms,  both of which are mainly for developers and shouldn't be used
              for production clusters).  For a description of what each algorithm means  and  how
              the algorithms differ see their individual sections.  Default value is ffsplit.

              can  be  one  of  lowest,  highest or valid_node_id (number) values. It's used as a
              fallback if qdevice has to decide between two or  more  equal  partitions.   lowest
              means the partition with the lowest node id is chosen.  highest means the partition
              with highest node  id  is  chosen.  And  valid_node_id  means  that  the  partition
              containing the node with the given node id is chosen.  Default is lowest.

              Timeout  when corosync-qdevice is trying to connect to corosync-qnetd host. Default
              is 0.8 * quorum.sync_timeout.

              can be one of 0|4|6 and forces the  software  to  use  the  given  IP  version.   0
              (default value) means IPv6 is preferred and IPv4 should be used as a fallback.

       Logging  configuration  is  within  the  logging  directive.   corosync-qdevice parses and
       supports most of the options with exception of to_logfile, logfile  and  logfile_priority.
       The logger_subsys sub-directive can be also used if subsys is set to QDEVICE.

       For corosync-qdevice to work correctly, the nodelist directive has to be used and properly
       configured. Also the net model requires that totem.cluster_name option is set.


       For model net to work using TLS, it's necessary to create the NSS database,  import  Qnetd
       CA certificate, and get/distribute a valid client certificate.

       If  pcs  is  used  (recommended)  the following steps are not needed because pcs does them

       corosync-qdevice-net-certutil is the tool to perform required actions  semi-automatically.
       Please  consult  the help output of it and its man page. For a first time configuration it
       may make sense to start with the -Q option.

       If TLS is not required just edit corosync.conf file and set to off.

       Depending on configuration of NSS (stored  in  nss.config  file  usually  in  /etc/crypto-
       policies/back-ends/  directory) disabled ciphers or too short keys may be rejected. Proper
       solution is to regenerate NSS  databases  for  both  corosync-qnetd  and  corosync-qdevice
       daemons.   As   a  quick  workaround  it's  also  possible  to  set  environment  variable
       NSS_IGNORE_SYSTEM_POLICY=1 before running corosync-qdevice daemon.

       When NSS is updated it may also be needed to upgrade database into new format. There is no
       consensus  on  recommended  way, but following command seems to work just fine (if qdevice
       sysconfdir is set to /etc)

       # certutil -N -d /etc/corosync/qdevice/net/nssdb -f /etc/corosync/qdevice/net/nssdb/pwdfile.txt


       Algorithms are used to change behavior of how corosync-qnetd provides  votes  to  a  given
       node/partition. Currently there are two algorithms supported.

              This  one  makes  sense only for clusters with an even number of nodes. It provides
              exactly one vote to the partition with the highest number of active nodes. If there
              are  two  exactly  similar  partitions,  it provides its vote to the partition with
              higher   score.   The   score   is   computed   as   (number_of_connected_nodes   +
              number_of_connected_nodes_with_passed_heuristics                                  -
              number_of_connected_nodes_with_failed_heuristics) If the scores are equal, the vote
              is  provided  to  partition with the most clients connected to the qnetd server. If
              this number is also equal, then the tie_breaker is used. It is able  to  transition
              its  vote  if  the  currently active partition becomes partitioned and a non-active
              partition still has at least 50% of the active nodes. Because of this,  a  vote  is
              not provided if the qnetd connection is not active.

              To  use  this  algorithm  it's  required  to  set the number of votes per node to 1
              (default) and the qdevice number of votes has to be also 1.  This  is  achieved  by
              setting quorum.device.votes key in corosync.conf file to 1.

       lms    Last-man-standing. If the node is the only one left in the cluster that can see the
              qnetd server then we return a vote.

              If more than one node can see the qnetd server but some nodes can't see each  other
              then  the  cluster  is divided up into 'partitions' based on their ring_id and this
              algorithm returns a vote to the partition with highest heuristics  score  (computed
              the  same  way  as for the ffsplit algorithm), or if there is more than 1 partition
              with equal scores, the largest active partition or, if there is more than  1  equal
              partition, the partition that contains the tie_breaker node (lowest, highest, etc).
              For LMS to work, the number of qdevice votes has to be  set  to  default  (so  just
              delete quorum.device.votes key from corosync.conf).


       Set by using -S option. The default value is shown in parentheses)  Options beginning with
       net_ prefix are specific to model net.

              Lock file location. (/var/run/corosync-qdevice/

              Internal   IPC   socket   file    location.    (/var/run/corosync-qdevice/corosync-

              Parameter passed to listen syscall. (10)

              How  many  times  to  retry  the  call  to  a  corosync function which has returned
              CS_ERR_TRY_AGAIN. (10)

              Name used for qdevice registration. (Qdevice)

              Maximum allowed simultaneous IPC clients. (10)

              Maximum size of a message received by IPC client. (4096)

              Maximum size of a message allowed to be sent to an IPC client. (65536)

              Force enable/disable master wins. (default is model)

              Maximum number of heuristics worker send buffers. (128)

              Maximum size of a message allowed to  be  send  to,  or  received  from  heuristics
              worker. (4096)

              Minimum heuristics timeout accepted by client in ms. (1000)

              Maximum heuristics timeout accepted by client in ms. (120000)

              Minimum heuristics interval accepted by client in ms. (1000)

              Maximum heuristics interval accepted by client in ms. (3600000)

              Maximum number of exec_ commands. (32)

              Use execvp instead of execv for executing commands. (off)

              Maximum number of processes running at one time. (160)

              Interval  between  status  is  gathered  and eventually signal is sent to processes
              which didn't finished on time in ms. (5000)

              NSS database directory. (/etc/corosync/qdevice/net/nssdb)

              Initial (used during connection parameters negotiation) maximum size of the receive
              buffer for message (maximum allowed message size received from qnetd). (32768)

              Initial  (used  during  connection  parameter negotiation) maximum size of one send
              buffer (message) to be sent to server. (32768)

              Minimum required size of one send buffer (message) to be sent to server. (32768)

              Maximum allowed size of receive buffer for a message sent by server. (16777216)

              Maximum number of send buffers. (10)

              Canonical name of qnetd server certificate. (Qnetd Server)

              NSS nickname of qdevice client certificate. (Cluster Cert)

              Minimum heartbeat timeout accepted by client in ms. (1000)

              Maximum heartbeat timeout accepted by client in ms. (120000)

              Minimum connection timeout accepted by client in ms. (1000)

              Maximum connection timeout accepted by client in ms. (120000)

              Enable test algorithm. (if built with --enable-debug on, otherwise off)


       Define qdevice with net model connecting to qnetd running on host, using
       ffsplit algorithm.  Heuristics is set to sync mode and executes two commands.

       quorum {
         provider: corosync_votequorum
         device {
           votes: 1
           model: net
           net {
             tls: on
             algorithm: ffsplit
           heuristics {
             mode: sync
             exec_ping: /bin/ping -q -c 1 ""
             exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt


       corosync-qdevice-tool(8)         corosync-qdevice-net-certutil(8)        corosync-qnetd(8)


       Jan Friesse

                                            2018-08-09                        COROSYNC-QDEVICE(8)