Provided by: corosync-qdevice_2.4.3-0ubuntu1.3_amd64 bug

NAME

       corosync-qdevice - QDevice daemon

SYNOPSIS

       corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]

DESCRIPTION

       corosync-qdevice  is  a daemon running on each node of a cluster. It provides a configured
       number of votes to the quorum subsystem based on a third-party arbitrator's decision.  Its
       primary use is to allow a cluster to sustain more node failures than standard quorum rules
       allow.  It is recommended for clusters with an even number of nodes and highly recommended
       for 2 node clusters.

OPTIONS

       -d     Forcefully turn on debug information without the need to change corosync.conf.

       -f     Do not daemonize, run in the foreground.

       -h     Show short help text

       -S     Set  advanced settings described in its own section below. This option shouldn't be
              generally used because most of the options are not safe to change.

CONFIGURATION

       corosync-qdevice reads its configuration from corosync.conf file.

       The main configuration is within quorum.device  sub-key.  Each  model  also  has  its  own
       configuration within a similarly named sub-key.

       model  Specifies  the  model  to be used. This parameter is required.  corosync-qdevice is
              modular and is able to support  multiple  different  models.  The  model  basically
              defines what type of arbitrator is used. Currently only net is supported.

       timeout
              Specifies  how  often corosync-qdevice should call the votequorum_poll function. It
              is also used by the net model to adjust its hearbeat  timeout.  It  is  recommended
              that you don't change this value.  Default is 10000.

       sync_timeout
              Specifies  how  often  corosync-qdevice  should  call  the votequorum_poll function
              during a sync phase. It is recommended that you don't change this  value.   Default
              is 30000.

       votes  The number of votes provided to the cluster by qdevice. Default is (number_of_nodes
              - 1) or generally sum(votes_per_node) - 1.

       quorum.device.heuristics subkey holds the configuration of the heuristics. Heuristics  are
       set of commands executed locally on startup, cluster membership change, successful connect
       to corosync-qnetd  and  optionally  also  at  regular  times.  When  all  commands  finish
       successfully  (their return error code is zero) on time, heuristics have passed, otherwise
       they have failed. The heuristics result is sent to corosync-qnetd and there it's  used  in
       calculations to determine which partition should be quorate.

       timeout
              Specifies  maximum  time  in  milliseconds how long corosync-qdevice waits till the
              heuristics commands finish. If some command doesn't finish before the timeout, it's
              killed and heuristics fail. This timeout is used for heuristics executed at regular
              times.  Default value is half of the quorum.device.timeout, so 5000.

       sync_timeout
              Similar to quorum.device.heuristics.timeout but  used  during  membership  changes.
              Default value is half of the quorum.device.sync_timeout, so 15000.

       interval
              Specifies  interval  between two regular heuristics execution. Default value is 3 *
              quorum.device.timeout, so 30000.

       mode   Can be on of on, sync or off and specifies mode of operation of heuristics. Default
              is  off  what  means  heuristics  are  disabled.  When  sync is set, heuristics are
              executed only during startup, membership change and when  connection  to  corosync-
              qnetd is established. When heuristics should be running also on regular basis, this
              option should be set to on value.

       exec_NAME
              defines executables.  NAME can be arbitrary valid cmap key name string and  it  has
              no  special meaning.  The value of this variable must contain a command to execute.
              The alue is parsed (split) into arguments  similarly  as  Bourne  shell  would  do.
              Quoting is possible by using backslash and double quotes.

       quorum.device.net subkey holds the configuration for model 'net'.

       tls    Can  be one of on, off or required and specifies if tls should be used.  on means a
              connection with TLS is attempted first, but if the  server  doesn't  advertise  TLS
              support  then  non-TLS will be used.  off is used then TLS is not required and it's
              then not even tried. This mode is the  only  one  which  doesn't  need  a  properly
              initialized NSS database.  required means TLS is required and if the server doesn't
              support TLS, qdevice will exit with error message. Default is on.

       host   Specifies the IP address or host  name  of  the  qnetd  server  to  be  used.  This
              parameter is required.

       port   Specifies TCP port of qnetd server. Default is 5403.

       algorithm
              Decision  algorithm.  Can  be  one of the ffsplit or lms.  (actually there are also
              test and 2nodelms , both of which are mainly for developers and shouldn't  be  used
              for  production  clusters).  For a description of what each algorithm means and how
              the algorithms differ see their individual sections.  Default value is ffsplit.

       tie_breaker
              can be one of lowest, highest or valid_node_id (number)  values.  It's  used  as  a
              fallback  if  qdevice  has  to decide between two or more equal partitions.  lowest
              means the partition with the lowest node id is chosen.  highest means the partition
              with  highest  node  id  is  chosen.  And  valid_node_id  means  that the partition
              containing the node with the given node id is chosen.  Default is 'lowest'.

       connect_timeout
              Timeout when corosync-qdevice is trying to connect to corosync-qnetd host.  Default
              is 0.8 * quorum.sync_timeout.

       force_ip_version
              can  be  one  of  0|4|6  and  forces  the  software to use the given IP version.  0
              (default value) means IPv6 is prefered and IPv4 should be used as a fallback.

       Logging configuration is  within  the  logging  directive.   corosync-qdevice  parses  and
       supports  most  of the options with exception of to_logfile, logfile and logfile_priority.
       The logger_subsys sub-directive can be also used if subsys is set to QDEVICE.

       For corosync-qdevice to work correctly, the nodelist directive has to be used and properly
       configured. Also the net model requires that totem.cluster_name option is set.

MODEL NET TLS CONFIGURATION

       For  model  net to work using TLS, it's necessary to create the NSS database, import Qnetd
       CA certificate, and get/distribute a valid client certificate.

       If pcs is used (recommended) the following steps are not  needed  because  pcs  does  them
       automatically.

       corosync-qdevice-net-certutil  is the tool to perform required actions semi-automatically.
       Please consult the help output of it and its man page. For a first time  configuration  it
       may make sense to start with the -Q option.

       If TLS is not required just edit corosync.conf file and set quorum.device.net.tls to off.

MODEL NET ALGORITHMS

       Algorithms  are  used  to  change behavior of how corosync-qnetd provides votes to a given
       node/partition. Currently there are two algorithms supported.

       ffsplit
              This one makes sense only for clusters with an even number of  nodes.  It  provides
              exactly one vote to the partition with the highest number of active nodes. If there
              are two exactly similar partitions, it provides its  vote  to  the  partition  with
              higher   score.   The   score   is   computed   as   (number_of_connected_nodes   +
              number_of_connected_nodes_with_passed_heuristics                                  -
              number_of_connected_nodes_with_failed_heuristics) If the scores are equal, the vote
              is provided to partition with the most clients connected to the  qnetd  server.  If
              this  number  is also equal, then the tie_breaker is used. It is able to transition
              its vote if the currently active partition becomes  partitioned  and  a  non-active
              partition  still  has  at least 50% of the active nodes. Because of this, a vote is
              not provided if the qnetd connection is not active.

              To use this algorithm it's required to set the  number  of  votes  per  node  to  1
              (default)  and  the  qdevice  number of votes has to be also 1. This is achieved by
              setting quorum.device.votes key in corosync.conf file to 1.

       lms    Last-man-standing. If the node is the only one left in the cluster that can see the
              qnetd server then we return a vote.

              If  more than one node can see the qnetd server but some nodes can't see each other
              then the cluster is divided up into 'partitions' based on their  ring_id  and  this
              algorithm  returns  a vote to the partition with highest heuristics score (computed
              the same way as for the ffsplit algorithm), or if there is more  than  1  partition
              with  equal  scores, the largest active partition or, if there is more than 1 equal
              partition, the partition that contains the tie_breaker node (lowest, highest, etc).
              For  LMS  to  work,  the  number of qdevice votes has to be set to default (so just
              delete quorum.device.votes key from corosync.conf).

ADVANCED SETTINGS

       Set by using -S option. The default value is shown in parentheses)  Options beginning with
       net_ prefix are specific to model net.

       lock_file
              Lock file location. (/var/run/corosync-qdevice/corosync-qdevice.pid)

       local_socket_file
              Internal    IPC    socket   file   location.   (/var/run/corosync-qdevice/corosync-
              qdevice.sock)

       local_socket_backlog
              Parameter passed to listen syscall. (10)

       max_cs_try_again
              How many times to retry  the  call  to  a  corosync  function  which  has  returned
              CS_ERR_TRY_AGAIN. (10)

       votequorum_device_name
              Name used for qdevice registration. (Qdevice)

       ipc_max_clients
              Maximum allowed simultaneous IPC clients. (10)

       ipc_max_receive_size
              Maximum size of a message received by IPC client. (4096)

       ipc_max_send_size
              Maximum size of a message allowed to be sent to an IPC client. (65536)

       master_wins
              Force enable/disable master wins. (default is model)

       heuristics_ipc_max_send_buffers
              Maximum number of heuristics worker send buffers. (128)

       heuristics_ipc_max_send_receive_size
              Maximum  size  of  a  message  allowed  to  be send to, or received from heuristics
              worker. (4096)

       heuristics_min_timeout
              Minimum heuristics timeout accepted by client in ms. (1000)

       heuristics_max_timeout
              Maximum heuristics timeout accepted by client in ms. (120000)

       heuristics_min_interval
              Minimum heuristics interval accepted by client in ms. (1000)

       heuristics_max_interval
              Maximum heuristics interval accepted by client in ms. (3600000)

       heuristics_max_execs
              Maximum number of exec_ commands. (32)

       heuristics_use_execvp
              Use execvp instead of execv for executing commands. (off)

       heuristics_max_processes
              Maximum    number    of    processes     running     at     one     time.     (160)
              heuristics_kill_list_interval  Interval  between  status is gathered and eventually
              signal is sent to processes which didn't finished on time in ms. (5000)

       net_nss_db_dir
              NSS database directory. (/etc/corosync/qdevice/net/nssdb)

       net_initial_msg_receive_size
              Initial (used during connection parameters negotiation) maximum size of the receive
              buffer for message (maximum allowed message size received from qnetd). (32768)

       net_initial_msg_send_size
              Initial  (used  during  connection  parameter negotiation) maximum size of one send
              buffer (message) to be sent to server. (32768)

       net_min_msg_send_size
              Minimum required size of one send buffer (message) to be sent to server. (32768)

       net_max_msg_receive_size
              Maximum allowed size of receive buffer for a message sent by server. (16777216)

       net_max_send_buffers
              Maximum number of send buffers. (10)

       net_nss_qnetd_cn
              Canonical name of qnetd server certificate. (Qnetd Server)

       net_nss_client_cert_nickname
              NSS nickname of qdevice client certificate. (Cluster Cert)

       net_heartbeat_interval_min
              Minimum heartbeat timeout accepted by client in ms. (1000)

       net_heartbeat_interval_max
              Maximum heartbeat timeout accepted by client in ms. (120000)

       net_min_connect_timeout
              Minimum connection timeout accepted by client in ms. (1000)

       net_max_connect_timeout
              Maximum connection timeout accepted by client in ms. (120000)

       net_test_algorithm_enabled
              Enable test algorithm. (if built with --enable-debug on, otherwise off)

EXAMPLE

       Define qdevice with net model connecting to qnetd running on qnetd.example.org host, using
       ffsplit algorithm.  Heuristics is set to sync mode and executes two commands.

       quorum {
         provider: corosync_votequorum
         device {
           votes: 1
           model: net
           net {
             tls: on
             host: qnetd.example.org
             algorithm: ffsplit
           }
           heuristics {
             mode: sync
             exec_ping: /bin/ping -q -c 1 "www.example.org"
             exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt
           }
       }

SEE ALSO

       corosync-qdevice-tool(8)         corosync-qdevice-net-certutil(8)        corosync-qnetd(8)
       corosync.conf(5)

AUTHOR

       Jan Friesse

                                            2017-10-17                        COROSYNC-QDEVICE(8)