Provided by: corosync_3.0.1-2ubuntu1_amd64 bug


       corosync.conf - corosync executive configuration file




       The  corosync.conf  instructs  the  corosync  executive about various parameters needed to
       control the corosync executive.  Empty lines and  lines  starting  with  #  character  are
       ignored.  The configuration file consists of bracketed top level directives.  The possible
       directive choices are:

       totem { }
              This top level directive contains configuration options for the totem protocol.

       logging { }
              This top level directive contains configuration options for logging.

       quorum { }
              This top level directive contains configuration options for quorum.

       nodelist { }
              This top level directive contains configuration options for nodes in cluster.

       system { }
              This top level directive contains configuration options related to system.

       resources { }
              This top level directive contains configuration options for resources.

       The interface sub-directive of totem is optional for UDP and knet transports.

       For knet, multiple interface subsections define parameters  for  each  knet  link  on  the

       For  UDPU  an  interface  section is not needed and it is recommended that the nodelist is
       used to define cluster nodes.

              This specifies the link number for the interface.  When using  the  knet  protocol,
              each  interface  should  specify  separate link numbers to uniquely identify to the
              membership protocol which interface to use for which  link.   The  linknumber  must
              start at 0. For UDP the only supported linknumber is 0.

              This  specifies the priority for the link when knet is used in 'passive' mode. (see
              link_mode below)

              This specifies the  interval  between  knet  link  pings.   knet_ping_interval  and
              knet_ping_timeout  are  a  pair,  if  one  is  specified  the  other should be too,
              otherwise one will be calculated from the token timeout and one will be taken  from
              the config file.  (default is token timeout / (knet_pong_count*2))

              If  no  ping  is  received  within  this  time,  the  knet  link  is declared dead.
              knet_ping_interval and knet_ping_timeout are a pair, if one is specified the  other
              should be too, otherwise one will be calculated from the token timeout and one will
              be taken from the config file.  (default is token timeout / knet_pong_count)

              How many values of latency are used to calculate the average link latency. (default
              2048 samples)

              How many valid ping/pongs before a link is marked UP. (default 5)

              Which  IP  transport  knet  should use. valid values are "sctp" or "udp". (default:

       bindnetaddr (udp only)
              This specifies the network address the corosync executive should bind to when using

              bindnetaddr  (udp  only)  should  be  an  IP address configured on the system, or a
              network address.

              For example, if the local interface is with netmask, you
              should  set  bindnetaddr to or  If the local interface is
     with netmask,  set  bindnetaddr  to  or
    , and so forth.

              This  may  also be an IPV6 address, in which case IPV6 networking will be used.  In
              this case, the exact address must be specified and there is no automatic  selection
              of the network interface within a specific subnet as with IPv4.

              If IPv6 networking is used, the nodeid field in nodelist must be specified.

       broadcast (udp only)
              This is optional and can be set to yes.  If it is set to yes, the broadcast address
              will be used for communication.  If this option is set,  mcastaddr  should  not  be

       mcastaddr (udp only)
              This  is the multicast address used by corosync executive.  The default should work
              for most networks,  but  the  network  administrator  should  be  queried  about  a
              multicast  address  to  use.   Avoid 224.x.x.x because this is a "config" multicast

              This may also be an IPV6 multicast address, in which case IPV6 networking  will  be
              used.  If IPv6 networking is used, the nodeid field in nodelist must be specified.

              It's  not  necessary  to  use  this  option if cluster_name option is used. If both
              options are used, mcastaddr has higher priority.

       mcastport (udp only)
              This specifies the UDP port number.  It is  possible  to  use  the  same  multicast
              address on a network with the corosync services configured for different UDP ports.
              Please note corosync  uses  two  UDP  ports  mcastport  (for  mcast  receives)  and
              mcastport - 1 (for mcast sends).  If you have multiple clusters on the same network
              using the same mcastaddr please configure the mcastports with a gap.

       ttl (udp only)
              This specifies the Time To Live (TTL). If you run your cluster on a routed  network
              then  the  default of "1" will be too small. This option provides a way to increase
              this up to 255. The valid range is 0..255.

       Within the totem directive,  there  are  seven  configuration  options  of  which  one  is
       required,  five are optional, and one is required when IPV6 is configured in the interface
       subdirective.  The required directive controls the version  of  the  totem  configuration.
       The  optional option unless using IPV6 directive controls identification of the processor.
       The optional options control secrecy and authentication, the network mode of operation and
       maximum network MTU field.

              This  specifies  the  version  of the configuration file.  Currently the only valid
              version for this directive is 2.

       clear_node_high_bit This configuration option is optional and is  only  relevant  when  no
       nodeid is specified.  Some corosync clients require a signed 32 bit nodeid that is greater
       than zero however by default corosync uses all 32 bits of  the  IPv4  address  space  when
       generating  a  nodeid.   Set  this  option  to  yes  to  force the high bit to be zero and
       therefore ensure the nodeid is a positive signed 32 bit integer.

       WARNING: Cluster behavior is undefined if this option is enabled on only a subset  of  the
       cluster (for example during a rolling upgrade).

              This  specifies which cryptographic library should be used by knet. Options are nss
              and openssl.

              The default is nss.

              This specifies which  HMAC  authentication  should  be  used  to  authenticate  all
              messages.  Valid values are none (no authentication), md5, sha1, sha256, sha384 and
              sha512. Encrypted transmission is only supported for the knet transport.

              The default is none.

              This specifies which cipher should be used to encrypt all messages.   Valid  values
              are none (no encryption), aes256, aes192, aes128 and 3des.  Enabling crypto_cipher,
              requires also enabling of crypto_hash. Encrypted transmission is only supported for
              the knet transport.

              The default is none.

              This  specifies the fully qualified path to the shared key used to authenticate and
              encrypt data used within the Totem protocol.

              The default is /etc/corosync/authkey.

       key    Shared key stored in configuration instead of authkey file. This option  has  lower
              precedence  than  keyfile  option  so it's used only when keyfile is not specified.
              Using this option is not recommended for security reasons.

              This specifies the Kronosnet mode, which may be  passive,  active,  or  rr  (round-
              robin).   passive: the active link with the lowest priority will be used. If one or
              more links share the same priority the one with the lowest link ID  will  be  used.
              active:  All  active  links  will  be  used  simultaneously  to send traffic.  link
              priority is ignored.  rr: Round-Robin policy. Each packet will be sent to the  next
              active link in order.

              If only one interface directive is specified, passive is automatically chosen.

              The maximum number of interface directives that is allowed with Kronosnet is 8. For
              other transports it is 1.

       netmtu This specifies the network maximum transmit unit.  To set this value  beyond  1500,
              the regular frame MTU, requires ethernet devices that support large, or also called
              jumbo, frames.  If any device in the network  doesn't  support  large  frames,  the
              protocol  will  not  operate properly.  The hosts must also have their mtu size set
              from 1500 to whatever frame size is specified here.

              Please note while some NICs or switches claim large  frame  support,  they  support
              9000 MTU as the maximum frame size including the IP header.  Setting the netmtu and
              host MTUs to 9000 will cause totem to use the full 9000 bytes of the  frame.   Then
              Linux  will  add  a 18 byte header moving the full frame size to 9018.  As a result
              some hardware will not operate properly with this size of data.  A netmtu  of  8982
              seems  to  work  for  the  few  large  frame  devices  that have been tested.  Some
              manufacturers claim large frame support when in fact they support  frame  sizes  of
              4500 bytes.

              When sending multicast traffic, if the network frequently reconfigures, chances are
              that some device in the network doesn't support large frames.

              Choose hardware carefully if intending to use large frame support.

              The default is 1500.

              This directive controls the transport mechanism used.  The default  is  knet.   The
              transport type can also be set to udpu or udp.  Only knet allows crypto or multiple
              interfaces per node.

              This specifies the name of cluster  and  it's  used  for  automatic  generating  of
              multicast address.

              This  specifies  version  of config file. This is converted to unsigned 64-bit int.
              By default it's 0. Option is used to prevent joining old nodes with not  up-to-date
              configuration.  If value is not 0, and node is going for first time (only for first
              time, join after split doesn't follow this rules) from  single-node  membership  to
              multiple  nodes  membership,  other nodes config_versions are collected. If current
              node config_version is not equal to highest  of  collected  versions,  corosync  is

              This specifies version of IP to ask DNS resolver for.  The value can be one of ipv4
              (look only for an IPv4 address) , ipv6 (check only IPv6 address) , ipv4-6 (look for
              all  address families and use first IPv4 address found in the list if there is such
              address, otherwise use first  IPv6  address)  and  ipv6-4  (look  for  all  address
              families  and  use  first  IPv6 address found in the list if there is such address,
              otherwise use first IPv4 address).

              Default (if unspecified) is ipv6-4 for knet and udpu transports and ipv4 for udp.

              The knet transport supports IPv4 and IPv6 addresses concurrently, provided they are
              consistent on each link.

              Within  the totem directive, there are several configuration options which are used
              to control the operation of the protocol.   It  is  generally  not  recommended  to
              change  any  of  these values without proper guidance and sufficient testing.  Some
              networks may require larger values if  suffering  from  frequent  reconfigurations.
              Some  applications may require faster failure detection times which can be achieved
              by reducing the token timeout.

       token  This timeout is used directly or as a  base  for  real  token  timeout  calculation
              (explained  in  token_coefficient section). Token timeout specifies in milliseconds
              until a token loss is declared after not receiving a token.  This is the time spent
              detecting  a  failure of a processor in the current configuration.  Reforming a new
              configuration takes about 50 milliseconds in addition to this timeout.

              For real token  timeout  used  by  totem  it's  possible  to  read  cmap  value  of
              runtime.config.totem.token key.

              The default is 1000 milliseconds.

              Specifies  the interval between warnings that the token has not been received.  The
              value is a percentage of the token timeout and can be set to 0 to disable warnings.

              The default is 75%.

              This value is used only when nodelist section is specified and contains at least  3
              nodes.  If so, real token timeout is then computed as token + (number_of_nodes - 2)
              * token_coefficient.  This allows cluster to scale without manually changing  token
              timeout  every  time  new  node  is  added. This value can be set to 0 resulting in
              effective removal of this feature.

              The default is 650 milliseconds.

              This timeout specifies in milliseconds after how long before receiving a token  the
              token  is  retransmitted.   This  will  be  automatically  calculated  if  token is
              modified.  It is not recommended to alter this  value  without  guidance  from  the
              corosync community.

              The default is 238 milliseconds.

              The  (optional)  type of compression used by Kronosnet. The values available depend
              on the build and  also  avaialable  libraries.  Typically  zlib  and  lz4  will  be
              available but bzip2 and others could also be allowed. The default is 'none'

              Tells  knet  to NOT compress any packets that are smaller than the value indicated.
              Default 100 bytes.

              Set to 0 to reset to the default.  Set to 1 to compress everything.

              Many compression libraries allow tuning of compression parameters. For example 0 or
              1  ...  9  are  commonly  used to determine the level of compression. This value is
              passed unmodified to the compression library so it is recommended  to  consult  the
              library's documentation for more detailed information.

       hold   This  timeout  specifies  in  milliseconds how long the token should be held by the
              representative when the protocol is under low utilization.   It is not  recommended
              to alter this value without guidance from the corosync community.

              The default is 180 milliseconds.

              This value identifies how many token retransmits should be attempted before forming
              a  new  configuration.   If  this  value  is  set,  retransmit  and  hold  will  be
              automatically calculated from retransmits_before_loss and token.

              The default is 4 retransmissions.

       join   This  timeout  specifies  in milliseconds how long to wait for join messages in the
              membership protocol.

              The default is 50 milliseconds.

              This timeout specifies in milliseconds an upper range between 0  and  send_join  to
              wait  before  sending  a join message.  For configurations with less than 32 nodes,
              this parameter is not necessary.  For larger rings, this parameter is necessary  to
              ensure  the NIC is not overflowed with join messages on formation of a new ring.  A
              reasonable value for large rings (128 nodes) would be 80msec.  Other  timer  values
              must  also  change if this value is changed.  Seek advice from the corosync mailing
              list if trying to run larger configurations.

              The default is 0 milliseconds.

              This timeout specifies in milliseconds  how  long  to  wait  for  consensus  to  be
              achieved  before  starting  a  new  round of membership configuration.  The minimum
              value for consensus must  be  1.2  *  token.   This  value  will  be  automatically
              calculated at 1.2 * token if the user doesn't specify a consensus value.

              For two node clusters, a consensus larger than the join timeout but less than token
              is safe.  For three node or larger clusters, consensus should be larger than token.
              There  is  an  increasing  risk  of  odd  membership changes, which still guarantee
              virtual synchrony,  as node count grows if consensus is less than token.

              The default is 1200 milliseconds.

       merge  This timeout specifies in milliseconds how long  to  wait  before  checking  for  a
              partition  when  no multicast traffic is being sent.  If multicast traffic is being
              sent, the merge detection happens automatically as a function of the protocol.

              The default is 200 milliseconds.

              This timeout specifies in milliseconds how long to  wait  before  checking  that  a
              network interface is back up after it has been downed.

              The default is 1000 milliseconds.

              This  constant  specifies  how many rotations of the token without receiving any of
              the messages when messages should be received may occur before a new  configuration
              is formed.

              The default is 2500 failures to receive a message.

              This  constant  specifies  how  many  rotations  of the token without any multicast
              traffic should occur before the hold timer is started.

              The default is 30 rotations.

              [HeartBeating mechanism] Configures the optional HeartBeating mechanism for  faster
              failure  detection.  Keep  in  mind  that engaging this mechanism in lossy networks
              could cause faulty loss declaration as the mechanism  relies  on  the  network  for

              So  as a rule of thumb use this mechanism if you require improved failure in low to
              medium utilized networks.

              This constant specifies the number of heartbeat failures the system should tolerate
              before  declaring  heartbeat  failure  e.g 3. Also if this value is not set or is 0
              then the heartbeat mechanism is not engaged in the system and token rotation is the
              method of failure detection

              The default is 0 (disabled).

              [HeartBeating  mechanism]  This  constant specifies in milliseconds the approximate
              delay that your network takes to transport one packet from one machine to  another.
              This  value is to be set by system engineers and please don't change if not sure as
              this effects the failure detection mechanism using heartbeat.

              The default is 50 milliseconds.

              This constant specifies the maximum number of messages that  may  be  sent  on  one
              token  rotation.  If all processors perform equally well, this value could be large
              (300), which would introduce higher latency from origination to delivery  for  very
              large  rings.   To  reduce  latency  in  large  rings(16+), the defaults are a safe
              compromise.  If 1 or more slow processor(s)  are  present  among  fast  processors,
              window_size  should  be  no  larger  than  256000 / netmtu to avoid overflow of the
              kernel receive buffers.  The  user  is  notified  of  this  by  the  display  of  a
              retransmit  list  in  the  notification  logs.   There  is  no  loss  of  data, but
              performance is reduced when these errors occur.

              The default is 50 messages.

              This constant specifies the maximum number of messages that  may  be  sent  by  one
              processor on receipt of the token.  The max_messages parameter is limited to 256000
              / netmtu to prevent overflow of the kernel transmit buffers.

              The default is 17 messages.

              This constant defines the maximum number of times on receipt of a token  a  message
              is  checked  for  retransmission before a retransmission occurs.  This parameter is
              useful to modify for switches that delay  multicast  packets  compared  to  unicast
              packets.  The default setting works well for nearly all modern switches.

              The default is 5 messages.

              How  often  the knet PMTUd runs to look for network MTU changes.  Value in seconds,
              default: 30

       Within the logging directive, there  are  several  configuration  options  which  are  all

       The following 3 options are valid only for the top level logging directive:

              This specifies that a timestamp is placed on all log messages. It can be one of off
              (no timestamp), on (second precision timestamp)  or  hires  (millisecond  precision
              timestamp - only when supported by LibQB).

              The default is hires (or on if hires is not supported).

              This specifies that file and line should be printed.

              The default is off.

              This specifies that the code function name should be printed.

              The default is off.

              This specifies that blackbox functionality should be enabled.

              The default is on.

       The  following  options  are  valid  both  for top level logging directive and they can be
       overridden in logger_subsys entries.



              These specify the destination of logging output. Any combination of  these  options
              may be specified. Valid options are yes and no.

              The default is syslog and stderr.

              Please  note,  if  you  are  using  to_logfile  and  want  to  rotate the file, use
              logrotate(8) with the option copytruncate.  eg.
              /var/log/corosync.log {
                   rotate 7

              If the to_logfile directive is set to yes , this option specifies the  pathname  of
              the log file.

              No default.

              This specifies the logfile priority for this particular subsystem. Ignored if debug
              is on.  Possible values are: alert, crit, debug (same as debug = on),  emerg,  err,
              info, notice, warning.

              The default is: info.

              This  specifies the syslog facility type that will be used for any messages sent to
              syslog. options are daemon, local0, local1, local2, local3, local4, local5,  local6
              & local7.

              The default is daemon.

              This  specifies the syslog level for this particular subsystem. Ignored if debug is
              on.  Possible values are: alert, crit, debug (same as  debug  =  on),  emerg,  err,
              info, notice, warning.

              The default is: info.

       debug  This  specifies whether debug output is logged for this particular logger. Also can
              contain value trace, what is highest level of debug information.

              The default is off.

       Within the logging directive, logger_subsys directives are optional.

       Within the logger_subsys sub-directive, all of the above logging configuration options are
       valid  and  can  be  used  to  override the default settings.  The subsys entry, described
       below, is mandatory to identify the subsystem.

       subsys This specifies the subsystem identity (name) for which logging is  specified.  This
              is the name used by a service in the log_init() call. E.g. 'CPG'. This directive is

       Within the quorum directive it is possible to specify the quorum algorithm to use with the

              directive. At the time of  writing  only  corosync_votequorum  is  supported.   See
              votequorum(5) for configuration options.

       Within  the  nodelist directive it is possible to specify specific information about nodes
       in cluster. Directive can contain only node sub-directive, which specifies every node that
       should be a member of the membership, and where non-default options are needed. Every node
       must have at least ring0_addr field filled.

       Every node that should be a member of the membership must be specified.

       Possible options are:

              This specifies IP or network hostname address of the particular node.  X is a  link

       nodeid This configuration option is required for each node for Kronosnet mode.  It is a 32
              bit value specifying the  node  identifier  delivered  to  the  cluster  membership
              service.  The  node identifier value of zero is reserved and should not be used. If
              knet is set, this field must be set.

       name   This option is used mainly with knet transport to identify local node.   It's  also
              used  by  client  software  (pacemaker).   Algorithm  for identifying local node is

              1.     Looks up $HOSTNAME in the nodelist

              2.     If this fails strip the domain name from $HOSTNAME and looks up that in  the

              3.     If  this  fails  look in the nodelist for a fully-qualified name whose short
                     version matches the short version of $HOSTNAME

              4.     If all this fails then search  the  interfaces  list  for  an  address  that
                     matches a name in the nodelist

       Within the system directive it is possible to specify system options.

       Possible options are:

              This  specifies type of IPC to use. Can be one of native (default), shm and socket.
              Native means one of shm or socket, depending on what is supported by OS. On systems
              with  support  for  both,  SHM  is  selected.  SHM is generally faster, but need to
              allocate ring buffer file in /dev/shm.

              Should be set to yes (default) if corosync should try to set round  robin  realtime
              scheduling  with  maximal  priority  to  itself.  When  setting of scheduler fails,
              fallback to set maximal priority.

              Set priority of corosync process. Valid only when sched_rr is set to  no.   Can  be
              ether  numeric value with similar meaning as nice(1) or max / min meaning maximal /
              minimal priority (so minimal / maximal nice value).

              Should be set to yes (default) if corosync  should  try  to  move  itself  to  root
              cgroup.  This  feature  is  available  only  for systems with cgroups with RT sched
              enabled (Linux with CONFIG_RT_GROUP_SCHED kernel option).

              Existing directory where corosync should  chdir  into.  Corosync  stores  important
              state files and blackboxes there.

              The default is /var/lib/corosync.

       Within the resources directive it is possible to specify options for resources.

       Possible option is:

              (Valid only if Corosync was compiled with watchdog support.)
              Watchdog  device  to  use, for example /dev/watchdog.  If unset, empty or "off", no
              watchdog is used.

              In a cluster  with  properly  configured  power  fencing  a  watchdog  provides  no
              additional  value.  On the other hand, slow watchdog communication may incur multi-
              second delays in the Corosync main  loop,  potentially  breaking  down  membership.
              IPMI   watchdogs   are   particularly   notorious   in   this  regard:  read  about
              kipmid_max_busy_us in IPMI.txt in the Linux kernel documentation.


       For example to add a node with address with nodeid 3. The node has  the  name
       NEW   (in  DNS  or  /etc/hosts)  and  is  not  currently  running  corosync.  The  current
       corosync.conf nodelist looks like this:

              nodelist {
                  node {
                      nodeid: 1
                      name: node1
                  node {
                      nodeid: 2
                      name: node2


       Add a new entry for the node below the existing nodes. Node entries don't have  to  be  in
       nodeid order, but it will help keep you sane. So the nodelist now looks like this:

              nodelist {
                  node {
                      nodeid: 1
                      name: node1
                  node {
                      nodeid: 2
                      name: node2

                  node {
                      nodeid: 3
                      name: NEW


       This  file must then be copied onto all three nodes -  the existing two nodes, and the new
       one.  On one of the existing corosync nodes, tell corosync to re-read the  updated  config
       file into memory:

              corosync-cfgtool -R

       This  command only needs to be run on one node in the cluster. You may then start corosync
       on the NEW node and it should join the cluster. If this  doesn't  work  as  expected  then
       check the communications between all three nodes is working, and check the syslog files on
       all nodes for more information. It's important to note that the  key  bit  of  information
       about a node failing to join might be on a different node than you expect.


       This  is  the  reverse procedure to 'Adding a node' above. First you need to shut down the
       node you will be removing from the cluster.

              corosync-cfgtool -H

       Then delete the nodelist stanza from corosync.conf and  finally  update  corosync  on  the
       remaining nodes by running

              corosync-cfgtool -R

       on one of them.


       corosync resolves ringX_addr names/IP addresses using the getaddrinfo(3) call with respect
       of totem.ip_version setting.

       getaddrinfo() function uses a sophisticated  algorithm  to  sort  node  addresses  into  a
       preferred order and corosync always chooses the first address in that list of the required
       family.  As such it  is  essential  that  your  DNS  or  /etc/hosts  files  are  correctly
       configured  so  that  all addresses for ringX appear on the same network (or are reachable
       with minimal hops) and over the same IP protocol. If this is not the case then some  nodes
       might not be able to join the cluster. It is possible to override the search order used by
       getaddrinfo() using the configuration file /etc/gai.conf(5) if necessary, but this is  not

       If  there  is  any  doubt about the order of addresses returned from getaddrinfo() then it
       might be simpler to use IP addresses (v4 or v6) in the ringX_addr field.


              The corosync executive configuration file.


       corosync_overview(7),  votequorum(5),  corosync-qdevice(8),  logrotate(8)   getaddrinfo(3)