Provided by: cman_3.0.12-2ubuntu5_i386 bug


       cman_tool - Cluster Management Tool


       cman_tool  join  |  leave  | kill | expected | votes | version | wait |
       status | nodes | services | debug [options]


       cman_tool is a program that manages the  cluster  management  subsystem
       CMAN.  cman_tool  can  be used to join the node to a cluster, leave the
       cluster, kill another cluster node or  change  the  value  of  expected
       votes of a cluster.
       Be  careful that you understand the consequences of the commands issued
       via cman_tool as they can affect all nodes in your cluster. Most of the
       time  the cman_tool will only be invoked from your startup and shutdown


       join   This is the main use of  cman_tool.  It  instructs  the  cluster
              manager  to  attempt  to  join  an  existing  cluster  or (if no
              existing cluster exists) then to form a new one on its own.
              If no options are given to this command then it  will  take  the
              cluster configuration information from cluster.conf. However, it
              is possible to provide all the information on  the  command-line
              or to override cluster.conf values by using the command line.

       leave  Tells CMAN to leave the cluster. You cannot do this if there are
              subsystems (eg DLM, GFS) active. You  should  dismount  all  GFS
              filesystems,  shutdown  CLVM, fenced and anything else using the
              cluster  manager  before  using  cman_tool   leave.    Look   at
              'cman_tool  status'  and  group_tool to see how many (and which)
              subsystems are active.
              When a node leaves the cluster, the remaining nodes  recalculate
              quorum  and  this  may  block  cluster  activity if the required
              number of votes is not present.  If this node is to be down  for
              an  extended  period  of  time  and you need to keep the cluster
              running, add the remove option, and  the  remaining  nodes  will
              recalculate quorum such that activity can continue.

       kill   Tells  CMAN to kill another node in the cluster. This will cause
              the local node to send a "KILL" message to that node and it will
              shut down.  Recovery will occur for the killed node as if it had
              failed.  This is a sort of remote version of  "leave  force"  so
              only use if if you really know what you are doing.

              Tells  CMAN  a  new  value of expected votes and instructs it to
              recalculate quorum based on this value.
              Use this option if your cluster has lost  quorum  due  to  nodes
              failing and you need to get it running again in a hurry.

              Used  alone  this will report the major, minor, patch and config
              versions used by CMAN (also displayed in 'cman_tool status'). It
              can also be used with -r to tell cluster members to update.
              The  argument  to -r is the version number that cman should look
              for. If that version is not currently available then  cman  will
              poll  for it. If a version of 0 is specified then cman will read
              the configuration file, validate it, distribute  it  around  the
              cluster (if necessary) and activate it.
              The  -D  flag  can  disable  the  validation  stage. This is NOT

       wait   Waits until the node  is  a  member  of  the  cluster  and  then

       status Displays the local view of the cluster status.

       nodes  Displays the local view of the cluster nodes.

              Displays  the  local  view of subsystems using cman (deprecated,
              group_tool should be used instead).

       debug  Sets the debug level of the running cman  daemon.  Debug  output
              will  be sent to syslog level LOG_DEBUG. the -d switch specifies
              the new logging  level.  This  is  the  same  bitmask  used  for
              cman_tool join -d


       -w     Normally,  "cman_tool  leave"  will  fail  if  the cluster is in
              transition (ie another node is joining or leaving the  cluster).
              By  adding  the -w flag, cman_tool will wait and retry the leave
              operation repeatedly until it succeeds or a more  serious  error

       -t <seconds>
              If  -w  is also specified then -t dictates the maximum amount of
              time cman_tool is prepared to wait. If the operation  times  out
              then a status of 2 is returned.

       force  Shuts  down the cluster manager without first telling any of the
              subsystems to close down. Use this option with extreme  care  as
              it could easily cause data loss.

       remove Tells  the  rest  of the cluster to recalculate quorum such that
              activity can continue without this node.


       -e <expected-votes>
              The new value of expected votes to use.  This  will  usually  be
              enough  to  bring  the  cluster  back to life. Values that would
              cause incorrect quorum will be rejected.


       -n <nodename>
              The node name of the node to  be  killed.  This  should  be  the
              unqualified node name as it appears in 'cman_tool nodes'.


       -r <config_version>
              Update  config version. You don't need to use this when adding a
              new node, the new cman node will tell the rest of the cluster to
              read the latest version of the config file automatically.
              In  fact  the argument to -r might look as though it is ignored.
              Its presence simply tells cman to re-read the configuration file
              and look for that version in the file. cman will keep re-reading
              the file until a version number >= the passed version is found.
              cman_tool version on  its  own  will  always  show  the  current
              version  and  not the one being looked for. So be aware that the
              display will possible not update immediately after you have  run
              cman_tool version -r.

              see "JOIN" options


       -q     Waits  until  the  cluster  is  quorate  before  returning.   -t
              <seconds> Dictates the  maximum  amount  of  time  cman_tool  is
              prepared to wait.  If the operation times out then a status of 2
              is returned.


       -c <clustername>
              Provides a text name for  the  cluster.  You  can  have  several
              clusters  on  one  LAN  and they are distinguished by this name.
              Note that the name is hashed to provide a unique number which is
              what  actually distinguishes the cluster, so it is possible that
              two different names can clash. If this happens,  the  node  will
              not  be  allowed  into the existing cluster and you will have to
              pick another name or  use  different  port  number  for  cluster

       -p <port>
              UDP port number used for cluster communication. This defaults to

       -v <votes>
              Number of votes this node has in the cluster. Defaults to 1.

       -e <expected votes>
              Number of expected votes for the  whole  cluster.  If  different
              nodes  provide  different  values  then the highest is used. The
              cluster will only operate when quorum is reached - that is  more
              than  half the available votes are available to the cluster. The
              default for this value is the total  number  of  votes  for  all
              nodes in the configuration file.

       -2     Sets  the cluster up for a special "two node only" mode. Because
              of the quorum requirements mentioned above, a  two-node  cluster
              cannot  be  valid.   This  option tells the cluster manager that
              there will only ever be two nodes in the cluster and  relies  on
              fencing  to  ensure  cluster integrity.  If you specify this you
              cannot add more nodes without taking down the  existing  cluster
              and  reconfiguring  it.  Expected votes should be set to 1 for a
              two-node cluster.

       -n <nodename>
              Overrides the node name. By default the unqualified hostname  is
              used.  This  option  is  also used to specify which interface is
              used for cluster communication.

       -N <nodeid>
              Overrides the  node  ID  for  this  node.  Normally,  nodes  are
              assigned  a node id in cluster.conf. If you specify an incorrect
              node ID here, the node might not be allowed to join the cluster.
              Setting  node IDs in the configuration is a far better way to do
              this.   Note that the node's application to join the cluster may
              be rejected if you try to set the nodeid to one that has already
              been used, or if the node was previously a member of the cluster
              but with a different nodeid.

       -o <nodename>
              Override  the name this node will have in the cluster. This will
              normally be the hostname or the  first  name  specified  by  -n.
              Note  how  this  differs from -n: -n tells cman_tool how to find
              the host address and/or the entry in the configuration file.  -o
              simply  changes  the  name the node will have in the cluster and
              has no bearing on the actual  name  of  the  machine.  Use  this
              option will extreme caution.

       -m <multicast-address>
              Specifies  a multicast address to use for cluster communication.
              This is required for IPv6 operation. You should also specify  an
              ethernet  interface  to bind to this multicast address using the
              -i option.

       -w     Join and wait until the node is a cluster member.

       -q     Join and wait until the cluster is quorate.  If the cluster join
              fails and -w (or -q) is specified, then it will be retried. Note
              that cman_tool cannot tell whether the cluster join was rejected
              by  another node for a good reason or that it timed out for some
              benign reason; so it is strongly recommended that a  timeout  is
              also given with the wait options to join. If you don't want join
              to retry on failure but do want to wait, use the cman_tool  join
              command without -w followed by cman_tool wait.

       -k <keyfile>
              All  traffic  sent out by cman/corosync is encrypted. By default
              the security key used is simply the cluster name.  If  you  need
              more  security  you can specify a key file that contains the key
              used to encrypt cluster communications.  Of course, the contents
              of the key file must be the same on all nodes in the cluster. It
              is up to you to securely copy the file to the nodes.

       -t <seconds>
              If -w or -q is also  specified  then  -t  dictates  the  maximum
              amount  of  time cman_tool is prepared to wait. If the operation
              times out then a status  of  2  is  returned.   Note  that  just
              because  cman_tool  has given up, does not mean that cman itself
              has stopped trying to join a cluster.

       -X     Tells cman not to use the  configuration  file  to  get  cluster
              information. If you use this option then cman will apply several
              defaults to the cluster to get it going. The cluster  name  will
              be  "RHCluster",  node IDs will default to the IP address of the
              node and remote node names will show up as Node<nodeid>. All  of
              these,  apart  from  the  node  names  can  be overridden on the
              cman_tool command-line if required.
              If you have to set up fence devices, services or  anything  else
              in  cluster.conf  then this option is probably not worthwhile to
              you - the extra readability of sensible node names  and  numbers
              will  make  it worth using cluster.conf for the cluster too. But
              for a simple failover cluster this might save you some effort.
              On each node using this configuration you will need to have  the
              same authorization key installed. To create this key run
              mv /etc/ais/authkey /etc/cluster/cman_authkey
              then copy that file to all nodes you want to join the cluster.

       -C     Overrides  the  default  configuration module. Usually cman uses
              xmlconfig (cluster.conf) to load its configuration. If you  have
              your  configuration database held elsewhere (eg LDAP) and have a
              configuration plugin for it, then you should specify the name of
              the module (see the documentation for the module for the name of
              it - it's not necessarily the same as the filename) here.
              It is possible to chain configuration modules by separating them
              with  colons.  So  to  add  two  modules  (eg)  'ldapconfig' and
              'ldappreproc'   to   the    chain    start    cman    with    -C
              The  default  value for this is 'xmlconfig'. Note that if the -X
              is on the command-line then -C will be ignored.

       -A     Don't load openais services. Normally cman_tool join  will  load
              the configuration module 'openaisserviceenablestable' which will
              load the services installed by openais.  If you  don't  want  to
              use  these  services  or  have  not  installed openais then this
              switch will disable them.

       -D     Tells cman_tool whether to  validate  the  configuration  before
              loading  or  reloading  it.   By  default  the  configuration is
              validated, which is equivalent to -Dfail.
              -Dwarn will validate the configuration and  print  any  messages
              arising, but will attempt to use it regardless of its validity.
              -Dnone (or just -D) will skip the validation completely.
              The  -D  switch  does  not  take  a  space  between  -D  and the
              parameter. so '-D fail' will cause an error. Use -Dfail.


       -a     Shows the IP address(es) the nodes are communicating on.

       -n <nodename>
              Shows node information for a specific node. This should  be  the
              unqualified node name as it appears in 'cman_tool nodes'.

       -F <format>
              Specify  the format of the output. The format string may contain
              one or more format options, each separated  by  a  comma.  Valid
              format options include: id, name, type, and addr.


       -d <value>
              The value is a bitmask of
              2 Barriers
              4 Membership messages
              8 Daemon operation, including command-line interaction
              16 Interaction with Corosync
              32 Startup debugging (cman_tool join operations only)


       the  nodes subcommand shows a list of nodes known to cman. the state is
       one of the following:
       M    The node is a member of the cluster
       X    The node is not a member of the cluster
       d    The node is known to the cluster but disallowed access to it.


       cman_tool removes most environment variables before forking and running
       Corosync,   as   well  as  adding  some  of  its  own  for  setting  up
       configuration parameters that were overridden on the command-line,  the
       exception  to  this is that variable with names starting COROSYNC_ will
       be passed down intact as they are assumed to be  used  for  configuring
       the daemon.


       Occasionally (but very infrequently I hope) you may see nodes marked as
       "Disallowed" in cman_tool status or "d" in cman_tool nodes.  This is  a
       bit  of  a  nasty  hack  to  get around mismatch between what the upper
       layers expect of the cluster manager and corosync.

       If a node experiences a momentary lack of connectivity, but one that is
       long enough to trigger the token timeouts, then it will be removed from
       the cluster. When connectivity is restored corosync will happily let it
       rejoin the cluster with no fuss. Sadly the upper layers don't like this
       very much. They may (indeed probably  will  have)  have  changed  their
       internal  state  while  the  other  node  was  away  and  there  is  no
       straightforward way to bring the rejoined  node  up-to-date  with  that
       state.  When  this  happens  the node is marked "Disallowed" and is not
       permitted to take part in cman operations.

       If the remainder of the cluster is quorate the the node will be sent  a
       kill  message and it will be forced to leave the cluster that way. Note
       that fencing should kick in to remove the node permanently anyway,  but
       it may take longer than the network outage for this to complete.

       If  the  remainder  of the cluster is inquorate then we have a problem.
       The likelihood is that we will have two (or more) partitioned  clusters
       and  we cannot decide which is the "right" one. In this case we need to
       defer to the system administrator to kill an appropriate  selection  of
       nodes to restore the cluster to sensible operation.

       The  latter  scenario  should  be  very  rare  and  may  indicate a bug
       somewhere in the code. If the local network is very flaky  or  busy  it
       may  be  necessary  to  increase  some  of  the  protocol  timeouts for
       corosync. We are trying to think of better solutions to this problem.

       Recovering  from  this  state  can,  unfortunately,   be   complicated.
       Fortunately, in the majority of cases, fencing will do the job for you,
       and the disallowed state will only be temporary. If  it  persists,  the
       recommended  approach  it  is to do a cman tool nodes on all systems in
       the cluster and determine the largest common subset of nodes  that  are
       valid members to each other. Then reboot the others and let them rejoin
       correctly. In the case of a single-node disconnection  this  should  be
       straightforward,  with  a  large cluster that has experienced a network
       partition it could get very complicated!


       In this example we have a five node  cluster  that  has  experienced  a
       network  partition.  Here  is  the  output  of cman_tool nodes from all
       Node  Sts   Inc   Joined               Name
          1   M   2372   2007-11-05 02:58:55
          2   d   2376   2007-11-05 02:58:56
          3   d   2376   2007-11-05 02:58:56
          4   M   2376   2007-11-05 02:58:56
          5   M   2376   2007-11-05 02:58:56

       Node  Sts   Inc   Joined               Name
          1   d   2372   2007-11-05 02:58:55
          2   M   2376   2007-11-05 02:58:56
          3   M   2376   2007-11-05 02:58:56
          4   d   2376   2007-11-05 02:58:56
          5   d   2376   2007-11-05 02:58:56

       Node  Sts   Inc   Joined               Name
          1   d   2372   2007-11-05 02:58:55
          2   M   2376   2007-11-05 02:58:56
          3   M   2376   2007-11-05 02:58:56
          4   d   2376   2007-11-05 02:58:56
          5   d   2376   2007-11-05 02:58:56

       Node  Sts   Inc   Joined               Name
          1   M   2372   2007-11-05 02:58:55
          2   d   2376   2007-11-05 02:58:56
          3   d   2376   2007-11-05 02:58:56
          4   M   2376   2007-11-05 02:58:56
          5   M   2376   2007-11-05 02:58:56

       Node  Sts   Inc   Joined               Name
          1   M   2372   2007-11-05 02:58:55
          2   d   2376   2007-11-05 02:58:56
          3   d   2376   2007-11-05 02:58:56
          4   M   2376   2007-11-05 02:58:56
          5   M   2376   2007-11-05 02:58:56
       In this scenario we should  kill  the  node  node-02  and  node-03.  Of
       course,  the 3 node cluster of node-01, node-04 & node-05 should remain
       quorate and be able to fenced the two rejoined nodes anyway, but it  is
       possible that the cluster has a qdisk setup that precludes this.


       This  section  details  how the configuration systems work in cman. You
       might need to know this if you are using the -C option to cman_tool, or
       writing your own configuration subsystem.
       By  default cman uses two configuration plugins to corosync. The first,
       'xmlconfig', reads the configuration information stored in cluster.conf
       and  stores  it in an internal database, in the same schema as it finds
       in  cluster.conf.   The  second  plugin,  'cmanpreconfig',  takes   the
       information   in   that  the  database,  adds  several  cman  defaults,
       determines  the  corosync  node  name  and  nodeID  and   formats   the
       information  in  a  similar  manner  to corosync.conf(5). Corosync then
       reads those keys to start the  cluster  protocol.   cmanpreconfig  also
       reads  several  environment  variables  that  might be set by cman_tool
       which can override information in the configuration.
       In the absence of xmlconfig, ie when 'cman_tool join' is  run  with  -X
       switch  (this  removes  xmlconfig  from the module list), cmanpreconfig
       also generates several defaults so that the cluster can be got  running
       without any configuration information - see above for the details.
       Note  that  cmanpreconfig  will  not  overwrite  corosync keys that are
       explicitly set in the  configuration  file,  allowing  you  to  provide
       custom  values  for  token  timeouts  etc, even though cman has its own
       defaults for some of those values. The exception to this  is  the  node
       name/address and multicast values, which are always taken from the cman
       configuration keys.
       Most of the extra keys that  cmanpreconfig  adds  are  outside  of  the
       /cluster/  tree  and  will  only  be  seen  if  you  dump  the whole of
       corosync's  object  database.  However  it  does  add  some  keys  into
       /cluster/cman  that you would not normally see in a normal cluster.conf
       file. These are harmless, though could be confusing. The  most  obvious
       of these is the "nodename" option which is passed from cmanpreconfig to
       the name cman module, to save it recalculating the node name again.