bionic (7) SLONIK_FAILOVER.7.gz

Provided by: slony1-2-doc_2.2.6-1_all bug

NAME

       FAILOVER - Fail a broken replication set over to a backup node

SYNOPSIS

       FAILOVER (options);

DESCRIPTION

       The  FAILOVER command causes the backup node to take over all sets that currently originate on the failed
       node. slonik will contact all other direct subscribers of the failed node to determine which node has the
       highest  sync  status  for  each  set. If another node has a higher sync status than the backup node, the
       replication will first be redirected so that the backup node replicates against that other  node,  before
       assuming the origin role and allowing update activity.

       After  successful failover, all former direct subscribers of the failed node become direct subscribers of
       the backup node. The failed node is abandoned, and can and should be removed from the configuration  with
       SLONIK DROP NODE(7).

       If multiple set origin nodes have failed, then you should tell FAILOVER about all of them in one request.
       This is done by passing a list like NODE=(ID=val,BACKUP NODE=val), NODE=(ID=val2,  BACKUP  NODE=val2)  to
       FAILOVER.

       Nodes  that  are  forwarding  providers  can also be passed to the failover command as a failed node. The
       failover process will redirect the subscriptions from these nodes to the backup node.

       ID = ival
              ID of the failed node

       BACKUP NODE = ival
              Node ID of the node that will take over all sets originating on the failed node

       This uses “schemadocfailednode(p_failed_nodes integer, p_backup_node integer,  p_failed_node  integer[])”
       [not available as a man page].

EXAMPLE

       FAILOVER (
          ID = 1,
          BACKUP NODE = 2
       );

       #example of multiple nodes
       FAILOVER(
          NODE=(ID=1, BACKUP NODE=2),
          NODE=(ID=3, BACKUP NODE=4)
       );

LOCKING BEHAVIOUR

       Exclusive  locks  on  each  replicated table will be taken out on both the new origin node as replication
       triggers are changed.  If the new origin was not completely up to date,  and  replication  data  must  be
       drawn  from  some  other  node that is more up to date, the new origin will not become usable until those
       updates are complete.

DANGEROUS/UNINTUITIVE BEHAVIOUR

       This command will abandon the status of the failed node. There is no possibility to let the  failed  node
       join  the  cluster  again  without  rebuilding  it from scratch as a slave. If at all possible, you would
       likely prefer to use SLONIK MOVE SET(7) instead, as that does not abandon the failed node.

       If a second failure occours in the middle of a FAILOVER operation then recovery might be complicated.

SLONIK EVENT CONFIRMATION BEHAVIOUR

       Slonik will submit the FAILOVER_EVENT without waiting but wait until the most  ahead  node  has  received
       confirmations of the FAILOVER_EVENT from all nodes before completing.

VERSION INFORMATION

       This command was introduced in Slony-I 1.0

       In version 2.0, the default BACKUP NODE value of 1 was removed, so it is mandatory to provide a value for
       this parameter

       In version 2.2 support was added for passing multiple nodes to a single failover command

                                                21 September 2017                             SLONIK FAILOVER(7)