Provided by: slony1-2-doc_2.2.11-4_all bug

NAME

       FAILOVER - Fail a broken replication set over to a backup node

SYNOPSIS

       FAILOVER (options);

DESCRIPTION

       The FAILOVER command causes the backup node to take over all sets that currently originate
       on the failed node. slonik will contact all other direct subscribers of the failed node to
       determine  which  node  has  the  highest  sync status for each set. If another node has a
       higher sync status than the backup node, the replication will first be redirected so  that
       the  backup  node  replicates against that other node, before assuming the origin role and
       allowing update activity.

       After successful failover, all former direct subscribers of the failed node become  direct
       subscribers  of  the  backup  node.  The  failed  node is abandoned, and can and should be
       removed from the configuration with SLONIK DROP NODE(7).

       If multiple set origin nodes have failed, then you should tell FAILOVER about all of  them
       in  one  request.  This  is  done  by  passing  a list like NODE=(ID=val,BACKUP NODE=val),
       NODE=(ID=val2, BACKUP NODE=val2) to FAILOVER.

       Nodes that are forwarding providers can also be passed to the failover command as a failed
       node.  The failover process will redirect the subscriptions from these nodes to the backup
       node.

       ID = ival
              ID of the failed node

       BACKUP NODE = ival
              Node ID of the node that will take over all sets originating on the failed node

       This   uses   “schemadocfailednode(p_failed_nodes    integer,    p_backup_node    integer,
       p_failed_node integer[])” [not available as a man page].

EXAMPLE

       FAILOVER (
          ID = 1,
          BACKUP NODE = 2
       );

       #example of multiple nodes
       FAILOVER(
          NODE=(ID=1, BACKUP NODE=2),
          NODE=(ID=3, BACKUP NODE=4)
       );

LOCKING BEHAVIOUR

       Exclusive  locks on each replicated table will be taken out on both the new origin node as
       replication triggers are changed.  If the new origin was not completely up  to  date,  and
       replication  data  must  be  drawn  from  some other node that is more up to date, the new
       origin will not become usable until those updates are complete.

DANGEROUS/UNINTUITIVE BEHAVIOUR

       This command will abandon the status of the failed node. There is no  possibility  to  let
       the  failed  node join the cluster again without rebuilding it from scratch as a slave. If
       at all possible, you would likely prefer to use SLONIK MOVE SET(7) instead, as  that  does
       not abandon the failed node.

       If  a  second failure occours in the middle of a FAILOVER operation then recovery might be
       complicated.

SLONIK EVENT CONFIRMATION BEHAVIOUR

       Slonik will submit the FAILOVER_EVENT without waiting but wait until the most  ahead  node
       has received confirmations of the FAILOVER_EVENT from all nodes before completing.

VERSION INFORMATION

       This command was introduced in Slony-I 1.0

       In  version  2.0,  the  default  BACKUP NODE value of 1 was removed, so it is mandatory to
       provide a value for this parameter

       In version 2.2 support was added for passing multiple nodes to a single failover command

                                           28 July 2024                        SLONIK FAILOVER(7)