Ubuntu Manpage: pg_auto_failover - pg_auto

name
introduction to pg_auto_failover
main pg_autoctl commands
pg_auto_failover tutorial
azure vms tutorial
architecture basics
multi-node architectures
citus support
citus cluster quick start
failover state machine
failover and fault tolerance
installing pg_auto_failover
security settings for pg_auto_failover
manual pages
configuring pg_auto_failover
operating pg_auto_failover
frequently asked questions
author
copyright

Provided by: pg-auto-failover-cli_2.0-2_amd64

NAME

       pg_auto_failover - pg_auto_failover Documentation

       The  pg_auto_failover  project is an Open Source Software project. The development happens
       at https://github.com/citusdata/pg_auto_failover and is public:  everyone  is  welcome  to
       participate by opening issues or pull requests, giving feedback, etc.

       Remember  that the first steps are to actually play with the pg_autoctl command, then read
       the entire available documentation (after all, I took the time to write it), and  then  to
       address  the  community in a kind and polite way — the same way you would expect people to
       use when addressing you.

       NOTE:
          The development of pg_auto_failover has been driven by Citus Data, since then a team at
          Microsoft.  The  Citus Data team at Microsoft generously maintains the pg_auto_failover
          Open Source Software so that its users may continue using it in production.

          For enhancements, improvements, and new features, consider contributing to the project.
          Pull Requests are reviewed as part of the offered maintenance.

       NOTE:
          Assistance is provided as usual with Open Source projects, on a voluntary basis. If you
          need help to cook a patch, enhance the documentation, or  even  to  use  the  software,
          you're welcome to ask questions and expect some level of free guidance.

INTRODUCTION TO PG_AUTO_FAILOVER

pg_auto_failover is an extension for PostgreSQL that monitors and manages failover for
postgres clusters. It is optimised for simplicity and correctness.

Single Standby Architecture
[image: pg_auto_failover Architecture with a primary and a standby node] [image]
pg_auto_failover architecture with a primary and a standby node.UNINDENT

pg_auto_failover implements Business Continuity for your PostgreSQL services.
pg_auto_failover implements a single PostgreSQL service using multiple nodes with
automated failover, and automates PostgreSQL maintenance operations in a way that
guarantees availability of the service to its users and applications.

To that end, pg_auto_failover uses three nodes (machines, servers) per PostgreSQL
service:

• a PostgreSQL primary node,

• a PostgreSQL secondary node, using Synchronous Hot Standby,

• a pg_auto_failover Monitor node that acts both as a witness and an orchestrator.

The pg_auto_failover Monitor implements a state machine and relies on in-core PostgreSQL
facilities to deliver HA. For example. when the secondary node is detected to be
unavailable, or when its lag is reported above a defined threshold (the default is 1 WAL
files, or 16MB, see the pgautofailover.promote_wal_log_threshold GUC on the
pg_auto_failover monitor), then the Monitor removes it from the synchronous_standby_names
setting on the primary node. Until the secondary is back to being monitored healthy,
failover and switchover operations are not allowed, preventing data loss.

Multiple Standby Architecture
[image: pg_auto_failover Architecture for a standalone PostgreSQL service] [image]
pg_auto_failover architecture with a primary and two standby nodes.UNINDENT

In the pictured architecture, pg_auto_failover implements Business Continuity and data
availability by implementing a single PostgreSQL service using multiple with automated
failover and data redundancy. Even after losing any Postgres node in a production
system, this architecture maintains two copies of the data on two different nodes.

When using more than one standby, different architectures can be achieved with
pg_auto_failover, depending on the objectives and trade-offs needed for your production
setup.

Multiple Standbys Architecture with 3 standby nodes, one async
[image: pg_auto_failover architecture with a primary and three standby nodes] [image]
pg_auto_failover architecture with a primary and three standby nodes.UNINDENT

When setting the three parameters above, it's possible to design very different Postgres
architectures for your production needs.

In this case, the system is setup with two standby nodes participating in the
replication quorum, allowing for number_sync_standbys = 1. The system always maintains a
minimum of two copies of the data set: one on the primary, another one on one on either
node B or node D. Whenever we lose one of those nodes, we can hold to this guarantee of
two copies of the data set.

Adding to that, we have the standby server D which has been set up to not participate in
the replication quorum. Node D will not be found in the synchronous_standby_names list
of nodes. Also, node D is set up in a way to never be a candidate for failover, with
candidate-priority = 0.

This architecture would fit a situation where nodes A, B, and C are deployed in the same
data center or availability zone, and node D in another. Those three nodes are set up to
support the main production traffic and implement high availability of both the Postgres
service and the data set.

Node D might be set up for Business Continuity in case the first data center is lost, or
maybe for reporting the need for deployment on another application domain.

Citus Architecture
[image: pg_auto_failover architecture with a Citus formation] [image] pg_auto_failover
architecture with a Citus formation.UNINDENT

pg_auto_failover implements Business Continuity for your Citus services.
pg_auto_failover implements a single Citus formation service using multiple Citus nodes
with automated failover, and automates PostgreSQL maintenance operations in a way that
guarantees availability of the service to its users and applications.

In that case, pg_auto_failover knows how to orchestrate a Citus coordinator failover and
a Citus worker failover. A Citus worker failover can be achieved with a very minimal
downtime to the application, where during a short time window SQL writes may error out.

In this figure we see a single standby node for each Citus node, coordinator and
workers. It is possible to implement more standby nodes, and even read-only nodes for
load balancing, see Citus Secondaries and read-replica.

MAIN PG_AUTOCTL COMMANDS

       pg_auto_failover includes the command line tool pg_autoctl that implements  many  commands
       to  manage  your Postgres nodes. To implement the Postgres architectures described in this
       documentation, and more, it is generally possible to use only some of the many  pg_autoctl
       commands.

       This  section  of  the documentation is a short introduction to the main commands that are
       useful when getting started with pg_auto_failover. More commands are  available  and  help
       deal with a variety of situations, see the Manual Pages for the whole list.

       To  understand  which  replication  settings  to use in your case, see Architecture Basics
       section and then the Multi-node Architectures section.

       To follow a step by step guide that you can reproduce on your own Azure  subscription  and
       create a production Postgres setup from VMs, see the pg_auto_failover Tutorial section.

       To  understand how to setup pg_auto_failover in a way that is compliant with your internal
       security guide lines, read the Security settings for pg_auto_failover section.

   Command line environment, configuration files, etc
       As a command line tool pg_autoctl depends on some environment variables.  Mostly, the tool
       re-uses the Postgres environment variables that you might already know.

       To  manage  a  Postgres  node  pg_auto_failover  needs to know its data directory location
       on-disk. For that, some users will find it easier to export the PGDATA variable  in  their
       environment.  The  alternative  consists  of  always  using  the  --pgdata  option that is
       available to all the pg_autoctl commands.

   Creating Postgres Nodes
       To get started with the  simplest  Postgres  failover  setup,  3  nodes  are  needed:  the
       pg_auto_failover  monitor,  and  2  Postgres  nodes  that  will  get assigned roles by the
       monitor. One Postgres node will be assigned the primary  role,  the  other  one  will  get
       assigned the secondary role.

       To create the monitor use the command:

          $ pg_autoctl create monitor

       The create the Postgres nodes use the following command on each node you want to create:

          $ pg_autoctl create postgres

       While  those  create  commands  initialize  your  nodes,  now you have to actually run the
       Postgres service that are expected to be running.  For  that  you  can  manually  run  the
       following command on every node:

          $ pg_autoctl run

       It  is  also  possible (and recommended) to integrate the pg_auto_failover service in your
       usual service management facility. When using systemd the following commands can  be  used
       to produce the unit file configuration required:

          $ pg_autoctl show systemd
          INFO  HINT: to complete a systemd integration, run the following commands:
          INFO  pg_autoctl -q show systemd --pgdata "/tmp/pgaf/m" | sudo tee /etc/systemd/system/pgautofailover.service
          INFO  sudo systemctl daemon-reload
          INFO  sudo systemctl enable pgautofailover
          INFO  sudo systemctl start pgautofailover
          [Unit]
          ...

       While  it  is  expected  that for a production deployment each node actually is a separate
       machine (virtual or physical, or even a container), it is also  possible  to  run  several
       Postgres nodes all on the same machine for testing or development purposes.

       TIP:
          When  running  several pg_autoctl nodes on the same machine for testing or contributing
          to pg_auto_failover, each Postgres instance needs to run on its own port, and with  its
          own  data  directory.  It  can make things easier to then set the environment variables
          PGDATA and PGPORT in each terminal, shell, or tab where each instance is started.

   Inspecting nodes
       Once your Postgres nodes have been created, and once each pg_autoctl service  is  running,
       it is possible to inspect the current state of the formation with the following command:

          $ pg_autoctl show state

       The  pg_autoctl  show  state  commands  outputs the current state of the system only once.
       Sometimes it would be nice to have an auto-updated display  such  as  provided  by  common
       tools  such  as  watch(1)  or  top(1)  and  the like. For that, the following commands are
       available (see also pg_autoctl watch):

          $ pg_autoctl watch
          $ pg_autoctl show state --watch

       To analyze what's been happening to get to the current state, it is possible to review the
       past events generated by the pg_auto_failover monitor with the following command:

          $ pg_autoctl show events

       HINT:
          The  pg_autoctl  show  commands can be run from any node in your system.  Those command
          need to connect to the monitor and print the current state or the current known list of
          events as per the monitor view of the system.

          Use  pg_autoctl  show  state  --local to have a view of the local state of a given node
          without connecting to the monitor Postgres instance.

          The option --json is available in most pg_autoctl  commands  and  switches  the  output
          format  from  a  human  readable  table  form to a program friendly JSON pretty-printed
          output.

   Inspecting and Editing Replication Settings
       When  creating  a  node  it  is  possible  to  use  the   --candidate-priority   and   the
       --replication-quorum  options to set the replication properties as required by your choice
       of Postgres architecture.

       To review the current replication settings of a formation, use one of  the  two  following
       commands, which are convenient aliases (the same command with two ways to invoke it):

          $ pg_autoctl show settings
          $ pg_autoctl get formation settings

       It is also possible to edit those replication settings at any time while your nodes are in
       production: you can change your mind or adjust to new elements without having to re-deploy
       everything. Just use the following commands to adjust the replication settings on the fly:

          $ pg_autoctl set formation number-sync-standbys
          $ pg_autoctl set node replication-quorum
          $ pg_autoctl set node candidate-priority

       IMPORTANT:
          The  pg_autoctl  get and pg_autoctl set commands always connect to the monitor Postgres
          instance.

          The  pg_autoctl  set  command  then  changes  the  replication  settings  on  the  node
          registration  on  the monitor. Then the monitor assigns the APPLY_SETTINGS state to the
          current primary node in the system for it to apply the new replication settings to  its
          Postgres streaming replication setup.

          As  a  result,  the pg_autoctl set commands requires a stable state in the system to be
          allowed to proceed. Namely, the current primary node in the system must have  both  its
          Current  State  and its Assigned State set to primary, as per the pg_autoctl show state
          output.

   Implementing Maintenance Operations
       When a Postgres node must be taken offline for a maintenance operation,  such  as  e.g.  a
       kernel  security  upgrade  or  a  minor Postgres update, it is best to make it so that the
       pg_auto_failover monitor knows about it.

          • For one thing, a node that is known to be in  maintenance  does  not  participate  in
            failovers.  If  you are running with two Postgres nodes, then failover operations are
            entirely prevented while the standby node is in maintenance.

          • Moreover, depending on  your  replication  settings,  enabling  maintenance  on  your
            standby  ensures  that the primary node switches to async replication before Postgres
            is shut down on the secondary, avoiding write queries to be blocked.

       To implement maintenance operations, use the following commands:

          $ pg_autoctl enable maintenance
          $ pg_autoctl disable maintenance

       The main pg_autoctl run service that is expected to be running in  the  background  should
       continue to run during the whole maintenance operation.  When a node is in the maintenance
       state, the pg_autoctl service is not controlling the Postgres service anymore.

       Note that it is possible to enable maintenance  on  a  primary  Postgres  node,  and  that
       operation   then   requires   a   failover  to  happen  first.  It  is  possible  to  have
       pg_auto_failover orchestrate that for you when using the command:

          $ pg_autoctl enable maintenance --allow-failover

       IMPORTANT:
          The pg_autoctl enable and pg_autoctl disable commands requires a stable  state  in  the
          system  to  be  allowed to proceed. Namely, the current primary node in the system must
          have both its Current State  and  its  Assigned  State  set  to  primary,  as  per  the
          pg_autoctl show state output.

   Manual failover, switchover, and promotions
       In  the  cases  when  a  failover  is  needed  without  having an actual node failure, the
       pg_auto_failover monitor can be  used  to  orchestrate  the  operation.  Use  one  of  the
       following commands, which are synonyms in the pg_auto_failover design:

          $ pg_autoctl perform failover
          $ pg_autoctl perform switchover

       Finally,  it  is  also  possible  to “elect” a new primary node in your formation with the
       command:

          $ pg_autoctl perform promotion

       IMPORTANT:
          The pg_autoctl perform commands requires a stable state in the system to be allowed  to
          proceed.  Namely,  the  current  primary  node in the system must have both its Current
          State and its Assigned State set to primary, as per the pg_autoctl show state output.

   What's next?
       This section of the documentation is meant to help users get started by  focusing  on  the
       main  commands  of  the  pg_autoctl tool. Each command has many options that can have very
       small impact, or pretty big impact in terms of security or architecture. Read the rest  of
       the  manual  to  understand  how to best use the many pg_autoctl options to implement your
       specific Postgres production architecture.

PG_AUTO_FAILOVER TUTORIAL

       In this guide we’ll create a Postgres setup with two nodes, a primary and a standby.  Then
       we'll  add a second standby node. We’ll simulate failure in the Postgres nodes and see how
       the system continues to function.

       This tutorial uses docker-compose in order to separate the architecture design  from  some
       of  the  implementation  details.  This allows reasonning at the architecture level within
       this tutorial, and better see which software component needs to be  deployed  and  run  on
       which node.

       The  setup  provided  in this tutorial is good for replaying at home in the lab. It is not
       intended to be production ready though. In particular, no attention  have  been  spent  on
       volume  management.  After  all, this is a tutorial: the goal is to walk through the first
       steps of using pg_auto_failover to implement Postgres automated failover.

   Pre-requisites
       When using docker-compose we describe a list of services, each service may run on  one  or
       more nodes, and each service just runs a single isolated process in a container.

       Within  the  context  of  a tutorial, or even a development environment, this matches very
       well to provisioning separate  physical  machines  on-prem,  or  Virtual  Machines  either
       on-prem on in a Cloud service.

       The  docker  image  used  in  this tutorial is named pg_auto_failover:tutorial.  It can be
       built locally when using the attached Dockerfile found within the  GitHub  repository  for
       pg_auto_failover.

       To build the image, either use the provided Makefile and run make build, or run the docker
       build command directly:

          $ git clone https://github.com/citusdata/pg_auto_failover
          $ cd pg_auto_failover/docs/tutorial

          $ docker build -t pg_auto_failover:tutorial -f Dockerfile ../..
          $ docker-compose build

   Postgres failover with two nodes
       Using docker-compose makes it easy enough to create an architecture that  looks  like  the
       following diagram:
         [image:  pg_auto_failover  Architecture  with  a  primary  and  a  standby node] [image]
         pg_auto_failover architecture with a primary and a standby node.UNINDENT

         Such an architecture provides failover capabilities, though it  does  not  provide  with
         High  Availability  of  both  the  Postgres  service  and  the data.  See the Multi-node
         Architectures chapter of our docs to understand more about this.

         To create a cluster we use the following docker-compose definition:

          version: "3.9"  # optional since v1.27.0

          services:

            app:
              build:
                context: .
                dockerfile: Dockerfile.app
              environment:
                PGUSER: tutorial
                PGDATABASE: tutorial
                PGHOST: node1,node2
                PGPORT: 5432
                PGAPPNAME: tutorial
                PGSSLMODE: require
                PGTARGETSESSIONATTRS: read-write

            monitor:
              image: pg_auto_failover:tutorial
              volumes:
                - ./monitor:/var/lib/postgres
              environment:
                PGDATA: /var/lib/postgres/pgaf
                PG_AUTOCTL_SSL_SELF_SIGNED: true
              expose:
                - 5432
              command: |
                pg_autoctl create monitor --auth trust --run

            node1:
              image: pg_auto_failover:tutorial
              hostname: node1
              volumes:
                - ./node1:/var/lib/postgres
              environment:
                PGDATA: /var/lib/postgres/pgaf
                PGUSER: tutorial
                PGDATABASE: tutorial
                PG_AUTOCTL_HBA_LAN: true
                PG_AUTOCTL_AUTH_METHOD: "trust"
                PG_AUTOCTL_SSL_SELF_SIGNED: true
                PG_AUTOCTL_MONITOR: "postgresql://autoctl_node@monitor/pg_auto_failover"
              expose:
                - 5432
              command: |
                pg_autoctl create postgres --name node1 --pg-hba-lan --run

            node2:
              image: pg_auto_failover:tutorial
              hostname: node2
              volumes:
                - ./node2:/var/lib/postgres
              environment:
                PGDATA: /var/lib/postgres/pgaf
                PGUSER: tutorial
                PGDATABASE: tutorial
                PG_AUTOCTL_HBA_LAN: true
                PG_AUTOCTL_AUTH_METHOD: "trust"
                PG_AUTOCTL_SSL_SELF_SIGNED: true
                PG_AUTOCTL_MONITOR: "postgresql://autoctl_node@monitor/pg_auto_failover"
              expose:
               - 5432
              command: |
                pg_autoctl create postgres --name node2 --pg-hba-lan --run

       To run the full Citus cluster with HA from this  definition,  we  can  use  the  following
       command:

          $ docker-compose up

       The  command above starts the services up. The first service is the monitor and is created
       with the command pg_autoctl create monitor. The options for this command  are  exposed  in
       the environment, and could have been specified on the command line too:

          $ pg_autoctl create postgres --ssl-self-signed --auth trust --pg-hba-lan --run

       While  the  Postgres  nodes  are  being  provisionned  by  docker-compose, you can run the
       following command and have a dynamic dashboard to follow what's happening.  The  following
       command is like top for pg_auto_failover:

          $ docker-compose exec monitor pg_autoctl watch

       After  a  little  while,  you  can  run the pg_autoctl show state command and see a stable
       result:

          $ docker-compose exec monitor pg_autoctl show state

           Name |  Node |  Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ------+-------+------------+----------------+--------------+---------------------+--------------------
          node2 |     1 | node2:5432 |   1: 0/3000148 |   read-write |             primary |             primary
          node1 |     2 | node1:5432 |   1: 0/3000148 |    read-only |           secondary |           secondary

       We can review the available Postgres URIs with the pg_autoctl show uri command:

          $ docker-compose exec monitor pg_autoctl show uri
                   Type |    Name | Connection String
           -------------+---------+-------------------------------
                monitor | monitor | postgres://autoctl_node@58053a02af03:5432/pg_auto_failover?sslmode=require
              formation | default | postgres://node2:5432,node1:5432/tutorial?target_session_attrs=read-write&sslmode=require

   Add application data
       Let's create a database schema with a single table, and some data in there.

          $ docker-compose exec app psql

          -- in psql

          CREATE TABLE companies
          (
            id         bigserial PRIMARY KEY,
            name       text NOT NULL,
            image_url  text,
            created_at timestamp without time zone NOT NULL,
            updated_at timestamp without time zone NOT NULL
          );

       Next download and ingest some sample data, still from within our psql session:

          \copy companies from program 'curl -o- https://examples.citusdata.com/mt_ref_arch/companies.csv' with csv
          ( COPY 75 )

   Our first failover
       When using pg_auto_failover, it is possible (and  easy)  to  trigger  a  failover  without
       having to orchestrate an incident, or power down the current primary.

          $ docker-compose exec monitor pg_autoctl perform switchover
          14:57:16 992 INFO  Waiting 60 secs for a notification with state "primary" in formation "default" and group 0
          14:57:16 992 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          14:57:16 992 INFO  Following table displays times when notifications are received
              Time |  Name |  Node |  Host:Port |       Current State |      Assigned State
          ---------+-------+-------+------------+---------------------+--------------------
          14:57:16 | node2 |     1 | node2:5432 |             primary |            draining
          14:57:16 | node1 |     2 | node1:5432 |           secondary |   prepare_promotion
          14:57:16 | node1 |     2 | node1:5432 |   prepare_promotion |   prepare_promotion
          14:57:16 | node1 |     2 | node1:5432 |   prepare_promotion |    stop_replication
          14:57:16 | node2 |     1 | node2:5432 |             primary |      demote_timeout
          14:57:17 | node2 |     1 | node2:5432 |            draining |      demote_timeout
          14:57:17 | node2 |     1 | node2:5432 |      demote_timeout |      demote_timeout
          14:57:19 | node1 |     2 | node1:5432 |    stop_replication |    stop_replication
          14:57:19 | node1 |     2 | node1:5432 |    stop_replication |        wait_primary
          14:57:19 | node2 |     1 | node2:5432 |      demote_timeout |             demoted
          14:57:19 | node2 |     1 | node2:5432 |             demoted |             demoted
          14:57:19 | node1 |     2 | node1:5432 |        wait_primary |        wait_primary
          14:57:19 | node2 |     1 | node2:5432 |             demoted |          catchingup
          14:57:26 | node2 |     1 | node2:5432 |             demoted |          catchingup
          14:57:38 | node2 |     1 | node2:5432 |             demoted |          catchingup
          14:57:39 | node2 |     1 | node2:5432 |          catchingup |          catchingup
          14:57:39 | node2 |     1 | node2:5432 |          catchingup |           secondary
          14:57:39 | node2 |     1 | node2:5432 |           secondary |           secondary
          14:57:40 | node1 |     2 | node1:5432 |        wait_primary |             primary
          14:57:40 | node1 |     2 | node1:5432 |             primary |             primary

       The new state after the failover looks like the following:

          $ docker-compose exec monitor pg_autoctl show state
           Name |  Node |  Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ------+-------+------------+----------------+--------------+---------------------+--------------------
          node2 |     1 | node2:5432 |   2: 0/5002698 |    read-only |           secondary |           secondary
          node1 |     2 | node1:5432 |   2: 0/5002698 |   read-write |             primary |             primary

       And we can verify that we still have the data available:

          docker-compose exec app psql -c "select count(*) from companies"
            count
           -------
               75
           (1 row)

   Next steps
       As  mentioned in the first section of this tutorial, the way we use docker-compose here is
       not meant to be production ready. It's useful to understand and play  with  a  distributed
       system such as Postgres multiple nodes sytem and failovers.

       See  the  command  pg_autoctl  do tmux compose session for more details about how to run a
       docker-compose test environment with docker-compose, including external volumes  for  each
       node.

       See  also the complete Azure VMs Tutorial for a guide on how-to provision an Azure network
       and then Azure VMs with  pg_auto_failover,  including  systemd  coverage  and  a  failover
       triggered by stopping a full VM.

AZURE VMS TUTORIAL

       In   this   guide  we’ll  create  a  primary  and  secondary  Postgres  node  and  set  up
       pg_auto_failover to replicate data between them. We’ll simulate  failure  in  the  primary
       node and see how the system smoothly switches (fails over) to the secondary.

       For  illustration,  we'll run our databases on virtual machines in the Azure platform, but
       the techniques here are relevant to any cloud provider or on-premise  network.  We'll  use
       four  virtual  machines:  a  primary  database,  a  secondary  database, a monitor, and an
       "application." The monitor watches the other nodes’  health,  manages  global  state,  and
       assigns nodes their roles.

   Create virtual network
       Our  database machines need to talk to each other and to the monitor node, so let's create
       a virtual network.

          az group create \
              --name ha-demo \
              --location eastus

          az network vnet create \
              --resource-group ha-demo \
              --name ha-demo-net \
              --address-prefix 10.0.0.0/16

       We need to open ports 5432 (Postgres) and 22 (SSH) between the  machines,  and  also  give
       ourselves  access  from  our  remote IP. We'll do this with a network security group and a
       subnet.

          az network nsg create \
              --resource-group ha-demo \
              --name ha-demo-nsg

          az network nsg rule create \
              --resource-group ha-demo \
              --nsg-name ha-demo-nsg \
              --name ha-demo-ssh-and-pg \
              --access allow \
              --protocol Tcp \
              --direction Inbound \
              --priority 100 \
              --source-address-prefixes `curl ifconfig.me` 10.0.1.0/24 \
              --source-port-range "*" \
              --destination-address-prefix "*" \
              --destination-port-ranges 22 5432

          az network vnet subnet create \
              --resource-group ha-demo \
              --vnet-name ha-demo-net \
              --name ha-demo-subnet \
              --address-prefixes 10.0.1.0/24 \
              --network-security-group ha-demo-nsg

       Finally  add  four  virtual   machines   (ha-demo-a,   ha-demo-b,   ha-demo-monitor,   and
       ha-demo-app). For speed we background the az vm create processes and run them in parallel:

          # create VMs in parallel
          for node in monitor a b app
          do
          az vm create \
              --resource-group ha-demo \
              --name ha-demo-${node} \
              --vnet-name ha-demo-net \
              --subnet ha-demo-subnet \
              --nsg ha-demo-nsg \
              --public-ip-address ha-demo-${node}-ip \
              --image debian \
              --admin-username ha-admin \
              --generate-ssh-keys &
          done
          wait

       To  make  it  easier to SSH into these VMs in future steps, let's make a shell function to
       retrieve their IP addresses:

          # run this in your local shell as well

          vm_ip () {
            az vm list-ip-addresses -g ha-demo -n ha-demo-$1 -o tsv \
              --query '[] [] .virtualMachine.network.publicIpAddresses[0].ipAddress'
          }

          # for convenience with ssh

          for node in monitor a b app
          do
          ssh-keyscan -H `vm_ip $node` >> ~/.ssh/known_hosts
          done

       Let's review what we created so far.

          az resource list --output table --query \
            "[?resourceGroup=='ha-demo'].{ name: name, flavor: kind, resourceType: type, region: location }"

       This shows the following resources:

          Name                             ResourceType                                           Region
          -------------------------------  -----------------------------------------------------  --------
          ha-demo-a                        Microsoft.Compute/virtualMachines                      eastus
          ha-demo-app                      Microsoft.Compute/virtualMachines                      eastus
          ha-demo-b                        Microsoft.Compute/virtualMachines                      eastus
          ha-demo-monitor                  Microsoft.Compute/virtualMachines                      eastus
          ha-demo-appVMNic                 Microsoft.Network/networkInterfaces                    eastus
          ha-demo-aVMNic                   Microsoft.Network/networkInterfaces                    eastus
          ha-demo-bVMNic                   Microsoft.Network/networkInterfaces                    eastus
          ha-demo-monitorVMNic             Microsoft.Network/networkInterfaces                    eastus
          ha-demo-nsg                      Microsoft.Network/networkSecurityGroups                eastus
          ha-demo-a-ip                     Microsoft.Network/publicIPAddresses                    eastus
          ha-demo-app-ip                   Microsoft.Network/publicIPAddresses                    eastus
          ha-demo-b-ip                     Microsoft.Network/publicIPAddresses                    eastus
          ha-demo-monitor-ip               Microsoft.Network/publicIPAddresses                    eastus
          ha-demo-net                      Microsoft.Network/virtualNetworks                      eastus

   Install the pg_autoctl executable
       This guide uses Debian Linux, but similar steps will work on other distributions. All that
       differs are the packages and paths. See Installing pg_auto_failover.

       The  pg_auto_failover system is distributed as a single pg_autoctl binary with subcommands
       to initialize and manage a replicated PostgreSQL service.  We’ll install the  binary  with
       the  operating  system  package  manager  on  all  nodes.  It will help us run and observe
       PostgreSQL.

          for node in monitor a b app
          do
          az vm run-command invoke \
             --resource-group ha-demo \
             --name ha-demo-${node} \
             --command-id RunShellScript \
             --scripts \
                "sudo touch /home/ha-admin/.hushlogin" \
                "curl https://install.citusdata.com/community/deb.sh | sudo bash" \
                "sudo DEBIAN_FRONTEND=noninteractive apt-get install -q -y postgresql-common" \
                "echo 'create_main_cluster = false' | sudo tee -a /etc/postgresql-common/createcluster.conf" \
                "sudo DEBIAN_FRONTEND=noninteractive apt-get install -q -y postgresql-11-auto-failover-1.4" \
                "sudo usermod -a -G postgres ha-admin" &
          done
          wait

   Run a monitor
       The pg_auto_failover monitor is the first component to run. It  periodically  attempts  to
       contact  the  other  nodes  and  watches their health. It also maintains global state that
       “keepers” on each node consult to determine their own roles in the system.

          # on the monitor virtual machine

          ssh -l ha-admin `vm_ip monitor` -- \
            pg_autoctl create monitor \
              --auth trust \
              --ssl-self-signed \
              --pgdata monitor \
              --pgctl /usr/lib/postgresql/11/bin/pg_ctl

       This command initializes a PostgreSQL cluster at the  location  pointed  by  the  --pgdata
       option.  When  --pgdata  is  omitted,  pg_autoctl  attempts  to use the PGDATA environment
       variable. If a PostgreSQL instance had already existing in the destination directory, this
       command would have configured it to serve as a monitor.

       pg_auto_failover,  installs  the pgautofailover Postgres extension, and grants access to a
       new autoctl_node user.

       In the Quick Start we use --auth trust to avoid complex security settings.   The  Postgres
       trust  authentication  method  is  not  considered  a  reasonable  choice  for  production
       environments. Consider either using the --skip-pg-hba option or --auth  scram-sha-256  and
       then setting up passwords yourself.

       At  this  point  the monitor is created. Now we'll install it as a service with systemd so
       that it will resume if the VM restarts.

          ssh -T -l ha-admin `vm_ip monitor` << CMD
            pg_autoctl -q show systemd --pgdata ~ha-admin/monitor > pgautofailover.service
            sudo mv pgautofailover.service /etc/systemd/system
            sudo systemctl daemon-reload
            sudo systemctl enable pgautofailover
            sudo systemctl start pgautofailover
          CMD

   Bring up the nodes
       We’ll create the primary database using the pg_autoctl create subcommand.

          ssh -l ha-admin `vm_ip a` -- \
            pg_autoctl create postgres \
              --pgdata ha \
              --auth trust \
              --ssl-self-signed \
              --username ha-admin \
              --dbname appdb \
              --hostname ha-demo-a.internal.cloudapp.net \
              --pgctl /usr/lib/postgresql/11/bin/pg_ctl \
              --monitor 'postgres://autoctl_node@ha-demo-monitor.internal.cloudapp.net/pg_auto_failover?sslmode=require'

       Notice the user and database name in the monitor  connection  string  --  these  are  what
       monitor  init  created. We also give it the path to pg_ctl so that the keeper will use the
       correct version of pg_ctl in future even if other versions of postgres  are  installed  on
       the system.

       In  the  example above, the keeper creates a primary database. It chooses to set up node A
       as primary because the monitor reports there are no other nodes in the system yet. This is
       one  example  of how the keeper is state-based: it makes observations and then adjusts its
       state, in this case from "init" to "single."

       Also add a setting to trust connections from our "application" VM:

          ssh -T -l ha-admin `vm_ip a` << CMD
            echo 'hostssl "appdb" "ha-admin" ha-demo-app.internal.cloudapp.net trust' \
              >> ~ha-admin/ha/pg_hba.conf
          CMD

       At this point the monitor and primary node are created and running. Next we  need  to  run
       the  keeper.  It’s  an  independent  process so that it can continue operating even if the
       PostgreSQL process goes terminates on the node. We'll install it as a service with systemd
       so that it will resume if the VM restarts.

          ssh -T -l ha-admin `vm_ip a` << CMD
            pg_autoctl -q show systemd --pgdata ~ha-admin/ha > pgautofailover.service
            sudo mv pgautofailover.service /etc/systemd/system
            sudo systemctl daemon-reload
            sudo systemctl enable pgautofailover
            sudo systemctl start pgautofailover
          CMD

       Next connect to node B and do the same process. We'll do both steps at once:

          ssh -l ha-admin `vm_ip b` -- \
            pg_autoctl create postgres \
              --pgdata ha \
              --auth trust \
              --ssl-self-signed \
              --username ha-admin \
              --dbname appdb \
              --hostname ha-demo-b.internal.cloudapp.net \
              --pgctl /usr/lib/postgresql/11/bin/pg_ctl \
              --monitor 'postgres://autoctl_node@ha-demo-monitor.internal.cloudapp.net/pg_auto_failover?sslmode=require'

          ssh -T -l ha-admin `vm_ip b` << CMD
            pg_autoctl -q show systemd --pgdata ~ha-admin/ha > pgautofailover.service
            sudo mv pgautofailover.service /etc/systemd/system
            sudo systemctl daemon-reload
            sudo systemctl enable pgautofailover
            sudo systemctl start pgautofailover
          CMD

       It discovers from the monitor that a primary exists, and then switches its own state to be
       a hot standby and begins streaming WAL contents from the primary.

   Node communication
       For convenience, pg_autoctl modifies each node's pg_hba.conf file to allow  the  nodes  to
       connect to one another. For instance, pg_autoctl added the following lines to node A:

          # automatically added to node A

          hostssl "appdb" "ha-admin" ha-demo-a.internal.cloudapp.net trust
          hostssl replication "pgautofailover_replicator" ha-demo-b.internal.cloudapp.net trust
          hostssl "appdb" "pgautofailover_replicator" ha-demo-b.internal.cloudapp.net trust

       For  pg_hba.conf  on  the monitor node pg_autoctl inspects the local network and makes its
       best guess about the subnet to allow. In our case it guessed correctly:

          # automatically added to the monitor

          hostssl "pg_auto_failover" "autoctl_node" 10.0.1.0/24 trust

       If worker nodes have more ad-hoc addresses and are not in the same subnet, it's better  to
       disable pg_autoctl's automatic modification of pg_hba using the --skip-pg-hba command line
       option during creation. You will then need to edit the hba file by  hand.  Another  reason
       for manual edits would be to use special authentication methods.

   Watch the replication
       First  let’s  verify  that  the  monitor knows about our nodes, and see what states it has
       assigned them:

          ssh -l ha-admin `vm_ip monitor` pg_autoctl show state --pgdata monitor

            Name |  Node |                            Host:Port |       LSN | Reachable |       Current State |      Assigned State
          -------+-------+--------------------------------------+-----------+-----------+---------------------+--------------------
          node_1 |     1 | ha-demo-a.internal.cloudapp.net:5432 | 0/3000060 |       yes |             primary |             primary
          node_2 |     2 | ha-demo-b.internal.cloudapp.net:5432 | 0/3000060 |       yes |           secondary |           secondary

       This looks good. We can add data to the primary, and later see it appear in the secondary.
       We'll  connect  to  the database from inside our "app" virtual machine, using a connection
       string obtained from the monitor.

          ssh -l ha-admin `vm_ip monitor` pg_autoctl show uri --pgdata monitor

                Type |    Name | Connection String
          -----------+---------+-------------------------------
             monitor | monitor | postgres://autoctl_node@ha-demo-monitor.internal.cloudapp.net:5432/pg_auto_failover?sslmode=require
           formation | default | postgres://ha-demo-b.internal.cloudapp.net:5432,ha-demo-a.internal.cloudapp.net:5432/appdb?target_session_attrs=read-write&sslmode=require

       Now we'll get the connection string and store it in a local environment variable:

          APP_DB_URI=$( \
            ssh -l ha-admin `vm_ip monitor` \
              pg_autoctl show uri --formation default --pgdata monitor \
          )

       The connection string contains both our nodes,  comma  separated,  and  includes  the  url
       parameter  ?target_session_attrs=read-write  telling  psql  that  we  want  to  connect to
       whichever of these servers supports reads and writes.  That will be the primary server.

          # connect to database via psql on the app vm and
          # create a table with a million rows
          ssh -l ha-admin -t `vm_ip app` -- \
            psql "'$APP_DB_URI'" \
              -c "'CREATE TABLE foo AS SELECT generate_series(1,1000000) bar;'"

   Cause a failover
       Now that we've added data to node A, let's switch which  is  considered  the  primary  and
       which  the  secondary.  After the switch we'll connect again and query the data, this time
       from node B.

          # initiate failover to node B
          ssh -l ha-admin -t `vm_ip monitor` \
            pg_autoctl perform switchover --pgdata monitor

       Once node B is marked "primary" (or "wait_primary") we can connect  and  verify  that  the
       data is still present:

          # connect to database via psql on the app vm
          ssh -l ha-admin -t `vm_ip app` -- \
            psql "'$APP_DB_URI'" \
              -c "'SELECT count(*) FROM foo;'"

       It shows

            count
          ---------
           1000000

   Cause a node failure
       This  plot  is  too  boring,  time  to  introduce  a problem. We’ll turn off VM for node B
       (currently the primary after our previous failover) and watch node A get promoted.

       In one terminal let’s keep an eye on events:

          ssh -t -l ha-admin `vm_ip monitor` -- \
            watch -n 1 -d pg_autoctl show state --pgdata monitor

       In another terminal we’ll turn off the virtual server.

          az vm stop \
            --resource-group ha-demo \
            --name ha-demo-b

       After a number of failed attempts to talk to node B, the monitor determines  the  node  is
       unhealthy and puts it into the "demoted" state.  The monitor promotes node A to be the new
       primary.

            Name |  Node |                            Host:Port |       LSN | Reachable |       Current State |      Assigned State
          -------+-------+--------------------------------------+-----------+-----------+---------------------+--------------------
          node_1 |     1 | ha-demo-a.internal.cloudapp.net:5432 | 0/6D4E068 |       yes |        wait_primary |        wait_primary
          node_2 |     2 | ha-demo-b.internal.cloudapp.net:5432 | 0/6D4E000 |       yes |             demoted |          catchingup

       Node A cannot be considered in full "primary" state since there is no  secondary  present,
       but  it  can still serve client requests. It is marked as "wait_primary" until a secondary
       appears, to indicate that it's running without a backup.

       Let's add some data while B is offline.

          # notice how $APP_DB_URI continues to work no matter which node
          # is serving as primary
          ssh -l ha-admin -t `vm_ip app` -- \
            psql "'$APP_DB_URI'" \
              -c "'INSERT INTO foo SELECT generate_series(1000001, 2000000);'"

   Resurrect node B
       Run this command to bring node B back online:

          az vm start \
            --resource-group ha-demo \
            --name ha-demo-b

       Now the next time the keeper retries its health check, it brings the node  back.   Node  B
       goes  through  the  state  "catchingup"  while it updates its data to match A. Once that's
       done, B becomes a secondary, and A is now a full primary again.

            Name |  Node |                            Host:Port |        LSN | Reachable |       Current State |      Assigned State
          -------+-------+--------------------------------------+------------+-----------+---------------------+--------------------
          node_1 |     1 | ha-demo-a.internal.cloudapp.net:5432 | 0/12000738 |       yes |             primary |             primary
          node_2 |     2 | ha-demo-b.internal.cloudapp.net:5432 | 0/12000738 |       yes |           secondary |           secondary

       What's more, if we connect directly to the database again, all two million rows are  still
       present.

          ssh -l ha-admin -t `vm_ip app` -- \
            psql "'$APP_DB_URI'" \
              -c "'SELECT count(*) FROM foo;'"

       It shows

            count
          ---------
           2000000

ARCHITECTURE BASICS

pg_auto_failover is designed as a simple and robust way to manage automated Postgres
failover in production. On-top of robust operations, pg_auto_failover setup is flexible
and allows either Business Continuity or High Availability configurations.
pg_auto_failover design includes configuration changes in a live system without downtime.

pg_auto_failover is designed to be able to handle a single PostgreSQL service using three
nodes. In this setting, the system is resilient to losing any one of three nodes.
[image: pg_auto_failover Architecture for a standalone PostgreSQL service] [image]
pg_auto_failover Architecture for a standalone PostgreSQL service.UNINDENT

It is important to understand that when using only two Postgres nodes then
pg_auto_failover is optimized for Business Continuity. In the event of losing a single
node, pg_auto_failover is capable of continuing the PostgreSQL service, and prevents any
data loss when doing so, thanks to PostgreSQL Synchronous Replication.

That said, there is a trade-off involved in this architecture. The business continuity
bias relaxes replication guarantees for asynchronous replication in the event of a
standby node failure. This allows the PostgreSQL service to accept writes when there's a
single server available, and opens the service for potential data loss if the primary
server were also to fail.

The pg_auto_failover Monitor
Each PostgreSQL node in pg_auto_failover runs a Keeper process which informs a central
Monitor node about notable local changes. Some changes require the Monitor to orchestrate
a correction across the cluster:

• New nodes

At initialization time, it's necessary to prepare the configuration of each node for
PostgreSQL streaming replication, and get the cluster to converge to the nominal
state with both a primary and a secondary node in each group. The monitor determines
each new node's role

• Node failure

The monitor orchestrates a failover when it detects an unhealthy node. The design of
pg_auto_failover allows the monitor to shut down service to a previously designated
primary node without causing a "split-brain" situation.

The monitor is the authoritative node that manages global state and makes changes in the
cluster by issuing commands to the nodes' keeper processes. A pg_auto_failover monitor
node failure has limited impact on the system. While it prevents reacting to other nodes'
failures, it does not affect replication. The PostgreSQL streaming replication setup
installed by pg_auto_failover does not depend on having the monitor up and running.

pg_auto_failover Glossary
pg_auto_failover handles a single PostgreSQL service with the following concepts:

Monitor
The pg_auto_failover monitor is a service that keeps track of one or several formations
containing groups of nodes.

The monitor is implemented as a PostgreSQL extension, so when you run the command
pg_autoctl create monitor a PostgreSQL instance is initialized, configured with the
extension, and started. The monitor service embeds a PostgreSQL instance.

Formation
A formation is a logical set of PostgreSQL services that are managed together.

It is possible to operate many formations with a single monitor instance. Each formation
has a group of Postgres nodes and the FSM orchestration implemented by the monitor applies
separately to each group.

Group
A group of two PostgreSQL nodes work together to provide a single PostgreSQL service in a
Highly Available fashion. A group consists of a PostgreSQL primary server and a secondary
server setup with Hot Standby synchronous replication. Note that pg_auto_failover can
orchestrate the whole setting-up of the replication for you.

In pg_auto_failover versions up to 1.3, a single Postgres group can contain only two
Postgres nodes. Starting with pg_auto_failover 1.4, there's no limit to the number of
Postgres nodes in a single group. Note that each Postgres instance that belongs to the
same group serves the same dataset in its data directory (PGDATA).

NOTE:
The notion of a formation that contains multiple groups in pg_auto_failover is useful
when setting up and managing a whole Citus formation, where the coordinator nodes
belong to group zero of the formation, and each Citus worker node becomes its own group
and may have Postgres standby nodes.

Keeper
The pg_auto_failover keeper is an agent that must be running on the same server where your
PostgreSQL nodes are running. The keeper controls the local PostgreSQL instance (using
both the pg_ctl command-line tool and SQL queries), and communicates with the monitor:

• it sends updated data about the local node, such as the WAL delta in between servers,
measured via PostgreSQL statistics views.

• it receives state assignments from the monitor.

Also the keeper maintains local state that includes the most recent communication
established with the monitor and the other PostgreSQL node of its group, enabling it to
detect Network Partitions.

NOTE:
In pg_auto_failover versions up to and including 1.3, the keeper process started with
pg_autoctl run manages a separate Postgres instance, running as its own process tree.

Starting in pg_auto_failover version 1.4, the keeper process (started with pg_autoctl
run) runs the Postgres instance as a sub-process of the main pg_autoctl process,
allowing tighter control over the Postgres execution. Running the sub-process also
makes the solution work better both in container environments (because it's now a
single process tree) and with systemd, because it uses a specific cgroup per service
unit.

Node
A node is a server (virtual or physical) that runs PostgreSQL instances and a keeper
service. At any given time, any node might be a primary or a secondary Postgres instance.
The whole point of pg_auto_failover is to decide this state.

As a result, refrain from naming your nodes with the role you intend for them. Their
roles can change. If they didn't, your system wouldn't need pg_auto_failover!

State
A state is the representation of the per-instance and per-group situation. The monitor
and the keeper implement a Finite State Machine to drive operations in the PostgreSQL
groups; allowing pg_auto_failover to implement High Availability with the goal of zero
data loss.

The keeper main loop enforces the current expected state of the local PostgreSQL instance,
and reports the current state and some more information to the monitor. The monitor uses
this set of information and its own health-check information to drive the State Machine
and assign a goal state to the keeper.

The keeper implements the transitions between a current state and a monitor-assigned goal
state.

Client-side HA
Implementing client-side High Availability is included in PostgreSQL's driver libpq from
version 10 onward. Using this driver, it is possible to specify multiple host names or IP
addresses in the same connection string:

$ psql -d "postgresql://host1,host2/dbname?target_session_attrs=read-write"
$ psql -d "postgresql://host1:port2,host2:port2/dbname?target_session_attrs=read-write"
$ psql -d "host=host1,host2 port=port1,port2 target_session_attrs=read-write"

When using either of the syntax above, the psql application attempts to connect to host1,
and when successfully connected, checks the target_session_attrs as per the PostgreSQL
documentation of it:
If this parameter is set to read-write, only a connection in which read-write
transactions are accepted by default is considered acceptable. The query SHOW
transaction_read_only will be sent upon any successful connection; if it returns on,
the connection will be closed. If multiple hosts were specified in the connection
string, any remaining servers will be tried just as if the connection attempt had
failed. The default value of this parameter, any, regards all connections as
acceptable.

When the connection attempt to host1 fails, or when the target_session_attrs can not be
verified, then the psql application attempts to connect to host2.

The behavior is implemented in the connection library libpq, so any application using it
can benefit from this implementation, not just psql.

When using pg_auto_failover, configure your application connection string to use the
primary and the secondary server host names, and set target_session_attrs=read-write too,
so that your application automatically connects to the current primary, even after a
failover occurred.

Monitoring protocol
The monitor interacts with the data nodes in 2 ways:

• Data nodes periodically connect and run SELECT pgautofailover.node_active(...) to
communicate their current state and obtain their goal state.

• The monitor periodically connects to all the data nodes to see if they are healthy,
doing the equivalent of pg_isready.

When a data node calls node_active, the state of the node is stored in the
pgautofailover.node table and the state machines of both nodes are progressed. The state
machines are described later in this readme. The monitor typically only moves one state
forward and waits for the node(s) to converge except in failure states.

If a node is not communicating to the monitor, it will either cause a failover (if node is
a primary), disabling synchronous replication (if node is a secondary), or cause the state
machine to pause until the node comes back (other cases). In most cases, the latter is
harmless, though in some cases it may cause downtime to last longer, e.g. if a standby
goes down during a failover.

To simplify operations, a node is only considered unhealthy if the monitor cannot connect
and it hasn't reported its state through node_active for a while. This allows, for
example, PostgreSQL to be restarted without causing a health check failure.

Synchronous vs. asynchronous replication
By default, pg_auto_failover uses synchronous replication, which means all writes block
until at least one standby node has reported receiving them. To handle cases in which the
standby fails, the primary switches between two states called wait_primary and primary
based on the health of standby nodes, and based on the replication setting
number_sync_standby.

When in the wait_primary state, synchronous replication is disabled by automatically
setting synchronous_standby_names = '' to allow writes to proceed. However doing so also
disables failover, since the standby might get arbitrarily far behind. If the standby is
responding to health checks and within 1 WAL segment of the primary (by default),
synchronous replication is enabled again on the primary by setting
synchronous_standby_names = '*' which may cause a short latency spike since writes will
then block until the standby has caught up.

When using several standby nodes with replication quorum enabled, the actual setting for
synchronous_standby_names is set to a list of those standby nodes that are set to
participate to the replication quorum.

If you wish to disable synchronous replication, you need to add the following to
postgresql.conf:

synchronous_commit = 'local'

This ensures that writes return as soon as they are committed on the primary -- under all
circumstances. In that case, failover might lead to some data loss, but failover is not
initiated if the secondary is more than 10 WAL segments (by default) behind on the
primary. During a manual failover, the standby will continue accepting writes from the old
primary. The standby will stop accepting writes only if it's fully caught up (most
common), the primary fails, or it does not receive writes for 2 minutes.

A note about performance
In some cases the performance impact on write latency when setting synchronous replication
makes the application fail to deliver expected performance. If testing or production
feedback shows this to be the case, it is beneficial to switch to using asynchronous
replication.

The way to use asynchronous replication in pg_auto_failover is to change the
synchronous_commit setting. This setting can be set per transaction, per session, or per
user. It does not have to be set globally on your Postgres instance.

One way to benefit from that would be:

alter role fast_and_loose set synchronous_commit to local;

That way performance-critical parts of the application don't have to wait for the standby
nodes. Only use this when you can also lower your data durability guarantees.

Node recovery
When bringing a node back after a failover, the keeper (pg_autoctl run) can simply be
restarted. It will also restart postgres if needed and obtain its goal state from the
monitor. If the failed node was a primary and was demoted, it will learn this from the
monitor. Once the node reports, it is allowed to come back as a standby by running
pg_rewind. If it is too far behind, the node performs a new pg_basebackup.

MULTI-NODE ARCHITECTURES

       Pg_auto_failover  allows  you  to  have  more  than  one standby node, and offers advanced
       control over your production architecture characteristics.

   Architectures with two standby nodes
       When adding your second  standby  node  with  default  settings,  you  get  the  following
       architecture:
         [image:  pg_auto_failover  architecture with two standby nodes] [image] pg_auto_failover
         architecture with two standby nodes.UNINDENT

         In this case, three nodes get set up with the same  characteristics,  achieving  HA  for
         both  the  Postgres  service  and  the production dataset. An important setting for this
         architecture is number_sync_standbys.

         The replication setting number_sync_standbys sets how many  standby  nodes  the  primary
         should  wait  for when committing a transaction. In order to have a good availability in
         your  system,  pg_auto_failover  requires  number_sync_standbys  +   1   standby   nodes
         participating  in  the  replication quorum: this allows any standby node to fail without
         impact on the system's ability to respect the replication quorum.

         When only two nodes are registered in a group on the monitor we have  a  primary  and  a
         single  secondary node. Then number_sync_standbys can only be set to zero. When adding a
         second standby  node  to  a  pg_auto_failover  group,  then  the  monitor  automatically
         increments number_sync_standbys to one, as we see in the diagram above.

         When  number_sync_standbys  is set to zero then pg_auto_failover implements the Business
         Continuity setup as seen in Architecture Basics: synchronous replication is then used as
         a way to guarantee that failover can be implemented without data loss.

         In more details:

          1. With  number_sync_standbys set to one, this architecture always maintains two copies
             of the dataset: one on the current primary node (node A in  the  previous  diagram),
             and  one  on  the  standby that acknowledges the transaction first (either node B or
             node C in the diagram).

             When one of the standby nodes is unavailable, the second copy  of  the  dataset  can
             still be maintained thanks to the remaining standby.

             When  both  the  standby  nodes  are  unavailable,  then  it's no longer possible to
             guarantee the replication quorum, and thus writes on the primary  are  blocked.  The
             Postgres  primary  node  waits  until  at  least  one  standby node acknowledges the
             transactions locally committed, thus degrading your Postgres service to read-only.

          0.  It is possible to manually set number_sync_standbys to zero when having  registered
              two standby nodes to the monitor, overriding the default behavior.

              In  that  case,  when the second standby node becomes unhealthy at the same time as
              the first standby node, the primary node is assigned  the  state  Wait_primary.  In
              that  state,  synchronous  replication  is  disabled  on  the  primary  by  setting
              synchronous_standby_names to an empty string. Writes are allowed  on  the  primary,
              even though there's no extra copy of the production dataset available at this time.

              Setting  number_sync_standbys  to  zero  allows  data  to be written even when both
              standby nodes are down. In this case, a single copy of the production data  set  is
              kept and, if the primary was then to fail, some data will be lost. How much depends
              on your backup and recovery mechanisms.

   Replication Settings and Postgres Architectures
       The entire flexibility of pg_auto_failover can  be  leveraged  with  the  following  three
       replication settings:

          • Number of sync stanbys

          • Replication quorum

          • Candidate priority

   Number Sync Standbys
       This   parameter   is   used  by  Postgres  in  the  synchronous_standby_names  parameter:
       number_sync_standby is the number of synchronous standbys for whose  replies  transactions
       must wait.

       This  parameter  can  be  set  at the formation level in pg_auto_failover, meaning that it
       applies to the current primary, and "follows" a failover to apply to any new primary  that
       might replace the current one.

       To set this parameter to the value <n>, use the following command:

          pg_autoctl set formation number-sync-standbys <n>

       The  default  value  in pg_auto_failover is zero. When set to zero, the Postgres parameter
       synchronous_standby_names can be set to either '*' or to '':

       • synchronous_standby_names  =  '*'  means  that  any  standby  may  participate  in   the
         replication quorum for transactions with synchronous_commit set to on or higher values.

         pg_autofailover  uses  synchronous_standby_names = '*' when there's at least one standby
         that is known to be healthy.

       • synchronous_standby_names = '' (empty string) disables synchrous commit  and  makes  all
         your   commits  asynchronous,  meaning  that  transaction  commits  will  not  wait  for
         replication. In other words, a single copy of your production data  is  maintained  when
         synchronous_standby_names is set that way.

         pg_autofailover  uses  synchronous_standby_names  = '' only when number_sync_standbys is
         set to zero and there's no standby node known healthy by the monitor.

       In order to set number_sync_standbys to a non-zero value, pg_auto_failover  requires  that
       at least number_sync_standbys + 1 standby nodes be registered in the system.

       When  the first standby node is added to the pg_auto_failover monitor, the only acceptable
       value for number_sync_standbys is zero. When a second standby is added  that  participates
       in the replication quorum, then number_sync_standbys is automatically set to one.

       The  command pg_autoctl set formation number-sync-standbys can be used to change the value
       of this parameter in a  formation,  even  when  all  the  nodes  are  already  running  in
       production.  The pg_auto_failover monitor then sets a transition for the primary to update
       its local value of synchronous_standby_names.

   Replication Quorum
       The replication quorum setting is a boolean and defaults to true, and can be set per-node.
       Pg_auto_failover  includes  a  given  node  in  synchronous_standby_names  only  when  the
       replication  quorum  parameter  has  been  set  to  true.  This  means  that  asynchronous
       replication will be used for nodes where replication-quorum is set to false.

       It is possible to force asynchronous replication globally by setting replication quorum to
       false on all the nodes in a formation. Remember that failovers will happen,  and  thus  to
       set  your replication settings on the current primary node too when needed: it is going to
       be a standby later.

       To set this parameter to either true or false, use one of the following commands:

          pg_autoctl set node replication-quorum true
          pg_autoctl set node replication-quorum false

   Candidate Priority
       The candidate priority setting is an integer that can be set to any value between 0 (zero)
       and  100 (one hundred). The default value is 50. When the pg_auto_failover monitor decides
       to orchestrate a failover, it uses each node's candidate priority to pick the new  primary
       node.

       When  setting  the  candidate  priority  of  a  node down to zero, this node will never be
       selected to be promoted as the new primary when a failover is orchestrated by the monitor.
       The  monitor  will instead wait until another node registered is healthy and in a position
       to be promoted.

       To set this parameter to the value <n>, use the following command:

          pg_autoctl set node candidate-priority <n>

       When nodes have the same candidate priority, the monitor then picks the standby  with  the
       most advanced LSN position published to the monitor. When more than one node has published
       the same LSN position, a random one is chosen.

       When the candidate for failover has not published the most advanced LSN  position  in  the
       WAL,  pg_auto_failover  orchestrates  an  intermediate step in the failover mechanism. The
       candidate fetches the missing WAL bytes from one of the standby with the most advanced LSN
       position  prior  to  being  promoted.  Postgres  allows this operation thanks to cascading
       replication: any standby can be the upstream node for another standby.

       It is required at all times that at least two nodes have a non-zero candidate priority  in
       any pg_auto_failover formation. Otherwise no failover is possible.

   Auditing replication settings
       The command pg_autoctl get formation settings (also known as pg_autoctl show settings) can
       be used to obtain a summary of all the replication  settings  currently  in  effect  in  a
       formation. Still using the first diagram on this page, we get the following summary:

          $ pg_autoctl get formation settings
            Context |    Name |                   Setting | Value
          ----------+---------+---------------------------+-------------------------------------------------------------
          formation | default |      number_sync_standbys | 1
            primary |  node_A | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_3, pgautofailover_standby_2)'
               node |  node_A |        replication quorum | true
               node |  node_B |        replication quorum | true
               node |  node_C |        replication quorum | true
               node |  node_A |        candidate priority | 50
               node |  node_B |        candidate priority | 50
               node |  node_C |        candidate priority | 50

       We can see that the number_sync_standbys has been used to compute the current value of the
       synchronous_standby_names setting on the primary.

       Because all the nodes in that example have the same default candidate priority (50),  then
       pg_auto_failover is using the form ANY 1 with the list of standby nodes that are currently
       participating in the replication quorum.

       The entries in the synchronous_standby_names list are meant to match the  application_name
       connection  setting  used in the primary_conninfo, and the format used by pg_auto_failover
       there is the format string "pgautofailover_standby_%d" where %d is replaced  by  the  node
       id.  This  allows  keeping the same connection string to the primary when the node name is
       changed (using the command pg_autoctl set metadata --name).

       Here we can see the node id of each registered Postgres node with the following command:

          $ pg_autoctl show state
            Name |  Node |      Host:Port |       LSN | Reachable |       Current State |      Assigned State
          -------+-------+----------------+-----------+-----------+---------------------+--------------------
          node_A |     1 | localhost:5001 | 0/7002310 |       yes |             primary |             primary
          node_B |     2 | localhost:5002 | 0/7002310 |       yes |           secondary |           secondary
          node_C |     3 | localhost:5003 | 0/7002310 |       yes |           secondary |           secondary

       When setting pg_auto_failover with per formation number_sync_standby  and  then  per  node
       replication  quorum and candidate priority replication settings, those properties are then
       used to compute the synchronous_standby_names value on the primary  node.  This  value  is
       automatically  maintained  on  the primary by pg_auto_failover, and is updated either when
       replication settings are changed or when a failover happens.

       The other situation when the pg_auto_failover replication settings are used is a candidate
       election  when  a failover happens and there is more than two nodes registered in a group.
       Then the node with the highest candidate priority is selected, as detailed  above  in  the
       Candidate Priority section.

   Sample architectures with three standby nodes
       When  setting  the three parameters above, it's possible to design very different Postgres
       architectures for your production needs.
         [image: pg_auto_failover architecture with three standby nodes] [image] pg_auto_failover
         architecture with three standby nodes.UNINDENT

         In  this  case, the system is set up with three standby nodes all set the same way, with
         default parameters. The default parameters support  setting  number_sync_standbys  =  2.
         This  means  that  Postgres will maintain three copies of the production data set at all
         times.

         On the other hand, if two standby nodes were to fail at the same time, despite the  fact
         that two copies of the data are still maintained, the Postgres service would be degraded
         to read-only.

         With this architecture diagram, here's the summary that we obtain:

          $ pg_autoctl show settings
            Context |    Name |                   Setting | Value
          ----------+---------+---------------------------+---------------------------------------------------------------------------------------
          formation | default |      number_sync_standbys | 2
            primary |  node_A | synchronous_standby_names | 'ANY 2 (pgautofailover_standby_2, pgautofailover_standby_4, pgautofailover_standby_3)'
               node |  node_A |        replication quorum | true
               node |  node_B |        replication quorum | true
               node |  node_C |        replication quorum | true
               node |  node_D |        replication quorum | true
               node |  node_A |        candidate priority | 50
               node |  node_B |        candidate priority | 50
               node |  node_C |        candidate priority | 50
               node |  node_D |        candidate priority | 50

   Sample architecture with three standby nodes, one async
         [image: pg_auto_failover architecture with  three  standby  nodes,  one  async]  [image]
         pg_auto_failover architecture with three standby nodes, one async.UNINDENT

         In  this  case,  the  system  is  set  up  with  two  standby nodes participating in the
         replication quorum, allowing for number_sync_standbys = 1. The system  always  maintains
         at  least  two  copies  of the data set, one on the primary, another on either node B or
         node C. Whenever we lose one of those nodes, we can hold to the guarantee of having  two
         copies of the data set.

         Additionally,  we  have the standby server D which has been set up to not participate in
         the replication quorum. Node D will not be found in the  synchronous_standby_names  list
         of  nodes.  Also,  node  D  is  set  up  to  never  be  a  candidate  for failover, with
         candidate-priority = 0.

         This architecture would fit a situation with nodes A, B, and C are deployed in the  same
         data  center  or availability zone and node D in another one.  Those three nodes are set
         up to support the main production traffic and implement high availability  of  both  the
         Postgres service and the data set.

         Node D might be set up for Business Continuity in case the first data center is lost, or
         maybe for reporting needs on another application domain.

         With this architecture diagram, here's the summary that we obtain:

          pg_autoctl show settings
            Context |    Name |                   Setting | Value
          ----------+---------+---------------------------+-------------------------------------------------------------
          formation | default |      number_sync_standbys | 1
            primary |  node_A | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
               node |  node_A |        replication quorum | true
               node |  node_B |        replication quorum | true
               node |  node_C |        replication quorum | true
               node |  node_D |        replication quorum | false
               node |  node_A |        candidate priority | 50
               node |  node_B |        candidate priority | 50
               node |  node_C |        candidate priority | 50
               node |  node_D |        candidate priority | 0

CITUS SUPPORT

       The usual pg_autoctl commands work both with Postgres  standalone  nodes  and  with  Citus
       nodes.
         [image:  pg_auto_failover  architecture with a Citus formation] [image] pg_auto_failover
         architecture with a Citus formation.UNINDENT

         When using pg_auto_failover with Citus, a pg_auto_failover formation is  composed  of  a
         coordinator and a set of worker nodes.

         When  High-Availability  is enabled at the formation level, which is the default, then a
         minimum of two coordinator nodes are required: a primary and a secondary coordinator  to
         be able to orchestrate a failover when needed.

         The  same  applies  to  the worker nodes: when using pg_auto_failover for Citus HA, then
         each worker node is a pg_auto_failover group in the formation, and each worker group  is
         setup with at least two nodes (primary, secondary).

   Setting-up your first Citus formation
       Have a look at our documentation of Citus Cluster Quick Start for more details with a full
       tutorial setup on a single VM, for testing and QA.

   Citus specific commands and operations
       When setting up Citus with pg_auto_failover, the following  Citus  specific  commands  are
       provided.  Other  pg_autoctl  commands  work as usual when deploying a Citus formation, so
       that you can use the rest of this documentation to operate your Citus deployments.

   pg_autoctl create coordinator
       This creates a Citus coordinator, that is to say a Postgres node with the Citus  extension
       loaded  and  ready  to  act  as  a  coordinator.  The  coordinator is always places in the
       pg_auto_failover group zero of a given formation.

       See pg_autoctl create coordinator for details.

       IMPORTANT:
          The default --dbname is the same as the current system user name, which in many case is
          going  to  be  postgres.  Please  make  sure to use the --dbname option with the actual
          database that you're going to use with your application.

          Citus does not support multiple databases, you have to use the database where Citus  is
          created. When using Citus, that is essential to the well behaving of worker failover.

   pg_autoctl create worker
       This  command  creates  a  new  Citus worker node, that is to say a Postgres node with the
       Citus extensions loaded, and registered to the Citus coordinator created with the previous
       command.  Because  the  Citus coordinator is always given group zero, the pg_auto_failover
       monitor knows how to reach the Citus coordinator and automate workers registration.

       The default worker creation policy is to assign the  primary  role  to  the  first  worker
       registered, then secondary in the same group, then primary in a new group, etc.

       If  you  want  to  extend an existing group to have a third worker node in the same group,
       enabling multiple-standby capabilities in your setup, then make sure to  use  the  --group
       option to the pg_autoctl create worker command.

       See pg_autoctl create worker for details.

   pg_autoctl activate
       This  command  calls  the Citus “activation” API so that a node can be used to host shards
       for your reference and distributed tables.

       When creating a Citus worker, pg_autoctl create worker automatically activates the  worker
       node  to  the  coordinator.  You  only  need  this  command when something unexpected have
       happened and you want to manually make sure the worker node  has  been  activated  at  the
       Citus coordinator level.

       Starting  with  Citus  10 it is also possible to activate the coordinator itself as a node
       with shard placement. Use pg_autoctl activate on your Citus coordinator node  manually  to
       use that feature.

       When  the  Citus  coordinator  is  activated,  an extra step is then needed for it to host
       shards of distributed tables. If you want your coordinator to have  shards,  then  have  a
       look  at  the  Citus  API  citus_set_node_property to set the shouldhaveshards property to
       true.

       See pg_autoctl activate for details.

   Citus worker failover
       When a failover is orchestrated by pg_auto_failover for a Citus worker node, Citus  offers
       the  opportunity to make the failover close to transparent to the application. Because the
       application connects to the coordinator, which in turn connects to the worker nodes,  then
       it  is  possible  with  Citus  to _pause_ the SQL write traffic on the coordinator for the
       shards hosted on a failed worker node.  The  Postgres  failover  then  happens  while  the
       traffic  is  kept  on  the  coordinator, and resumes as soon as a secondary worker node is
       ready to accept read-write queries.

       This is implemented thanks to Citus smart locking strategy in its  citus_update_node  API,
       and  pg_auto_failover  takes  full benefit with a special built set of FSM transitions for
       Citus workers.

   Citus Secondaries and read-replica
       It is possible to setup Citus read-only replicas. This Citus feature allows using a set of
       dedicated  nodes  (both  coordinator  and  workers)  to  serve  read-only traffic, such as
       reporting, analytics, or other parts of your workload that are read-only.

       Citus read-replica nodes are a solution for load-balancing. Those nodes can't be  used  as
       HA failover targets, and thus have their candidate-priority set to zero. This setting of a
       read-replica can not be changed later.

       This setup is done by setting the Citus property citus.use_secondary_nodes to  always  (it
       defaults  to  never),  and the Citus property citus.cluster_name to your read-only cluster
       name.

       Both of those settings are entirely supported and managed by  pg_autoctl  when  using  the
       --citus-secondary    --cluster-name   cluster_d   options   to   the   pg_autoctl   create
       coordinator|worker commands.

       Here is an example where we have created a formation with  three  nodes  for  HA  for  the
       coordinator (one primary and two secondary nodes), then a single worker node with the same
       three nodes setup for HA, and then one read-replica environment  on-top  of  that,  for  a
       total of 8 nodes:

          $  pg_autoctl show state
              Name |  Node |      Host:Port |       LSN | Reachable |       Current State |      Assigned State
          ---------+-------+----------------+-----------+-----------+---------------------+--------------------
           coord0a |   0/1 | localhost:5501 | 0/5003298 |       yes |             primary |             primary
           coord0b |   0/3 | localhost:5502 | 0/5003298 |       yes |           secondary |           secondary
           coord0c |   0/6 | localhost:5503 | 0/5003298 |       yes |           secondary |           secondary
           coord0d |   0/7 | localhost:5504 | 0/5003298 |       yes |           secondary |           secondary
          worker1a |   1/2 | localhost:5505 | 0/4000170 |       yes |             primary |             primary
          worker1b |   1/4 | localhost:5506 | 0/4000170 |       yes |           secondary |           secondary
          worker1c |   1/5 | localhost:5507 | 0/4000170 |       yes |           secondary |           secondary
          reader1d |   1/8 | localhost:5508 | 0/4000170 |       yes |           secondary |           secondary

       Nodes named coord0d and reader1d are members of the read-replica cluster cluster_d. We can
       see that a read-replica cluster needs a  dedicated  coordinator  and  then  one  dedicated
       worker node per group.

       TIP:
          It  is  possible  to  name the nodes in a pg_auto_failover formation either at creation
          time or later, using one of those commands:

              $ pg_autoctl create worker --name ...
              $ pg_autoctl set node metadata --name ...

          Here coord0d is the node name for the dedicated coordinator for cluster_d, and reader1d
          is the node name for the dedicated worker for cluster_d in the worker group 1 (the only
          worker group in that setup).

       Now the usual command to show the connection strings for your application is  listing  the
       read-replica setup that way:

              $ pg_autoctl show uri
                  Type |      Name | Connection String
          -------------+-----------+-------------------------------
               monitor |   monitor | postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer
             formation |   default | postgres://localhost:5503,localhost:5501,localhost:5502/postgres?target_session_attrs=read-write&sslmode=prefer
          read-replica | cluster_d | postgres://localhost:5504/postgres?sslmode=prefer

       Given  that  setup,  your  application  can  now use the formation default Postgres URI to
       connect to the highly-available read-write  service,  or  to  the  read-replica  cluster_d
       service  to  connect  to  the  read-only  replica  where  you can offload some of your SQL
       workload.

       When   connecting   to   the   cluster_d   connection   string,   the   Citus   properties
       citus.use_secondary_nodes and citus.cluster_name are automatically setup to their expected
       values, of course.

CITUS CLUSTER QUICK START

       In this guide we’ll create a Citus cluster with a  coordinator  node  and  three  workers.
       Every  node  will have a secondary for failover. We’ll simulate failure in the coordinator
       and worker nodes and see how the system continues to function.

       This tutorial uses docker-compose in order to separate the architecture design  from  some
       of  the  implementation  details.  This allows reasonning at the architecture level within
       this tutorial, and better see which software component needs to be  deployed  and  run  on
       which node.

       The  setup  provided  in this tutorial is good for replaying at home in the lab. It is not
       intended to be production ready though. In particular, no attention  have  been  spent  on
       volume  management.  After  all, this is a tutorial: the goal is to walk through the first
       steps of using pg_auto_failover to provide HA to a Citus formation.

   Pre-requisites
       When using docker-compose we describe a list of services, each service may run on  one  or
       more nodes, and each service just runs a single isolated process in a container.

       Within  the  context  of  a tutorial, or even a development environment, this matches very
       well to provisioning separate  physical  machines  on-prem,  or  Virtual  Machines  either
       on-prem on in a Cloud service.

       The  docker  image  used in this tutorial is named pg_auto_failover:citus. It can be built
       locally when using  the  attached  Dockerfile  found  within  the  GitHub  repository  for
       pg_auto_failover.

       To build the image, either use the provided Makefile and run make build, or run the docker
       build command directly:

          $ git clone https://github.com/citusdata/pg_auto_failover
          $ cd pg_auto_failover/docs/cluster

          $ docker build -t pg_auto_failover:citus -f Dockerfile ../..
          $ docker-compose build

   Our first Citus Cluster
       To create a cluster we use the following docker-compose definition:

          version: "3.9"  # optional since v1.27.0

          services:

            monitor:
              image: pg_auto_failover:citus
              environment:
                PGDATA: /tmp/pgaf
              command: |
                pg_autoctl create monitor --ssl-self-signed --auth trust --run
              expose:
                - 5432

            coord:
              image: pg_auto_failover:citus
              environment:
                PGDATA: /tmp/pgaf
                PGUSER: citus
                PGDATABASE: citus
                PG_AUTOCTL_MONITOR: "postgresql://autoctl_node@monitor/pg_auto_failover"
              expose:
                - 5432
              command: |
                pg_autoctl create coordinator --ssl-self-signed --auth trust --pg-hba-lan --run

            worker:
              image: pg_auto_failover:citus
              environment:
                PGDATA: /tmp/pgaf
                PGUSER: citus
                PGDATABASE: citus
                PG_AUTOCTL_MONITOR: "postgresql://autoctl_node@monitor/pg_auto_failover"
              expose:
               - 5432
              command: |
                pg_autoctl create worker --ssl-self-signed --auth trust --pg-hba-lan --run

       To run the full Citus cluster with HA from this  definition,  we  can  use  the  following
       command:

          $ docker-compose up --scale coord=2 --scale worker=6

       The command above starts the services up. The command also specifies a --scale option that
       is different for each service. We need:

          • one monitor node, and the default scale for a service is 1,

          • one primary Citus coordinator node and one secondary Cituscoordinator node, which  is
            to say two coordinator nodes,

          • and  three  Citus  worker  nodes, each worker with both a primary Postgres node and a
            secondary Postgres node, so that's a scale of 6 here.

       The default policy for the pg_auto_failover monitor is to assign a primary and a secondary
       per  auto failover Group. In our case, every node being provisioned with the same command,
       we benefit from that default policy:

          $ pg_autoctl create worker --ssl-self-signed --auth trust --pg-hba-lan --run

       When provisioning a production cluster, it is often required to have a better control over
       which  node participates in which group, then using the --group N option in the pg_autoctl
       create worker command line.

       Within a given group, the first node that registers is a primary, and the other nodes  are
       secondary  nodes. The monitor takes care of that in a way that we don't have to. In a High
       Availability setup, every node should be ready to be promoted  primary  at  any  time,  so
       knowing which node in a group is assigned primary first is not very interesting.

       While  the  cluster  is  being  provisionned  by docker-compose, you can run the following
       command and have a dynamic dashboard to follow what's happening.  The following command is
       like top for pg_auto_failover:

          $ docker-compose exec monitor pg_autoctl watch

       Because  the pg_basebackup operation that is used to create the secondary nodes takes some
       time when using Citus, because of the first CHECKPOINT which is quite slow.  So  at  first
       when inquiring about the cluster state you might see the following output:

          $ docker-compose exec monitor pg_autoctl show state
              Name |  Node |         Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ---------+-------+-------------------+----------------+--------------+---------------------+--------------------
           coord0a |   0/1 | cd52db444544:5432 |   1: 0/200C4A0 |   read-write |        wait_primary |        wait_primary
           coord0b |   0/2 | 66a31034f2e4:5432 |         1: 0/0 |       none ! |        wait_standby |          catchingup
          worker1a |   1/3 | dae7c062e2c1:5432 |   1: 0/2003B18 |   read-write |        wait_primary |        wait_primary
          worker1b |   1/4 | 397e6069b09b:5432 |         1: 0/0 |       none ! |        wait_standby |          catchingup
          worker2a |   2/5 | 5bf86f9ef784:5432 |   1: 0/2006AB0 |   read-write |        wait_primary |        wait_primary
          worker2b |   2/6 | 23498b801a61:5432 |         1: 0/0 |       none ! |        wait_standby |          catchingup
          worker3a |   3/7 | c23610380024:5432 |   1: 0/2003B18 |   read-write |        wait_primary |        wait_primary
          worker3b |   3/8 | 2ea8aac8a04a:5432 |         1: 0/0 |       none ! |        wait_standby |          catchingup

       After  a  while  though (typically around a minute or less), you can run that same command
       again for stable result:

          $ docker-compose exec monitor pg_autoctl show state

              Name |  Node |         Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ---------+-------+-------------------+----------------+--------------+---------------------+--------------------
           coord0a |   0/1 | cd52db444544:5432 |   1: 0/3127AD0 |   read-write |             primary |             primary
           coord0b |   0/2 | 66a31034f2e4:5432 |   1: 0/3127AD0 |    read-only |           secondary |           secondary
          worker1a |   1/3 | dae7c062e2c1:5432 |   1: 0/311B610 |   read-write |             primary |             primary
          worker1b |   1/4 | 397e6069b09b:5432 |   1: 0/311B610 |    read-only |           secondary |           secondary
          worker2a |   2/5 | 5bf86f9ef784:5432 |   1: 0/311B610 |   read-write |             primary |             primary
          worker2b |   2/6 | 23498b801a61:5432 |   1: 0/311B610 |    read-only |           secondary |           secondary
          worker3a |   3/7 | c23610380024:5432 |   1: 0/311B648 |   read-write |             primary |             primary
          worker3b |   3/8 | 2ea8aac8a04a:5432 |   1: 0/311B648 |    read-only |           secondary |           secondary

       You can see from the above that the coordinator node has a primary and secondary  instance
       for  high  availability. When connecting to the coordinator, clients should try connecting
       to whichever instance is running and supports reads and writes.

       We can review the available Postgres URIs with the pg_autoctl show uri command:

          $ docker-compose exec monitor pg_autoctl show uri
                   Type |    Name | Connection String
           -------------+---------+-------------------------------
                monitor | monitor | postgres://autoctl_node@552dd89d5d63:5432/pg_auto_failover?sslmode=require
              formation | default | postgres://66a31034f2e4:5432,cd52db444544:5432/citus?target_session_attrs=read-write&sslmode=require

       To check that Citus worker nodes have been registered to the coordinator,  we  can  run  a
       psql session right from the coordinator container:

          $ docker-compose exec coord psql -d citus -c 'select * from citus_get_active_worker_nodes();'
            node_name   | node_port
          --------------+-----------
           dae7c062e2c1 |      5432
           5bf86f9ef784 |      5432
           c23610380024 |      5432
          (3 rows)

       We are now reaching the limits of using a simplified docker-compose setup.  When using the
       --scale option, it is not possible to give a specific hostname to each running  node,  and
       then  we  get a randomly generated string instead or useful node names such as worker1a or
       worker3b.

   Create a Citus Cluster, take two
       In order to implement the following architecture, we need  to  introduce  a  more  complex
       docker-compose file than in the previous section.
         [image:  pg_auto_failover  architecture with a Citus formation] [image] pg_auto_failover
         architecture with a Citus formation.UNINDENT

         This time we create a cluster using the following docker-compose definition:

          version: "3.9"  # optional since v1.27.0

          x-coord: &coordinator
            image: pg_auto_failover:citus
            environment:
              PGDATA: /tmp/pgaf
              PGUSER: citus
              PGDATABASE: citus
              PG_AUTOCTL_HBA_LAN: true
              PG_AUTOCTL_AUTH_METHOD: "trust"
              PG_AUTOCTL_SSL_SELF_SIGNED: true
              PG_AUTOCTL_MONITOR: "postgresql://autoctl_node@monitor/pg_auto_failover"
            expose:
              - 5432

          x-worker: &worker
            image: pg_auto_failover:citus
            environment:
              PGDATA: /tmp/pgaf
              PGUSER: citus
              PGDATABASE: citus
              PG_AUTOCTL_HBA_LAN: true
              PG_AUTOCTL_AUTH_METHOD: "trust"
              PG_AUTOCTL_SSL_SELF_SIGNED: true
              PG_AUTOCTL_MONITOR: "postgresql://autoctl_node@monitor/pg_auto_failover"
            expose:
              - 5432

          services:
            app:
              build:
                context: .
                dockerfile: Dockerfile.app
              environment:
                PGUSER: citus
                PGDATABASE: citus
                PGHOST: coord0a,coord0b
                PGPORT: 5432
                PGAPPNAME: demo
                PGSSLMODE: require
                PGTARGETSESSIONATTRS: read-write

            monitor:
              image: pg_auto_failover:citus
              environment:
                PGDATA: /tmp/pgaf
                PG_AUTOCTL_SSL_SELF_SIGNED: true
              expose:
                - 5432
              command: |
                pg_autoctl create monitor --auth trust --run

            coord0a:
              <<: *coordinator
              hostname: coord0a
              command: |
                pg_autoctl create coordinator --name coord0a --run

            coord0b:
              <<: *coordinator
              hostname: coord0b
              command: |
                pg_autoctl create coordinator --name coord0b --run

            worker1a:
              <<: *worker
              hostname: worker1a
              command: |
                pg_autoctl create worker --group 1 --name worker1a --run

            worker1b:
              <<: *worker
              hostname: worker1b
              command: |
                pg_autoctl create worker --group 1 --name worker1b --run

            worker2a:
              <<: *worker
              hostname: worker2a
              command: |
                pg_autoctl create worker --group 2 --name worker2a --run

            worker2b:
              <<: *worker
              hostname: worker2b
              command: |
                pg_autoctl create worker --group 2 --name worker2b --run

            worker3a:
              <<: *worker
              hostname: worker3a
              command: |
                pg_autoctl create worker --group 3 --name worker3a --run

            worker3b:
              <<: *worker
              hostname: worker3b
              command: |
                pg_autoctl create worker --group 3 --name worker3b --run

       This definition is a little more involved than the previous one. We take benefit from YAML
       anchors  and  aliases to define a template for our coordinator nodes and worker nodes, and
       then apply that template to the actual nodes.

       Also this time we provision  an  application  service  (named  "app")  that  sits  in  the
       backgound  and  allow  us  to  later  connect  to  our  current  primary  coordinator. See
       Dockerfile.app for the complete definition of this service.

       We start this cluster with a simplified command line this time:

          $ docker-compose up

       And this time we get the following cluster as a result:

          $ docker-compose exec monitor pg_autoctl show state
              Name |  Node |     Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ---------+-------+---------------+----------------+--------------+---------------------+--------------------
           coord0a |   0/3 |  coord0a:5432 |   1: 0/312B040 |   read-write |             primary |             primary
           coord0b |   0/4 |  coord0b:5432 |   1: 0/312B040 |    read-only |           secondary |           secondary
          worker1a |   1/1 | worker1a:5432 |   1: 0/311C550 |   read-write |             primary |             primary
          worker1b |   1/2 | worker1b:5432 |   1: 0/311C550 |    read-only |           secondary |           secondary
          worker2b |   2/7 | worker2b:5432 |   2: 0/5032698 |   read-write |             primary |             primary
          worker2a |   2/8 | worker2a:5432 |   2: 0/5032698 |    read-only |           secondary |           secondary
          worker3a |   3/5 | worker3a:5432 |   1: 0/311C870 |   read-write |             primary |             primary
          worker3b |   3/6 | worker3b:5432 |   1: 0/311C870 |    read-only |           secondary |           secondary

       And then we have the following application connection string to use:

          $ docker-compose exec monitor pg_autoctl show uri
                  Type |    Name | Connection String
          -------------+---------+-------------------------------
               monitor | monitor | postgres://autoctl_node@f0135b83edcd:5432/pg_auto_failover?sslmode=require
             formation | default | postgres://coord0b:5432,coord0a:5432/citus?target_session_attrs=read-write&sslmode=require

       And finally, the nodes being registered as Citus worker nodes also make more sense:

          $ docker-compose exec coord0a psql -d citus -c 'select * from citus_get_active_worker_nodes()'
           node_name | node_port
          -----------+-----------
           worker1a  |      5432
           worker3a  |      5432
           worker2b  |      5432
          (3 rows)

       IMPORTANT:
          At this point, it is important to note that the Citus coordinator only knows about  the
          primary  nodes  in  each group. The High Availability mechanisms are all implemented in
          pg_auto_failover, which mostly uses the Citus API citus_update_node during worker  node
          failovers.

   Our first Citus worker failover
       We  see that in the citus_get_active_worker_nodes() output we have worker1a, worker2b, and
       worker3a. As mentionned before, that should have no impact on the operations of the  Citus
       cluster when nodes are all dimensionned the same.

       That  said,  some  readers  among  you will prefer to have the A nodes as primaries to get
       started with. So let's implement our first worker failover  then.  With  pg_auto_failover,
       this is as easy as doing:

          $ docker-compose exec monitor pg_autoctl perform failover --group 2
          15:40:03 9246 INFO  Waiting 60 secs for a notification with state "primary" in formation "default" and group 2
          15:40:03 9246 INFO  Listening monitor notifications about state changes in formation "default" and group 2
          15:40:03 9246 INFO  Following table displays times when notifications are received
              Time |     Name |  Node |     Host:Port |       Current State |      Assigned State
          ---------+----------+-------+---------------+---------------------+--------------------
          22:58:42 | worker2b |   2/7 | worker2b:5432 |             primary |            draining
          22:58:42 | worker2a |   2/8 | worker2a:5432 |           secondary |   prepare_promotion
          22:58:42 | worker2a |   2/8 | worker2a:5432 |   prepare_promotion |   prepare_promotion
          22:58:42 | worker2a |   2/8 | worker2a:5432 |   prepare_promotion |        wait_primary
          22:58:42 | worker2b |   2/7 | worker2b:5432 |             primary |             demoted
          22:58:42 | worker2b |   2/7 | worker2b:5432 |            draining |             demoted
          22:58:42 | worker2b |   2/7 | worker2b:5432 |             demoted |             demoted
          22:58:43 | worker2a |   2/8 | worker2a:5432 |        wait_primary |        wait_primary
          22:58:44 | worker2b |   2/7 | worker2b:5432 |             demoted |          catchingup
          22:58:46 | worker2b |   2/7 | worker2b:5432 |          catchingup |          catchingup
          22:58:46 | worker2b |   2/7 | worker2b:5432 |          catchingup |           secondary
          22:58:46 | worker2b |   2/7 | worker2b:5432 |           secondary |           secondary
          22:58:46 | worker2a |   2/8 | worker2a:5432 |        wait_primary |             primary
          22:58:46 | worker2a |   2/8 | worker2a:5432 |             primary |             primary

       So  it took around 5 seconds to do a full worker failover in worker group 2.  Now we'll do
       the same on the group 1 to fix the other  situation,  and  review  the  resulting  cluster
       state.

          $ docker-compose exec monitor pg_autoctl show state
              Name |  Node |     Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ---------+-------+---------------+----------------+--------------+---------------------+--------------------
           coord0a |   0/3 |  coord0a:5432 |   1: 0/312ADA8 |   read-write |             primary |             primary
           coord0b |   0/4 |  coord0b:5432 |   1: 0/312ADA8 |    read-only |           secondary |           secondary
          worker1a |   1/1 | worker1a:5432 |   1: 0/311B610 |   read-write |             primary |             primary
          worker1b |   1/2 | worker1b:5432 |   1: 0/311B610 |    read-only |           secondary |           secondary
          worker2b |   2/7 | worker2b:5432 |   2: 0/50000D8 |    read-only |           secondary |           secondary
          worker2a |   2/8 | worker2a:5432 |   2: 0/50000D8 |   read-write |             primary |             primary
          worker3a |   3/5 | worker3a:5432 |   1: 0/311B648 |   read-write |             primary |             primary
          worker3b |   3/6 | worker3b:5432 |   1: 0/311B648 |    read-only |           secondary |           secondary

       Which seen from the Citus coordinator, looks like the following:

          $ docker-compose exec coord0a psql -d citus -c 'select * from citus_get_active_worker_nodes()'
           node_name | node_port
          -----------+-----------
           worker1a  |      5432
           worker3a  |      5432
           worker2a  |      5432
          (3 rows)

   Distribute Data to Workers
       Let's create a database schema with a single distributed table.

          $ docker-compose exec app psql

          -- in psql

          CREATE TABLE companies
          (
            id         bigserial PRIMARY KEY,
            name       text NOT NULL,
            image_url  text,
            created_at timestamp without time zone NOT NULL,
            updated_at timestamp without time zone NOT NULL
          );

          SELECT create_distributed_table('companies', 'id');

       Next download and ingest some sample data, still from within our psql session:

          \copy companies from program 'curl -o- https://examples.citusdata.com/mt_ref_arch/companies.csv' with csv
          # ( COPY 75 )

   Handle Worker Failure
       Now we'll intentionally crash a worker's primary node and observe how the pg_auto_failover
       monitor unregisters that node in the coordinator and registers the secondary instead.

          # the pg_auto_failover keeper process will be unable to resurrect
          # the worker node if pg_control has been removed
          $ docker-compose exec worker1a rm /tmp/pgaf/global/pg_control

          # shut it down
          $ docker-compose exec worker1a /usr/lib/postgresql/14/bin/pg_ctl stop -D /tmp/pgaf

       The keeper will attempt to start worker 1a three times and then report the failure to  the
       monitor,  who promotes worker1b to replace worker1a. Citus worker worker1a is unregistered
       with the coordinator node, and worker1b is registered in its stead.

       Asking the coordinator for active worker nodes now shows worker1b, worker2a, and worker3a:

          $ docker-compose exec app psql -c 'select * from master_get_active_worker_nodes();'

           node_name | node_port
          -----------+-----------
           worker3a  |      5432
           worker2a  |      5432
           worker1b  |      5432
          (3 rows)

       Finally, verify that all rows of data are still present:

          $ docker-compose exec app psql -c 'select count(*) from companies;'
           count
          -------
              75

       Meanwhile, the keeper on worker 1a heals the node. It  runs  pg_basebackup  to  fetch  the
       current PGDATA from worker1a. This restores, among other things, a new copy of the file we
       removed. After streaming replication completes, worker1b becomes  a  full-fledged  primary
       and worker1a its secondary.

          $ docker-compose exec monitor pg_autoctl show state
              Name |  Node |     Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ---------+-------+---------------+----------------+--------------+---------------------+--------------------
           coord0a |   0/3 |  coord0a:5432 |   1: 0/3178B20 |   read-write |             primary |             primary
           coord0b |   0/4 |  coord0b:5432 |   1: 0/3178B20 |    read-only |           secondary |           secondary
          worker1a |   1/1 | worker1a:5432 |   2: 0/504C400 |    read-only |           secondary |           secondary
          worker1b |   1/2 | worker1b:5432 |   2: 0/504C400 |   read-write |             primary |             primary
          worker2b |   2/7 | worker2b:5432 |   2: 0/50FF048 |    read-only |           secondary |           secondary
          worker2a |   2/8 | worker2a:5432 |   2: 0/50FF048 |   read-write |             primary |             primary
          worker3a |   3/5 | worker3a:5432 |   1: 0/31CD8C0 |   read-write |             primary |             primary
          worker3b |   3/6 | worker3b:5432 |   1: 0/31CD8C0 |    read-only |           secondary |           secondary

   Handle Coordinator Failure
       Because  our application connection string includes both coordinator hosts with the option
       target_session_attrs=read-write, the database client will connect to  whichever  of  these
       servers supports both reads and writes.

       However  if  we  use  the  same  trick  with  the  pg_control  file  to  crash our primary
       coordinator, we can watch how the monitor promotes the secondary.

          $ docker-compose exec coord0a rm /tmp/pgaf/global/pg_control
          $ docker-compose exec coord0a /usr/lib/postgresql/14/bin/pg_ctl stop -D /tmp/pgaf

       After some time, coordinator A's keeper heals it, and the cluster converges in this state:

          $ docker-compose exec monitor pg_autoctl show state
              Name |  Node |     Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ---------+-------+---------------+----------------+--------------+---------------------+--------------------
           coord0a |   0/3 |  coord0a:5432 |   2: 0/50000D8 |    read-only |           secondary |           secondary
           coord0b |   0/4 |  coord0b:5432 |   2: 0/50000D8 |   read-write |             primary |             primary
          worker1a |   1/1 | worker1a:5432 |   2: 0/504C520 |    read-only |           secondary |           secondary
          worker1b |   1/2 | worker1b:5432 |   2: 0/504C520 |   read-write |             primary |             primary
          worker2b |   2/7 | worker2b:5432 |   2: 0/50FF130 |    read-only |           secondary |           secondary
          worker2a |   2/8 | worker2a:5432 |   2: 0/50FF130 |   read-write |             primary |             primary
          worker3a |   3/5 | worker3a:5432 |   1: 0/31CD8C0 |   read-write |             primary |             primary
          worker3b |   3/6 | worker3b:5432 |   1: 0/31CD8C0 |    read-only |           secondary |           secondary

       We can check that the data is still available through the new coordinator node too:

          $ docker-compose exec app psql -c 'select count(*) from companies;'
           count
          -------
              75

   Next steps
       As mentioned in the first section of this tutorial, the way we use docker-compose here  is
       not  meant  to  be production ready. It's useful to understand and play with a distributed
       system such as Citus though, and makes it simple to  introduce  faults  and  see  how  the
       pg_auto_failover High Availability reacts to those faults.

       One obvious missing element to better test the system is the lack of persistent volumes in
       our docker-compose based test rig. It is possible to create external volumes and use  them
       for each node in the docker-compose definition. This allows restarting nodes over the same
       data set.

       See the command pg_autoctl do tmux compose session for more details about  how  to  run  a
       docker-compose  test  environment with docker-compose, including external volumes for each
       node.

       Now is a good time to go read Citus Documentation too, so that you know how  to  use  this
       cluster you just created!

FAILOVER STATE MACHINE

Introduction
pg_auto_failover uses a state machine for highly controlled execution. As keepers inform
the monitor about new events (or fail to contact it at all), the monitor assigns each node
both a current state and a goal state. A node's current state is a strong guarantee of its
capabilities. States themselves do not cause any actions; actions happen during state
transitions. The assigned goal states inform keepers of what transitions to attempt.

Example of state transitions in a new cluster
A good way to get acquainted with the states is by examining the transitions of a cluster
from birth to high availability.

After starting a monitor and running keeper init for the first data node ("node A"), the
monitor registers the state of that node as "init" with a goal state of "single." The init
state means the monitor knows nothing about the node other than its existence because the
keeper is not yet continuously running there to report node health.

Once the keeper runs and reports its health to the monitor, the monitor assigns it the
state "single," meaning it is just an ordinary Postgres server with no failover. Because
there are not yet other nodes in the cluster, the monitor also assigns node A the goal
state of single -- there's nothing that node A's keeper needs to change.

As soon as a new node ("node B") is initialized, the monitor assigns node A the goal state
of "wait_primary." This means the node still has no failover, but there's hope for a
secondary to synchronize with it soon. To accomplish the transition from single to
wait_primary, node A's keeper adds node B's hostname to pg_hba.conf to allow a hot standby
replication connection.

At the same time, node B transitions into wait_standby with the goal initially of staying
in wait_standby. It can do nothing but wait until node A gives it access to connect. Once
node A has transitioned to wait_primary, the monitor assigns B the goal of "catchingup,"
which gives B's keeper the green light to make the transition from wait_standby to
catchingup. This transition involves running pg_basebackup, editing recovery.conf and
restarting PostgreSQL in Hot Standby node.

Node B reports to the monitor when it's in hot standby mode and able to connect to node A.
The monitor then assigns node B the goal state of "secondary" and A the goal of "primary."
Postgres ships WAL logs from node A and replays them on B. Finally B is caught up and
tells the monitor (specifically B reports its pg_stat_replication.sync_state and WAL
replay lag). At this glorious moment the monitor assigns A the state primary (goal:
primary) and B secondary (goal: secondary).

State reference
The following diagram shows the pg_auto_failover State Machine. It's missing links to the
single state, which can always been reached when removing all the other nodes.
[image: pg_auto_failover Finite State Machine diagram] [image] pg_auto_failover Finite
State Machine diagram.UNINDENT

In the previous diagram we can see that we have a list of six states where the
application can connect to a read-write Postgres service: single, wait_primary, primary,
prepare_maintenance, and apply_settings.

Init
A node is assigned the "init" state when it is first registered with the monitor. Nothing
is known about the node at this point beyond its existence. If no other node has been
registered with the monitor for the same formation and group ID then this node is assigned
a goal state of "single." Otherwise the node has the goal state of "wait_standby."

Single
There is only one node in the group. It behaves as a regular PostgreSQL instance, with no
high availability and no failover. If the administrator removes a node the other node will
revert to the single state.

Wait_primary
Applied to a node intended to be the primary but not yet in that position. The
primary-to-be at this point knows the secondary's node name or IP address, and has granted
the node hot standby access in the pg_hba.conf file.

The wait_primary state may be caused either by a new potential secondary being registered
with the monitor (good), or an existing secondary becoming unhealthy (bad). In the latter
case, during the transition from primary to wait_primary, the primary node's keeper
disables synchronous replication on the node. It also cancels currently blocked queries.

Join_primary
Applied to a primary node when another standby is joining the group. This allows the
primary node to apply necessary changes to its HBA setup before allowing the new node
joining the system to run the pg_basebackup command.

IMPORTANT:
This state has been deprecated, and is no longer assigned to nodes. Any time we would
have used join_primary before, we now use primary instead.

Primary
A healthy secondary node exists and has caught up with WAL replication. Specifically, the
keeper reports the primary state only when it has verified that the secondary is reported
"sync" in pg_stat_replication.sync_state, and with a WAL lag of 0.

The primary state is a strong assurance. It's the only state where we know we can fail
over when required.

During the transition from wait_primary to primary, the keeper also enables synchronous
replication. This means that after a failover the secondary will be fully up to date.

Wait_standby
Monitor decides this node is a standby. Node must wait until the primary has authorized it
to connect and setup hot standby replication.

Catchingup
The monitor assigns catchingup to the standby node when the primary is ready for a
replication connection (pg_hba.conf has been properly edited, connection role added, etc).

The standby node keeper runs pg_basebackup, connecting to the primary's hostname and port.
The keeper then edits recovery.conf and starts PostgreSQL in hot standby node.

Secondary
A node with this state is acting as a hot standby for the primary, and is up to date with
the WAL log there. In particular, it is within 16MB or 1 WAL segment of the primary.

Maintenance
The cluster administrator can manually move a secondary into the maintenance state to
gracefully take it offline. The primary will then transition from state primary to
wait_primary, during which time the secondary will be online to accept writes. When the
old primary reaches the wait_primary state then the secondary is safe to take offline with
minimal consequences.

Prepare_maintenance
The cluster administrator can manually move a primary node into the maintenance state to
gracefully take it offline. The primary then transitions to the prepare_maintenance state
to make sure the secondary is not missing any writes. In the prepare_maintenance state,
the primary shuts down.

Wait_maintenance
The custer administrator can manually move a secondary into the maintenance state to
gracefully take it offline. Before reaching the maintenance state though, we want to
switch the primary node to asynchronous replication, in order to avoid writes being
blocked. In the state wait_maintenance the standby waits until the primary has reached
wait_primary.

Draining
A state between primary and demoted where replication buffers finish flushing. A draining
node will not accept new client writes, but will continue to send existing data to the
secondary.

To implement that with Postgres we actually stop the service. When stopping, Postgres
ensures that the current replication buffers are flushed correctly to synchronous
standbys.

Demoted
The primary keeper or its database were unresponsive past a certain threshold. The monitor
assigns demoted state to the primary to avoid a split-brain scenario where there might be
two nodes that don't communicate with each other and both accept client writes.

In that state the keeper stops PostgreSQL and prevents it from running.

Demote_timeout
If the monitor assigns the primary a demoted goal state but the primary keeper doesn't
acknowledge transitioning to that state within a timeout window, then the monitor assigns
demote_timeout to the primary.

Most commonly may happen when the primary machine goes silent. The keeper is not reporting
to the monitor.

Stop_replication
The stop_replication state is meant to ensure that the primary goes to the demoted state
before the standby goes to single and accepts writes (in case the primary can’t contact
the monitor anymore). Before promoting the secondary node, the keeper stops PostgreSQL on
the primary to avoid split-brain situations.

For safety, when the primary fails to contact the monitor and fails to see the
pg_auto_failover connection in pg_stat_replication, then it goes to the demoted state of
its own accord.

Prepare_promotion
The prepare_promotion state is meant to prepare the standby server to being promoted. This
state allows synchronisation on the monitor, making sure that the primary has stopped
Postgres before promoting the secondary, hence preventing split brain situations.

Report_LSN
The report_lsn state is assigned to standby nodes when a failover is orchestrated and
there are several standby nodes. In order to pick the furthest standby in the replication,
pg_auto_failover first needs a fresh report of the current LSN position reached on each
standby node.

When a node reaches the report_lsn state, the replication stream is stopped, by restarting
Postgres without a primary_conninfo. This allows the primary node to detect Network
Partitions, i.e. when the primary can't connect to the monitor and there's no standby
listed in pg_stat_replication.

Fast_forward
The fast_forward state is assigned to the selected promotion candidate during a failover
when it won the election thanks to the candidate priority settings, but the selected node
is not the most advanced standby node as reported in the report_lsn state.

Missing WAL bytes are fetched from one of the most advanced standby nodes by using
Postgres cascading replication features: it is possible to use any standby node in the
primary_conninfo.

Dropped
The dropped state is assigned to a node when the pg_autoctl drop node command is used.
This allows the node to implement specific local actions before being entirely removed
from the monitor database.

When a node reports reaching the dropped state, the monitor removes its entry. If a node
is not reporting anymore, maybe because it's completely unavailable, then it's possible to
run the pg_autoctl drop node --force command, and then the node entry is removed from the
monitor.

pg_auto_failover keeper's State Machine
When built in TEST mode, it is then possible to use the following command to get a visual
representation of the Keeper's Finite State Machine:

$ PG_AUTOCTL_DEBUG=1 pg_autoctl do fsm gv | dot -Tsvg > fsm.svg

The dot program is part of the Graphviz suite and produces the following output:
[image: Keeper state machine] [image] Keeper State Machine.UNINDENT

FAILOVER AND FAULT TOLERANCE

At the heart of the pg_auto_failover implementation is a State Machine. The state machine
is driven by the monitor, and its transitions are implemented in the keeper service, which
then reports success to the monitor.

The keeper is allowed to retry transitions as many times as needed until they succeed, and
reports also failures to reach the assigned state to the monitor node. The monitor also
implements frequent health-checks targeting the registered PostgreSQL nodes.

When the monitor detects something is not as expected, it takes action by assigning a new
goal state to the keeper, that is responsible for implementing the transition to this new
state, and then reporting.

Unhealthy Nodes
The pg_auto_failover monitor is responsible for running regular health-checks with every
PostgreSQL node it manages. A health-check is successful when it is able to connect to the
PostgreSQL node using the PostgreSQL protocol (libpq), imitating the pg_isready command.

How frequent those health checks are (5s by default), the PostgreSQL connection timeout in
use (5s by default), and how many times to retry in case of a failure before marking the
node unhealthy (2 by default) are GUC variables that you can set on the Monitor node
itself. Remember, the monitor is implemented as a PostgreSQL extension, so the setup is a
set of PostgreSQL configuration settings:

SELECT name, setting
FROM pg_settings
WHERE name ~ 'pgautofailover\.health';
name | setting
-----------------------------------------+---------
pgautofailover.health_check_max_retries | 2
pgautofailover.health_check_period | 5000
pgautofailover.health_check_retry_delay | 2000
pgautofailover.health_check_timeout | 5000
(4 rows)

The pg_auto_failover keeper also reports if PostgreSQL is running as expected. This is
useful for situations where the PostgreSQL server / OS is running fine and the keeper
(pg_autoctl run) is still active, but PostgreSQL has failed. Situations might include
File System is Full on the WAL disk, some file system level corruption, missing files,
etc.

Here's what happens to your PostgreSQL service in case of any single-node failure is
observed:

• Primary node is monitored unhealthy

When the primary node is unhealthy, and only when the secondary node is itself in
good health, then the primary node is asked to transition to the DRAINING state, and
the attached secondary is asked to transition to the state PREPARE_PROMOTION. In this
state, the secondary is asked to catch-up with the WAL traffic from the primary, and
then report success.

The monitor then continues orchestrating the promotion of the standby: it stops the
primary (implementing STONITH in order to prevent any data loss), and promotes the
secondary into being a primary now.

Depending on the exact situation that triggered the primary unhealthy, it's possible
that the secondary fails to catch-up with WAL from it, in that case after the
PREPARE_PROMOTION_CATCHUP_TIMEOUT the standby reports success anyway, and the
failover sequence continues from the monitor.

• Secondary node is monitored unhealthy

When the secondary node is unhealthy, the monitor assigns to it the state CATCHINGUP,
and assigns the state WAIT_PRIMARY to the primary node. When implementing the
transition from PRIMARY to WAIT_PRIMARY, the keeper disables synchronous replication.

When the keeper reports an acceptable WAL difference in the two nodes again, then the
replication is upgraded back to being synchronous. While a secondary node is not in
the SECONDARY state, secondary promotion is disabled.

• Monitor node has failed

Then the primary and secondary node just work as if you didn't have setup
pg_auto_failover in the first place, as the keeper fails to report local state from
the nodes. Also, health checks are not performed. It means that no automated failover
may happen, even if needed.

Network Partitions
Adding to those simple situations, pg_auto_failover is also resilient to Network
Partitions. Here's the list of situation that have an impact to pg_auto_failover behavior,
and the actions taken to ensure High Availability of your PostgreSQL service:

• Primary can't connect to Monitor

Then it could be that either the primary is alone on its side of a network split, or
that the monitor has failed. The keeper decides depending on whether the secondary
node is still connected to the replication slot, and if we have a secondary,
continues to serve PostgreSQL queries.

Otherwise, when the secondary isn't connected, and after the
NETWORK_PARTITION_TIMEOUT has elapsed, the primary considers it might be alone in a
network partition: that's a potential split brain situation and with only one way to
prevent it. The primary stops, and reports a new state of DEMOTE_TIMEOUT.

The network_partition_timeout can be setup in the keeper's configuration and defaults
to 20s.

• Monitor can't connect to Primary

Once all the retries have been done and the timeouts are elapsed, then the primary
node is considered unhealthy, and the monitor begins the failover routine. This
routine has several steps, each of them allows to control our expectations and step
back if needed.

For the failover to happen, the secondary node needs to be healthy and caught-up with
the primary. Only if we timeout while waiting for the WAL delta to resorb (30s by
default) then the secondary can be promoted with uncertainty about the data
durability in the group.

• Monitor can't connect to Secondary

As soon as the secondary is considered unhealthy then the monitor changes the
replication setting to asynchronous on the primary, by assigning it the WAIT_PRIMARY
state. Also the secondary is assigned the state CATCHINGUP, which means it can't be
promoted in case of primary failure.

As the monitor tracks the WAL delta between the two servers, and they both report it
independently, the standby is eligible to promotion again as soon as it's caught-up
with the primary again, and at this time it is assigned the SECONDARY state, and the
replication will be switched back to synchronous.

Failure handling and network partition detection
If a node cannot communicate to the monitor, either because the monitor is down or because
there is a problem with the network, it will simply remain in the same state until the
monitor comes back.

If there is a network partition, it might be that the monitor and secondary can still
communicate and the monitor decides to promote the secondary since the primary is no
longer responsive. Meanwhile, the primary is still up-and-running on the other side of the
network partition. If a primary cannot communicate to the monitor it starts checking
whether the secondary is still connected. In PostgreSQL, the secondary connection
automatically times out after 30 seconds. If last contact with the monitor and the last
time a connection from the secondary was observed are both more than 30 seconds in the
past, the primary concludes it is on the losing side of a network partition and shuts
itself down. It may be that the secondary and the monitor were actually down and the
primary was the only node that was alive, but we currently do not have a way to
distinguish such a situation. As with consensus algorithms, availability can only be
correctly preserved if at least 2 out of 3 nodes are up.

In asymmetric network partitions, the primary might still be able to talk to the
secondary, while unable to talk to the monitor. During failover, the monitor therefore
assigns the secondary the stop_replication state, which will cause it to disconnect from
the primary. After that, the primary is expected to shut down after at least 30 and at
most 60 seconds. To factor in worst-case scenarios, the monitor waits for 90 seconds
before promoting the secondary to become the new primary.

INSTALLING PG_AUTO_FAILOVER

       We  provide  native  system  packages  for  pg_auto_failover   on   most   popular   Linux
       distributions.

       Use  the  steps  below  to  install pg_auto_failover on PostgreSQL 11. At the current time
       pg_auto_failover is compatible with both PostgreSQL 10 and PostgreSQL 11.

   Ubuntu or Debian
   Postgres apt repository
       Binary packages for debian and derivatives (ubuntu) are available from  apt.postgresql.org
       repository, install by following the linked documentation and then:

          $ sudo apt-get install pg-auto-failover-cli
          $ sudo apt-get install postgresql-14-auto-failover

       The  Postgres  extension  named "pgautofailover" is only necessary on the monitor node. To
       install that extension, you can install the postgresql-14-auto-failover package when using
       Postgres 14. It's available for other Postgres versions too.

   Avoiding the default Postgres service
       When  installing  the  debian  Postgres package, the installation script will initialize a
       Postgres data directory automatically, and register it to the systemd services. When using
       pg_auto_failover, it is best to avoid that step.

       To  avoid  automated  creation  of  a  Postgres  data directory when installing the debian
       package, follow those steps:

          $ curl https://www.postgresql.org/media/keys/ACCC4CF8.asc | apt-key add -
          $ echo "deb http://apt.postgresql.org/pub/repos/apt buster-pgdg main" > /etc/apt/sources.list.d/pgdg.list

          # bypass initdb of a "main" cluster
          $ echo 'create_main_cluster = false' | sudo tee -a /etc/postgresql-common/createcluster.conf
          $ apt-get update
          $ apt-get install -y --no-install-recommends postgresql-14

       That way when it's time to pg_autoctl create monitor or pg_autoctl create  postgres  there
       is no confusion about how to handle the default Postgres service created by debian: it has
       not been created at all.

   Fedora, CentOS, or Red Hat
   Quick install
       The following installation method downloads a bash script that  automates  several  steps.
       The  full  script  is  available for review at our package cloud installation instructions
       page url.

          # add the required packages to your system
          curl https://install.citusdata.com/community/rpm.sh | sudo bash

          # install pg_auto_failover
          sudo yum install -y pg-auto-failover14_12

          # confirm installation
          /usr/pgsql-12/bin/pg_autoctl --version

   Manual installation
       If you'd prefer to install your repo on your system manually, follow the instructions from
       package cloud manual installation page. This page will guide you with the specific details
       to achieve the 3 steps:

          1. install the pygpgme yum-utils packages for your distribution

          2. install a new RPM reposiroty for CitusData packages

          3. update your local yum cache

       Then when that's done, you can proceed with installing pg_auto_failover itself as  in  the
       previous case:

          # install pg_auto_failover
          sudo yum install -y pg-auto-failover14_12

          # confirm installation
          /usr/pgsql-12/bin/pg_autoctl --version

   Installing a pgautofailover Systemd unit
       The  command pg_autoctl show systemd outputs a systemd unit file that you can use to setup
       a boot-time registered service for pg_auto_failover on your machine.

       Here's a sample output from the command:

          $ export PGDATA=/var/lib/postgresql/monitor
          $ pg_autoctl show systemd
          13:44:34 INFO  HINT: to complete a systemd integration, run the following commands:
          13:44:34 INFO  pg_autoctl -q show systemd --pgdata "/var/lib/postgresql/monitor" | sudo tee /etc/systemd/system/pgautofailover.service
          13:44:34 INFO  sudo systemctl daemon-reload
          13:44:34 INFO  sudo systemctl start pgautofailover
          [Unit]
          Description = pg_auto_failover

          [Service]
          WorkingDirectory = /var/lib/postgresql
          Environment = 'PGDATA=/var/lib/postgresql/monitor'
          User = postgres
          ExecStart = /usr/lib/postgresql/10/bin/pg_autoctl run
          Restart = always
          StartLimitBurst = 0

          [Install]
          WantedBy = multi-user.target

       Copy/pasting the commands given in the hint  output  from  the  command  will  enable  the
       pgautofailer service on your system, when using systemd.

       It is important that PostgreSQL is started by pg_autoctl rather than by systemd itself, as
       it might be that a failover has been done during a reboot, for instance, and that once the
       reboot complete we want the local Postgres to re-join as a secondary node where it used to
       be a primary node.

SECURITY SETTINGS FOR PG_AUTO_FAILOVER

       In order to be able to orchestrate fully automated failovers, pg_auto_failover needs to be
       able to establish the following Postgres connections:

          • from the monitor node to each Postgres node to check the node's “health”

          • from  each  Postgres  node  to  the monitor to implement our node_active protocol and
            fetch the current assigned state for this node

          • from the secondary node to the primary node for Postgres streaming replication.

       Postgres Client authentication is controlled by a configuration  file:  pg_hba.conf.  This
       file contains a list of rules where each rule may allow or reject a connection attempt.

       For  pg_auto_failover  to  work  as intended, some HBA rules need to be added to each node
       configuration. You can choose  to  provision  the  pg_hba.conf  file  yourself  thanks  to
       pg_autoctl  options'  --skip-pg-hba, or you can use the following options to control which
       kind of rules are going to be added for you.

   Postgres HBA rules
       For your application to be able to connect to the current Postgres primary  servers,  some
       application  specific HBA rules have to be added to pg_hba.conf. There is no provision for
       doing that in pg_auto_failover.

       In other words, it is expected that you have to edit pg_hba.conf to open  connections  for
       your application needs.

   The trust security model
       As  its  name  suggests  the  trust  security  model  is not enabling any kind of security
       validation. This setting is popular for testing deployments though, as it  makes  it  very
       easy  to  verify that everything works as intended before putting security restrictions in
       place.

       To enable a “trust” security model with pg_auto_failover, use the pg_autoctl option --auth
       trust when creating nodes:

          $ pg_autoctl create monitor --auth trust ...
          $ pg_autoctl create postgres --auth trust ...
          $ pg_autoctl create postgres --auth trust ...

       When  using  --auth  trust  pg_autoctl  adds new HBA rules in the monitor and the Postgres
       nodes to enable connections as seen above.

   Authentication with passwords
       To setup pg_auto_failover with password for connections, you can use one of  the  password
       based  authentication methods supported by Postgres, such as password or scram-sha-256. We
       recommend the latter, as in the following example:

          $ pg_autoctl create monitor --auth scram-sha-256 ...

       The pg_autoctl does not set the password for you. The first step is to  set  the  database
       user password in the monitor database thanks to the following command:

          $ psql postgres://monitor.host/pg_auto_failover
          > alter user autoctl_node password 'h4ckm3';

       Now  that the monitor is ready with our password set for the autoctl_node user, we can use
       the password in the monitor connection string used when creating Postgres nodes.

       On the primary node, we can  create  the  Postgres  setup  as  usual,  and  then  set  our
       replication password, that we will use if we are demoted and then re-join as a standby:

          $ pg_autoctl create postgres       \
                 --auth scram-sha-256        \
                 ...                         \
                 --monitor postgres://autoctl_node:h4ckm3@monitor.host/pg_auto_failover

          $ pg_autoctl config set replication.password h4ckm3m0r3

       The  second  Postgres  node  is going to be initialized as a secondary and pg_autoctl then
       calls pg_basebackup at create time. We need to have the replication password  already  set
       at this time, and we can achieve that the following way:

          $ export PGPASSWORD=h4ckm3m0r3
          $ pg_autoctl create postgres       \
                 --auth scram-sha-256        \
                 ...                         \
                 --monitor postgres://autoctl_node:h4ckm3@monitor.host/pg_auto_failover

          $ pg_autoctl config set replication.password h4ckm3m0r3

       Note  that  you  can  use  The  Password  File  mechanism  as  discussed  in  the Postgres
       documentation in order to maintain your passwords in a separate file,  not  in  your  main
       pg_auto_failover  configuration  file. This also avoids using passwords in the environment
       and in command lines.

   Encryption of network communications
       Postgres knows how to  use  SSL  to  enable  network  encryption  of  all  communications,
       including  authentication with passwords and the whole data set when streaming replication
       is used.

       To enable SSL on the server an SSL certificate is needed. It  could  be  as  simple  as  a
       self-signed  certificate,  and  pg_autoctl  creates  such a certificate for you when using
       --ssl-self-signed command line option:

          $ pg_autoctl create monitor --ssl-self-signed ...      \
                                      --auth scram-sha-256 ...   \
                                      --ssl-mode require         \
                                      ...

          $ pg_autoctl create postgres --ssl-self-signed ...      \
                                       --auth scram-sha-256 ...   \
                                       ...

          $ pg_autoctl create postgres --ssl-self-signed ...      \
                                       --auth scram-sha-256 ...   \
                                       ...

       In that example we setup SSL connections to encrypt the network traffic, and we still have
       to setup an authentication mechanism exactly as in the previous sections of this document.
       Here scram-sha-256 has been selected, and the password will  be  sent  over  an  encrypted
       channel.

       When  using the --ssl-self-signed option, pg_autoctl creates a self-signed certificate, as
       per the Postgres documentation at the Creating Certificates page.

       The certificate subject CN defaults to  the  --hostname  parameter,  which  can  be  given
       explicitly  or  computed  by  pg_autoctl  as either your hostname when you have proper DNS
       resolution, or your current IP address.

       Self-signed certificates provide protection against eavesdropping;  this  setup  does  NOT
       protect  against  Man-In-The-Middle  attacks  nor  Impersonation  attacks.  See PostgreSQL
       documentation page SSL Support for details.

   Using your own SSL certificates
       In many cases you will want to  install  certificates  provided  by  your  local  security
       department  and signed by a trusted Certificate Authority. In that case one solution is to
       use --skip-pg-hba and do the whole setup yourself.

       It is still possible to give the certificates to pg_auto_failover and have it  handle  the
       Postgres setup for you:

          $ pg_autoctl create monitor --ssl-ca-file root.crt   \
                                      --ssl-crl-file root.crl  \
                                      --server-cert server.crt  \
                                      --server-key server.key  \
                                      --ssl-mode verify-full \
                                      ...

          $ pg_autoctl create postgres --ssl-ca-file root.crt   \
                                       --server-cert server.crt  \
                                       --server-key server.key  \
                                       --ssl-mode verify-full \
                                       ...

          $ pg_autoctl create postgres --ssl-ca-file root.crt   \
                                       --server-cert server.crt  \
                                       --server-key server.key  \
                                       --ssl-mode verify-full \
                                       ...

       The  option  --ssl-mode  can  be  used  to  force connection strings used by pg_autoctl to
       contain your preferred ssl mode. It defaults to require when using  --ssl-self-signed  and
       to allow when --no-ssl is used.  Here, we set --ssl-mode to verify-full which requires SSL
       Certificates Authentication, covered next.

       The default --ssl-mode when providing your own certificates (signed by your trusted CA) is
       then verify-full. This setup applies to the client connection where the server identity is
       going to be checked against the root  certificate  provided  with  --ssl-ca-file  and  the
       revocation  list optionally provided with the --ssl-crl-file. Both those files are used as
       the respective parameters sslrootcert and sslcrl in pg_autoctl connection strings to  both
       the monitor and the streaming replication primary server.

   SSL Certificates Authentication
       Given  those  files, it is then possible to use certificate based authentication of client
       connections. For that, it is necessary to prepare client certificates signed by your  root
       certificate  private  key  and  using  the  target  user  name  as its CN, as per Postgres
       documentation for Certificate Authentication:
          The cn (Common Name) attribute of the certificate will be  compared  to  the  requested
          database user name, and if they match the login will be allowed

       For  enabling  the cert authentication method with pg_auto_failover, you need to prepare a
       Client Certificate for the user postgres and used by pg_autoctl  when  connecting  to  the
       monitor,    to    place    in    ~/.postgresql/postgresql.crt    along    with   its   key
       ~/.postgresql/postgresql.key, in the home directory of the user that runs  the  pg_autoctl
       service (which defaults to postgres).

       Then  you  need to create a user name map as documented in Postgres page User Name Maps so
       that your certificate can be used to authenticate pg_autoctl users.

       The ident map in pg_ident.conf on  the  pg_auto_failover  monitor  should  then  have  the
       following  entry,  to  allow  postgres  to connect as the autoctl_node user for pg_autoctl
       operations:

          # MAPNAME       SYSTEM-USERNAME         PG-USERNAME

          # pg_autoctl runs as postgres and connects to the monitor autoctl_node user
          pgautofailover   postgres               autoctl_node

       To enable streaming replication, the pg_ident.conf file on each Postgres node  should  now
       allow    the   postgres   user   in   the   client   certificate   to   connect   as   the
       pgautofailover_replicator database user:

          # MAPNAME       SYSTEM-USERNAME         PG-USERNAME

          # pg_autoctl runs as postgres and connects to the monitor autoctl_node user
          pgautofailover  postgres                pgautofailover_replicator

       Given that user name map, you can then use the cert authentication  method.  As  with  the
       pg_ident.conf  provisioning, it is best to now provision the HBA rules yourself, using the
       --skip-pg-hba option:

          $ pg_autoctl create postgres --skip-pg-hba --ssl-ca-file ...

       The HBA rule will use the authentication method cert with a map  option,  and  might  then
       look like the following on the monitor:

          # allow certificate based authentication to the monitor
          hostssl pg_auto_failover autoctl_node 10.0.0.0/8 cert map=pgautofailover

       Then  your  pg_auto_failover  nodes  on the 10.0.0.0 network are allowed to connect to the
       monitor with the user autoctl_node used by pg_autoctl, assuming  they  have  a  valid  and
       trusted client certificate.

       The  HBA  rule  to  use  on the Postgres nodes to allow for Postgres streaming replication
       connections looks like the following:

          # allow streaming replication for pg_auto_failover nodes
          hostssl replication pgautofailover_replicator 10.0.0.0/8 cert map=pgautofailover

       Because the Postgres server runs as the  postgres  system  user,  the  connection  to  the
       primary  node  can  be  made  with  SSL  enabled and will then use the client certificates
       installed in the postgres home directory in ~/.postgresql/postgresql.{key,cert} locations.

   Postgres HBA provisioning
       While pg_auto_failover knows how to manage the Postgres HBA rules that are  necessary  for
       your  stream  replication  needs  and  for  its  monitor  protocol, it will not manage the
       Postgres HBA rules that are needed for your applications.

       If you have your own HBA provisioning solution, you  can  include  the  rules  needed  for
       pg_auto_failover and then use the --skip-pg-hba option to the pg_autoctl create commands.

   Enable SSL connections on an existing setup
       Whether you upgrade pg_auto_failover from a previous version that did not have support for
       the SSL features, or when you started with --no-ssl and later  change  your  mind,  it  is
       possible  with  pg_auto_failover to add SSL settings on system that has already been setup
       without explicit SSL support.

       In this section we detail how to upgrade to SSL settings.

       Installing Self-Signed certificates on-top of an already existing  pg_auto_failover  setup
       is  done  with  one  of  the  following pg_autoctl command variants, depending if you want
       self-signed certificates or fully verified ssl certificates:

          $ pg_autoctl enable ssl --ssl-self-signed --ssl-mode required

          $ pg_autoctl enable ssl --ssl-ca-file root.crt   \
                                  --ssl-crl-file root.crl  \
                                  --server-cert server.crt  \
                                  --server-key server.key  \
                                  --ssl-mode verify-full

       The  pg_autoctl  enable  ssl  command  edits  the  postgresql-auto-failover.conf  Postgres
       configuration file to match the command line arguments given and enable SSL as instructed,
       and then updates the pg_autoctl configuration.

       The connection string to connect to the monitor  is  also  automatically  updated  by  the
       pg_autoctl enable ssl command. You can verify your new configuration with:

          $ pg_autoctl config get pg_autoctl.monitor

       Note  that an already running pg_autoctl daemon will try to reload its configuration after
       pg_autoctl enable ssl has finished. In some cases this is not possible  to  do  without  a
       restart.  So  be  sure  to check the logs from a running daemon to confirm that the reload
       succeeded. If it did not you may need to restart the daemon to ensure the  new  connection
       string is used.

       The  HBA  settings are not edited, irrespective of the --skip-pg-hba that has been used at
       creation time. That's because the host records match  either  SSL  or  non-SSL  connection
       attempts in Postgres HBA file, so the pre-existing setup will continue to work. To enhance
       the SSL setup, you can manually edit the HBA files and change the existing lines from host
       to hostssl to dissallow unencrypted connections at the server side.

       In summary, to upgrade an existing pg_auto_failover setup to enable SSL:

          1. run  the  pg_autoctl  enable  ssl  command on your monitor and then all the Postgres
             nodes,

          2. on the Postgres nodes, review your pg_autoctl logs to  make  sure  that  the  reload
             operation  has  been effective, and review your Postgres settings to verify that you
             have the expected result,

          3. review your HBA rules setup to  change  the  pg_auto_failover  rules  from  host  to
             hostssl to disallow insecure connections.

MANUAL PAGES

       The  pg_autoctl  tool  hosts  many  commands and sub-commands. Each of them have their own
       manual page.

   pg_autoctl
       pg_autoctl - control a pg_auto_failover node

   Synopsis
       pg_autoctl provides the following commands:

          pg_autoctl
          + create    Create a pg_auto_failover node, or formation
          + drop      Drop a pg_auto_failover node, or formation
          + config    Manages the pg_autoctl configuration
          + show      Show pg_auto_failover information
          + enable    Enable a feature on a formation
          + disable   Disable a feature on a formation
          + get       Get a pg_auto_failover node, or formation setting
          + set       Set a pg_auto_failover node, or formation setting
          + perform   Perform an action orchestrated by the monitor
            activate  Activate a Citus worker from the Citus coordinator
            run       Run the pg_autoctl service (monitor or keeper)
            stop      signal the pg_autoctl service for it to stop
            reload    signal the pg_autoctl for it to reload its configuration
            status    Display the current status of the pg_autoctl service
            help      print help message
            version   print pg_autoctl version

          pg_autoctl create
            monitor      Initialize a pg_auto_failover monitor node
            postgres     Initialize a pg_auto_failover standalone postgres node
            coordinator  Initialize a pg_auto_failover citus coordinator node
            worker       Initialize a pg_auto_failover citus worker node
            formation    Create a new formation on the pg_auto_failover monitor

          pg_autoctl drop
            monitor    Drop the pg_auto_failover monitor
            node       Drop a node from the pg_auto_failover monitor
            formation  Drop a formation on the pg_auto_failover monitor

          pg_autoctl config
            check  Check pg_autoctl configuration
            get    Get the value of a given pg_autoctl configuration variable
            set    Set the value of a given pg_autoctl configuration variable

          pg_autoctl show
            uri            Show the postgres uri to use to connect to pg_auto_failover nodes
            events         Prints monitor's state of nodes in a given formation and group
            state          Prints monitor's state of nodes in a given formation and group
            settings       Print replication settings for a formation from the monitor
            standby-names  Prints synchronous_standby_names for a given group
            file           List pg_autoctl internal files (config, state, pid)
            systemd        Print systemd service file for this node

          pg_autoctl enable
            secondary    Enable secondary nodes on a formation
            maintenance  Enable Postgres maintenance mode on this node
            ssl          Enable SSL configuration on this node
            monitor      Enable a monitor for this node to be orchestrated from

          pg_autoctl disable
            secondary    Disable secondary nodes on a formation
            maintenance  Disable Postgres maintenance mode on this node
            ssl          Disable SSL configuration on this node
            monitor      Disable the monitor for this node

          pg_autoctl get
          + node       get a node property from the pg_auto_failover monitor
          + formation  get a formation property from the pg_auto_failover monitor

          pg_autoctl get node
            replication-quorum  get replication-quorum property from the monitor
            candidate-priority  get candidate property from the monitor

          pg_autoctl get formation
            settings              get replication settings for a formation from the monitor
            number-sync-standbys  get number_sync_standbys for a formation from the monitor

          pg_autoctl set
          + node       set a node property on the monitor
          + formation  set a formation property on the monitor

          pg_autoctl set node
            metadata            set metadata on the monitor
            replication-quorum  set replication-quorum property on the monitor
            candidate-priority  set candidate property on the monitor

          pg_autoctl set formation
            number-sync-standbys  set number-sync-standbys for a formation on the monitor

          pg_autoctl perform
            failover    Perform a failover for given formation and group
            switchover  Perform a switchover for given formation and group
            promotion   Perform a failover that promotes a target node

   Description
       The pg_autoctl tool is the client tool provided by pg_auto_failover to create  and  manage
       Postgres  nodes  and  the  pg_auto_failover  monitor  node. The command is built with many
       sub-commands that each have their own manual page.

   Help
       To get the full recursive list of supported commands, use:

          pg_autoctl help

   Version
       To grab the version of pg_autoctl that you're using, use:

          pg_autoctl --version
          pg_autoctl version

       A typical output would be:

          pg_autoctl version 1.4.2
          pg_autoctl extension version 1.4
          compiled with PostgreSQL 12.3 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit
          compatible with Postgres 10, 11, 12, and 13

       The version is also available as a JSON document when using the --json option:

          pg_autoctl --version --json
          pg_autoctl version --json

       A typical JSON output would be:

          {
              "pg_autoctl": "1.4.2",
              "pgautofailover": "1.4",
              "pg_major": "12",
              "pg_version": "12.3",
              "pg_version_str": "PostgreSQL 12.3 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit",
              "pg_version_num": 120003
          }

       This is for version 1.4.2 of pg_auto_failover. This particular version of  the  pg_autoctl
       client  tool  has  been  compiled  using  libpq for PostgreSQL 12.3 and is compatible with
       Postgres 10, 11, 12, and 13.

   Environment
       PG_AUTOCTL_DEBUG
          When this environment variable  is  set  (to  anything)  then  pg_autoctl  allows  more
          commands. Use with care, this opens abilities to destroy your production clusters.

   pg_autoctl create
       pg_autoctl create - Create a pg_auto_failover node, or formation

   pg_autoctl create monitor
       pg_autoctl create monitor - Initialize a pg_auto_failover monitor node

   Synopsis
       This command initializes a PostgreSQL cluster and installs the pgautofailover extension so
       that it's possible to use the new instance to monitor PostgreSQL services:

          usage: pg_autoctl create monitor  [ --pgdata --pgport --pgctl --hostname ]

          --pgctl           path to pg_ctl
          --pgdata          path to data directory
          --pgport          PostgreSQL's port number
          --hostname        hostname by which postgres is reachable
          --auth            authentication method for connections from data nodes
          --skip-pg-hba     skip editing pg_hba.conf rules
          --run             create node then run pg_autoctl service
          --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
          --ssl-mode        use that sslmode in connection strings
          --ssl-ca-file     set the Postgres ssl_ca_file to that file path
          --ssl-crl-file    set the Postgres ssl_crl_file to that file path
          --no-ssl          don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
          --server-key      set the Postgres ssl_key_file to that file path
          --server-cert     set the Postgres ssl_cert_file to that file path

   Description
       The pg_autoctl tool is the client tool provided by pg_auto_failover to create  and  manage
       Postgres  nodes  and  the  pg_auto_failover  monitor  node. The command is built with many
       sub-commands that each have their own manual page.

   Options
       The following options are available to pg_autoctl create monitor:

       --pgctl
              Path to the pg_ctl tool to use for the version of PostgreSQL you want to use.

              Defaults to the pg_ctl found in the PATH when there is a single entry for pg_ctl in
              the PATH. Check your setup using which -a pg_ctl.

              When using an RPM based distribution such as RHEL or CentOS, the path would usually
              be /usr/pgsql-13/bin/pg_ctl for Postgres 13.

              When using a debian based distribution such as debian or  ubuntu,  the  path  would
              usually  be /usr/lib/postgresql/13/bin/pg_ctl for Postgres 13.  Those distributions
              also use the package postgresql-common which provides /usr/bin/pg_config. This tool
              can be automatically used by pg_autoctl to discover the default version of Postgres
              to use on your setup.

       --pgdata
              Location where to initialize a  Postgres  database  cluster,  using  either  pg_ctl
              initdb or pg_basebackup. Defaults to the environment variable PGDATA.

       --pgport
              Postgres port to use, defaults to 5432.

       --hostname
              Hostname or IP address (both v4 and v6 are supported) to use from any other node to
              connect to this node.

              When not provided, a default value is computed by running the following algorithm.

                 1. We get this machine's "public IP" by opening a connection to  the  8.8.8.8:53
                    public  service. Then we get TCP/IP client address that has been used to make
                    that connection.

                 2. We then do a reverse DNS lookup on the IP address found in the previous  step
                    to fetch a hostname for our local machine.

                 3. If  the reverse DNS lookup is successful , then pg_autoctl does a forward DNS
                    lookup of that hostname.

              When the forward DNS lookup response in step 3. is an IP address found  in  one  of
              our local network interfaces, then pg_autoctl uses the hostname found in step 2. as
              the default --hostname. Otherwise it uses the IP address found in step 1.

              You may use the --hostname command line option to bypass the whole DNS lookup based
              process and force the local node name to a fixed value.

       --auth Authentication method used by pg_autoctl when editing the Postgres HBA file to open
              connections to other nodes. No default value, must be provided  by  the  user.  The
              value --trust is only a good choice for testing and evaluation of pg_auto_failover,
              see Security settings for pg_auto_failover for more information.

       --skip-pg-hba
              When this option is used then pg_autoctl refrains from any editing of the  Postgres
              HBA file. Please note that editing the HBA file is still needed so that other nodes
              can connect using either read privileges or replication streaming privileges.

              When --skip-pg-hba is used, pg_autoctl still outputs the HBA entries  it  needs  in
              the logs, it only skips editing the HBA file.

       --run  Immediately run the pg_autoctl service after having created this node.

       --ssl-self-signed
              Generate  SSL self-signed certificates to provide network encryption. This does not
              protect against man-in-the-middle kinds  of  attacks.  See  Security  settings  for
              pg_auto_failover for more about our SSL settings.

       --ssl-mode
              SSL  Mode  used  by  pg_autoctl  when  connecting  to  other  nodes, including when
              connecting for streaming replication.

       --ssl-ca-file
              Set the Postgres ssl_ca_file to that file path.

       --ssl-crl-file
              Set the Postgres ssl_crl_file to that file path.

       --no-ssl
              Don't enable network encryption. This is not recommended, prefer --ssl-self-signed.

       --server-key
              Set the Postgres ssl_key_file to that file path.

       --server-cert
              Set the Postgres ssl_cert_file to that file path.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_CONFIG
          Can be set to the absolute path to the pg_config Postgres tool. This is mostly used  in
          the  context of building extensions, though it can be a useful way to select a Postgres
          version when several are installed on the same system.

       PATH
          Used the usual way mostly. Some entries that are searched in the PATH by the pg_autoctl
          command  are  expected  to  be  found  only once, to avoid mistakes with Postgres major
          versions.

       PGHOST, PGPORT, PGDATABASE, PGUSER, PGCONNECT_TIMEOUT, ...
          See the Postgres docs about Environment Variables for details.

       TMPDIR
          The pgcopydb command creates all its work files and directories in  ${TMPDIR}/pgcopydb,
          and defaults to /tmp/pgcopydb.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   pg_autoctl create postgres
       pg_autoctl create postgres - Initialize a pg_auto_failover postgres node

   Synopsis
       The command pg_autoctl create  postgres  initializes  a  standalone  Postgres  node  to  a
       pg_auto_failover  monitor.  The  monitor  is then handling auto-failover for this Postgres
       node (as soon as a secondary has been registered too, and is known to be healthy).

          usage: pg_autoctl create postgres

            --pgctl           path to pg_ctl
            --pgdata          path to data directory
            --pghost          PostgreSQL's hostname
            --pgport          PostgreSQL's port number
            --listen          PostgreSQL's listen_addresses
            --username        PostgreSQL's username
            --dbname          PostgreSQL's database name
            --name            pg_auto_failover node name
            --hostname        hostname used to connect from the other nodes
            --formation       pg_auto_failover formation
            --monitor         pg_auto_failover Monitor Postgres URL
            --auth            authentication method for connections from monitor
            --skip-pg-hba     skip editing pg_hba.conf rules
            --pg-hba-lan      edit pg_hba.conf rules for --dbname in detected LAN
            --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
            --ssl-mode        use that sslmode in connection strings
            --ssl-ca-file     set the Postgres ssl_ca_file to that file path
            --ssl-crl-file    set the Postgres ssl_crl_file to that file path
            --no-ssl          don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
            --server-key      set the Postgres ssl_key_file to that file path
            --server-cert     set the Postgres ssl_cert_file to that file path
            --candidate-priority    priority of the node to be promoted to become primary
            --replication-quorum    true if node participates in write quorum
            --maximum-backup-rate   maximum transfer rate of data transferred from the server during initial sync

   Description
       Three different modes of initialization are supported by this command, corresponding to as
       many implementation strategies.

          1. Initialize a primary node from scratch

             This  happens  when  --pgdata  (or  the  environment  variable  PGDATA) points to an
             non-existing or empty directory. Then the given  --hostname  is  registered  to  the
             pg_auto_failover --monitor as a member of the --formation.

             The  monitor  answers  to  the  registration  call with a state to assign to the new
             member of the group, either SINGLE or  WAIT_STANDBY.  When  the  assigned  state  is
             SINGLE,  then  pg_autoctl  create  postgres  proceeds to initialize a new PostgreSQL
             instance.

          2. Initialize an already existing primary server

             This happens when --pgdata (or the environment variable PGDATA) points to an already
             existing  directory  that  belongs to a PostgreSQL instance. The standard PostgreSQL
             tool pg_controldata is  used  to  recognize  whether  the  directory  belongs  to  a
             PostgreSQL instance.

             In  that  case,  the  given --hostname is registered to the monitor in the tentative
             SINGLE state. When the given --formation and --group is currently  empty,  then  the
             monitor  accepts  the  registration  and  the pg_autoctl create prepares the already
             existing primary server for pg_auto_failover.

          3. Initialize a secondary node from scratch

             This happens when  --pgdata  (or  the  environment  variable  PGDATA)  points  to  a
             non-existing  or empty directory, and when the monitor registration call assigns the
             state WAIT_STANDBY in step 1.

             In that case, the pg_autoctl create command steps  through  the  initial  states  of
             registering  a  secondary  server,  which  includes  preparing  the  primary  server
             PostgreSQL HBA rules and creating a replication slot.

             When the command ends successfully, a PostgreSQL secondary server has  been  created
             with pg_basebackup and is now started, catching-up to the primary server.

          4. Initialize a secondary node from an existing data directory

             When  the  data  directory  pointed  to  by  the  option --pgdata or the environment
             variable PGDATA already exists,  then  pg_auto_failover  verifies  that  the  system
             identifier matches the one of the other nodes already existing in the same group.

             The  system identifier can be obtained with the command pg_controldata. All nodes in
             a physical replication setting must have the  same  system  identifier,  and  so  in
             pg_auto_failover all the nodes in a same group have that constraint too.

             When the system identifier matches the already registered system identifier of other
             nodes in the same group, then the node is  set-up  as  a  standby  and  Postgres  is
             started with the primary conninfo pointed at the current primary.

       The  --auth  option  allows  setting up authentication method to be used when monitor node
       makes a connection to data node with pgautofailover_monitor user. As with  the  pg_autoctl
       create  monitor  command, you could use --auth trust when playing with pg_auto_failover at
       first and consider something production grade later. Also, consider using --skip-pg-hba if
       you already have your own provisioning tools with a security compliance process.

       See Security settings for pg_auto_failover for notes on .pgpass

   Options
       The following options are available to pg_autoctl create postgres:

       --pgctl
              Path to the pg_ctl tool to use for the version of PostgreSQL you want to use.

              Defaults to the pg_ctl found in the PATH when there is a single entry for pg_ctl in
              the PATH. Check your setup using which -a pg_ctl.

              When using an RPM based distribution such as RHEL or CentOS, the path would usually
              be /usr/pgsql-13/bin/pg_ctl for Postgres 13.

              When  using  a  debian  based distribution such as debian or ubuntu, the path would
              usually be /usr/lib/postgresql/13/bin/pg_ctl for Postgres 13.  Those  distributions
              also use the package postgresql-common which provides /usr/bin/pg_config. This tool
              can be automatically used by pg_autoctl to discover the default version of Postgres
              to use on your setup.

       --pgdata
              Location  where  to  initialize  a  Postgres  database cluster, using either pg_ctl
              initdb or pg_basebackup. Defaults to the environment variable PGDATA.

       --pghost
              Hostname to use when connecting to the local Postgres instance from the  pg_autoctl
              process. By default, this field is left blank in the connection string, allowing to
              use Unix Domain Sockets with the default  path  compiled  in  your  libpq  version,
              usually  provided  by  the Operating System. That would be /var/run/postgresql when
              using debian or ubuntu.

       --pgport
              Postgres port to use, defaults to 5432.

       --listen
              PostgreSQL's listen_addresses to setup. At the moment only one address is supported
              in this command line option.

       --username
              PostgreSQL's  username  to  use  when  connecting to the local Postgres instance to
              manage it.

       --dbname
              PostgreSQL's database name to use in your application. Defaults to being  the  same
              as the --username, or to postgres when none of those options are used.

       --name Node  name  used  on the monitor to refer to this node. The hostname is a technical
              information, and given Postgres requirements on the HBA setup  and  DNS  resolution
              (both forward and reverse lookups), IP addresses are often used for the hostname.

              The --name option allows using a user-friendly name for your Postgres nodes.

       --hostname
              Hostname or IP address (both v4 and v6 are supported) to use from any other node to
              connect to this node.

              When not provided, a default value is computed by running the following algorithm.

                 1. We get this machine's "public IP"  by  opening  a  connection  to  the  given
                    monitor  hostname  or  IP address. Then we get TCP/IP client address that has
                    been used to make that connection.

                 2. We then do a reverse DNS lookup on the IP address found in the previous  step
                    to fetch a hostname for our local machine.

                 3. If  the reverse DNS lookup is successful , then pg_autoctl does a forward DNS
                    lookup of that hostname.

              When the forward DNS lookup response in step 3. is an IP address found  in  one  of
              our local network interfaces, then pg_autoctl uses the hostname found in step 2. as
              the default --hostname. Otherwise it uses the IP address found in step 1.

              You may use the --hostname command line option to bypass the whole DNS lookup based
              process and force the local node name to a fixed value.

       --formation
              Formation  to  register  the  node  into  on  the  monitor. Defaults to the default
              formation, that is automatically created in the monitor in  the  pg_autoctl  create
              monitor command.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target the pg_auto_failover database name. It is possible to show the Postgres  URI
              from the monitor node using the command pg_autoctl show uri.

       --auth Authentication method used by pg_autoctl when editing the Postgres HBA file to open
              connections to other nodes. No default value, must be provided  by  the  user.  The
              value --trust is only a good choice for testing and evaluation of pg_auto_failover,
              see Security settings for pg_auto_failover for more information.

       --skip-pg-hba
              When this option is used then pg_autoctl refrains from any editing of the  Postgres
              HBA file. Please note that editing the HBA file is still needed so that other nodes
              can connect using either read privileges or replication streaming privileges.

              When --skip-pg-hba is used, pg_autoctl still outputs the HBA entries  it  needs  in
              the logs, it only skips editing the HBA file.

       --pg-hba-lan
              When this option is used pg_autoctl determines the local IP address used to connect
              to the monitor, and retrieves its netmask, and uses that to compute your local area
              network CIDR. This CIDR is then opened for connections in the Postgres HBA rules.

              For instance, when the monitor resolves to 192.168.0.1 and your local Postgres node
              uses an inferface with IP  address  192.168.0.2/255.255.255.0  to  connect  to  the
              monitor, then the LAN CIDR is computed to be 192.168.0.0/24.

       --candidate-priority
              Sets  this  node  replication  setting  for  candidate  priority to the given value
              (between 0 and 100) at node registration on the monitor. Defaults to 50.

       --replication-quorum
              Sets this node replication setting  for  replication  quorum  to  the  given  value
              (either  true  or  false)  at  node registration on the monitor.  Defaults to true,
              which enables synchronous replication.

       --maximum-backup-rate
              Sets the maximum transfer rate of data transferred from the server  during  initial
              sync. This is used by pg_basebackup.  Defaults to 100M.

       --run  Immediately run the pg_autoctl service after having created this node.

       --ssl-self-signed
              Generate  SSL self-signed certificates to provide network encryption. This does not
              protect against man-in-the-middle kinds  of  attacks.  See  Security  settings  for
              pg_auto_failover for more about our SSL settings.

       --ssl-mode
              SSL  Mode  used  by  pg_autoctl  when  connecting  to  other  nodes, including when
              connecting for streaming replication.

       --ssl-ca-file
              Set the Postgres ssl_ca_file to that file path.

       --ssl-crl-file
              Set the Postgres ssl_crl_file to that file path.

       --no-ssl
              Don't enable network encryption. This is not recommended, prefer --ssl-self-signed.

       --server-key
              Set the Postgres ssl_key_file to that file path.

       --server-cert
              Set the Postgres ssl_cert_file to that file path.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       PG_AUTOCTL_NODE_NAME
          Node name to register to the monitor, can be used instead of the --name option.

       PG_AUTOCTL_REPLICATION_QUORUM
          Can be used instead of the --replication-quorum option.

       PG_AUTOCTL_CANDIDATE_PRIORITY
          Can be used instead of the --candidate-priority option.

       PG_CONFIG
          Can  be set to the absolute path to the pg_config Postgres tool. This is mostly used in
          the context of building extensions, though it can be a useful way to select a  Postgres
          version when several are installed on the same system.

       PATH
          Used the usual way mostly. Some entries that are searched in the PATH by the pg_autoctl
          command are expected to be found only once,  to  avoid  mistakes  with  Postgres  major
          versions.

       PGHOST, PGPORT, PGDATABASE, PGUSER, PGCONNECT_TIMEOUT, ...
          See the Postgres docs about Environment Variables for details.

       TMPDIR
          The  pgcopydb command creates all its work files and directories in ${TMPDIR}/pgcopydb,
          and defaults to /tmp/pgcopydb.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   pg_autoctl create coordinator
       pg_autoctl create coordinator - Initialize a pg_auto_failover coordinator node

   Synopsis
       The  command pg_autoctl create coordinator initializes a pg_auto_failover Coordinator node
       for a Citus formation. The coordinator is special in a Citus formation: that's  where  the
       client  application  connects  to  either  to manage the formation and the sharding of the
       tables, or for its normal SQL traffic.

       The coordinator also has to register every worker in the formation.

          usage: pg_autoctl create coordinator

            --pgctl           path to pg_ctl
            --pgdata          path to data directory
            --pghost          PostgreSQL's hostname
            --pgport          PostgreSQL's port number
            --hostname        hostname by which postgres is reachable
            --listen          PostgreSQL's listen_addresses
            --username        PostgreSQL's username
            --dbname          PostgreSQL's database name
            --name            pg_auto_failover node name
            --formation       pg_auto_failover formation
            --monitor         pg_auto_failover Monitor Postgres URL
            --auth            authentication method for connections from monitor
            --skip-pg-hba     skip editing pg_hba.conf rules
            --citus-secondary when used, this worker node is a citus secondary
            --citus-cluster   name of the Citus Cluster for read-replicas
            --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
            --ssl-mode        use that sslmode in connection strings
            --ssl-ca-file     set the Postgres ssl_ca_file to that file path
            --ssl-crl-file    set the Postgres ssl_crl_file to that file path
            --no-ssl          don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
            --server-key      set the Postgres ssl_key_file to that file path
            --server-cert     set the Postgres ssl_cert_file to that file path

   Description
       This commands works the same as the pg_autoctl create postgres command and implements  the
       following extra steps:

          1. adds   shared_preload_libraries   =   citus   to   the   local  PostgreSQL  instance
             configuration.

          2. enables the whole local area network to connect to the  coordinator,  by  adding  an
             entry for e.g. 192.168.1.0/24 in the PostgreSQL HBA configuration.

          3. creates the extension citus in the target database.

       IMPORTANT:
          The default --dbname is the same as the current system user name, which in many case is
          going to be postgres. Please make sure to use  the  --dbname  option  with  the  actual
          database that you're going to use with your application.

          Citus  does not support multiple databases, you have to use the database where Citus is
          created. When using Citus, that is essential to the well behaving of worker failover.

   Options
       See the manual page for pg_autoctl create postgres for the common  options.  This  section
       now lists the options that are specific to pg_autoctl create coordinator:

       --citus-secondary
              Use this option to create a coordinator dedicated to a Citus Secondary cluster.

              See Citus Secondaries and read-replica for more information.

       --citus-cluster
              Use  this  option  to  name  the Citus Secondary cluster that this coordinator node
              belongs to. Use the same cluster name again for the worker nodes that are  part  of
              this cluster.

              See Citus Secondaries and read-replica for more information.

   pg_autoctl create worker
       pg_autoctl create worker - Initialize a pg_auto_failover worker node

   Synopsis
       The  command  pg_autoctl  create  worker  initializes a pg_auto_failover Worker node for a
       Citus formation. The worker is special in a  Citus  formation:  that's  where  the  client
       application  connects to either to manage the formation and the sharding of the tables, or
       for its normal SQL traffic.

       The worker also has to register every worker in the formation.

          usage: pg_autoctl create worker

            --pgctl           path to pg_ctl
            --pgdata          path to data director
            --pghost          PostgreSQL's hostname
            --pgport          PostgreSQL's port number
            --hostname        hostname by which postgres is reachable
            --listen          PostgreSQL's listen_addresses
            --proxyport       Proxy's port number
            --username        PostgreSQL's username
            --dbname          PostgreSQL's database name
            --name            pg_auto_failover node name
            --formation       pg_auto_failover formation
            --group           pg_auto_failover group Id
            --monitor         pg_auto_failover Monitor Postgres URL
            --auth            authentication method for connections from monitor
            --skip-pg-hba     skip editing pg_hba.conf rules
            --citus-secondary when used, this worker node is a citus secondary
            --citus-cluster   name of the Citus Cluster for read-replicas
            --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
            --ssl-mode        use that sslmode in connection strings
            --ssl-ca-file     set the Postgres ssl_ca_file to that file path
            --ssl-crl-file    set the Postgres ssl_crl_file to that file path
            --no-ssl          don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
            --server-key      set the Postgres ssl_key_file to that file path
            --server-cert     set the Postgres ssl_cert_file to that file path

   Description
       This commands works the same as the pg_autoctl create postgres command and implements  the
       following extra steps:

          1. adds   shared_preload_libraries   =   citus   to   the   local  PostgreSQL  instance
             configuration.

          2. creates the extension citus in the target database.

          3. gets the coordinator node hostname from the pg_auto_failover monitor.

             This operation is retried when it fails, as the coordinator might appear later  than
             some of the workers when the whole formation is initialized at once, in parallel, on
             multiple nodes.

          4. adds node to the coordinator

             This is done in two steps. First, we call the SQL function  master_add_inactive_node
             on the coordinator, and second, we call the SQL function master_activate_node.

             This  way  allows for easier diagnostics in case things go wrong. In the first step,
             the network and authentication setup needs to allow for nodes  to  connect  to  each
             other.  In  the  second  step, the Citus reference tables are distributed to the new
             node, and this operation has its own set of failure cases to handle.

       IMPORTANT:
          The default --dbname is the same as the current system user name, which in many case is
          going  to  be  postgres.  Please  make  sure to use the --dbname option with the actual
          database that you're going to use with your application.

          Citus does not support multiple databases, you have to use the database where Citus  is
          created. When using Citus, that is essential to the well behaving of worker failover.

   Options
       See  the  manual  page for pg_autoctl create postgres for the common options. This section
       now lists the options that are specific to pg_autoctl create worker:

       --proxyport
              The --proxyport option allows pg_auto_failover to register the proxy  port  in  the
              pg_dist_poolinfo  entry  for  the  worker  node in its Coordinator, rather than the
              --pgport entry as would usually be done.

       --citus-secondary
              Use this option to create a worker dedicated to a Citus Secondary cluster.

              See Citus Secondaries and read-replica for more information.

       --citus-cluster
              Use this option to name the Citus Secondary cluster that this worker  node  belongs
              to.  Use  the  same  cluster  name again for the worker nodes that are part of this
              cluster.

              See Citus Secondaries and read-replica for more information.

   pg_autoctl create formation
       pg_autoctl create formation - Create a new formation on the pg_auto_failover monitor

   Synopsis
       This command registers a new formation on the monitor, with the specified kind:

          usage: pg_autoctl create formation  [ --pgdata --monitor --formation --kind --dbname  --with-secondary --without-secondary ]

          --pgdata      path to data directory
          --monitor     pg_auto_failover Monitor Postgres URL
          --formation   name of the formation to create
          --kind        formation kind, either "pgsql" or "citus"
          --dbname      name for postgres database to use in this formation
          --enable-secondary     create a formation that has multiple nodes that can be
                                 used for fail over when others have issues
          --disable-secondary    create a citus formation without nodes to fail over to
          --number-sync-standbys minimum number of standbys to confirm write

   Description
       A single pg_auto_failover monitor may manage any number of formations, each composed of at
       least one Postgres service group. This commands creates a new formation so that it is then
       possible to register Postgres nodes in the new formation.

   Options
       The following options are available to pg_autoctl create formation:

       --pgdata
              Location where to initialize a  Postgres  database  cluster,  using  either  pg_ctl
              initdb or pg_basebackup. Defaults to the environment variable PGDATA.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target the pg_auto_failover database name. It is possible to show the Postgres  URI
              from the monitor node using the command pg_autoctl show uri.

       --formation
              Name of the formation to create.

       --kind A pg_auto_failover formation could be of kind pgsql or of kind citus. At the moment
              citus  formation  kinds  are  not  managed  in   the   Open   Source   version   of
              pg_auto_failover.

       --dbname
              Name  of  the  database  to  use in the formation, mostly useful to formation kinds
              citus where the Citus extension is only installed in a single target database.

       --enable-secondary
              The formation to be created allows using standby nodes. Defaults  to  true.  Mostly
              useful for Citus formations.

       --disable-secondary
              See --enable-secondary above.

       --number-sync-standby
              Postgres  streaming  replication  uses  synchronous_standby_names to setup how many
              standby nodes should have received a copy  of  the  transaction  data.  When  using
              pg_auto_failover this setup is handled at the formation level.

              Defaults  to  zero when creating the first two Postgres nodes in a formation in the
              same group. When set to zero pg_auto_failover  uses  synchronous  replication  only
              when  a standby node is available: the idea is to allow failover, this setting does
              not allow proper HA for Postgres.

              When adding a third  node  that  participates  in  the  quorum  (one  primary,  two
              secondaries), the setting is automatically changed from zero to one.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   pg_autoctl drop
       pg_autoctl drop - Drop a pg_auto_failover node, or formation

   pg_autoctl drop monitor
       pg_autoctl drop monitor - Drop the pg_auto_failover monitor

   Synopsis
       This  command allows to review all the replication settings of a given formation (defaults
       to 'default' as usual):

          usage: pg_autoctl drop monitor [ --pgdata --destroy ]

          --pgdata      path to data directory
          --destroy     also destroy Postgres database

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --destroy
              By default the pg_autoctl drop  monitor  commands  does  not  remove  the  Postgres
              database  for  the monitor. When using --destroy, the Postgres installation is also
              deleted.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   pg_autoctl drop node
       pg_autoctl drop node - Drop a node from the pg_auto_failover monitor

   Synopsis
       This command drops a Postgres node from the pg_auto_failover monitor:

          usage: pg_autoctl drop node [ [ [ --pgdata ] [ --destroy ] ] | [ --monitor [ [ --hostname --pgport ] | [ --formation --name ] ] ] ]

          --pgdata      path to data directory
          --monitor     pg_auto_failover Monitor Postgres URL
          --formation   pg_auto_failover formation
          --name        drop the node with the given node name
          --hostname    drop the node with given hostname and pgport
          --pgport      drop the node with given hostname and pgport
          --destroy     also destroy Postgres database
          --force       force dropping the node from the monitor
          --wait        how many seconds to wait, default to 60

   Description
       Two modes of operations are implemented in the pg_autoctl drop node command.

       When  removing  a  node  that  still  exists,  it  is possible to use pg_autoctl drop node
       --destroy to remove the node both from the monitor and  also  delete  the  local  Postgres
       instance entirely.

       When  removing  a  node that doesn't exist physically anymore, or when the VM that used to
       host the node has been lost entirely, use  either  the  pair  of  options  --hostname  and
       --pgport  or  the  pair  of  options --formation and --name to match the node registration
       record on the monitor database, and get it removed from the known list  of  nodes  on  the
       monitor.

       Then  option  --force  can  be used when the target node to remove does not exist anymore.
       When a node has been lost entirely, it's not going to be  able  to  finish  the  procedure
       itself, and it is then possible to instruct the monitor of the situation.

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target the pg_auto_failover database name. It is possible to show the Postgres  URI
              from the monitor node using the command pg_autoctl show uri.

       --hostname
              Hostname  of  the  Postgres  node  to remove from the monitor. Use either --name or
              --hostname --pgport, but not both.

       --pgport
              Port of the Postgres node  to  remove  from  the  monitor.  Use  either  --name  or
              --hostname --pgport, but not both.

       --name Name  of  the  node  to  remove  from  the monitor. Use either --name or --hostname
              --pgport, but not both.

       --destroy
              By default the pg_autoctl drop  monitor  commands  does  not  remove  the  Postgres
              database  for  the monitor. When using --destroy, the Postgres installation is also
              deleted.

       --force
              By default a node is expected to reach  the  assigned  state  DROPPED  when  it  is
              removed  from  the  monitor, and has the opportunity to implement clean-up actions.
              When the target node to remove is not available anymore, it is possible to use  the
              option --force to immediately remove the node from the monitor.

       --wait How  many  seconds  to  wait for the node to be dropped entirely. The command stops
              when the target node is not to be found on the monitor anymore, or when the timeout
              has  elapsed,  whichever  comes  first. The value 0 (zero) disables the timeout and
              disables waiting entirely, making the command async.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       PG_AUTOCTL_NODE_NAME
          Node name to register to the monitor, can be used instead of the --name option.

       PG_AUTOCTL_REPLICATION_QUORUM
          Can be used instead of the --replication-quorum option.

       PG_AUTOCTL_CANDIDATE_PRIORITY
          Can be used instead of the --candidate-priority option.

       PG_CONFIG
          Can  be set to the absolute path to the pg_config Postgres tool. This is mostly used in
          the context of building extensions, though it can be a useful way to select a  Postgres
          version when several are installed on the same system.

       PATH
          Used the usual way mostly. Some entries that are searched in the PATH by the pg_autoctl
          command are expected to be found only once,  to  avoid  mistakes  with  Postgres  major
          versions.

       PGHOST, PGPORT, PGDATABASE, PGUSER, PGCONNECT_TIMEOUT, ...
          See the Postgres docs about Environment Variables for details.

       TMPDIR
          The  pgcopydb command creates all its work files and directories in ${TMPDIR}/pgcopydb,
          and defaults to /tmp/pgcopydb.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
          $ pg_autoctl drop node --destroy --pgdata ./node3
          17:52:21 54201 INFO  Reaching assigned state "secondary"
          17:52:21 54201 INFO  Removing node with name "node3" in formation "default" from the monitor
          17:52:21 54201 WARN  Postgres is not running and we are in state secondary
          17:52:21 54201 WARN  Failed to update the keeper's state from the local PostgreSQL instance, see above for details.
          17:52:21 54201 INFO  Calling node_active for node default/4/0 with current state: PostgreSQL is running is false, sync_state is "", latest WAL LSN is 0/0.
          17:52:21 54201 INFO  FSM transition to "dropped": This node is being dropped from the monitor
          17:52:21 54201 INFO  Transition complete: current state is now "dropped"
          17:52:21 54201 INFO  This node with id 4 in formation "default" and group 0 has been dropped from the monitor
          17:52:21 54201 INFO  Stopping PostgreSQL at "/Users/dim/dev/MS/pg_auto_failover/tmux/node3"
          17:52:21 54201 INFO  /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl --pgdata /Users/dim/dev/MS/pg_auto_failover/tmux/node3 --wait stop --mode fast
          17:52:21 54201 INFO  /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl status -D /Users/dim/dev/MS/pg_auto_failover/tmux/node3 [3]
          17:52:21 54201 INFO  pg_ctl: no server running
          17:52:21 54201 INFO  pg_ctl stop failed, but PostgreSQL is not running anyway
          17:52:21 54201 INFO  Removing "/Users/dim/dev/MS/pg_auto_failover/tmux/node3"
          17:52:21 54201 INFO  Removing "/Users/dim/dev/MS/pg_auto_failover/tmux/config/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node3/pg_autoctl.cfg"

   pg_autoctl drop formation
       pg_autoctl drop formation - Drop a formation on the pg_auto_failover monitor

   Synopsis
       This command drops an existing formation on the monitor:

          usage: pg_autoctl drop formation  [ --pgdata --formation ]

          --pgdata      path to data directory
          --monitor     pg_auto_failover Monitor Postgres URL
          --formation   name of the formation to drop

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target the pg_auto_failover database name. It is possible to show the Postgres  URI
              from the monitor node using the command pg_autoctl show uri.

       --formation
              Name of the formation to drop from the monitor.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   pg_autoctl config
       pg_autoctl config - Manages the pg_autoctl configuration

   pg_autoctl config get
       pg_autoctl config get - Get the value of a given pg_autoctl configuration variable

   Synopsis
       This command prints a pg_autoctl configuration setting:

          usage: pg_autoctl config get  [ --pgdata ] [ --json ] [ section.option ]

          --pgdata      path to data directory

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Description
       When the argument section.option is used, this is the name of a configuration ooption. The
       configuration file for pg_autoctl is stored using the INI format.

       When no argument is given to pg_autoctl config get the entire configuration file is  given
       in  the  output. To figure out where the configuration file is stored, see pg_autoctl show
       file and use pg_autoctl show file --config.

   Examples
       Without arguments, we get the entire file:

          $ pg_autoctl config get --pgdata node1
          [pg_autoctl]
          role = keeper
          monitor = postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer
          formation = default
          group = 0
          name = node1
          hostname = localhost
          nodekind = standalone

          [postgresql]
          pgdata = /Users/dim/dev/MS/pg_auto_failover/tmux/node1
          pg_ctl = /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl
          dbname = demo
          host = /tmp
          port = 5501
          proxyport = 0
          listen_addresses = *
          auth_method = trust
          hba_level = app

          [ssl]
          active = 1
          sslmode = require
          cert_file = /Users/dim/dev/MS/pg_auto_failover/tmux/node1/server.crt
          key_file = /Users/dim/dev/MS/pg_auto_failover/tmux/node1/server.key

          [replication]
          maximum_backup_rate = 100M
          backup_directory = /Users/dim/dev/MS/pg_auto_failover/tmux/backup/node_1

          [timeout]
          network_partition_timeout = 20
          prepare_promotion_catchup = 30
          prepare_promotion_walreceiver = 5
          postgresql_restart_failure_timeout = 20
          postgresql_restart_failure_max_retries = 3

       It is possible to pipe JSON formatted output to the jq command line and filter the  result
       down to a specific section of the file:

          $ pg_autoctl config get --pgdata node1 --json | jq .pg_autoctl
          {
            "role": "keeper",
            "monitor": "postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer",
            "formation": "default",
            "group": 0,
            "name": "node1",
            "hostname": "localhost",
            "nodekind": "standalone"
          }

       Finally, a single configuration element can be listed:

          $ pg_autoctl config get --pgdata node1 ssl.sslmode --json
          require

   pg_autoctl config set
       pg_autoctl config set - Set the value of a given pg_autoctl configuration variable

   Synopsis
       This command prints a pg_autoctl configuration setting:

          usage: pg_autoctl config set  [ --pgdata ] [ --json ] section.option [ value ]

          --pgdata      path to data directory

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Description
       This  commands  allows  to  set  a  pg_autoctl  configuration setting to a new value. Most
       settings are possible to change and can be reloaded online.

       Some of those commands can then be applied with a pg_autoctl reload command to an  already
       running process.

   Settings
       pg_autoctl.role
          This setting can not be changed. It can be either monitor or keeper and the rest of the
          configuration file is read depending on this value.

       pg_autoctl.monitor
          URI of the pg_autoctl monitor Postgres service. Can be changed with a reload.

          To register an existing node to a new monitor, use pg_autoctl disable monitor and  then
          pg_autoctl enable monitor.

       pg_autoctl.formation
          Formation  to  which  this  node  has  been  registered.  Changing  this setting is not
          supported.

       pg_autoctl.group
          Group in which this node has been registered. Changing this setting is not supported.

       pg_autoctl.name
          Name of the node as known to the monitor and listed in pg_autoctl show  state.  Can  be
          changed with a reload.

       pg_autoctl.hostname
          Hostname  or  IP  address  of  the node, as known to the monitor. Can be changed with a
          reload.

       pg_autoctl.nodekind
          This setting can not be changed and depends on the command that has been used to create
          this pg_autoctl node.

       postgresql.pgdata
          Directory  where the managed Postgres instance is to be created (or found) and managed.
          Can't be changed.

       postgresql.pg_ctl
          Path to the pg_ctl tool used to manage this Postgres instance.  Absolute  path  depends
          on  the major version of Postgres and looks like /usr/lib/postgresql/13/bin/pg_ctl when
          using a debian or ubuntu OS.

          Can be changed after a major upgrade of Postgres.

       postgresql.dbname
          Name of the database that is used to connect to Postgres. Can be changed, but then must
          be changed manually on the monitor's pgautofailover.formation table with a SQL command.

          WARNING:
              When  using  pg_auto_failover  enterprise  edition  with Citus support, this is the
              database where pg_autoctl maintains the list of Citus  nodes  on  the  coordinator.
              Using the same database name as your application that uses Citus is then crucial.

       postgresql.host
          Hostname to use in connection strings when connecting from the local pg_autoctl process
          to the local Postgres database. Defaults to using the Operating  System  default  value
          for  the  Unix  Domain  Socket  directory,  either  /tmp or when using debian or ubuntu
          /var/run/postgresql.

          Can be changed with a reload.

       postgresql.port
          Port on which Postgres should be managed. Can be changed offline, between a  pg_autoctl
          stop and a subsequent pg_autoctl start.

       postgresql.listen_addresses
          Value  to  set  to  Postgres  parameter of the same name. At the moment pg_autoctl only
          supports a single address for this parameter.

       postgresql.auth_method
          Authentication method to use when editing HBA rules to allow the Postgres  nodes  of  a
          formation  to  connect  to  each other, and to the monitor, and to allow the monitor to
          connect to the nodes.

          Can be changed online with a reload, but actually  adding  new  HBA  rules  requires  a
          restart of the "node-active" service.

       postgresql.hba_level
          This  setting  reflects  the choice of --skip-pg-hba or --pg-hba-lan that has been used
          when creating this pg_autoctl node. Can be changed with a reload, though the HBA  rules
          that have been previously added will not get removed.

       ssl.active, ssl.sslmode, ssl.cert_file, ssl.key_file, etc
          Please  use  the  command pg_autoctl enable ssl or pg_autoctl disable ssl to manage the
          SSL settings in the ssl  section  of  the  configuration.  Using  those  commands,  the
          settings can be changed online.

       replication.maximum_backup_rate
          Used  as  a parameter to pg_basebackup, defaults to 100M. Can be changed with a reload.
          Changing this value does not affect an already running pg_basebackup command.

          Limiting the bandwidth used by pg_basebackup makes the operation slower, and still  has
          the advantage of limiting the impact on the disks of the primary server.

       replication.backup_directory
          Target  location  of  the  pg_basebackup  command  used  by  pg_autoctl when creating a
          secondary node. When done with fetching the data over the network, then pg_autoctl uses
          the  rename(2)  system-call  to  rename  the  temporary download location to the target
          PGDATA location.

          The rename(2) system-call is known to be atomic when both the source and the target  of
          the operation are using the same file system / mount point.

          Can  be  changed  online  with  a reload, will not affect already running pg_basebackup
          sub-processes.

       replication.password
          Used as a parameter in the  connection  string  to  the  upstream  Postgres  node.  The
          "replication" connection uses the password set-up in the pg_autoctl configuration file.

          Changing  the  replication.password  of a pg_autoctl configuration has no effect on the
          Postgres database itself. The password must  match  what  the  Postgres  upstream  node
          expects,  which  can  be  set with the following SQL command run on the upstream server
          (primary or other standby node):

              alter user pgautofailover_replicator password 'h4ckm3m0r3';

          The replication.password can be changed online with a reload, but  requires  restarting
          the  Postgres  service  to  be  activated.  Postgres  only  reads  the primary_conninfo
          connection string at start-up, up to and including Postgres 12. With  Postgres  13  and
          following, it is possible to reload this Postgres parameter.

       timeout.network_partition_timeout
          Timeout  (in  seconds)  that  pg_autoctl waits before deciding that it is on the losing
          side of a network partition. When pg_autoctl fails to connect to the monitor  and  when
          the  local  Postgres  instance pg_stat_replication system view is empty, and after this
          many seconds have passed, then pg_autoctl demotes itself.

          Can be changed with a reload.

       timeout.prepare_promotion_catchup
          Currently not used in the source code. Can be changed with a reload.

       timeout.prepare_promotion_walreceiver
          Currently not used in the source code. Can be changed with a reload.

       timeout.postgresql_restart_failure_timeout
          When pg_autoctl fails to start Postgres for at  least  this  duration  from  the  first
          attempt,  then  it  starts reporting that Postgres is not running to the monitor, which
          might then decide to implement a failover.

          Can be changed with a reload.

       timeout.postgresql_restart_failure_max_retries
          When pg_autoctl fails to start Postgres for at least this many  times  then  it  starts
          reporting  that  Postgres  is  not  running  to the monitor, which them might decide to
          implement a failover.

          Can be changed with a reload.

   pg_autoctl config check
       pg_autoctl config check - Check pg_autoctl configuration

   Synopsis
       This command implements a very basic list of sanity checks for a pg_autoctl node setup:

          usage: pg_autoctl config check  [ --pgdata ] [ --json ]

          --pgdata      path to data directory
          --json        output data in the JSON format

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl config check --pgdata node1
          18:37:27 63749 INFO  Postgres setup for PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" is ok, running with PID 5501 and port 99698
          18:37:27 63749 INFO  Connection to local Postgres ok, using "port=5501 dbname=demo host=/tmp"
          18:37:27 63749 INFO  Postgres configuration settings required for pg_auto_failover are ok
          18:37:27 63749 WARN  Postgres 12.1 does not support replication slots on a standby node
          18:37:27 63749 INFO  Connection to monitor ok, using "postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer"
          18:37:27 63749 INFO  Monitor is running version "1.5.0.1", as expected
          pgdata:                /Users/dim/dev/MS/pg_auto_failover/tmux/node1
          pg_ctl:                /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl
          pg_version:            12.3
          pghost:                /tmp
          pgport:                5501
          proxyport:             0
          pid:                   99698
          is in recovery:        no
          Control Version:       1201
          Catalog Version:       201909212
          System Identifier:     6941034382470571312
          Latest checkpoint LSN: 0/6000098
          Postmaster status:     ready

   pg_autoctl show
       pg_autoctl show - Show pg_auto_failover information

   pg_autoctl show uri
       pg_autoctl show uri - Show the postgres uri to use to connect to pg_auto_failover nodes

   Synopsis
       This command outputs  the  monitor  or  the  coordinator  Postgres  URI  to  use  from  an
       application to connect to Postgres:

          usage: pg_autoctl show uri  [ --pgdata --monitor --formation --json ]

            --pgdata      path to data directory
            --monitor     monitor uri
            --formation   show the coordinator uri of given formation
            --json        output data in the JSON format

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target the pg_auto_failover database name. It is possible to show the Postgres  URI
              from the monitor node using the command pg_autoctl show uri.

              Defaults to the value of the environment variable PG_AUTOCTL_MONITOR.

       --formation
              When  --formation  is  used, lists the Postgres URIs of all known formations on the
              monitor.

       --json Output a JSON formatted data instead of a table formatted list.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl show uri
                  Type |    Name | Connection String
          -------------+---------+-------------------------------
               monitor | monitor | postgres://autoctl_node@localhost:5500/pg_auto_failover
             formation | default | postgres://localhost:5502,localhost:5503,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer

          $ pg_autoctl show uri --formation monitor
          postgres://autoctl_node@localhost:5500/pg_auto_failover

          $ pg_autoctl show uri --formation default
          postgres://localhost:5503,localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer

          $ pg_autoctl show uri --json
          [
           {
               "uri": "postgres://autoctl_node@localhost:5500/pg_auto_failover",
               "name": "monitor",
               "type": "monitor"
           },
           {
               "uri": "postgres://localhost:5503,localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer",
               "name": "default",
               "type": "formation"
           }
          ]

   Multi-hosts Postgres connection strings
       PostgreSQL since version 10 includes support for multiple hosts in its  connection  driver
       libpq, with the special target_session_attrs connection property.

       This  multi-hosts  connection  string  facility allows applications to keep using the same
       stable connection string over server-side failovers. That's why pg_autoctl show  uri  uses
       that format.

   pg_autoctl show events
       pg_autoctl show events - Prints monitor's state of nodes in a given formation and group

   Synopsis
       This  command  outputs  the  events  that  the pg_auto_failover events records about state
       changes of the pg_auto_failover nodes managed by the monitor:

          usage: pg_autoctl show events  [ --pgdata --formation --group --count ]

          --pgdata      path to data directory
          --monitor     pg_auto_failover Monitor Postgres URL
          --formation   formation to query, defaults to 'default'
          --group       group to query formation, defaults to all
          --count       how many events to fetch, defaults to 10
          --watch       display an auto-updating dashboard
          --json        output data in the JSON format

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target  the pg_auto_failover database name. It is possible to show the Postgres URI
              from the monitor node using the command pg_autoctl show uri.

       --formation
              List the events recorded for nodes in the given formation. Defaults to default.

       --count
              By default only the last 10 events are printed.

       --watch
              Take control of the terminal and display the current state of the  system  and  the
              last  events  from  the  monitor.  The  display  is updated automatically every 500
              milliseconds (half a second) and reacts properly to window size change.

              Depending on the terminal window size, a different set of columns is visible in the
              state part of the output. See pg_autoctl watch.

       --json Output a JSON formatted data instead of a table formatted list.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
          $ pg_autoctl show events --count 2 --json
          [
           {
               "nodeid": 1,
               "eventid": 15,
               "groupid": 0,
               "nodehost": "localhost",
               "nodename": "node1",
               "nodeport": 5501,
               "eventtime": "2021-03-18T12:32:36.103467+01:00",
               "goalstate": "primary",
               "description": "Setting goal state of node 1 \"node1\" (localhost:5501) to primary now that at least one secondary candidate node is healthy.",
               "formationid": "default",
               "reportedlsn": "0/4000060",
               "reportedstate": "wait_primary",
               "reportedrepstate": "async",
               "candidatepriority": 50,
               "replicationquorum": true
           },
           {
               "nodeid": 1,
               "eventid": 16,
               "groupid": 0,
               "nodehost": "localhost",
               "nodename": "node1",
               "nodeport": 5501,
               "eventtime": "2021-03-18T12:32:36.215494+01:00",
               "goalstate": "primary",
               "description": "New state is reported by node 1 \"node1\" (localhost:5501): \"primary\"",
               "formationid": "default",
               "reportedlsn": "0/4000110",
               "reportedstate": "primary",
               "reportedrepstate": "quorum",
               "candidatepriority": 50,
               "replicationquorum": true
           }
          ]

   pg_autoctl show state
       pg_autoctl show state - Prints monitor's state of nodes in a given formation and group

   Synopsis
       This  command  outputs  the  current  state  of the formation and groups registered to the
       pg_auto_failover monitor:

          usage: pg_autoctl show state  [ --pgdata --formation --group ]

          --pgdata      path to data directory
          --monitor     pg_auto_failover Monitor Postgres URL
          --formation   formation to query, defaults to 'default'
          --group       group to query formation, defaults to all
          --local       show local data, do not connect to the monitor
          --watch       display an auto-updating dashboard
          --json        output data in the JSON format

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target  the pg_auto_failover database name. It is possible to show the Postgres URI
              from the monitor node using the command pg_autoctl show uri.

       --formation
              List the events recorded for nodes in the given formation. Defaults to default.

       --group
              Limit output to a single group in the formation. Default to  including  all  groups
              registered in the target formation.

       --local
              Print the local state information without connecting to the monitor.

       --watch
              Take  control  of  the terminal and display the current state of the system and the
              last events from the monitor.  The  display  is  updated  automatically  every  500
              milliseconds (half a second) and reacts properly to window size change.

              Depending on the terminal window size, a different set of columns is visible in the
              state part of the output. See pg_autoctl watch.

       --json Output a JSON formatted data instead of a table formatted list.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Description
       The pg_autoctl show state output includes the following columns:

          • Name
                Name of the node.

          • Node
                Node information. When the formation has a single group (group zero),  then  this
                column only contains the nodeId.

                Only Citus formations allow several groups. When using a Citus formation the Node
                column contains the groupId and the nodeId, separated by a colon, such as 0:1 for
                the first coordinator node.

          • Host:Port
                Hostname and port number used to connect to the node.

          • TLI: LSN
                Timeline identifier (TLI) and Postgres Log Sequence Number (LSN).

                The LSN is the current position in the Postgres WAL stream. This is a hexadecimal
                number. See pg_lsn for more information.

                The current timeline is incremented each time a failover happens, or  when  doing
                Point  In  Time Recovery. A node can only reach the secondary state when it is on
                the same timeline as its primary node.

          • Connection
                This  output  field  contains  two  bits  of  information.  First,  the  Postgres
                connection  type that the node provides, either read-write or read-only. Then the
                mark ! is added when the monitor has failed to connect to this node, and  ?  when
                the monitor didn't connect to the node yet.

          • Reported State
                The  latest  reported  FSM  state,  as  reported to the monitor by the pg_autoctl
                process running on the Postgres node.

          • Assigned State
                The assigned FSM state on the monitor. When the assigned state is not the same as
                the  reported  start,  then  the  pg_autoctl process running on the Postgres node
                might have not retrieved the assigned state yet, or might still  be  implementing
                the FSM transition from the current state to the assigned state.

   Examples
          $ pg_autoctl show state
           Name |  Node |      Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ------+-------+----------------+----------------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 |   1: 0/4000678 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 |   1: 0/4000678 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 |   1: 0/4000678 |    read-only |           secondary |           secondary

          $ pg_autoctl show state --local
           Name |  Node |      Host:Port |       TLI: LSN |   Connection |      Reported State |      Assigned State
          ------+-------+----------------+----------------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 |   1: 0/4000678 | read-write ? |             primary |             primary

          $ pg_autoctl show state --json
          [
              {
                  "health": 1,
                  "node_id": 1,
                  "group_id": 0,
                  "nodehost": "localhost",
                  "nodename": "node1",
                  "nodeport": 5501,
                  "reported_lsn": "0/4000678",
                  "reported_tli": 1,
                  "formation_kind": "pgsql",
                  "candidate_priority": 50,
                  "replication_quorum": true,
                  "current_group_state": "primary",
                  "assigned_group_state": "primary"
              },
              {
                  "health": 1,
                  "node_id": 2,
                  "group_id": 0,
                  "nodehost": "localhost",
                  "nodename": "node2",
                  "nodeport": 5502,
                  "reported_lsn": "0/4000678",
                  "reported_tli": 1,
                  "formation_kind": "pgsql",
                  "candidate_priority": 50,
                  "replication_quorum": true,
                  "current_group_state": "secondary",
                  "assigned_group_state": "secondary"
              },
              {
                  "health": 1,
                  "node_id": 3,
                  "group_id": 0,
                  "nodehost": "localhost",
                  "nodename": "node3",
                  "nodeport": 5503,
                  "reported_lsn": "0/4000678",
                  "reported_tli": 1,
                  "formation_kind": "pgsql",
                  "candidate_priority": 50,
                  "replication_quorum": true,
                  "current_group_state": "secondary",
                  "assigned_group_state": "secondary"
              }
          ]

   pg_autoctl show settings
       pg_autoctl show settings - Print replication settings for a formation from the monitor

   Synopsis
       This  command allows to review all the replication settings of a given formation (defaults
       to 'default' as usual):

          usage: pg_autoctl show settings  [ --pgdata ] [ --json ] [ --formation ]

          --pgdata      path to data directory
          --monitor     pg_auto_failover Monitor Postgres URL
          --json        output data in the JSON format
          --formation   pg_auto_failover formation

   Description
       See also pg_autoctl get formation settings which is a synonym.

       The output contains setting and values that apply at different  contexts,  as  shown  here
       with  a  formation  of  four  nodes,  where node_4 is not participating in the replication
       quorum and also not a candidate for failover:

          $ pg_autoctl show settings
             Context |    Name |                   Setting | Value
           ----------+---------+---------------------------+-------------------------------------------------------------
           formation | default |      number_sync_standbys | 1
             primary |  node_1 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_3, pgautofailover_standby_2)'
                node |  node_1 |        replication quorum | true
                node |  node_2 |        replication quorum | true
                node |  node_3 |        replication quorum | true
                node |  node_4 |        replication quorum | false
                node |  node_1 |        candidate priority | 50
                node |  node_2 |        candidate priority | 50
                node |  node_3 |        candidate priority | 50
                node |  node_4 |        candidate priority | 0

       Three replication settings context are listed:

          1. The "formation" context contains a single entry, the value  of  number_sync_standbys
             for the target formation.

          2. The  "primary"  context  contains  one  entry  per  group  of  Postgres nodes in the
             formation, and shows the current value  of  the  synchronous_standby_names  Postgres
             setting  as  computed  by  the  monitor. It should match what's currently set on the
             primary node unless while applying a change, as shown by the primary  being  in  the
             APPLY_SETTING state.

          3. The  "node"  context  contains  two  entry per nodes, one line shows the replication
             quorum setting of nodes, and another line shows the candidate priority of nodes.

       This command gives an overview of all the settings that apply to the current formation.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target  the pg_auto_failover database name. It is possible to show the Postgres URI
              from the monitor node using the command pg_autoctl show uri.

              Defaults to the value of the environment variable PG_AUTOCTL_MONITOR.

       --formation
              Show the current replication settings for the  given  formation.  Defaults  to  the
              default formation.

       --json Output a JSON formatted data instead of a table formatted list.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
          $ pg_autoctl show settings
               Context |    Name |                   Setting | Value
             ----------+---------+---------------------------+-------------------------------------------------------------
             formation | default |      number_sync_standbys | 1
               primary |   node1 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
                  node |   node1 |        candidate priority | 50
                  node |   node2 |        candidate priority | 50
                  node |   node3 |        candidate priority | 50
                  node |   node1 |        replication quorum | true
                  node |   node2 |        replication quorum | true
                  node |   node3 |        replication quorum | true

   pg_autoctl show standby-names
       pg_autoctl show standby-names - Prints synchronous_standby_names for a given group

   Synopsis
       This  command  prints  the  current  value  for  synchronous_standby_names for the primary
       Postgres server of the target group (default 0) in the target formation (default default),
       as computed by the monitor:

          usage: pg_autoctl show standby-names  [ --pgdata ] --formation --group

            --pgdata      path to data directory
            --monitor     pg_auto_failover Monitor Postgres URL
            --formation   formation to query, defaults to 'default'
            --group       group to query formation, defaults to all
            --json        output data in the JSON format

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target the pg_auto_failover database name. It is possible to show the Postgres  URI
              from the monitor node using the command pg_autoctl show uri.

              Defaults to the value of the environment variable PG_AUTOCTL_MONITOR.

       --formation
              Show  the current synchronous_standby_names value for the given formation. Defaults
              to the default formation.

       --group
              Show the current synchronous_standby_names value for the given group in  the  given
              formation. Defaults to group 0.

       --json Output a JSON formatted data instead of a table formatted list.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
          $ pg_autoctl show standby-names
          'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'

          $ pg_autoctl show standby-names --json
          {
              "formation": "default",
              "group": 0,
              "synchronous_standby_names": "ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)"
          }

   pg_autoctl show file
       pg_autoctl show file - List pg_autoctl internal files (config, state, pid)

   Synopsis
       This  command  the files that pg_autoctl uses internally for its own configuration, state,
       and pid:

          usage: pg_autoctl show file  [ --pgdata --all --config | --state | --init | --pid --contents ]

          --pgdata      path to data directory
          --all         show all pg_autoctl files
          --config      show pg_autoctl configuration file
          --state       show pg_autoctl state file
          --init        show pg_autoctl initialisation state file
          --pid         show pg_autoctl PID file
          --contents    show selected file contents
          --json        output data in the JSON format

   Description
       The pg_autoctl command follows  the  XDG  Base  Directory  Specification  and  places  its
       internal  and  configuration  files  by default in places such as ~/.config/pg_autoctl and
       ~/.local/share/pg_autoctl.

       It is possible to change the default XDG locations  by  using  the  environment  variables
       XDG_CONFIG_HOME, XDG_DATA_HOME, and XDG_RUNTIME_DIR.

       Also,  pg_config  uses  sub-directories  that  are  specific  to a given PGDATA, making it
       possible to run several Postgres nodes on the same machine, which is  very  practical  for
       testing and development purposes, though not advised for production setups.

   Configuration File
       The  pg_autoctl  configuration  file  for  an  instance  serving  the  data  directory  at
       /data/pgsql is found at ~/.config/pg_autoctl/data/pgsql/pg_autoctl.cfg, written in the INI
       format.

       It  is  possible  to  get  the  location  of  the  configuration file by using the command
       pg_autoctl show file --config --pgdata /data/pgsql and to output its content by using  the
       command pg_autoctl show file --config --contents --pgdata /data/pgsql.

       See also pg_autoctl config get and pg_autoctl config set.

   State File
       The  pg_autoctl  state  file  for an instance serving the data directory at /data/pgsql is
       found at  ~/.local/share/pg_autoctl/data/pgsql/pg_autoctl.state,  written  in  a  specific
       binary format.

       This  file  is not intended to be written by anything else than pg_autoctl itself. In case
       of state corruption, see the trouble shooting section of the documentation.

       It is possible to get the location of the state file by using the command pg_autoctl  show
       file  --state  --pgdata  /data/pgsql  and  to  output  its  content  by  using the command
       pg_autoctl show file --state --contents --pgdata /data/pgsql.

   Init State File
       The pg_autoctl init state file for an instance serving the data directory  at  /data/pgsql
       is  found  at  ~/.local/share/pg_autoctl/data/pgsql/pg_autoctl.init, written in a specific
       binary format.

       This file is not intended to be written by anything else than pg_autoctl itself.  In  case
       of state corruption, see the trouble shooting section of the documentation.

       This initialization state file only exists during the initialization of a pg_auto_failover
       node. In normal operations, this file does not exist.

       It is possible to get the location of the state file by using the command pg_autoctl  show
       file --init --pgdata /data/pgsql and to output its content by using the command pg_autoctl
       show file --init --contents --pgdata /data/pgsql.

   PID File
       The pg_autoctl PID file for an instance serving the data directory at /data/pgsql is found
       at /tmp/pg_autoctl/data/pgsql/pg_autoctl.pid, written in a specific text format.

       The  PID  file  is  located in a temporary directory by default, or in the XDG_RUNTIME_DIR
       directory when this is setup.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --all  List all the files that belong to this pg_autoctl node.

       --config
              Show only the configuration file.

       --state
              Show only the state file.

       --init Show only the init state file, which  only  exists  while  the  command  pg_autoctl
              create  postgres  or the command pg_autoctl create monitor is running, or when than
              command failed (and can then be retried).

       --pid  Show only the pid file.

       --contents
              When one of the options to show a specific file is in use,  then  --contents  shows
              the contents of the selected file instead of showing its absolute file path.

       --json Output JSON formatted data.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
       The  following  examples  are taken from a QA environment that has been prepared thanks to
       the make cluster command made available to the pg_auto_failover contributors. As a result,
       the XDG environment variables have been tweaked to obtain a self-contained test:

          $  tmux show-env | grep XDG
          XDG_CONFIG_HOME=/Users/dim/dev/MS/pg_auto_failover/tmux/config
          XDG_DATA_HOME=/Users/dim/dev/MS/pg_auto_failover/tmux/share
          XDG_RUNTIME_DIR=/Users/dim/dev/MS/pg_auto_failover/tmux/run

       Within that self-contained test location, we can see the following examples.

          $ pg_autoctl show file --pgdata ./node1
             File | Path
          --------+----------------
           Config | /Users/dim/dev/MS/pg_auto_failover/tmux/config/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.cfg
            State | /Users/dim/dev/MS/pg_auto_failover/tmux/share/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.state
             Init | /Users/dim/dev/MS/pg_auto_failover/tmux/share/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.init
              Pid | /Users/dim/dev/MS/pg_auto_failover/tmux/run/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.pid
             'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'

          $ pg_autoctl show file --pgdata node1 --state
          /Users/dim/dev/MS/pg_auto_failover/tmux/share/pg_autoctl/Users/dim/dev/MS/pg_auto_failover/tmux/node1/pg_autoctl.state

          $ pg_autoctl show file --pgdata node1 --state --contents
          Current Role:             primary
          Assigned Role:            primary
          Last Monitor Contact:     Thu Mar 18 17:32:25 2021
          Last Secondary Contact:   0
          pg_autoctl state version: 1
          group:                    0
          node id:                  1
          nodes version:            0
          PostgreSQL Version:       1201
          PostgreSQL CatVersion:    201909212
          PostgreSQL System Id:     6940955496243696337

          pg_autoctl show file --pgdata node1 --config --contents --json | jq .pg_autoctl
          {
            "role": "keeper",
            "monitor": "postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer",
            "formation": "default",
            "group": 0,
            "name": "node1",
            "hostname": "localhost",
            "nodekind": "standalone"
          }

   pg_autoctl show systemd
       pg_autoctl show systemd - Print systemd service file for this node

   Synopsis
       This command outputs a configuration unit that is suitable for registering pg_autoctl as a
       systemd service.

   Examples
          $ pg_autoctl show systemd --pgdata node1
          17:38:29 99778 INFO  HINT: to complete a systemd integration, run the following commands:
          17:38:29 99778 INFO  pg_autoctl -q show systemd --pgdata "node1" | sudo tee /etc/systemd/system/pgautofailover.service
          17:38:29 99778 INFO  sudo systemctl daemon-reload
          17:38:29 99778 INFO  sudo systemctl enable pgautofailover
          17:38:29 99778 INFO  sudo systemctl start pgautofailover
          [Unit]
          Description = pg_auto_failover

          [Service]
          WorkingDirectory = /Users/dim
          Environment = 'PGDATA=node1'
          User = dim
          ExecStart = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl run
          Restart = always
          StartLimitBurst = 0
          ExecReload = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl reload

          [Install]
          WantedBy = multi-user.target

       To avoid the logs output, use the -q option:

          $ pg_autoctl show systemd --pgdata node1 -q
          [Unit]
          Description = pg_auto_failover

          [Service]
          WorkingDirectory = /Users/dim
          Environment = 'PGDATA=node1'
          User = dim
          ExecStart = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl run
          Restart = always
          StartLimitBurst = 0
          ExecReload = /Applications/Postgres.app/Contents/Versions/12/bin/pg_autoctl reload

          [Install]
          WantedBy = multi-user.target

   pg_autoctl enable
       pg_autoctl enable - Enable a feature on a formation

   pg_autoctl enable secondary
       pg_autoctl enable secondary - Enable secondary nodes on a formation

   Synopsis
       This feature makes the most sense when using the Enterprise Edition  of  pg_auto_failover,
       which  is  fully  compatible with Citus formations. When secondary are enabled, then Citus
       workers creation policy is to assign a primary node then a standby node  for  each  group.
       When secondary is disabled the Citus workers creation policy is to assign only the primary
       nodes.

           usage: pg_autoctl enable secondary  [ --pgdata --formation ]

          --pgdata      path to data directory
          --formation   Formation to enable secondary on

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --formation
              Target formation where to enable secondary feature.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   pg_autoctl enable maintenance
       pg_autoctl enable maintenance - Enable Postgres maintenance mode on this node

   Synopsis
       A pg_auto_failover can be put to a maintenance state. The  Postgres  node  is  then  still
       registered  to the monitor, and is known to be unreliable until maintenance is disabled. A
       node in the maintenance state is not a candidate for promotion.

       Typical use of the maintenance state include Operating System  or  Postgres  reboot,  e.g.
       when applying security upgrades.

           usage: pg_autoctl enable maintenance  [ --pgdata --allow-failover ]

          --pgdata      path to data directory

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --formation
              Target formation where to enable secondary feature.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
          pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000760 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000760 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 | 0/4000760 |    read-only |           secondary |           secondary

          $ pg_autoctl enable maintenance --pgdata node3
          12:06:12 47086 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          12:06:12 47086 INFO  Following table displays times when notifications are received
              Time |  Name |  Node |      Host:Port |       Current State |      Assigned State
          ---------+-------+-------+----------------+---------------------+--------------------
          12:06:12 | node1 |     1 | localhost:5501 |             primary |        join_primary
          12:06:12 | node3 |     3 | localhost:5503 |           secondary |    wait_maintenance
          12:06:12 | node3 |     3 | localhost:5503 |    wait_maintenance |    wait_maintenance
          12:06:12 | node1 |     1 | localhost:5501 |        join_primary |        join_primary
          12:06:12 | node3 |     3 | localhost:5503 |    wait_maintenance |         maintenance
          12:06:12 | node1 |     1 | localhost:5501 |        join_primary |             primary
          12:06:13 | node3 |     3 | localhost:5503 |         maintenance |         maintenance

          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000810 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000810 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 | 0/4000810 |         none |         maintenance |         maintenance

   pg_autoctl enable ssl
       pg_autoctl enable ssl - Enable SSL configuration on this node

   Synopsis
       It  is  possible  to  manage  Postgres  SSL  settings with the pg_autoctl command, both at
       pg_autoctl create postgres time and then again to change your  mind  and  update  the  SSL
       settings at run-time.

           usage: pg_autoctl enable ssl  [ --pgdata ] [ --json ]

          --pgdata      path to data directory
          --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
          --ssl-mode        use that sslmode in connection strings
          --ssl-ca-file     set the Postgres ssl_ca_file to that file path
          --ssl-crl-file    set the Postgres ssl_crl_file to that file path
          --no-ssl          don't enable network encryption (NOT recommended, prefer --ssl-self-signed)
          --server-key      set the Postgres ssl_key_file to that file path
          --server-cert     set the Postgres ssl_cert_file to that file path

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --ssl-self-signed
              Generate  SSL self-signed certificates to provide network encryption. This does not
              protect against man-in-the-middle kinds  of  attacks.  See  Security  settings  for
              pg_auto_failover for more about our SSL settings.

       --ssl-mode
              SSL  Mode  used  by  pg_autoctl  when  connecting  to  other  nodes, including when
              connecting for streaming replication.

       --ssl-ca-file
              Set the Postgres ssl_ca_file to that file path.

       --ssl-crl-file
              Set the Postgres ssl_crl_file to that file path.

       --no-ssl
              Don't enable network encryption. This is not recommended, prefer --ssl-self-signed.

       --server-key
              Set the Postgres ssl_key_file to that file path.

       --server-cert
              Set the Postgres ssl_cert_file to that file path.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   pg_autoctl enable monitor
       pg_autoctl enable monitor - Enable a monitor for this node to be orchestrated from

   Synopsis
       It is possible to disable the pg_auto_failover monitor and enable it  again  online  in  a
       running  pg_autoctl  Postgres  node.  The main use-cases where this operation is useful is
       when the monitor node has to be replaced, either  after  a  full  crash  of  the  previous
       monitor node, of for migrating to a new monitor node (hardware replacement, region or zone
       migration, etc).

           usage: pg_autoctl enable monitor  [ --pgdata --allow-failover ] postgres://autoctl_node@new.monitor.add.ress/pg_auto_failover

          --pgdata      path to data directory

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000760 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000760 |    read-only |           secondary |           secondary

          $ pg_autoctl enable monitor --pgdata node3 'postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=require'
          12:42:07 43834 INFO  Registered node 3 (localhost:5503) with name "node3" in formation "default", group 0, state "wait_standby"
          12:42:07 43834 INFO  Successfully registered to the monitor with nodeId 3
          12:42:08 43834 INFO  Still waiting for the monitor to drive us to state "catchingup"
          12:42:08 43834 WARN  Please make sure that the primary node is currently running `pg_autoctl run` and contacting the monitor.

          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000810 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000810 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 | 0/4000810 |    read-only |           secondary |           secondary

   pg_autoctl disable
       pg_autoctl disable - Disable a feature on a formation

   pg_autoctl disable secondary
       pg_autoctl disable secondary - Disable secondary nodes on a formation

   Synopsis
       This feature makes the most sense when using the Enterprise Edition  of  pg_auto_failover,
       which  is  fully compatible with Citus formations. When secondary are disabled, then Citus
       workers creation policy is to assign a primary node then a standby node  for  each  group.
       When secondary is disabled the Citus workers creation policy is to assign only the primary
       nodes.

           usage: pg_autoctl disable secondary  [ --pgdata --formation ]

          --pgdata      path to data directory
          --formation   Formation to disable secondary on

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --formation
              Target formation where to disable secondary feature.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   pg_autoctl disable maintenance
       pg_autoctl disable maintenance - Disable Postgres maintenance mode on this node

   Synopsis
       A pg_auto_failover can be put to a maintenance state. The  Postgres  node  is  then  still
       registered  to the monitor, and is known to be unreliable until maintenance is disabled. A
       node in the maintenance state is not a candidate for promotion.

       Typical use of the maintenance state include Operating System  or  Postgres  reboot,  e.g.
       when applying security upgrades.

           usage: pg_autoctl disable maintenance  [ --pgdata --allow-failover ]

          --pgdata      path to data directory

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --formation
              Target formation where to disable secondary feature.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000810 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000810 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 | 0/4000810 |         none |         maintenance |         maintenance

          $ pg_autoctl disable maintenance --pgdata node3
          12:06:37 47542 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          12:06:37 47542 INFO  Following table displays times when notifications are received
              Time |  Name |  Node |      Host:Port |       Current State |      Assigned State
          ---------+-------+-------+----------------+---------------------+--------------------
          12:06:37 | node3 |     3 | localhost:5503 |         maintenance |          catchingup
          12:06:37 | node3 |     3 | localhost:5503 |          catchingup |          catchingup
          12:06:37 | node3 |     3 | localhost:5503 |          catchingup |           secondary
          12:06:37 | node3 |     3 | localhost:5503 |           secondary |           secondary

          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000848 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000848 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 | 0/4000000 |    read-only |           secondary |           secondary

   pg_autoctl disable ssl
       pg_autoctl disable ssl - Disable SSL configuration on this node

   Synopsis
       It  is  possible  to  manage  Postgres  SSL  settings with the pg_autoctl command, both at
       pg_autoctl create postgres time and then again to change your  mind  and  update  the  SSL
       settings at run-time.

           usage: pg_autoctl disable ssl  [ --pgdata ] [ --json ]

          --pgdata      path to data directory
          --ssl-self-signed setup network encryption using self signed certificates (does NOT protect against MITM)
          --ssl-mode        use that sslmode in connection strings
          --ssl-ca-file     set the Postgres ssl_ca_file to that file path
          --ssl-crl-file    set the Postgres ssl_crl_file to that file path
          --no-ssl          don't disable network encryption (NOT recommended, prefer --ssl-self-signed)
          --server-key      set the Postgres ssl_key_file to that file path
          --server-cert     set the Postgres ssl_cert_file to that file path

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --ssl-self-signed
              Generate  SSL self-signed certificates to provide network encryption. This does not
              protect against man-in-the-middle kinds  of  attacks.  See  Security  settings  for
              pg_auto_failover for more about our SSL settings.

       --ssl-mode
              SSL  Mode  used  by  pg_autoctl  when  connecting  to  other  nodes, including when
              connecting for streaming replication.

       --ssl-ca-file
              Set the Postgres ssl_ca_file to that file path.

       --ssl-crl-file
              Set the Postgres ssl_crl_file to that file path.

       --no-ssl
              Don't   disable   network   encryption.   This   is   not    recommended,    prefer
              --ssl-self-signed.

       --server-key
              Set the Postgres ssl_key_file to that file path.

       --server-cert
              Set the Postgres ssl_cert_file to that file path.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   pg_autoctl disable monitor
       pg_autoctl disable monitor - Disable the monitor for this node

   Synopsis
       It  is  possible  to  disable the pg_auto_failover monitor and enable it again online in a
       running pg_autoctl Postgres node. The main use-cases where this  operation  is  useful  is
       when  the  monitor  node  has  to  be  replaced, either after a full crash of the previous
       monitor node, of for migrating to a new monitor node (hardware replacement, region or zone
       migration, etc).

           usage: pg_autoctl disable monitor  [ --pgdata --force ]

          --pgdata      path to data directory
          --force       force unregistering from the monitor

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --force
              The --force covers the two following situations:

                 1. By default, the command expects to be able to connect to the current monitor.
                    When the current known monitor in the  setup  is  not  running  anymore,  use
                    --force to skip this step.

                 2. When  pg_autoctl  could  connect  to the monitor and the node is found there,
                    this is normally an error that prevents from  disabling  the  monitor.  Using
                    --force  allows  the  command  to drop the node from the monitor and continue
                    with disabling the monitor.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl show state
              Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000148 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000148 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 | 0/4000148 |    read-only |           secondary |           secondary

          $ pg_autoctl disable monitor --pgdata node3
          12:41:21 43039 INFO  Found node 3 "node3" (localhost:5503) on the monitor
          12:41:21 43039 FATAL Use --force to remove the node from the monitor

          $ pg_autoctl disable monitor --pgdata node3 --force
          12:41:32 43219 INFO  Removing node 3 "node3" (localhost:5503) from monitor

          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000760 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/4000760 |    read-only |           secondary |           secondary

   pg_autoctl get
       pg_autoctl get - Get a pg_auto_failover node, or formation setting

   pg_autoctl get formation settings
       pg_autoctl get formation settings - get replication settings  for  a  formation  from  the
       monitor

   Synopsis
       This command prints a pg_autoctl replication settings:

          usage: pg_autoctl get formation settings  [ --pgdata ] [ --json ] [ --formation ]

          --pgdata      path to data directory
          --json        output data in the JSON format
          --formation   pg_auto_failover formation

   Description
       See also pg_autoctl show settings which is a synonym.

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

       --formation
              Show replication settings for given formation. Defaults to default.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Examples
          $ pg_autoctl get formation settings
            Context |    Name |                   Setting | Value
          ----------+---------+---------------------------+-------------------------------------------------------------
          formation | default |      number_sync_standbys | 1
            primary |   node1 | synchronous_standby_names | 'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'
               node |   node1 |        candidate priority | 50
               node |   node2 |        candidate priority | 50
               node |   node3 |        candidate priority | 50
               node |   node1 |        replication quorum | true
               node |   node2 |        replication quorum | true
               node |   node3 |        replication quorum | true

          $ pg_autoctl get formation settings --json
          {
              "nodes": [
                  {
                      "value": "true",
                      "context": "node",
                      "node_id": 1,
                      "setting": "replication quorum",
                      "group_id": 0,
                      "nodename": "node1"
                  },
                  {
                      "value": "true",
                      "context": "node",
                      "node_id": 2,
                      "setting": "replication quorum",
                      "group_id": 0,
                      "nodename": "node2"
                  },
                  {
                      "value": "true",
                      "context": "node",
                      "node_id": 3,
                      "setting": "replication quorum",
                      "group_id": 0,
                      "nodename": "node3"
                  },
                  {
                      "value": "50",
                      "context": "node",
                      "node_id": 1,
                      "setting": "candidate priority",
                      "group_id": 0,
                      "nodename": "node1"
                  },
                  {
                      "value": "50",
                      "context": "node",
                      "node_id": 2,
                      "setting": "candidate priority",
                      "group_id": 0,
                      "nodename": "node2"
                  },
                  {
                      "value": "50",
                      "context": "node",
                      "node_id": 3,
                      "setting": "candidate priority",
                      "group_id": 0,
                      "nodename": "node3"
                  }
              ],
              "primary": [
                  {
                      "value": "'ANY 1 (pgautofailover_standby_2, pgautofailover_standby_3)'",
                      "context": "primary",
                      "node_id": 1,
                      "setting": "synchronous_standby_names",
                      "group_id": 0,
                      "nodename": "node1"
                  }
              ],
              "formation": {
                  "value": "1",
                  "context": "formation",
                  "node_id": null,
                  "setting": "number_sync_standbys",
                  "group_id": null,
                  "nodename": "default"
              }
          }

   pg_autoctl get formation number-sync-standbys
       pg_autoctl  get  formation number-sync-standbys - get number_sync_standbys for a formation
       from the monitor

   Synopsis
       This command prints a pg_autoctl replication settings for number sync standbys:

          usage: pg_autoctl get formation number-sync-standbys  [ --pgdata ] [ --json ] [ --formation ]

          --pgdata      path to data directory
          --json        output data in the JSON format
          --formation   pg_auto_failover formation

   Description
       See also pg_autoctl show settings for the full list of replication settings.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

       --formation
              Show replication settings for given formation. Defaults to default.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl get formation number-sync-standbys
          1

          $ pg_autoctl get formation number-sync-standbys --json
          {
              "number-sync-standbys": 1
          }

   pg_autoctl get node replication-quorum
       pg_autoctl get replication-quorum - get replication-quorum property from the monitor

   Synopsis
       This command prints pg_autoctl replication quorum for a given node:

          usage: pg_autoctl get node replication-quorum  [ --pgdata ] [ --json ] [ --formation ] [ --name ]

          --pgdata      path to data directory
          --formation   pg_auto_failover formation
          --name        pg_auto_failover node name
          --json        output data in the JSON format

   Description
       See also pg_autoctl show settings for the full list of replication settings.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

       --formation
              Show replication settings for given formation. Defaults to default.

       --name Show replication settings for given node, selected by name.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl get node replication-quorum --name node1
          true

          $ pg_autoctl get node replication-quorum --name node1 --json
          {
              "name": "node1",
              "replication-quorum": true
          }

   pg_autoctl get node candidate-priority
       pg_autoctl get candidate-priority - get candidate-priority property from the monitor

   Synopsis
       This command prints pg_autoctl candidate priority for a given node:

          usage: pg_autoctl get node candidate-priority  [ --pgdata ] [ --json ] [ --formation ] [ --name ]

          --pgdata      path to data directory
          --formation   pg_auto_failover formation
          --name        pg_auto_failover node name
          --json        output data in the JSON format

   Description
       See also pg_autoctl show settings for the full list of replication settings.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

       --formation
              Show replication settings for given formation. Defaults to default.

       --name Show replication settings for given node, selected by name.

   Examples
          $ pg_autoctl get node candidate-priority --name node1
          50

          $ pg_autoctl get node candidate-priority --name node1 --json
          {
              "name": "node1",
              "candidate-priority": 50
          }

   pg_autoctl set
       pg_autoctl set - Set a pg_auto_failover node, or formation setting

   pg_autoctl set formation number-sync-standbys
       pg_autoctl set formation number-sync-standbys - set number_sync_standbys for  a  formation
       from the monitor

   Synopsis
       This command set a pg_autoctl replication settings for number sync standbys:

          usage: pg_autoctl set formation number-sync-standbys  [ --pgdata ] [ --json ] [ --formation ] <number_sync_standbys>

          --pgdata      path to data directory
          --formation   pg_auto_failover formation
          --json        output data in the JSON format

   Description
       The  pg_auto_failover  monitor  ensures  that  at  least  N+1  candidate standby nodes are
       registered when number-sync-standbys is N. This means that to be able to run the following
       command, at least 3 standby nodes with a non-zero candidate priority must be registered to
       the monitor:

          $ pg_autoctl set formation number-sync-standbys 2

       See also pg_autoctl show settings for the full list of replication settings.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

       --formation
              Show replication settings for given formation. Defaults to default.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   pg_autoctl set node replication-quorum
       pg_autoctl set replication-quorum - set replication-quorum property from the monitor

   Synopsis
       This command sets pg_autoctl replication quorum for a given node:

          usage: pg_autoctl set node replication-quorum  [ --pgdata ] [ --json ] [ --formation ] [ --name ] <true|false>

          --pgdata      path to data directory
          --formation   pg_auto_failover formation
          --name        pg_auto_failover node name
          --json        output data in the JSON format

   Description
       See also pg_autoctl show settings for the full list of replication settings.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

       --formation
              Show replication settings for given formation. Defaults to default.

       --name Show replication settings for given node, selected by name.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl set node replication-quorum --name node1 false
          12:49:37 94092 INFO  Waiting for the settings to have been applied to the monitor and primary node
          12:49:37 94092 INFO  New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
          12:49:37 94092 INFO  Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
          12:49:37 94092 INFO  New state is reported by node 1 "node1" (localhost:5501): "primary"
          false

          $ pg_autoctl set node replication-quorum --name node1 true --json
          12:49:42 94199 INFO  Waiting for the settings to have been applied to the monitor and primary node
          12:49:42 94199 INFO  New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
          12:49:42 94199 INFO  Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
          12:49:43 94199 INFO  New state is reported by node 1 "node1" (localhost:5501): "primary"
          {
              "replication-quorum": true
          }

   pg_autoctl set node candidate-priority
       pg_autoctl set candidate-priority - set candidate-priority property from the monitor

   Synopsis
       This command sets the pg_autoctl candidate priority for a given node:

          usage: pg_autoctl set node candidate-priority  [ --pgdata ] [ --json ] [ --formation ] [ --name ] <priority: 0..100>

          --pgdata      path to data directory
          --formation   pg_auto_failover formation
          --name        pg_auto_failover node name
          --json        output data in the JSON format

   Description
       See also pg_autoctl show settings for the full list of replication settings.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output JSON formatted data.

       --formation
              Show replication settings for given formation. Defaults to default.

       --name Show replication settings for given node, selected by name.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl set node candidate-priority --name node1 65
          12:47:59 92326 INFO  Waiting for the settings to have been applied to the monitor and primary node
          12:47:59 92326 INFO  New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
          12:47:59 92326 INFO  Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
          12:47:59 92326 INFO  New state is reported by node 1 "node1" (localhost:5501): "primary"
          65

          $ pg_autoctl set node candidate-priority --name node1 50 --json
          12:48:05 92450 INFO  Waiting for the settings to have been applied to the monitor and primary node
          12:48:05 92450 INFO  New state is reported by node 1 "node1" (localhost:5501): "apply_settings"
          12:48:05 92450 INFO  Setting goal state of node 1 "node1" (localhost:5501) to primary after it applied replication properties change.
          12:48:05 92450 INFO  New state is reported by node 1 "node1" (localhost:5501): "primary"
          {
              "candidate-priority": 50
          }

   pg_autoctl perform
       pg_autoctl perform - Perform an action orchestrated by the monitor

   pg_autoctl perform failover
       pg_autoctl perform failover - Perform a failover for given formation and group

   Synopsis
       This command starts a Postgres failover orchestration from the pg_auto_failover monitor:

          usage: pg_autoctl perform failover  [ --pgdata --formation --group ]

          --pgdata      path to data directory
          --formation   formation to target, defaults to 'default'
          --group       group to target, defaults to 0
          --wait        how many seconds to wait, default to 60

   Description
       The pg_auto_failover monitor can be used to orchestrate a manual failover, sometimes  also
       known  as  a  switchover.  When doing so, split-brain are prevented thanks to intermediary
       states being used in the Finite State Machine.

       The pg_autoctl perform failover command waits until the failover is known complete on  the
       monitor, or until the hard-coded 60s timeout has passed.

       The  failover  orchestration  is  done  in  the  background by the monitor, so even if the
       pg_autoctl perform failover stops on the timeout, the failover orchestration continues  at
       the monitor.

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --formation
              Formation to target for the operation. Defaults to default.

       --group
              Postgres  group  to  target for the operation. Defaults to 0, only Citus formations
              may have more than one group.

       --wait How many seconds to wait for notifications about the promotion. The  command  stops
              when  the  promotion  is  finished  (a  node  is  primary), or when the timeout has
              elapsed, whichever comes first. The value 0 (zero) disables the timeout and  allows
              the command to wait forever.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       PGHOST, PGPORT, PGDATABASE, PGUSER, PGCONNECT_TIMEOUT, ...
          See the Postgres docs about Environment Variables for details.

       TMPDIR
          The pgcopydb command creates all its work files and directories in  ${TMPDIR}/pgcopydb,
          and defaults to /tmp/pgcopydb.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl perform failover
          12:57:30 3635 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          12:57:30 3635 INFO  Following table displays times when notifications are received
              Time |  Name |  Node |      Host:Port |       Current State |      Assigned State
          ---------+-------+-------+----------------+---------------------+--------------------
          12:57:30 | node1 |     1 | localhost:5501 |             primary |            draining
          12:57:30 | node1 |     1 | localhost:5501 |            draining |            draining
          12:57:30 | node2 |     2 | localhost:5502 |           secondary |          report_lsn
          12:57:30 | node3 |     3 | localhost:5503 |           secondary |          report_lsn
          12:57:36 | node3 |     3 | localhost:5503 |          report_lsn |          report_lsn
          12:57:36 | node2 |     2 | localhost:5502 |          report_lsn |          report_lsn
          12:57:36 | node2 |     2 | localhost:5502 |          report_lsn |   prepare_promotion
          12:57:36 | node2 |     2 | localhost:5502 |   prepare_promotion |   prepare_promotion
          12:57:36 | node2 |     2 | localhost:5502 |   prepare_promotion |    stop_replication
          12:57:36 | node1 |     1 | localhost:5501 |            draining |      demote_timeout
          12:57:36 | node3 |     3 | localhost:5503 |          report_lsn |      join_secondary
          12:57:36 | node1 |     1 | localhost:5501 |      demote_timeout |      demote_timeout
          12:57:36 | node3 |     3 | localhost:5503 |      join_secondary |      join_secondary
          12:57:37 | node2 |     2 | localhost:5502 |    stop_replication |    stop_replication
          12:57:37 | node2 |     2 | localhost:5502 |    stop_replication |        wait_primary
          12:57:37 | node1 |     1 | localhost:5501 |      demote_timeout |             demoted
          12:57:37 | node1 |     1 | localhost:5501 |             demoted |             demoted
          12:57:37 | node2 |     2 | localhost:5502 |        wait_primary |        wait_primary
          12:57:37 | node3 |     3 | localhost:5503 |      join_secondary |           secondary
          12:57:37 | node1 |     1 | localhost:5501 |             demoted |          catchingup
          12:57:38 | node3 |     3 | localhost:5503 |           secondary |           secondary
          12:57:38 | node2 |     2 | localhost:5502 |        wait_primary |             primary
          12:57:38 | node1 |     1 | localhost:5501 |          catchingup |          catchingup
          12:57:38 | node2 |     2 | localhost:5502 |             primary |             primary

          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000F50 |    read-only |           secondary |           secondary
          node2 |     2 | localhost:5502 | 0/4000F50 |   read-write |             primary |             primary
          node3 |     3 | localhost:5503 | 0/4000F50 |    read-only |           secondary |           secondary

   pg_autoctl perform switchover
       pg_autoctl perform switchover - Perform a switchover for given formation and group

   Synopsis
       This command starts  a  Postgres  switchover  orchestration  from  the  pg_auto_switchover
       monitor:

          usage: pg_autoctl perform switchover  [ --pgdata --formation --group ]

          --pgdata      path to data directory
          --formation   formation to target, defaults to 'default'
          --group       group to target, defaults to 0

   Description
       The  pg_auto_switchover  monitor can be used to orchestrate a manual switchover, sometimes
       also  known  as  a  switchover.  When  doing  so,  split-brain  are  prevented  thanks  to
       intermediary states being used in the Finite State Machine.

       The  pg_autoctl perform switchover command waits until the switchover is known complete on
       the monitor, or until the hard-coded 60s timeout has passed.

       The switchover orchestration is done in the background by the  monitor,  so  even  if  the
       pg_autoctl perform switchover stops on the timeout, the switchover orchestration continues
       at the monitor.

       See also pg_autoctl perform failover, a synonym for this command.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --formation
              Formation to target for the operation. Defaults to default.

       --group
              Postgres group to target for the operation. Defaults to 0,  only  Citus  formations
              may have more than one group.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       PG_CONFIG
          Can be set to the absolute path to the pg_config Postgres tool. This is mostly used  in
          the  context of building extensions, though it can be a useful way to select a Postgres
          version when several are installed on the same system.

       PATH
          Used the usual way mostly. Some entries that are searched in the PATH by the pg_autoctl
          command  are  expected  to  be  found  only once, to avoid mistakes with Postgres major
          versions.

       PGHOST, PGPORT, PGDATABASE, PGUSER, PGCONNECT_TIMEOUT, ...
          See the Postgres docs about Environment Variables for details.

       TMPDIR
          The pgcopydb command creates all its work files and directories in  ${TMPDIR}/pgcopydb,
          and defaults to /tmp/pgcopydb.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   pg_autoctl perform promotion
       pg_autoctl perform promotion - Perform a failover that promotes a target node

   Synopsis
       This command starts a Postgres failover orchestration from the  pg_auto_promotion  monitor
       and targets given node:

          usage: pg_autoctl perform promotion  [ --pgdata --formation --group ]

          --pgdata      path to data directory
          --formation   formation to target, defaults to 'default'
          --name        node name to target, defaults to current node
          --wait        how many seconds to wait, default to 60

   Description
       The  pg_auto_promotion  monitor  can  be used to orchestrate a manual promotion, sometimes
       also  known  as  a  switchover.  When  doing  so,  split-brain  are  prevented  thanks  to
       intermediary states being used in the Finite State Machine.

       The  pg_autoctl  perform  promotion command waits until the promotion is known complete on
       the monitor, or until the hard-coded 60s timeout has passed.

       The promotion orchestration is done in the background by  the  monitor,  so  even  if  the
       pg_autoctl  perform  promotion stops on the timeout, the promotion orchestration continues
       at the monitor.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --formation
              Formation to target for the operation. Defaults to default.

       --name Name of the node that should be elected as the new primary node.

       --wait How many seconds to wait for notifications about the promotion. The  command  stops
              when  the  promotion  is  finished  (a  node  is  primary), or when the timeout has
              elapsed, whichever comes first. The value 0 (zero) disables the timeout and  allows
              the command to wait forever.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       PG_CONFIG
          Can be set to the absolute path to the pg_config Postgres tool. This is mostly used  in
          the  context of building extensions, though it can be a useful way to select a Postgres
          version when several are installed on the same system.

       PATH
          Used the usual way mostly. Some entries that are searched in the PATH by the pg_autoctl
          command  are  expected  to  be  found  only once, to avoid mistakes with Postgres major
          versions.

       PGHOST, PGPORT, PGDATABASE, PGUSER, PGCONNECT_TIMEOUT, ...
          See the Postgres docs about Environment Variables for details.

       TMPDIR
          The pgcopydb command creates all its work files and directories in  ${TMPDIR}/pgcopydb,
          and defaults to /tmp/pgcopydb.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Examples
          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/4000F88 |    read-only |           secondary |           secondary
          node2 |     2 | localhost:5502 | 0/4000F88 |   read-write |             primary |             primary
          node3 |     3 | localhost:5503 | 0/4000F88 |    read-only |           secondary |           secondary

          $ pg_autoctl perform promotion --name node1
          13:08:13 15297 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          13:08:13 15297 INFO  Following table displays times when notifications are received
              Time |  Name |  Node |      Host:Port |       Current State |      Assigned State
          ---------+-------+-------+----------------+---------------------+--------------------
          13:08:13 | node1 |   0/1 | localhost:5501 |           secondary |           secondary
          13:08:13 | node2 |   0/2 | localhost:5502 |             primary |            draining
          13:08:13 | node2 |   0/2 | localhost:5502 |            draining |            draining
          13:08:13 | node1 |   0/1 | localhost:5501 |           secondary |          report_lsn
          13:08:13 | node3 |   0/3 | localhost:5503 |           secondary |          report_lsn
          13:08:19 | node3 |   0/3 | localhost:5503 |          report_lsn |          report_lsn
          13:08:19 | node1 |   0/1 | localhost:5501 |          report_lsn |          report_lsn
          13:08:19 | node1 |   0/1 | localhost:5501 |          report_lsn |   prepare_promotion
          13:08:19 | node1 |   0/1 | localhost:5501 |   prepare_promotion |   prepare_promotion
          13:08:19 | node1 |   0/1 | localhost:5501 |   prepare_promotion |    stop_replication
          13:08:19 | node2 |   0/2 | localhost:5502 |            draining |      demote_timeout
          13:08:19 | node3 |   0/3 | localhost:5503 |          report_lsn |      join_secondary
          13:08:19 | node2 |   0/2 | localhost:5502 |      demote_timeout |      demote_timeout
          13:08:19 | node3 |   0/3 | localhost:5503 |      join_secondary |      join_secondary
          13:08:20 | node1 |   0/1 | localhost:5501 |    stop_replication |    stop_replication
          13:08:20 | node1 |   0/1 | localhost:5501 |    stop_replication |        wait_primary
          13:08:20 | node2 |   0/2 | localhost:5502 |      demote_timeout |             demoted
          13:08:20 | node1 |   0/1 | localhost:5501 |        wait_primary |        wait_primary
          13:08:20 | node3 |   0/3 | localhost:5503 |      join_secondary |           secondary
          13:08:20 | node2 |   0/2 | localhost:5502 |             demoted |             demoted
          13:08:20 | node2 |   0/2 | localhost:5502 |             demoted |          catchingup
          13:08:21 | node3 |   0/3 | localhost:5503 |           secondary |           secondary
          13:08:21 | node1 |   0/1 | localhost:5501 |        wait_primary |             primary
          13:08:21 | node2 |   0/2 | localhost:5502 |          catchingup |          catchingup
          13:08:21 | node1 |   0/1 | localhost:5501 |             primary |             primary

          $ pg_autoctl show state
           Name |  Node |      Host:Port |       LSN |   Connection |       Current State |      Assigned State
          ------+-------+----------------+-----------+--------------+---------------------+--------------------
          node1 |     1 | localhost:5501 | 0/40012F0 |   read-write |             primary |             primary
          node2 |     2 | localhost:5502 | 0/40012F0 |    read-only |           secondary |           secondary
          node3 |     3 | localhost:5503 | 0/40012F0 |    read-only |           secondary |           secondary

   pg_autoctl do
       pg_autoctl do - Internal commands and internal QA tooling

       The debug commands for  pg_autoctl  are  only  available  when  the  environment  variable
       PG_AUTOCTL_DEBUG is set (to any value).

       When testing pg_auto_failover, it is helpful to be able to play with the local nodes using
       the same lower-level API as used by the pg_auto_failover Finite State Machine transitions.
       Some  commands  could be useful in contexts other than pg_auto_failover development and QA
       work, so some documentation has been made available.

   pg_autoctl do tmux
       pg_autoctl do tmux - Set of facilities to handle tmux interactive sessions

   Synopsis
       pg_autoctl do tmux provides the following commands:

          pg_autoctl do tmux
          + compose  Set of facilities to handle docker-compose sessions
            script   Produce a tmux script for a demo or a test case (debug only)
            session  Run a tmux session for a demo or a test case
            stop     Stop pg_autoctl processes that belong to a tmux session
            wait     Wait until a given node has been registered on the monitor
            clean    Clean-up a tmux session processes and root dir

          pg_autoctl do tmux compose
            config   Produce a docker-compose configuration file for a demo
            script   Produce a tmux script for a demo or a test case (debug only)
            session  Run a tmux session for a demo or a test case

   Description
       An easy way to get started with pg_auto_failover in a localhost only formation with  three
       nodes is to run the following command:

          $ PG_AUTOCTL_DEBUG=1 pg_autoctl do tmux session \
               --root /tmp/pgaf    \
               --first-pgport 9000 \
               --nodes 4           \
               --layout tiled

       This  requires  the  command  tmux  to  be  available in your PATH. The pg_autoctl do tmux
       session commands prepares a self-contained root directory where to create pg_auto_failover
       nodes  and their configuration, then prepares a tmux script, and then runs the script with
       a command such as:

          /usr/local/bin/tmux -v start-server ; source-file /tmp/pgaf/script-9000.tmux

       The tmux session contains a single tmux window multiple panes:

          • one pane for the monitor

          • one pane per Postgres nodes, here 4 of them

          • one pane for running pg_autoctl watch

          • one extra pane for an interactive shell.

       Usually the first two commands to run in the interactive  shell,  once  the  formation  is
       stable (one node is primary, the other ones are all secondary), are the following:

          $ pg_autoctl get formation settings
          $ pg_autoctl perform failover

   Using docker-compose to run a distributed system
       The  same  idea  can  also  be implemented with docker-compose to run the nodes, and still
       using tmux to have three control panels this time:

          • one pane for the docker-compose cumulative logs of all the nodes

          • one pane for running pg_autoctl watch

          • one extra pane for an interactive shell.

       For this setup, you can use the following command:

          PG_AUTOCTL_DEBUG=1 pg_autoctl do tmux compose session \
                 --root ./tmux/citus           \
                 --first-pgport 5600           \
                 --nodes 3                     \
                 --async-nodes 0               \
                 --node-priorities 50,50,0     \
                 --sync-standbys -1            \
                 --citus-workers 4             \
                 --citus-secondaries 0         \
                 --citus                       \
                 --layout even-vertical

       The pg_autoctl do tmux compose session command also takes care of creating external docker
       volumes and referencing them for each node in the docker-compose file.

   pg_autoctl do tmux session
       This command runs a tmux session for a demo or a test case.

          usage: pg_autoctl do tmux session [option ...]

            --root              path where to create a cluster
            --first-pgport      first Postgres port to use (5500)
            --nodes             number of Postgres nodes to create (2)
            --async-nodes       number of async nodes within nodes (0)
            --node-priorities   list of nodes priorities (50)
            --sync-standbys     number-sync-standbys to set (0 or 1)
            --skip-pg-hba       use --skip-pg-hba when creating nodes
            --citus             start a Citus formation
            --citus-workers     number of Citus workers to create (2)
            --citus-secondaries number of Citus secondaries to create (0)
            --layout            tmux layout to use (even-vertical)
            --binpath           path to the pg_autoctl binary (current binary path)

   pg_autoctl do tmux compose session
       This  command runs a tmux session for a demo or a test case. It generates a docker-compose
       file and then uses docker-compose to drive many nodes.

          usage: pg_autoctl do tmux compose session [option ...]

            --root            path where to create a cluster
            --first-pgport    first Postgres port to use (5500)
            --nodes           number of Postgres nodes to create (2)
            --async-nodes     number of async nodes within nodes (0)
            --node-priorities list of nodes priorities (50)
            --sync-standbys   number-sync-standbys to set (0 or 1)
            --skip-pg-hba     use --skip-pg-hba when creating nodes
            --layout          tmux layout to use (even-vertical)
            --binpath         path to the pg_autoctl binary (current binary path)

   pg_autoctl do demo
       pg_autoctl do demo - Use a demo application for pg_auto_failover

   Synopsis
       pg_autoctl do demo provides the following commands:

          pg_autoctl do demo
           run      Run the pg_auto_failover demo application
           uri      Grab the application connection string from the monitor
           ping     Attempt to connect to the application URI
           summary  Display a summary of the previous demo app run

       To run a demo, use pg_autoctl do demo run:

          usage: pg_autoctl do demo run [option ...]

          --monitor        Postgres URI of the pg_auto_failover monitor
          --formation      Formation to use (default)
          --group          Group Id to failover (0)
          --username       PostgreSQL's username
          --clients        How many client processes to use (1)
          --duration       Duration of the demo app, in seconds (30)
          --first-failover Timing of the first failover (10)
          --failover-freq  Seconds between subsequent failovers (45)

   Description
       The pg_autoctl debug tooling includes a demo application.

       The demo prepare its Postgres schema on the  target  database,  and  then  starts  several
       clients (see --clients) that concurrently connect to the target application URI and record
       the time it took to establish the Postgres connection to the current read-write node, with
       information about the retry policy metrics.

   Example
          $ pg_autoctl do demo run --monitor 'postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer' --clients 10
          14:43:35 19660 INFO  Using application connection string "postgres://localhost:5502,localhost:5503,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer"
          14:43:35 19660 INFO  Using Postgres user PGUSER "dim"
          14:43:35 19660 INFO  Preparing demo schema: drop schema if exists demo cascade
          14:43:35 19660 WARN  NOTICE:  schema "demo" does not exist, skipping
          14:43:35 19660 INFO  Preparing demo schema: create schema demo
          14:43:35 19660 INFO  Preparing demo schema: create table demo.tracking(ts timestamptz default now(), client integer, loop integer, retries integer, us bigint, recovery bool)
          14:43:36 19660 INFO  Preparing demo schema: create table demo.client(client integer, pid integer, retry_sleep_ms integer, retry_cap_ms integer, failover_count integer)
          14:43:36 19660 INFO  Starting 10 concurrent clients as sub-processes
          14:43:36 19675 INFO  Failover client is started, will failover in 10s and every 45s after that
          ...

          $ pg_autoctl do demo summary --monitor 'postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=prefer' --clients 10
          14:44:27 22789 INFO  Using application connection string "postgres://localhost:5503,localhost:5501,localhost:5502/demo?target_session_attrs=read-write&sslmode=prefer"
          14:44:27 22789 INFO  Using Postgres user PGUSER "dim"
          14:44:27 22789 INFO  Summary for the demo app running with 10 clients for 30s
                  Client        | Connections | Retries | Min Connect Time (ms) |   max    |   p95   |   p99
          ----------------------+-------------+---------+-----------------------+----------+---------+---------
           Client 1             |         136 |      14 |                58.318 | 2601.165 | 244.443 | 261.809
           Client 2             |         136 |       5 |                55.199 | 2514.968 | 242.362 | 259.282
           Client 3             |         134 |       6 |                55.815 | 2974.247 | 241.740 | 262.908
           Client 4             |         135 |       7 |                56.542 | 2970.922 | 238.995 | 251.177
           Client 5             |         136 |       8 |                58.339 | 2758.106 | 238.720 | 252.439
           Client 6             |         134 |       9 |                58.679 | 2813.653 | 244.696 | 254.674
           Client 7             |         134 |      11 |                58.737 | 2795.974 | 243.202 | 253.745
           Client 8             |         136 |      12 |                52.109 | 2354.952 | 242.664 | 254.233
           Client 9             |         137 |      19 |                59.735 | 2628.496 | 235.668 | 253.582
           Client 10            |         133 |       6 |                57.994 | 3060.489 | 242.156 | 256.085
           All Clients Combined |        1351 |      97 |                52.109 | 3060.489 | 241.848 | 258.450
          (11 rows)

           Min Connect Time (ms) |   max    | freq |                      bar
          -----------------------+----------+------+-----------------------------------------------
                          52.109 |  219.105 | 1093 | ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
                         219.515 |  267.168 |  248 | ▒▒▒▒▒▒▒▒▒▒
                        2354.952 | 2354.952 |    1 |
                        2514.968 | 2514.968 |    1 |
                        2601.165 | 2628.496 |    2 |
                        2758.106 | 2813.653 |    3 |
                        2970.922 | 2974.247 |    2 |
                        3060.489 | 3060.489 |    1 |
          (8 rows)

   pg_autoctl do service restart
       pg_autoctl do service restart - Run pg_autoctl sub-processes (services)

   Synopsis
       pg_autoctl do service restart provides the following commands:

          pg_autoctl do service restart
           postgres     Restart the pg_autoctl postgres controller service
           listener     Restart the pg_autoctl monitor listener service
           node-active  Restart the pg_autoctl keeper node-active service

   Description
       It  is  possible  to  restart the pg_autoctl or the Postgres service without affecting the
       other running service. Typically,  to  restart  the  pg_autoctl  parts  without  impacting
       Postgres:

          $ pg_autoctl do service restart node-active --pgdata node1
          14:52:06 31223 INFO  Sending the TERM signal to service "node-active" with pid 26626
          14:52:06 31223 INFO  Service "node-active" has been restarted with pid 31230
          31230

       The Postgres service has not been impacted by the restart of the pg_autoctl process.

   pg_autoctl do show
       pg_autoctl do show - Show some debug level information

   Synopsis
       The  commands pg_autoctl create monitor and pg_autoctl create postgres both implement some
       level of automated detection of the node network settings when the  option  --hostname  is
       not used.

       Adding  to  those commands, when a new node is registered to the monitor, other nodes also
       edit their Postgres HBA rules to  allow  the  new  node  to  connect,  unless  the  option
       --skip-pg-hba has been used.

       The  debug  sub-commands  for pg_autoctl do show can be used to see in details the network
       discovery done by pg_autoctl.

       pg_autoctl do show provides the following commands:

          pg_autoctl do show
            ipaddr    Print this node's IP address information
            cidr      Print this node's CIDR information
            lookup    Print this node's DNS lookup information
            hostname  Print this node's default hostname
            reverse   Lookup given hostname and check reverse DNS setup

   pg_autoctl do show ipaddr
       Connects to an external IP address and uses getsockname(2) to retrieve the current address
       to which the socket is bound.

       The  external  IP  address defaults to 8.8.8.8, the IP address of a Google provided public
       DNS server, or to the monitor IP address or hostname in the context of  pg_autoctl  create
       postgres.

          $ pg_autoctl do show ipaddr
          16:42:40 62631 INFO  ipaddr.c:107: Connecting to 8.8.8.8 (port 53)
          192.168.1.156

   pg_autoctl do show cidr
       Connects  to  an external IP address in the same way as the previous command pg_autoctl do
       show ipaddr and then matches the  local  socket  name  with  the  list  of  local  network
       interfaces.  When  a match is found, uses the netmask of the interface to compute the CIDR
       notation from the IP address.

       The computed CIDR notation is then used in HBA rules.

          $ pg_autoctl do show cidr
          16:43:19 63319 INFO  Connecting to 8.8.8.8 (port 53)
          192.168.1.0/24

   pg_autoctl do show hostname
       Uses either its first (and only) argument or the result of gethostname(2) as the candidate
       hostname  to  use in HBA rules, and then check that the hostname resolves to an IP address
       that belongs to one of the machine network interfaces.

       When the hostname forward-dns lookup resolves to an IP address that is local to  the  node
       where  the  command is run, then a reverse-lookup from the IP address is made to see if it
       matches with the candidate hostname.

          $ pg_autoctl do show hostname
          DESKTOP-IC01GOOS.europe.corp.microsoft.com

          $ pg_autoctl -vv do show hostname 'postgres://autoctl_node@localhost:5500/pg_auto_failover'
          13:45:00 93122 INFO  cli_do_show.c:256: Using monitor hostname "localhost" and port 5500
          13:45:00 93122 INFO  ipaddr.c:107: Connecting to ::1 (port 5500)
          13:45:00 93122 DEBUG cli_do_show.c:272: cli_show_hostname: ip ::1
          13:45:00 93122 DEBUG cli_do_show.c:283: cli_show_hostname: host localhost
          13:45:00 93122 DEBUG cli_do_show.c:294: cli_show_hostname: ip ::1
          localhost

   pg_autoctl do show lookup
       Checks that the given argument is an hostname that resolves to a local IP address, that is
       an IP address associated with a local network interface.

          $ pg_autoctl do show lookup DESKTOP-IC01GOOS.europe.corp.microsoft.com
          DESKTOP-IC01GOOS.europe.corp.microsoft.com: 192.168.1.156

   pg_autoctl do show reverse
       Implements  the  same  DNS  checks as Postgres HBA matching code: first does a forward DNS
       lookup of the given hostname,  and  then  a  reverse-lookup  from  all  the  IP  addresses
       obtained. Success is reached when at least one of the IP addresses from the forward lookup
       resolves back to the given hostname (as the first answer to the reverse DNS lookup).

          $ pg_autoctl do show reverse DESKTOP-IC01GOOS.europe.corp.microsoft.com
          16:44:49 64910 FATAL Failed to find an IP address for hostname "DESKTOP-IC01GOOS.europe.corp.microsoft.com" that matches hostname again in a reverse-DNS lookup.
          16:44:49 64910 INFO  Continuing with IP address "192.168.1.156"

          $ pg_autoctl -vv do show reverse DESKTOP-IC01GOOS.europe.corp.microsoft.com
          16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 192.168.1.156
          16:44:45 64832 DEBUG ipaddr.c:733: reverse lookup for "192.168.1.156" gives "desktop-ic01goos.europe.corp.microsoft.com" first
          16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 192.168.1.156
          16:44:45 64832 DEBUG ipaddr.c:733: reverse lookup for "192.168.1.156" gives "desktop-ic01goos.europe.corp.microsoft.com" first
          16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 2a01:110:10:40c::2ad
          16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
          16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 2a01:110:10:40c::2ad
          16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
          16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 100.64.34.213
          16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
          16:44:45 64832 DEBUG ipaddr.c:719: DESKTOP-IC01GOOS.europe.corp.microsoft.com has address 100.64.34.213
          16:44:45 64832 DEBUG ipaddr.c:728: Failed to resolve hostname from address "192.168.1.156": nodename nor servname provided, or not known
          16:44:45 64832 FATAL cli_do_show.c:333: Failed to find an IP address for hostname "DESKTOP-IC01GOOS.europe.corp.microsoft.com" that matches hostname again in a reverse-DNS lookup.
          16:44:45 64832 INFO  cli_do_show.c:334: Continuing with IP address "192.168.1.156"

   pg_autoctl do pgsetup
       pg_autoctl do pgsetup - Manage a local Postgres setup

   Synopsis
       The main pg_autoctl commands implement low-level management tooling for a  local  Postgres
       instance.  Some  of  the low-level Postgres commands can be used as their own tool in some
       cases.

       pg_autoctl do pgsetup provides the following commands:

          pg_autoctl do pgsetup
            pg_ctl    Find a non-ambiguous pg_ctl program and Postgres version
            discover  Discover local PostgreSQL instance, if any
            ready     Return true is the local Postgres server is ready
            wait      Wait until the local Postgres server is ready
            logs      Outputs the Postgres startup logs
            tune      Compute and log some Postgres tuning options

   pg_autoctl do pgsetup pg_ctl
       In a similar way to which -a, this commands scans your PATH for pg_ctl commands.  Then  it
       runs  the  pg_ctl  --version  command  and  parses  the output to determine the version of
       Postgres that is available in the path.

          $ pg_autoctl do pgsetup pg_ctl --pgdata node1
          16:49:18 69684 INFO  Environment variable PG_CONFIG is set to "/Applications/Postgres.app//Contents/Versions/12/bin/pg_config"
          16:49:18 69684 INFO  `pg_autoctl create postgres` would use "/Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl" for Postgres 12.3
          16:49:18 69684 INFO  `pg_autoctl create monitor` would use "/Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl" for Postgres 12.3

   pg_autoctl do pgsetup discover
       Given a PGDATA or --pgdata option, the command discovers if  a  running  Postgres  service
       matches  the  pg_autoctl setup, and prints the information that pg_autoctl typically needs
       when managing a Postgres instance.

          $ pg_autoctl do pgsetup discover --pgdata node1
          pgdata:                /Users/dim/dev/MS/pg_auto_failover/tmux/node1
          pg_ctl:                /Applications/Postgres.app/Contents/Versions/12/bin/pg_ctl
          pg_version:            12.3
          pghost:                /tmp
          pgport:                5501
          proxyport:             0
          pid:                   21029
          is in recovery:        no
          Control Version:       1201
          Catalog Version:       201909212
          System Identifier:     6942422768095393833
          Latest checkpoint LSN: 0/4059C18
          Postmaster status:     ready

   pg_autoctl do pgsetup ready
       Similar to the pg_isready command, though uses the Postgres specifications  found  in  the
       pg_autoctl node setup.

          $ pg_autoctl do pgsetup ready --pgdata node1
          16:50:08 70582 INFO  Postgres status is: "ready"

   pg_autoctl do pgsetup wait
       When  pg_autoctl  do  pgsetup  ready would return false because Postgres is not ready yet,
       this command continues probing every second for 30 seconds, and exists as soon as Postgres
       is ready.

          $ pg_autoctl do pgsetup wait --pgdata node1
          16:50:22 70829 INFO  Postgres is now serving PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" on port 5501 with pid 21029
          16:50:22 70829 INFO  Postgres status is: "ready"

   pg_autoctl do pgsetup logs
       Outputs the Postgres logs from the most recent log file in the PGDATA/log directory.

          $ pg_autoctl do pgsetup logs --pgdata node1
          16:50:39 71126 WARN  Postgres logs from "/Users/dim/dev/MS/pg_auto_failover/tmux/node1/startup.log":
          16:50:39 71126 INFO  2021-03-22 14:43:48.911 CET [21029] LOG:  starting PostgreSQL 12.3 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit
          16:50:39 71126 INFO  2021-03-22 14:43:48.913 CET [21029] LOG:  listening on IPv6 address "::", port 5501
          16:50:39 71126 INFO  2021-03-22 14:43:48.913 CET [21029] LOG:  listening on IPv4 address "0.0.0.0", port 5501
          16:50:39 71126 INFO  2021-03-22 14:43:48.913 CET [21029] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5501"
          16:50:39 71126 INFO  2021-03-22 14:43:48.931 CET [21029] LOG:  redirecting log output to logging collector process
          16:50:39 71126 INFO  2021-03-22 14:43:48.931 CET [21029] HINT:  Future log output will appear in directory "log".
          16:50:39 71126 WARN  Postgres logs from "/Users/dim/dev/MS/pg_auto_failover/tmux/node1/log/postgresql-2021-03-22_144348.log":
          16:50:39 71126 INFO  2021-03-22 14:43:48.937 CET [21033] LOG:  database system was shut down at 2021-03-22 14:43:46 CET
          16:50:39 71126 INFO  2021-03-22 14:43:48.937 CET [21033] LOG:  entering standby mode
          16:50:39 71126 INFO  2021-03-22 14:43:48.942 CET [21033] LOG:  consistent recovery state reached at 0/4022E88
          16:50:39 71126 INFO  2021-03-22 14:43:48.942 CET [21033] LOG:  invalid record length at 0/4022E88: wanted 24, got 0
          16:50:39 71126 INFO  2021-03-22 14:43:48.946 CET [21029] LOG:  database system is ready to accept read only connections
          16:50:39 71126 INFO  2021-03-22 14:43:49.032 CET [21038] LOG:  fetching timeline history file for timeline 4 from primary server
          16:50:39 71126 INFO  2021-03-22 14:43:49.037 CET [21038] LOG:  started streaming WAL from primary at 0/4000000 on timeline 3
          16:50:39 71126 INFO  2021-03-22 14:43:49.046 CET [21038] LOG:  replication terminated by primary server
          16:50:39 71126 INFO  2021-03-22 14:43:49.046 CET [21038] DETAIL:  End of WAL reached on timeline 3 at 0/4022E88.
          16:50:39 71126 INFO  2021-03-22 14:43:49.047 CET [21033] LOG:  new target timeline is 4
          16:50:39 71126 INFO  2021-03-22 14:43:49.049 CET [21038] LOG:  restarted WAL streaming at 0/4000000 on timeline 4
          16:50:39 71126 INFO  2021-03-22 14:43:49.210 CET [21033] LOG:  redo starts at 0/4022E88
          16:50:39 71126 INFO  2021-03-22 14:52:06.692 CET [21029] LOG:  received SIGHUP, reloading configuration files
          16:50:39 71126 INFO  2021-03-22 14:52:06.906 CET [21029] LOG:  received SIGHUP, reloading configuration files
          16:50:39 71126 FATAL 2021-03-22 15:34:24.920 CET [21038] FATAL:  terminating walreceiver due to timeout
          16:50:39 71126 INFO  2021-03-22 15:34:24.973 CET [21033] LOG:  invalid record length at 0/4059CC8: wanted 24, got 0
          16:50:39 71126 INFO  2021-03-22 15:34:25.105 CET [35801] LOG:  started streaming WAL from primary at 0/4000000 on timeline 4
          16:50:39 71126 FATAL 2021-03-22 16:12:56.918 CET [35801] FATAL:  terminating walreceiver due to timeout
          16:50:39 71126 INFO  2021-03-22 16:12:57.086 CET [38741] LOG:  started streaming WAL from primary at 0/4000000 on timeline 4
          16:50:39 71126 FATAL 2021-03-22 16:23:39.349 CET [38741] FATAL:  terminating walreceiver due to timeout
          16:50:39 71126 INFO  2021-03-22 16:23:39.497 CET [41635] LOG:  started streaming WAL from primary at 0/4000000 on timeline 4

   pg_autoctl do pgsetup tune
       Outputs the pg_autoclt automated tuning options. Depending on the number of CPU and amount
       of RAM detected in the environment where it is run, pg_autoctl can adjust some very  basic
       Postgres tuning knobs to get started.

          $ pg_autoctl do pgsetup tune --pgdata node1 -vv
          13:25:25 77185 DEBUG pgtuning.c:85: Detected 12 CPUs and 16 GB total RAM on this server
          13:25:25 77185 DEBUG pgtuning.c:225: Setting autovacuum_max_workers to 3
          13:25:25 77185 DEBUG pgtuning.c:228: Setting shared_buffers to 4096 MB
          13:25:25 77185 DEBUG pgtuning.c:231: Setting work_mem to 24 MB
          13:25:25 77185 DEBUG pgtuning.c:235: Setting maintenance_work_mem to 512 MB
          13:25:25 77185 DEBUG pgtuning.c:239: Setting effective_cache_size to 12 GB
          # basic tuning computed by pg_auto_failover
          track_functions = pl
          shared_buffers = '4096 MB'
          work_mem = '24 MB'
          maintenance_work_mem = '512 MB'
          effective_cache_size = '12 GB'
          autovacuum_max_workers = 3
          autovacuum_vacuum_scale_factor = 0.08
          autovacuum_analyze_scale_factor = 0.02

       The  low-level  API  is  made available through the following pg_autoctl do commands, only
       available in debug environments:

          pg_autoctl do
          + monitor  Query a pg_auto_failover monitor
          + fsm      Manually manage the keeper's state
          + primary  Manage a PostgreSQL primary server
          + standby  Manage a PostgreSQL standby server
          + show     Show some debug level information
          + pgsetup  Manage a local Postgres setup
          + pgctl    Signal the pg_autoctl postgres service
          + service  Run pg_autoctl sub-processes (services)
          + tmux     Set of facilities to handle tmux interactive sessions
          + azure    Manage a set of Azure resources for a pg_auto_failover demo
          + demo     Use a demo application for pg_auto_failover

          pg_autoctl do monitor
          + get                 Get information from the monitor
            register            Register the current node with the monitor
            active              Call in the pg_auto_failover Node Active protocol
            version             Check that monitor version is 1.5.0.1; alter extension update if not
            parse-notification  parse a raw notification message

          pg_autoctl do monitor get
            primary      Get the primary node from pg_auto_failover in given formation/group
            others       Get the other nodes from the pg_auto_failover group of hostname/port
            coordinator  Get the coordinator node from the pg_auto_failover formation

          pg_autoctl do fsm
            init    Initialize the keeper's state on-disk
            state   Read the keeper's state from disk and display it
            list    List reachable FSM states from current state
            gv      Output the FSM as a .gv program suitable for graphviz/dot
            assign  Assign a new goal state to the keeper
            step    Make a state transition if instructed by the monitor
          + nodes   Manually manage the keeper's nodes list

          pg_autoctl do fsm nodes
            get  Get the list of nodes from file (see --disable-monitor)
            set  Set the list of nodes to file (see --disable-monitor)

          pg_autoctl do primary
          + slot      Manage replication slot on the primary server
          + adduser   Create users on primary
            defaults  Add default settings to postgresql.conf
            identify  Run the IDENTIFY_SYSTEM replication command on given host

          pg_autoctl do primary slot
            create  Create a replication slot on the primary server
            drop    Drop a replication slot on the primary server

          pg_autoctl do primary adduser
            monitor  add a local user for queries from the monitor
            replica  add a local user with replication privileges

          pg_autoctl do standby
            init     Initialize the standby server using pg_basebackup
            rewind   Rewind a demoted primary server using pg_rewind
            promote  Promote a standby server to become writable

          pg_autoctl do show
            ipaddr    Print this node's IP address information
            cidr      Print this node's CIDR information
            lookup    Print this node's DNS lookup information
            hostname  Print this node's default hostname
            reverse   Lookup given hostname and check reverse DNS setup

          pg_autoctl do pgsetup
            pg_ctl    Find a non-ambiguous pg_ctl program and Postgres version
            discover  Discover local PostgreSQL instance, if any
            ready     Return true is the local Postgres server is ready
            wait      Wait until the local Postgres server is ready
            logs      Outputs the Postgres startup logs
            tune      Compute and log some Postgres tuning options

          pg_autoctl do pgctl
            on   Signal pg_autoctl postgres service to ensure Postgres is running
            off  Signal pg_autoctl postgres service to ensure Postgres is stopped

          pg_autoctl do service
          + getpid        Get the pid of pg_autoctl sub-processes (services)
          + restart       Restart pg_autoctl sub-processes (services)
            pgcontroller  pg_autoctl supervised postgres controller
            postgres      pg_autoctl service that start/stop postgres when asked
            listener      pg_autoctl service that listens to the monitor notifications
            node-active   pg_autoctl service that implements the node active protocol

          pg_autoctl do service getpid
            postgres     Get the pid of the pg_autoctl postgres controller service
            listener     Get the pid of the pg_autoctl monitor listener service
            node-active  Get the pid of the pg_autoctl keeper node-active service

          pg_autoctl do service restart
            postgres     Restart the pg_autoctl postgres controller service
            listener     Restart the pg_autoctl monitor listener service
            node-active  Restart the pg_autoctl keeper node-active service

          pg_autoctl do tmux
            script   Produce a tmux script for a demo or a test case (debug only)
            session  Run a tmux session for a demo or a test case
            stop     Stop pg_autoctl processes that belong to a tmux session
            wait     Wait until a given node has been registered on the monitor
            clean    Clean-up a tmux session processes and root dir

          pg_autoctl do azure
          + provision  provision azure resources for a pg_auto_failover demo
          + tmux       Run a tmux session with an Azure setup for QA/testing
          + show       show azure resources for a pg_auto_failover demo
            deploy     Deploy a pg_autoctl VMs, given by name
            create     Create an azure QA environment
            drop       Drop an azure QA environment: resource group, network, VMs
            ls         List resources in a given azure region
            ssh        Runs ssh -l ha-admin <public ip address> for a given VM name
            sync       Rsync pg_auto_failover sources on all the target region VMs

          pg_autoctl do azure provision
            region  Provision an azure region: resource group, network, VMs
            nodes   Provision our pre-created VM with pg_autoctl Postgres nodes

          pg_autoctl do azure tmux
            session  Create or attach a tmux session for the created Azure VMs
            kill     Kill an existing tmux session for Azure VMs

          pg_autoctl do azure show
            ips    Show public and private IP addresses for selected VMs
            state  Connect to the monitor node to show the current state

          pg_autoctl do demo
            run      Run the pg_auto_failover demo application
            uri      Grab the application connection string from the monitor
            ping     Attempt to connect to the application URI
            summary  Display a summary of the previous demo app run

   pg_autoctl run
       pg_autoctl run - Run the pg_autoctl service (monitor or keeper)

   Synopsis
       This commands starts the processes needed  to  run  a  monitor  node  or  a  keeper  node,
       depending  on  the  configuration  file  that  belongs  to  the  --pgdata option or PGDATA
       environment variable.

          usage: pg_autoctl run  [ --pgdata --name --hostname --pgport ]

          --pgdata      path to data directory
          --name        pg_auto_failover node name
          --hostname    hostname used to connect from other nodes
          --pgport      PostgreSQL's port number

   Description
       When registering Postgres nodes to  the  pg_auto_failover  monitor  using  the  pg_autoctl
       create  postgres  command, the nodes are registered with metadata: the node name, hostname
       and Postgres port.

       The node name is used mostly in the logs and pg_autoctl  show  state  commands  and  helps
       human administrators of the formation.

       The  node  hostname  and  pgport  are  used by other nodes, including the pg_auto_failover
       monitor, to open a Postgres connection.

       Both the node name and  the  node  hostname  and  port  can  be  changed  after  the  node
       registration  by  using  either this command (pg_autoctl run) or the pg_autoctl config set
       command.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --name Node name used on the monitor to refer to this node. The hostname  is  a  technical
              information,  and  given  Postgres requirements on the HBA setup and DNS resolution
              (both forward and reverse lookups), IP addresses are often used for the hostname.

              The --name option allows using a user-friendly name for your Postgres nodes.

       --hostname
              Hostname or IP address (both v4 and v6 are supported) to use from any other node to
              connect to this node.

              When not provided, a default value is computed by running the following algorithm.

                 1. We  get  this  machine's  "public  IP"  by  opening a connection to the given
                    monitor hostname or IP address. Then we get TCP/IP client  address  that  has
                    been used to make that connection.

                 2. We  then do a reverse DNS lookup on the IP address found in the previous step
                    to fetch a hostname for our local machine.

                 3. If the reverse DNS lookup is successful , then pg_autoctl does a forward  DNS
                    lookup of that hostname.

              When  the  forward  DNS lookup response in step 3. is an IP address found in one of
              our local network interfaces, then pg_autoctl uses the hostname found in step 2. as
              the default --hostname. Otherwise it uses the IP address found in step 1.

              You may use the --hostname command line option to bypass the whole DNS lookup based
              process and force the local node name to a fixed value.

       --pgport
              Postgres port to use, defaults to 5432.

   pg_autoctl watch
       pg_autoctl watch - Display an auto-updating dashboard

   Synopsis
       This command outputs the events that  the  pg_auto_failover  events  records  about  state
       changes of the pg_auto_failover nodes managed by the monitor:

          usage: pg_autoctl watch  [ --pgdata --formation --group ]

          --pgdata      path to data directory
          --monitor     show the monitor uri
          --formation   formation to query, defaults to 'default'
          --group       group to query formation, defaults to all
          --json        output data in the JSON format

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --monitor
              Postgres URI used to connect to the monitor. Must use the autoctl_node username and
              target the pg_auto_failover database name. It is possible to show the Postgres  URI
              from the monitor node using the command pg_autoctl show uri.

       --formation
              List the events recorded for nodes in the given formation. Defaults to default.

       --group
              Limit  output  to  a single group in the formation. Default to including all groups
              registered in the target formation.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   Description
       The pg_autoctl watch output is divided in 3 sections.

       The first section is a single header  line  which  includes  the  name  of  the  currently
       selected  formation,  the  formation replication setting Number Sync Standbys, and then in
       the right most position the current time.

       The second section displays one line per node, and each line contains a  list  of  columns
       that  describe  the  current  state  for  the  node.  This list can includes the following
       columns, and which columns are part of the output depends on  the  terminal  window  size.
       This choice is dynamic and changes if your terminal window size changes:

          • Name
                Name of the node.

          • Node, or Id
                Node  information.  When the formation has a single group (group zero), then this
                column only contains the nodeId.

                Only Citus formations allow several groups. When using a Citus formation the Node
                column contains the groupId and the nodeId, separated by a colon, such as 0:1 for
                the first coordinator node.

          • Last Report, or Report
                Time interval between now and the last known time when a node has reported to the
                monitor, using the node_active protocol.

                This value is expected to stay under 2s or abouts, and is known to increment when
                either the pg_autoctl run service is not running, or  when  there  is  a  network
                split.

          • Last Check, or Check
                Time  interval between now and the last known time when the monitor could connect
                to a node's Postgres instance, via its health check mechanism.

                This value is known to increment when either the Postgres service is not  running
                on the target node, when there is a network split, or when the internal machinery
                (the health check worker background process) implements jitter.

          • Host:Port
                Hostname and port number used to connect to the node.

          • TLI: LSN
                Timeline identifier (TLI) and Postgres Log Sequence Number (LSN).

                The LSN is the current position in the Postgres WAL stream. This is a hexadecimal
                number. See pg_lsn for more information.

                The  current  timeline is incremented each time a failover happens, or when doing
                Point In Time Recovery. A node can only reach the secondary state when it  is  on
                the same timeline as its primary node.

          • Connection
                This  output  field  contains  two  bits  of  information.  First,  the  Postgres
                connection type that the node provides, either read-write or read-only. Then  the
                mark  !  is added when the monitor has failed to connect to this node, and ? when
                the monitor didn't connect to the node yet.

          • Reported State
                The current FSM state as reported  to  the  monitor  by  the  pg_autoctl  process
                running on the Postgres node.

          • Assigned State
                The assigned FSM state on the monitor. When the assigned state is not the same as
                the reported start, then the pg_autoctl process  running  on  the  Postgres  node
                might  have  not retrieved the assigned state yet, or might still be implementing
                the FSM transition from the current state to the assigned state.

       The third and last section lists the most recent events that the monitor  has  registered,
       the more recent event is found at the bottom of the screen.

       To quit the command hit either the F1 key or the q key.

   pg_autoctl stop
       pg_autoctl stop - signal the pg_autoctl service for it to stop

   Synopsis
       This commands stops the processes needed to run a monitor node or a keeper node, depending
       on the configuration file that belongs  to  the  --pgdata  option  or  PGDATA  environment
       variable.

          usage: pg_autoctl stop  [ --pgdata --fast --immediate ]

          --pgdata      path to data directory
          --fast        fast shutdown mode for the keeper
          --immediate   immediate shutdown mode for the keeper

   Description
       The  pg_autoctl stop commands finds the PID of the running service for the given --pgdata,
       and if the process is still running, sends a SIGTERM signal to the process.

       When pg_autoclt receives a shutdown signal a shutdown sequence is triggered. Depending  on
       the  signal  received,  an operation that has been started (such as a state transition) is
       either run to completion, stopped as the next opportunity,  or  stopped  immediately  even
       when in the middle of the transition.

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --fast Fast  Shutdown mode for pg_autoctl. Sends the SIGINT signal to the running service,
              which is the same as using C-c on an interactive process running  as  a  foreground
              shell job.

       --immediate
              Immediate  Shutdown  mode  for  pg_autoctl. Sends the SIGQUIT signal to the running
              service.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres URI to connect to the monitor node, can  be  used  instead  of  the  --monitor
          option.

       XDG_CONFIG_HOME
          The   pg_autoctl   command  stores  its  configuration  files  in  the  standard  place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The pg_autoctl  command  stores  its  internal  states  files  in  the  standard  place
          XDG_DATA_HOME,   which   defaults   to  ~/.local/share.  See  the  XDG  Base  Directory
          Specification.

   pg_autoctl reload
       pg_autoctl reload - signal the pg_autoctl for it to reload its configuration

   Synopsis
       This commands signals a running pg_autoctl process to reload its configuration from  disk,
       and also signal the managed Postgres service to reload its configuration.

          usage: pg_autoctl reload  [ --pgdata ] [ --json ]

          --pgdata      path to data directory

   Description
       The  pg_autoctl  reload  commands  finds  the  PID  of  the  running service for the given
       --pgdata, and if the process is still running, sends a SIGHUP signal to the process.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

   pg_autoctl status
       pg_autoctl status - Display the current status of the pg_autoctl service

   Synopsis
       This commands outputs the current process status for the pg_autoctl  service  running  for
       the given --pgdata location.

          usage: pg_autoctl status  [ --pgdata ] [ --json ]

          --pgdata      path to data directory
          --json        output data in the JSON format

   Options
       --pgdata
              Location  of  the  Postgres node being managed locally. Defaults to the environment
              variable PGDATA. Use --monitor to connect to a monitor from anywhere,  rather  than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

       --json Output a JSON formatted data instead of a table formatted list.

   Environment
       PGDATA
          Postgres directory location. Can be used instead of the --pgdata option.

       PG_AUTOCTL_MONITOR
          Postgres  URI  to  connect  to  the  monitor node, can be used instead of the --monitor
          option.

       XDG_CONFIG_HOME
          The  pg_autoctl  command  stores  its  configuration  files  in  the   standard   place
          XDG_CONFIG_HOME. See the XDG Base Directory Specification.

       XDG_DATA_HOME
          The  pg_autoctl  command  stores  its  internal  states  files  in  the  standard place
          XDG_DATA_HOME,  which  defaults  to  ~/.local/share.  See  the   XDG   Base   Directory
          Specification.

   Example
          $ pg_autoctl status --pgdata node1
          11:26:30 27248 INFO  pg_autoctl is running with pid 26618
          11:26:30 27248 INFO  Postgres is serving PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" on port 5501 with pid 26725

          $ pg_autoctl status --pgdata node1 --json
          11:26:37 27385 INFO  pg_autoctl is running with pid 26618
          11:26:37 27385 INFO  Postgres is serving PGDATA "/Users/dim/dev/MS/pg_auto_failover/tmux/node1" on port 5501 with pid 26725
          {
              "postgres": {
                  "pgdata": "\/Users\/dim\/dev\/MS\/pg_auto_failover\/tmux\/node1",
                  "pg_ctl": "\/Applications\/Postgres.app\/Contents\/Versions\/12\/bin\/pg_ctl",
                  "version": "12.3",
                  "host": "\/tmp",
                  "port": 5501,
                  "proxyport": 0,
                  "pid": 26725,
                  "in_recovery": false,
                  "control": {
                      "version": 0,
                      "catalog_version": 0,
                      "system_identifier": "0"
                  },
                  "postmaster": {
                      "status": "ready"
                  }
              },
              "pg_autoctl": {
                  "pid": 26618,
                  "status": "running",
                  "pgdata": "\/Users\/dim\/dev\/MS\/pg_auto_failover\/tmux\/node1",
                  "version": "1.5.0",
                  "semId": 196609,
                  "services": [
                      {
                          "name": "postgres",
                          "pid": 26625,
                          "status": "running",
                          "version": "1.5.0",
                          "pgautofailover": "1.5.0.1"
                      },
                      {
                          "name": "node-active",
                          "pid": 26626,
                          "status": "running",
                          "version": "1.5.0",
                          "pgautofailover": "1.5.0.1"
                      }
                  ]
              }
          }

   pg_autoctl activate
       pg_autoctl activate - Activate a Citus worker from the Citus coordinator

   Synopsis
       This  command  calls  the Citus “activation” API so that a node can be used to host shards
       for your reference and distributed tables.

          usage: pg_autoctl activate  [ --pgdata ]

            --pgdata      path to data directory

   Description
       When creating a Citus worker, pg_autoctl create worker automatically activates the  worker
       node  to  the  coordinator.  You  only  need  this  command when something unexpected have
       happened and you want to manually make sure the worker node  has  been  activated  at  the
       Citus coordinator level.

       Starting  with  Citus  10 it is also possible to activate the coordinator itself as a node
       with shard placement. Use pg_autoctl activate on your Citus coordinator node  manually  to
       use that feature.

       When  the  Citus  coordinator  is  activated,  an extra step is then needed for it to host
       shards of distributed tables. If you want your coordinator to have  shards,  then  have  a
       look  at  the  Citus  API  citus_set_node_property to set the shouldhaveshards property to
       true.

   Options
       --pgdata
              Location of the Postgres node being managed locally. Defaults  to  the  environment
              variable  PGDATA.  Use --monitor to connect to a monitor from anywhere, rather than
              the monitor URI used by a local Postgres node managed with pg_autoctl.

CONFIGURING PG_AUTO_FAILOVER

       Several defaults settings of pg_auto_failover can be reviewed and changed depending on the
       trade-offs  you  want to implement in your own production setup. The settings that you can
       change will have an impact of the following operations:

          • Deciding when to promote the secondary

            pg_auto_failover decides to implement a  failover  to  the  secondary  node  when  it
            detects that the primary node is unhealthy. Changing the following settings will have
            an impact on when the pg_auto_failover  monitor  decides  to  promote  the  secondary
            PostgreSQL node:

                pgautofailover.health_check_max_retries
                pgautofailover.health_check_period
                pgautofailover.health_check_retry_delay
                pgautofailover.health_check_timeout
                pgautofailover.node_considered_unhealthy_timeout

          • Time taken to promote the secondary

            At secondary promotion time, pg_auto_failover waits for the following timeout to make
            sure that all pending writes on the primary  server  made  it  to  the  secondary  at
            shutdown time, thus preventing data loss.:

                pgautofailover.primary_demote_timeout

          • Preventing promotion of the secondary

            pg_auto_failover  implements  a  trade-off  where  data  availability  trumps service
            availability. When the primary node of a PostgreSQL service  is  detected  unhealthy,
            the  secondary is only promoted if it was known to be eligible at the moment when the
            primary is lost.

            In the case when synchronous replication was in use at the moment  when  the  primary
            node  is lost, then we know we can switch to the secondary safely, and the wal lag is
            0 in that case.

            In the case when the secondary server had been detected unhealthy  before,  then  the
            pg_auto_failover   monitor  switches  it  from  the  state  SECONDARY  to  the  state
            CATCHING-UP and promotion is prevented then.

            The following setting allows to still promote the secondary, allowing for a window of
            data loss:

                pgautofailover.promote_wal_log_threshold

   pg_auto_failover Monitor
       The configuration for the behavior of the monitor happens in the PostgreSQL database where
       the extension has been deployed:

          pg_auto_failover=> select name, setting, unit, short_desc from pg_settings where name ~ 'pgautofailover.';
          -[ RECORD 1 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.enable_sync_wal_log_threshold
          setting    | 16777216
          unit       |
          short_desc | Don't enable synchronous replication until secondary xlog is within this many bytes of the primary's
          -[ RECORD 2 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.health_check_max_retries
          setting    | 2
          unit       |
          short_desc | Maximum number of re-tries before marking a node as failed.
          -[ RECORD 3 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.health_check_period
          setting    | 5000
          unit       | ms
          short_desc | Duration between each check (in milliseconds).
          -[ RECORD 4 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.health_check_retry_delay
          setting    | 2000
          unit       | ms
          short_desc | Delay between consecutive retries.
          -[ RECORD 5 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.health_check_timeout
          setting    | 5000
          unit       | ms
          short_desc | Connect timeout (in milliseconds).
          -[ RECORD 6 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.node_considered_unhealthy_timeout
          setting    | 20000
          unit       | ms
          short_desc | Mark node unhealthy if last ping was over this long ago
          -[ RECORD 7 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.primary_demote_timeout
          setting    | 30000
          unit       | ms
          short_desc | Give the primary this long to drain before promoting the secondary
          -[ RECORD 8 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.promote_wal_log_threshold
          setting    | 16777216
          unit       |
          short_desc | Don't promote secondary unless xlog is with this many bytes of the master
          -[ RECORD 9 ]----------------------------------------------------------------------------------------------------
          name       | pgautofailover.startup_grace_period
          setting    | 10000
          unit       | ms
          short_desc | Wait for at least this much time after startup before initiating a failover.

       You can edit the parameters as usual with PostgreSQL, either in the  postgresql.conf  file
       or  using  ALTER DATABASE pg_auto_failover SET parameter = value; commands, then issuing a
       reload.

   pg_auto_failover Keeper Service
       For an introduction to the pg_autoctl commands relevant  to  the  pg_auto_failover  Keeper
       configuration, please see pg_autoctl config.

       An example configuration file looks like the following:

          [pg_autoctl]
          role = keeper
          monitor = postgres://autoctl_node@192.168.1.34:6000/pg_auto_failover
          formation = default
          group = 0
          hostname = node1.db
          nodekind = standalone

          [postgresql]
          pgdata = /data/pgsql/
          pg_ctl = /usr/pgsql-10/bin/pg_ctl
          dbname = postgres
          host = /tmp
          port = 5000

          [replication]
          slot = pgautofailover_standby
          maximum_backup_rate = 100M
          backup_directory = /data/backup/node1.db

          [timeout]
          network_partition_timeout = 20
          postgresql_restart_failure_timeout = 20
          postgresql_restart_failure_max_retries = 3

       To  output,  edit  and  check  entries  of  the  configuration, the following commands are
       provided:

          pg_autoctl config check [--pgdata <pgdata>]
          pg_autoctl config get [--pgdata <pgdata>] section.option
          pg_autoctl config set [--pgdata <pgdata>] section.option value

       The [postgresql] section is discovered automatically by the pg_autoctl command and is  not
       intended to be changed manually.

       pg_autoctl.monitor

       PostgreSQL  service  URL  of  the  pg_auto_failover monitor, as given in the output of the
       pg_autoctl show uri command.

       pg_autoctl.formation

       A single pg_auto_failover monitor may handle  several  postgres  formations.  The  default
       formation name default is usually fine.

       pg_autoctl.group

       This  information  is  retrieved by the pg_auto_failover keeper when registering a node to
       the monitor, and should not be changed afterwards. Use at your own risk.

       pg_autoctl.hostname

       Node hostname used by all the other  nodes  in  the  cluster  to  contact  this  node.  In
       particular,  if  this  node  is  a  primary  then  its  standby uses that address to setup
       streaming replication.

       replication.slot

       Name  of  the  PostgreSQL  replication  slot  used  in  the  streaming  replication  setup
       automatically  deployed  by  pg_auto_failover.  Replication  slots  can't  be  renamed  in
       PostgreSQL.

       replication.maximum_backup_rate

       When pg_auto_failover (re-)builds a standby node using  the  pg_basebackup  command,  this
       parameter  is  given  to pg_basebackup to throttle the network bandwidth used. Defaults to
       100Mbps.

       replication.backup_directory

       When pg_auto_failover (re-)builds a standby node using  the  pg_basebackup  command,  this
       parameter is the target directory where to copy the bits from the primary server. When the
       copy has been successful, then the directory is renamed to postgresql.pgdata.

       The default value is computed from ${PGDATA}/../backup/${hostname} and can be set  to  any
       value of your preference. Remember that the directory renaming is an atomic operation only
       when both the source and the target of the copy are in the same filesystem,  at  least  in
       Unix systems.

       timeout

       This  section  allows  to setup the behavior of the pg_auto_failover keeper in interesting
       scenarios.

       timeout.network_partition_timeout

       Timeout in seconds before we consider failure to communicate with other nodes indicates  a
       network  partition.  This check is only done on a PRIMARY server, so other nodes mean both
       the monitor and the standby.

       When a PRIMARY node is detected to be on the losing  side  of  a  network  partition,  the
       pg_auto_failover keeper enters the DEMOTE state and stops the PostgreSQL instance in order
       to protect against split brain situations.

       The default is 20s.

       timeout.postgresql_restart_failure_timeout

       timeout.postgresql_restart_failure_max_retries

       When PostgreSQL is not running, the first thing the pg_auto_failover keeper does is try to
       restart  it. In case of a transient failure (e.g. file system is full, or other dynamic OS
       resource constraint), the best course of action is to try again for a little while  before
       reaching out to the monitor and ask for a failover.

       The       pg_auto_failover       keeper       tries       to       restart      PostgreSQL
       timeout.postgresql_restart_failure_max_retries times  in  a  row  (default  3)  or  up  to
       timeout.postgresql_restart_failure_timeout   (defaults   20s)   since   it  detected  that
       PostgreSQL is not running, whichever comes first.

OPERATING PG_AUTO_FAILOVER

       This section is not yet complete. Please contact us with any questions.

   Deployment
       pg_auto_failover is a general purpose tool for setting up PostgreSQL replication in  order
       to implement High Availability of the PostgreSQL service.

   Provisioning
       It  is also possible to register pre-existing PostgreSQL instances with a pg_auto_failover
       monitor. The pg_autoctl create command honors the PGDATA environment variable, and  checks
       whether PostgreSQL is already running. If Postgres is detected, the new node is registered
       in SINGLE mode, bypassing the monitor's role assignment policy.

   Postgres configuration management
       The pg_autoctl create postgres command  edits  the  default  Postgres  configuration  file
       (postgresql.conf) to include pg_auto_failover settings.

       The  include  directive is placed on the top of the postgresql.conf file in a way that you
       may override any setting by editing it later in the file.

       Unless using the --skip-pg-hba option then pg_autoctl edits a minimal set of HBA rules for
       you,  in order for the pg_auto_failover nodes to be able to connect to each other. The HBA
       rules that are needed for your application to connect to your Postgres nodes still need to
       be  added.  As  pg_autoctl knows nothing about your applications, then you are responsible
       for editing the HBA file.

   Upgrading pg_auto_failover, from versions 1.4 onward
       When upgrading a pg_auto_failover setup, the procedure is different on the monitor and  on
       the Postgres nodes:

          • on  the monitor, the internal pg_auto_failover database schema might have changed and
            needs to be upgraded to its new definition,  porting  the  existing  data  over.  The
            pg_auto_failover  database  contains the registration of every node in the system and
            their current state.
                It is not possible to trigger a failover during  the  monitor  update.   Postgres
                operations on the Postgres nodes continue normally.

                During  the restart of the monitor, the other nodes might have trouble connecting
                to the monitor. The pg_autoctl command is designed to  retry  connecting  to  the
                monitor and handle errors gracefully.

          • on the Postgres nodes, the pg_autoctl command connects to the monitor every once in a
            while (every second by default), and then calls the node_active  protocol,  a  stored
            procedure in the monitor databases.
                The  pg_autoctl also verifies at each connection to the monitor that it's running
                the  expected  version  of  the  extension.  When  that's  not  the   case,   the
                "node-active" sub-process quits, to be restarted with the possibly new version of
                the pg_autoctl binary found on-disk.

       As a result, here is the standard upgrade plan for pg_auto_failover:

          1. Upgrade the pg_auto_failover package on the all the nodes, monitor included.
                 When using a debian based OS, this looks like the following  command  when  from
                 1.4 to 1.5:

                    sudo apt-get remove pg-auto-failover-cli-1.4 postgresql-11-auto-failover-1.4
                    sudo apt-get install -q -y pg-auto-failover-cli-1.5 postgresql-11-auto-failover-1.5

          2. Restart the pgautofailover service on the monitor.
                 When using the systemd integration, all we need to do is:

                    sudo systemctl restart pgautofailover

                 Then  we may use the following commands to make sure that the service is running
                 as expected:

                    sudo systemctl status pgautofailover
                    sudo journalctl -u pgautofailover

                 At this point it is expected that the pg_autoctl logs show that an  upgrade  has
                 been  performed  by  using  the  ALTER  EXTENSION  pgautofailover  UPDATE TO ...
                 command. The monitor is ready with the new version of pg_auto_failover.

       When the Postgres nodes pg_autoctl process connects to the new monitor version, the  check
       for  version  compatibility  fails,  and  the  "node-active"  sub-process  exits. The main
       pg_autoctl process supervisor then restart the "node-active" sub-process from its  on-disk
       binary  executable  file,  which has been upgraded to the new version. That's why we first
       install the new packages for pg_auto_failover on every node, and  only  then  restart  the
       monitor.

       IMPORTANT:
          Before  upgrading  the monitor, which is a simple restart of the pg_autoctl process, it
          is important that the OS packages for pgautofailover be updated  on  all  the  Postgres
          nodes.

          When  that's not the case, pg_autoctl on the Postgres nodes will still detect a version
          mismatch with the monitor extension, and the "node-active" sub-process will  exit.  And
          when  restarted  automatically,  the  same  version  of  the  local  pg_autoctl  binary
          executable is found on-disk, leading to the same  version  mismatch  with  the  monitor
          extension.

          After  restarting  the  "node-active"  process  5  times, pg_autoctl quits retrying and
          stops. This includes stopping the Postgres service too, and a  service  downtime  might
          then occur.

       And  when  the upgrade is done we can use pg_autoctl show state on the monitor to see that
       eveything is as expected.

   Upgrading from previous pg_auto_failover versions
       The new upgrade procedure described in the previous section is  part  of  pg_auto_failover
       since  version  1.4. When upgrading from a previous version of pg_auto_failover, up to and
       including version 1.3, then all the pg_autoctl processes have to be restarted fully.

       To prevent triggering a failover during the upgrade, it's best to put your secondary nodes
       in maintenance. The procedure then looks like the following:

          1. Enable maintenance on your secondary node(s):

                 pg_autoctl enable maintenance

          2. Upgrade the OS packages for pg_auto_failover on every node, as per previous section.

          3. Restart the monitor to upgrade it to the new pg_auto_failover version:
                 When using the systemd integration, all we need to do is:

                    sudo systemctl restart pgautofailover

                 Then  we may use the following commands to make sure that the service is running
                 as expected:

                    sudo systemctl status pgautofailover
                    sudo journalctl -u pgautofailover

                 At this point it is expected that the pg_autoctl logs show that an  upgrade  has
                 been  performed  by  using  the  ALTER  EXTENSION  pgautofailover  UPDATE TO ...
                 command. The monitor is ready with the new version of pg_auto_failover.

          4. Restart pg_autoctl on all Postgres nodes on the cluster.
                 When using the systemd integration, all we need to do is:

                    sudo systemctl restart pgautofailover

                 As in the previous point in this list, make sure the service is now  running  as
                 expected.

          5. Disable maintenance on your secondary nodes(s):

                 pg_autoctl disable maintenance

   Extension dependencies when upgrading the monitor
       Since  version  1.4.0 the pgautofailover extension requires the Postgres contrib extension
       btree_gist. The pg_autoctl command arranges for the creation of this dependency,  and  has
       been buggy in some releases.

       As  a  result,  you  might  have  trouble upgrade the pg_auto_failover monitor to a recent
       version. It is possible to fix the error by connecting to the  monitor  Postgres  database
       and running the create extension command manually:

          # create extension btree_gist;

   Cluster Management and Operations
       It  is  possible  to  operate  pg_auto_failover  formations  and  groups directly from the
       monitor. All that is needed is an access to the monitor Postgres  database  as  a  client,
       such  as  psql.  It's also possible to add those management SQL function calls in your own
       ops application if you have one.

       For security reasons, the autoctl_node is not allowed to perform  maintenance  operations.
       This  user is limited to what pg_autoctl needs.  You can either create a specific user and
       authentication rule to expose for management, or  edit  the  default  HBA  rules  for  the
       autoctl user. In the following examples we're directly connecting as the autoctl role.

       The  main  operations with pg_auto_failover are node maintenance and manual failover, also
       known as a controlled switchover.

   Maintenance of a secondary node
       It is possible to put a secondary node in any group in a MAINTENANCE state,  so  that  the
       Postgres  server  is  not  doing synchronous replication anymore and can be taken down for
       maintenance purposes, such as security kernel upgrades or the like.

       The command line tool pg_autoctl exposes an API to schedule maintenance operations on  the
       current node, which must be a secondary node at the moment when maintenance is requested.

       Here's  an  example  of  using the maintenance commands on a secondary node, including the
       output. Of course, when you try that on your own nodes, dates and  PID  information  might
       differ:

          $ pg_autoctl enable maintenance
          17:49:19 14377 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          17:49:19 14377 INFO  Following table displays times when notifications are received
              Time |  ID |      Host |   Port |       Current State |      Assigned State
          ---------+-----+-----------+--------+---------------------+--------------------
          17:49:19 |   1 | localhost |   5001 |             primary |        wait_primary
          17:49:19 |   2 | localhost |   5002 |           secondary |    wait_maintenance
          17:49:19 |   2 | localhost |   5002 |    wait_maintenance |    wait_maintenance
          17:49:20 |   1 | localhost |   5001 |        wait_primary |        wait_primary
          17:49:20 |   2 | localhost |   5002 |    wait_maintenance |         maintenance
          17:49:20 |   2 | localhost |   5002 |         maintenance |         maintenance

       The  command listens to the state changes in the current node's formation and group on the
       monitor and displays those changes as it receives them. The operation  is  done  when  the
       node has reached the maintenance state.

       It  is now possible to disable maintenance to allow pg_autoctl to manage this standby node
       again:

          $ pg_autoctl disable maintenance
          17:49:26 14437 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          17:49:26 14437 INFO  Following table displays times when notifications are received
              Time |  ID |      Host |   Port |       Current State |      Assigned State
          ---------+-----+-----------+--------+---------------------+--------------------
          17:49:27 |   2 | localhost |   5002 |         maintenance |          catchingup
          17:49:27 |   2 | localhost |   5002 |          catchingup |          catchingup
          17:49:28 |   2 | localhost |   5002 |          catchingup |           secondary
          17:49:28 |   1 | localhost |   5001 |        wait_primary |             primary
          17:49:28 |   2 | localhost |   5002 |           secondary |           secondary
          17:49:29 |   1 | localhost |   5001 |             primary |             primary

       When a standby node is in maintenance, the monitor sets the primary  node  replication  to
       WAIT_PRIMARY:  in  this role, the PostgreSQL streaming replication is now asynchronous and
       the standby PostgreSQL server may be stopped, rebooted, etc.

   Maintenance of a primary node
       A  primary  node  must  be  available  at  all  times  in  any  formation  and  group   in
       pg_auto_failover, that is the invariant provided by the whole solution. With that in mind,
       the only way to allow a primary node to go to a maintenance mode is to first failover  and
       promote the secondary node.

       The  same  command  pg_autoctl  enable maintenance implements that operation when run on a
       primary node with the option --allow-failover. Here is an example of such an operation:

          $ pg_autoctl enable maintenance
          11:53:03 50526 WARN  Enabling maintenance on a primary causes a failover
          11:53:03 50526 FATAL Please use --allow-failover to allow the command proceed

       As we can see the option allow-failover is mandatory. In the next example we use it:

          $ pg_autoctl enable maintenance --allow-failover
          13:13:42 1614 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          13:13:42 1614 INFO  Following table displays times when notifications are received
              Time |  ID |      Host |   Port |       Current State |      Assigned State
          ---------+-----+-----------+--------+---------------------+--------------------
          13:13:43 |   2 | localhost |   5002 |             primary | prepare_maintenance
          13:13:43 |   1 | localhost |   5001 |           secondary |   prepare_promotion
          13:13:43 |   1 | localhost |   5001 |   prepare_promotion |   prepare_promotion
          13:13:43 |   2 | localhost |   5002 | prepare_maintenance | prepare_maintenance
          13:13:44 |   1 | localhost |   5001 |   prepare_promotion |    stop_replication
          13:13:45 |   1 | localhost |   5001 |    stop_replication |    stop_replication
          13:13:46 |   1 | localhost |   5001 |    stop_replication |        wait_primary
          13:13:46 |   2 | localhost |   5002 | prepare_maintenance |         maintenance
          13:13:46 |   1 | localhost |   5001 |        wait_primary |        wait_primary
          13:13:47 |   2 | localhost |   5002 |         maintenance |         maintenance

       When the operation is done we can have the old primary re-join the group, this time  as  a
       secondary:

          $ pg_autoctl disable maintenance
          13:14:46 1985 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          13:14:46 1985 INFO  Following table displays times when notifications are received
              Time |  ID |      Host |   Port |       Current State |      Assigned State
          ---------+-----+-----------+--------+---------------------+--------------------
          13:14:47 |   2 | localhost |   5002 |         maintenance |          catchingup
          13:14:47 |   2 | localhost |   5002 |          catchingup |          catchingup
          13:14:52 |   2 | localhost |   5002 |          catchingup |           secondary
          13:14:52 |   1 | localhost |   5001 |        wait_primary |             primary
          13:14:52 |   2 | localhost |   5002 |           secondary |           secondary
          13:14:53 |   1 | localhost |   5001 |             primary |             primary

   Triggering a failover
       It is possible to trigger a manual failover, or a switchover, using the command pg_autoctl
       perform failover. Here's an example of what happens when running the command:

          $ pg_autoctl perform failover
          11:58:00 53224 INFO  Listening monitor notifications about state changes in formation "default" and group 0
          11:58:00 53224 INFO  Following table displays times when notifications are received
              Time |  ID |      Host |   Port |      Current State |     Assigned State
          ---------+-----+-----------+--------+--------------------+-------------------
          11:58:01 |   1 | localhost |   5001 |            primary |           draining
          11:58:01 |   2 | localhost |   5002 |          secondary |  prepare_promotion
          11:58:01 |   1 | localhost |   5001 |           draining |           draining
          11:58:01 |   2 | localhost |   5002 |  prepare_promotion |  prepare_promotion
          11:58:02 |   2 | localhost |   5002 |  prepare_promotion |   stop_replication
          11:58:02 |   1 | localhost |   5001 |           draining |     demote_timeout
          11:58:03 |   1 | localhost |   5001 |     demote_timeout |     demote_timeout
          11:58:04 |   2 | localhost |   5002 |   stop_replication |   stop_replication
          11:58:05 |   2 | localhost |   5002 |   stop_replication |       wait_primary
          11:58:05 |   1 | localhost |   5001 |     demote_timeout |            demoted
          11:58:05 |   2 | localhost |   5002 |       wait_primary |       wait_primary
          11:58:05 |   1 | localhost |   5001 |            demoted |            demoted
          11:58:06 |   1 | localhost |   5001 |            demoted |         catchingup
          11:58:06 |   1 | localhost |   5001 |         catchingup |         catchingup
          11:58:08 |   1 | localhost |   5001 |         catchingup |          secondary
          11:58:08 |   2 | localhost |   5002 |       wait_primary |            primary
          11:58:08 |   1 | localhost |   5001 |          secondary |          secondary
          11:58:08 |   2 | localhost |   5002 |            primary |            primary

       Again, timings and PID numbers are not expected to be the same when you run the command on
       your own setup.

       Also note in the output that the command shows the whole set of transitions including when
       the old primary is now a secondary node. The database is available for read-write  traffic
       as soon as we reach the state wait_primary.

   Implementing a controlled switchover
       It  is  generally  useful  to  distinguish  a  controlled  switchover  to a failover. In a
       controlled switchover situation it is possible to organise the sequence of events in a way
       to avoid data loss and lower downtime to a minimum.

       In  the  case  of  pg_auto_failover, because we use synchronous replication, we don't face
       data loss risks when triggering a manual failover. Moreover, our monitor knows the current
       primary  health  at  the  time  when  the  failover  is triggered, and drives the failover
       accordingly.

       So to trigger a controlled switchover with pg_auto_failover you can use the  same  API  as
       for a manual failover:

          $ pg_autoctl perform switchover

       Because  the  subtelties  of  orchestrating either a controlled switchover or an unplanned
       failover are all handled by the monitor, rather than the client side command line, at  the
       client level the two command pg_autoctl perform failover and pg_autoctl perform switchover
       are synonyms, or aliases.

   Current state, last events
       The following commands  display  information  from  the  pg_auto_failover  monitor  tables
       pgautofailover.node and pgautofailover.event:

          $ pg_autoctl show state
          $ pg_autoctl show events

       When  run  on  the  monitor,  the commands outputs all the known states and events for the
       whole set of formations handled by the monitor. When run on a PostgreSQL node, the command
       connects  to  the monitor and outputs the information relevant to the service group of the
       local node only.

       For interactive debugging it is helpful to run the following command from the monitor node
       while e.g. initializing a formation from scratch, or performing a manual failover:

          $ watch pg_autoctl show state

   Monitoring pg_auto_failover in Production
       The  monitor  reports  every state change decision to a LISTEN/NOTIFY channel named state.
       PostgreSQL logs on the monitor are also  stored  in  a  table,  pgautofailover.event,  and
       broadcast by NOTIFY in the channel log.

   Replacing the monitor online
       When  the  monitor  node  is not available anymore, it is possible to create a new monitor
       node and then switch existing nodes to a new monitor by using the following commands.

          1. Apply the STONITH approach on the old monitor to make sure this node is not going to
             show up again during the procedure. This step is sometimes referred to as “fencing”.

          2. On  every  node,  ending  with  the  (current) Postgres primary node for each group,
             disable the monitor while pg_autoctl is still running:

                 $ pg_autoctl disable monitor --force

          3. Create a new monitor node:

                 $ pg_autoctl create monitor ...

          4. On the current primary node first, so that it's registered first and  as  a  primary
             still, for each group in your formation(s), enable the monitor online again:

                 $ pg_autoctl enable monitor postgresql://autoctl_node@.../pg_auto_failover

          5. On every other (secondary) node, enable the monitor online again:

                 $ pg_autoctl enable monitor postgresql://autoctl_node@.../pg_auto_failover

       See  pg_autoctl  disable  monitor  and  pg_autoctl  enable monitor for details about those
       commands.

       This operation relies on the fact that a pg_autoctl can be operated without a monitor, and
       when  reconnecting  to  a new monitor, this process reset the parts of the node state that
       comes from the monitor, such as the node identifier.

   Trouble-Shooting Guide
       pg_auto_failover commands can be run repeatedly. If initialization fails the first time --
       for  instance  because  a  firewall  rule  hasn't  yet  activated  -- it's possible to try
       pg_autoctl create again. pg_auto_failover will review its  previous  progress  and  repeat
       idempotent operations (create database, create extension etc), gracefully handling errors.

FREQUENTLY ASKED QUESTIONS

Those questions have been asked in GitHub issues for the project by several people. If you
have more questions, feel free to open a new issue, and your question and its answer might
make it to this FAQ.

I stopped the primary and no failover is happening for 20s to 30s, why?
In order to avoid spurious failovers when the network connectivity is not stable,
pg_auto_failover implements a timeout of 20s before acting on a node that is known
unavailable. This needs to be added to the delay between health checks and the retry
policy.

See the Configuring pg_auto_failover part for more information about how to setup the
different delays and timeouts that are involved in the decision making.

See also pg_autoctl watch to have a dashboard that helps understanding the system and
what's going on in the moment.

The secondary is blocked in the CATCHING_UP state, what should I do?
In the pg_auto_failover design, the following two things are needed for the monitor to be
able to orchestrate nodes integration completely:

1. Health Checks must be successful

The monitor runs periodic health checks with all the nodes registered in the system.
Those health checks are Postgres connections from the monitor to the registered
Postgres nodes, and use the hostname and port as registered.

The pg_autoctl show state commands column Reachable contains "yes" when the monitor
could connect to a specific node, "no" when this connection failed, and "unknown"
when no connection has been attempted yet, since the last startup time of the
monitor.

The Reachable column from pg_autoctl show state command output must show a "yes"
entry before a new standby node can be orchestrated up to the "secondary" goal
state.

2. pg_autoctl service must be running

The pg_auto_failover monitor works by assigning goal states to individual Postgres
nodes. The monitor will not assign a new goal state until the current one has been
reached.

To implement a transition from the current state to the goal state assigned by the
monitor, the pg_autoctl service must be running on every node.

When your new standby node stays in the "catchingup" state for a long time, please check
that the node is reachable from the monitor given its hostname and port known on the
monitor, and check that the pg_autoctl run command is running for this node.

When things are not obvious, the next step is to go read the logs. Both the output of the
pg_autoctl command and the Postgres logs are relevant. See the Should I read the logs?
Where are the logs? question for details.

Should I read the logs? Where are the logs?
Yes. If anything seems strange to you, please do read the logs.

As maintainers of the pg_autoctl tool, we can't foresee everything that may happen to your
production environment. Still, a lot of efforts is spent on having a meaningful output. So
when you're in a situation that's hard to understand, please make sure to read the
pg_autoctl logs and the Postgres logs.

When using systemd integration, the pg_autoctl logs are then handled entirely by the
journal facility of systemd. Please then refer to journalctl for viewing the logs.

The Postgres logs are to be found in the $PGDATA/log directory with the default
configuration deployed by pg_autoctl create .... When a custom Postgres setup is used,
please refer to your actual setup to find Postgres logs.

The state of the system is blocked, what should I do?
This question is a general case situation that is similar in nature to the previous
situation, reached when adding a new standby to a group of Postgres nodes. Please check
the same two elements: the monitor health checks are successful, and the pg_autoctl run
command is running.

Impossible / unresolveable state after crash - How to recover?
The pg_auto_failover Failover State Machine is great to simplify node management and
orchestrate multi-nodes operations such as a switchover or a failover. That said, it might
happen that the FSM is unable to proceed in some cases, usually after a hard crash of some
components of the system, and mostly due to bugs.

Even if we have an extensive test suite to prevent such bugs from happening, you might
have to deal with a situation that the monitor doesn't know how to solve.

The FSM has been designed with a last resort operation mode. It is always possible to
unregister a node from the monitor with the pg_autoctl drop node command. This helps the
FSM getting back to a simpler situation, the simplest possible one being when only one
node is left registered in a given formation and group (state is then SINGLE).

When the monitor is back on its feet again, then you may add your nodes again with the
pg_autoctl create postgres command. The command understands that a Postgres service is
running and will recover from where you left.

In some cases you might have to also delete the local pg_autoctl state file, error
messages will instruct you about the situation.

The monitor is a SPOF in pg_auto_failover design, how should we handle that?
When using pg_auto_failover, the monitor is needed to make decisions and orchestrate
changes in all the registered Postgres groups. Decisions are transmitted to the Postgres
nodes by the monitor assigning nodes a goal state which is different from their current
state.

Consequences of the monitor being unavailable
Nodes contact the monitor each second and call the node_active stored procedure, which
returns a goal state that is possibly different from the current state.

The monitor only assigns Postgres nodes with a new goal state when a cluster wide
operation is needed. In practice, only the following operations require the monitor to
assign a new goal state to a Postgres node:

• a new node is registered

• a failover needs to happen, either triggered automatically or manually

• a node is being put to maintenance

• a node replication setting is being changed.

When the monitor node is not available, the pg_autoctl processes on the Postgres nodes
will fail to contact the monitor every second, and log about this failure. Adding to that,
no orchestration is possible.

The Postgres streaming replication does not need the monitor to be available in order to
deliver its service guarantees to your application, so your Postgres service is still
available when the monitor is not available.

To repair your installation after having lost a monitor, the following scenarios are to be
considered.

The monitor node can be brought up again without data having been lost
This is typically the case in Cloud Native environments such as Kubernetes, where you
could have a service migrated to another pod and re-attached to its disk volume. This
scenario is well supported by pg_auto_failover, and no intervention is needed.

It is also possible to use synchronous archiving with the monitor so that it's possible to
recover from the current archives and continue operating without intervention on the
Postgres nodes, except for updating their monitor URI. This requires an archiving setup
that uses synchronous replication so that any transaction committed on the monitor is
known to have been replicated in your WAL archive.

At the moment, you have to take care of that setup yourself. Here's a quick summary of
what needs to be done:

1. Schedule base backups

Use pg_basebackup every once in a while to have a full copy of the monitor Postgres
database available.

2. Archive WAL files in a synchronous fashion

Use pg_receivewal --sync ... as a service to keep a WAL archive in sync with the
monitor Postgres instance at all time.

3. Prepare a recovery tool on top of your archiving strategy

Write a utility that knows how to create a new monitor node from your most recent
pg_basebackup copy and the WAL files copy.

Bonus points if that tool/script is tested at least once a day, so that you avoid
surprises on the unfortunate day that you actually need to use it in production.

A future version of pg_auto_failover will include this facility, but the current versions
don't.

The monitor node can only be built from scratch again
If you don't have synchronous archiving for the monitor set-up, then you might not be able
to restore a monitor database with the expected up-to-date node metadata. Specifically we
need the nodes state to be in sync with what each pg_autoctl process has received the last
time they could contact the monitor, before it has been unavailable.

It is possible to register nodes that are currently running to a new monitor without
restarting Postgres on the primary. For that, the procedure mentioned in Replacing the
monitor online must be followed, using the following commands:

$ pg_autoctl disable monitor
$ pg_autoctl enable monitor

AUTHOR

       Microsoft

COPYRIGHT

       Copyright (c) Microsoft Corporation. All rights reserved.