PostgreSQL HA operations
Triggering a failover
To call the function successfully, you need to figure out the formation and group of the group where the failover happens.
The following commands when run on a pg_auto_failover
keeper node provide for the necessary information:
Machine: node-1
Get the variables [formation] and [group] from these commands:
su - postgres
export PATH="$PATH:/usr/pgsql-12/bin"
# [formation] -> the default value is 'default'
pg_autoctl config get pg_autoctl.formation --pgdata ./[node-1]
# [group] -> the default value is '0'
pg_autoctl config get pg_autoctl.group --pgdata ./[node-1]
Machine: monitor
sudo su - postgres
export PATH="$PATH:/usr/pgsql-12/bin"
psql -p [port] -d pg_auto_failover
> select pgautofailover.perform_failover(formation_id => '[formation]', group_id => [group]);
Implementing a controlled switchover
It is generally useful to distinguish a controlled switchover from a failover. In a controlled switchover situation it is possible to organize the sequence of events in a way to avoid data loss and lower downtime to a minimum.
In the case of pg_auto_failover
, because we use synchronous replication, we don’t face data loss risks when triggering a manual failover. Moreover, our monitor knows the current primary health at the time when the failover is triggered and drives the failover accordingly.
So to trigger a controlled switchover with pg_auto_failover
you can use the same API as for a manual failover above.
Maintenance of a secondary node
It is possible to put a secondary node in any group in a MAINTENANCE state so that the Postgres server is not doing synchronous replication anymore and can be taken down for maintenance purposes, such as security kernel upgrades or the like.
Machine: node-1 | node-2
To enable maintenance we use:
pg_autoctl enable maintenance --pgdata ./[node-1 | node-2]
When a standby node is in maintenance, the monitor sets the primary node replication to WAIT_PRIMARY: in this role, the PostgreSQL streaming replication is now asynchronous and the standby PostgreSQL server may be stopped, rebooted, etc.
Note
pg_auto_failover
does not provide support for primary server maintenance.
To disable maintenance we use
pg_autoctl disable maintenance --pgdata ./[node-1 | node-2]
Show current state and events
$ pg_autoctl show state --pgdata [monitor | node-1 | node-2]
$ pg_autoctl show events --pgdata [monitor | node-1 | node-2]
Monitoring pg_auto_failover
in production
The monitor reports every state change decision to a LISTEN/NOTIFY channel named state. PostgreSQL logs on the monitor are also stored in a table, pgautofailover.event
, and broadcast by NOTIFY in the channel log.
Machine: monitor
sudo su - postgres
tail -f ./[monitor]/pg_log/postgresql-[WeekDay].log