How to run Vidispine with high availability Part 3 – pgpool AND PostgreSQL

In the third part of the Vidispine High Availability series we take a look on how you can use pgpool-II to configure PostgreSQL for high availability.

In this third part of the high availability series we will show you how to configure Postgresql for high availability. In Part #1 we made a simple Vidispine cluster, and in Part #2 we added HAProxy, and in the final Part #4 we will show how you can work with SolrCloud.

PostgreSQL HA Configuration

In this part, we will make use of the PostgreSQL “stream replication” (binary replication) feature and pgpool-II to setup a HA PostgreSQL backend. “Stream replication” is a new feature since PostgreSQL 9.0, so make sure PostgreSQL 9.0 or above is used.

In stream replication, one or multiple standby server(s) will need to be configured to receive WAL logs from the primary server.

When the primary server fails, a trigger file will need to be created on one of the standby server. So that standby can stop reading WAL logs and switch to read-write or “master” mode. In this case, the trigger file will be created by pgpool

Please note that if there are multiple standby servers, promoting one of them would break the others. The others need be reconfigured to pointing to the new primary.

For more detailed explanation about stream replication, please refer to https://wiki.postgresql.org/wiki/Binary_Replication_Tutorial
https://www.postgresql.org/docs/9.1/static/runtime-config.html

Testing environment

Ubuntu 12.04 + PostgreSQL 9.1.8

PostgreSQL server 1 (primary): 10.185.20.1
PostgreSQL server 2 (standby): 10.185.20.2
pgpool server: 10.185.20.100

PostgreSQL data folder is

CODE

/var/lib/PostgreSQL/9.1/main

Installation

Install PostgreSQL on server 1 and server 2 as usual, and make sure the databases are initialized and running without problems.

Install pgpool-II on server 10.185.20.100

CODE

apt-get install postgresql-contrib
apt-get install pgpool2

Configuration

STREAM REPLICATION

First of all, please make sure that the user running PostgreSQL primary server is able to access standby servers using the following command without entering password:

CODE

ssh postgres@10.185.20.2

Stop both Server 1 and Server 2

ON SERVER 1 (PRIMARY)

Edit /etc/postgresql/9.1/main/postgresql.conf :

CODE

listen_addresses = '*'
wal_level = hot_standby
max_wal_senders = 1
wal_keep_segments = 32
archive_mode = on
archive_command = 'rsync -aq %p postgres@10.185.20.2:/var/lib/postgresql/archive/%f'

Edit /etc/postgresql/9.1/main/pg_hba.conf to allow hot standby server to read WAL logs:

CODE

host     replication     postgres        10.185.20.2/16          trust

Restart primary server

CODE

/etc/init.d/postgresql restart

Make a base backup by copying the primary server’s data directory to the standby server.

CODE

psql -c "SELECT pg_start_backup('test-backup', true)"
rsync -ac /var/lib/postgresql/9.1/main/  postgres@10.185.20.2:/var/lib/postgresql/9.1/main --exclude postmaster.pid
psql -c "SELECT pg_stop_backup()"

ON SERVER 2 (SECONDARY)

Edit /etc/postgresql/9.1/main/postgresql.conf

CODE

hot_standby = on

Create a recovery.conf under the standby server’s data directory.

CODE

standby_mode          = 'on'
primary_conninfo      = 'host=10.185.20.1 port=5432 user=postgres'
trigger_file = '/var/lib/postgresql/pg_trigger'
restore_command = 'cp -f /var/lib/postgresql/archive/%f %p </dev/null'

Restart standby server. It will then start streaming replication.

Verify

Run these commands on primary and standby respectively:

CODE

psql -c "SELECT pg_current_xlog_location()" -h10.185.20.1      (primary host)
psql -c "select pg_last_xlog_receive_location()" -h10.185.20.2 (standby host)

They should give the same results.

There should be one “sender” and one “receiver” process running on primary and standby server, respectively:

CODE

[primary]$ ps -aux | grep sender
[standby]$ ps -aux | grep receiver

Failover

Now, consider the pgpool server.

Before starting, please make sure that the user running pgpool is able to execute this command successfully without entering the password manually:

CODE

/usr/bin/ssh -T postgres@10.185.20.2 /bin/touch /var/lib/postgresql/pg_trigger_test

Now, configure pgpool to handle failover (by creating the trigger file automatilly)

Edit /etc/pgpool2/pgpool.conf :

CODE

listen_addresses = '*'

    backend_hostname0 = '10.185.20.1'
    backend_port0 = 5432
    backend_weight0 = 1
    backend_data_directory0 = '/var/lib/postgresql/9.1/main'
    backend_flag0 = 'ALLOW_TO_FAILOVER'

backend_hostname1 = '10.185.20.2'
    backend_port1 = 5432
    backend_weight1 = 1
    backend_data_directory1 = '/var/lib/postgresql/9.1/main'
    backend_flag1 = 'ALLOW_TO_FAILOVER'

log_destination = 'syslog'
  log_standby_delay = 'always'
    syslog_facility = 'LOCAL2'
    syslog_ident = 'pgpool'
    logdir = '/var/log/pgpool'

master_slave_mode = on
    master_slave_sub_mode = 'stream'
    delay_threshold = 1
    sr_check_period = 5
    sr_check_user = 'postgres'
    sr_check_password = ''
    delay_threshold = 1

health_check_period = 10
    health_check_timeout = 20
    health_check_user = 'postgres'
    health_check_password = ''

failover_command = '/usr/local/bin/failover_stream.sh %d 10.185.20.2 /var/lib/postgresql/pg_trigger_test'

In configuration above, two backend servers were defined for pgpool, and pgpool is configured in master/slave mode.

When a server failure is detected, it will try to execute the failover_command .

pgpool uses syslog for logging, so you would need to add one more line to your syslog configuration file ( /etc/rsyslog.conf under Ubuntu):

CODE

local2.* /var/log/pgpool/pgpool.log

the failover_stream.sh script looks like:

CODE

#! /bin/sh
# Failover command for streaming replication.
# This script assumes that DB node 0 is primary, and 1 is standby.
#
# If standby goes down, do nothing. If primary goes down, create a
# trigger file so that standby takes over primary node.
#
# Arguments: $1: failed node id. $2: new master hostname. $3: path to
# trigger file.

failed_node=$1
#new_master=$2
#trigger_file=$3

# Do nothing if standby goes down.
if [ $failed_node = 1 ]; then
    exit 0;
        fi 


# Create the trigger file.
/usr/bin/ssh -T postgres@10.185.20.2 /bin/touch /var/lib/postgresql/pg_trigger_test

exit 0;

Pgpooladmin (optional)

There is a web tool called pgpoolAdmin that can assist with the configure files and monitor the backend servers.

Download the source from https://www.pgpool.net/mediawiki/index.php/Downloads#pgpoolAdmin_.28…

Unzip it into your apache folder and then follow the installation guide.

Please make sure that you have php5-pgsql installed.

CODE

apt-get install php5-pgsql

For more information, please refer to https://pgpool.projects.pgfoundry.org/pgpoolAdmin/doc/index_en.html

Read the next part in the series here: