Postgres Streaming Replication by Hand
Setting up a primary, configuring pg_basebackup, streaming WAL to a standby — no managed service, no Patroni, just Postgres.
Real-World Analogy
Training a replacement employee by giving them a complete copy of your work history (pg_basebackup) and then letting them shadow every new task you do in real time (streaming WAL). When you leave, they’re fully up to date and can take over immediately.
Primary Configuration
# postgresql.conf on primary
wal_level = replica # minimum for replication (logical for Debezium)
max_wal_senders = 5 # max concurrent WAL sender processes
max_replication_slots = 5 # if using replication slots
wal_keep_size = 1GB # keep 1GB of WAL segments (backup if replica falls behind)
hot_standby = on # replica can serve reads
# Optional: track commit timestamps (needed for some HA tools)
track_commit_timestamp = on # pg_hba.conf on primary — allow replica to connect for replication
# TYPE DATABASE USER ADDRESS METHOD
host replication replicator 10.0.0.11/32 scram-sha-256
host replication replicator 10.0.0.12/32 scram-sha-256 -- Create replication user
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'strong-password'; # Apply changes
pg_ctlcluster 16 main reload Creating the Replica
# On the replica server — stop any existing Postgres
pg_ctlcluster 16 main stop
# Wipe data directory
rm -rf /var/lib/postgresql/16/main/*
# Take base backup from primary (pg_basebackup streams data and WAL simultaneously)
pg_basebackup \
--host=10.0.0.10 \ # primary IP
--username=replicator \
--pgdata=/var/lib/postgresql/16/main \
--wal-method=stream \ # stream WAL during backup (consistent snapshot)
--write-recovery-conf \ # write standby.signal + postgresql.auto.conf
--checkpoint=fast \ # don't wait for natural checkpoint
--progress \
--verbose --write-recovery-conf creates two files:
standby.signal— presence of this file tells Postgres to start as a standbypostgresql.auto.conf— containsprimary_conninfopointing to the primary
# postgresql.auto.conf (written by pg_basebackup)
primary_conninfo = 'host=10.0.0.10 port=5432 user=replicator password=strong-password application_name=replica1' # Start the replica
pg_ctlcluster 16 main start
# Verify it's running as a standby
psql -c "SELECT pg_is_in_recovery();"
# t → it's a replica
# Check replication on primary
psql -h 10.0.0.10 -c "SELECT * FROM pg_stat_replication;" postgresql.conf on Replica
# postgresql.conf on replica
hot_standby = on # allow read queries
hot_standby_feedback = on # prevent primary from vacuuming rows replica still needs
max_standby_streaming_delay = 30s # max delay before cancelling queries that conflict with WAL
max_standby_archive_delay = 30s hot_standby_feedback tells the primary which rows the replica is reading so the primary doesn’t vacuum them away. The trade-off: table bloat on the primary if the replica has long-running queries.
Verifying Replication
-- On primary
SELECT
application_name,
client_addr,
state, -- streaming, catchup, startup
sync_state, -- async, sync, quorum
replay_lag,
write_lag,
flush_lag
FROM pg_stat_replication;
-- On replica
SELECT
now() - pg_last_xact_replay_timestamp() AS replication_lag,
pg_is_in_recovery(),
pg_last_wal_receive_lsn(),
pg_last_wal_replay_lsn(); Write a row on the primary, check it appears on the replica:
# Primary
psql -h 10.0.0.10 -c "INSERT INTO test_replication VALUES (1);"
# Replica (should appear within milliseconds)
psql -h 10.0.0.11 -c "SELECT * FROM test_replication;" Manual Failover
If the primary fails, promote a replica:
# On the replica to promote
pg_ctlcluster 16 main promote
# Or:
pg_ctl promote -D /var/lib/postgresql/16/main
# Verify it's now primary
psql -c "SELECT pg_is_in_recovery();"
# f → it's now primary After promotion:
- Update application connection strings to point to the new primary
- Re-point any other replicas to the new primary
- If the old primary recovers, it must be rebuilt as a replica (it has diverged)
Rebuilding the old primary as a new replica:
# On old primary (now demoted)
pg_ctlcluster 16 main stop
rm -rf /var/lib/postgresql/16/main/*
pg_basebackup \
--host=10.0.0.11 \ # new primary (former replica)
--username=replicator \
--pgdata=/var/lib/postgresql/16/main \
--wal-method=stream \
--write-recovery-conf \
--checkpoint=fast
pg_ctlcluster 16 main start Replication with Multiple Replicas
# Primary postgresql.conf
max_wal_senders = 10 # one per replica
# Synchronous commit with one sync replica, rest async
synchronous_standby_names = 'ANY 1 (replica1, replica2)'
# Primary waits for any 1 of these two to confirm WAL receipt Cascade replication (replica replicates from another replica):
# replica2 replicates from replica1 instead of primary
# primary_conninfo in replica2's postgresql.auto.conf
primary_conninfo = 'host=10.0.0.11 port=5432 user=replicator ...'
# replica1 must also be configured to allow WAL streaming Cascade reduces network load on the primary but increases replication lag (replica2 = primary lag + replica1 lag).
Monitoring Script
#!/bin/bash
# /usr/local/bin/check-replication.sh
PRIMARY="10.0.0.10"
REPLICAS=("10.0.0.11" "10.0.0.12")
MAX_LAG_SECONDS=30
for replica in "${REPLICAS[@]}"; do
lag=$(psql -h "$replica" -U postgres -tAc \
"SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::int")
if [ "$lag" -gt "$MAX_LAG_SECONDS" ]; then
echo "ALERT: $replica is ${lag}s behind primary"
# Send to alertmanager, PagerDuty, etc.
curl -X POST "$ALERTMANAGER_URL/api/v1/alerts" \
-d "[{\"labels\":{\"alertname\":\"ReplicationLagHigh\",\"replica\":\"$replica\"},\"annotations\":{\"lag\":\"${lag}s\"}}]"
else
echo "OK: $replica is ${lag}s behind primary"
fi
done # Add to cron
*/5 * * * * /usr/local/bin/check-replication.sh >> /var/log/replication-check.log 2>&1 WAL Archiving (Point-in-Time Recovery)
WAL archiving keeps old WAL segments for PITR:
# postgresql.conf
archive_mode = on
archive_command = 'aws s3 cp %p s3://my-wal-archive/%f'
# %p = full path of WAL file, %f = filename
# Verify archiving works
psql -c "SELECT * FROM pg_stat_archiver;"
# last_archived_wal should be recent
# failed_count should be 0 With WAL archiving + base backup, you can restore to any point in time:
# 1. Restore base backup
pg_basebackup --host=... --pgdata=/var/lib/postgresql/16/restore ...
# 2. Configure PITR target
cat >> /var/lib/postgresql/16/restore/postgresql.conf <<EOF
restore_command = 'aws s3 cp s3://my-wal-archive/%f %p'
recovery_target_time = '2024-01-15 14:30:00'
recovery_target_action = 'promote'
EOF
# 3. Create recovery signal
touch /var/lib/postgresql/16/restore/recovery.signal
# 4. Start Postgres — it replays WAL up to the target time
pg_ctlcluster 16 restore start