Skip to content

GalacticFog/docker-pgrepl

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

docker-pgrepl

This Dockerfile uses the standard postgres 9.4 docker image and adds a script that sets up streaming replication between two or more docker containers running PostgreSQL.

This is based off the work by @mgudmund at https://github.com/mgudmund/docker-pgrepl. It has been modified to better support use on DC/OS with persistent volumes, via:

  • better support for re-entrant containers
  • more environment variable support
  • more flexible PGDATA placement

In addition, there are utility scripts for deployment and management on:

  • docker
  • DC/OS (Marathon)

Additionally, the image has been customized to bootstrap the databases necessary for the gestalt-framework; this can be disabled by removing the gestalt.sh script.


Usage under local docker

To clone this git repo run:

# git clone https://github.com/GalacticFog/docker-pgrepl.git

To build the docker image do:

# docker build -t postgres_repl .

To create the first docker container with the primary node run:

# docker run -d -P --name pgrepl1  postgres_repl 

Check the logs to see if postgres started correctly:

# docker logs pgrepl1
...
LOG:  database system was shut down at 2015-06-30 08:14:39 UTC
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

To add a standby to the primary, pgrepl1, run:

# docker run -d --link pgrepl1:postgres -P --name pgrepl2 -e PGREPL_ROLE=STANDBY  postgres_repl

Check the logs to make sure it has entered standby mode:

# docker logs pgrepl2 
...
LOG:  database system was interrupted while in recovery at log time 2015-06-30 08:15:14 UTC
HINT:  If this has occurred more than once some data might be corrupted and you might need to choose an earlier recovery target.
LOG:  entering standby mode
LOG:  started streaming WAL from primary at 0/4000000 on timeline 1
LOG:  redo starts at 0/4000060
LOG:  consistent recovery state reached at 0/5000000
LOG:  database system is ready to accept read only connections

To add a second standby to the primary, pgrepl1, run:

# docker run -d --link pgrepl1:postgres -P --name pgrepl3 -e PGREPL_ROLE=STANDBY  postgres_repl

To add a third standby, downstream of the first standby, pgrepl2, run:

# docker run -d --link pgrepl2:postgres -P --name pgrepl4 -e PGREPL_ROLE=STANDBY  postgres_repl

The --link directive specifies what upstream postgres node to connect the standby to. After the above commands have been run, you should have a Postgres streaming replica setup like this:

pgrepl1 
   |      
   |--> pgrepl2 --> pgrepl4
   |
   |--> pgrepl3

To promote a standby to become a primary, you can use docker exec. Example:

If pgrepl1 crashes, run the following command to promote pgrepl2 to become the primary

# docker exec pgrepl2 gosu postgres pg_ctl promote
server promoting

Check the logs to see if has promoted successfully:

# docker logs pgrepl2
...
LOG:  received promote request
FATAL:  terminating walreceiver process due to administrator command
LOG:  record with zero length at 0/5000060
LOG:  redo done at 0/5000028
LOG:  selected new timeline ID: 2
LOG:  archive recovery complete
LOG:  MultiXact member wraparound protections are now enabled
LOG:  database system is ready to accept connections
LOG:  autovacuum launcher started

This would promte pgrepl2 to be the primary. The downstream standby from pgrepl2, pgrepl4 will switch timelines and continue to be the downstream standby. Checking the logs for pgrepl4 will show that:

# docker logs pgrepl4
LOG:  replication terminated by primary server
DETAIL:  End of WAL reached on timeline 1 at 0/5000060.
LOG:  fetching timeline history file for timeline 2 from primary server
LOG:  new target timeline is 2
LOG:  record with zero length at 0/5000060
LOG:  restarted WAL streaming at 0/5000000 on timeline 2

pgrepl3 would in this case not have any primary to connect to. You could reconfigure it to follow pgrepl2, or just remove it and create a new standby, downstream from pgrepl2.

If you don't want to use the docker --link, you can specify the IP and port of the replication primary using PGREPL_MASTER_IP and PGREPL_MASTER_PORT as environment variables in your docker run command.

There are example scripts in the docker directory that perform these operations, using container linking:

# ./docker/primary.sh pgrepl1
Container ID: 0b0fb5dec403c81b9cca514473ea54c9189bd3f205ce6cf53923e5ee4fac488f
# ./docker/standby.sh pgrepl1 pgrepl2
Container ID: e94b7c5359b3af2ac7f9afe42841453e2bc18bd73770fb5d0fd9bc642e20f25e
# ./docker/standby.sh pgrepl1 pgrepl3
Container ID: d64e3020efdf4402bd41ed586a12cb60fecdc788f4829506a08acf19c9f95f79
# ./docker/standby.sh pgrepl2 pgrepl4
Container ID: 4a49008b507b3c25c6d59c7e8bcde65c8d89d4422d26417e99ead459a1ba7961
# ./docker kill pgrepl1
pgrepl1
# ./docker/promote.sh pgrepl2
server promoting
# docker logs pgrepl4 
LOG:  replication terminated by primary server
DETAIL:  End of WAL reached on timeline 1 at 0/3000060.
LOG:  fetching timeline history file for timeline 2 from primary server
LOG:  new target timeline is 2
LOG:  record with zero length at 0/3000060
LOG:  restarted WAL streaming at 0/3000000 on timeline 2

Usage under DC/OS

The image supports usage under DC/OS (1.8 or later), deployed via Marathon. The marathon directory constains scripts to deploy primary and standby containers and to promote standby containers to primary operation:

  • All containers are provisioned using local persistent volumes. This has the effect of locking the container to a specific host, but it means that if the container is restarted (after being suspended or crashing), the PGDATA directory is still available.
  • All containers are provisioned with a virtual IP (VIP). The promote script has the option to modify the VIP when moving a container from standby, allowing it to take over the VIP of the primary. This means that there is no need to update the configuration of downstream services (including other standbys).

Assuming that Marathon is available at http://marathon.mesos:8080, to create a primary with Marathon application ID pgrepl1 and VIP primary.pgrepl, run:

# ./marathon/primary.sh http://marathon.mesos:8080 pgrepl1 /primary.pgrepl:5432

At this point, applications can access the primary at the URL:

primary.pgrepl.marathon.l4lb.thisdcos.directory:5432

Creating a standby pgrepl2 against this primary can be done as follows:

# ./marathon/standby.sh http://marathon.mesos:8080 pgrepl1 pgrepl2 /standby.pgrepl:5432

The pgrepl2 standby is configured to reach the primary using the URL above, and is itself available (for read-only access) at:

standby.pgrepl.marathon.l4lb.thisdcos.directory:5432

Similarly, other standbys can be created, against pgrepl2 or the pgrepl1 primary.

Marathon will work to make sure that pgrepl1 is restarted should it fail, albeit on the same host, due to the persistent volume. In the case that the Mesos agent hosting pgrepl1 is taken offline, it will be necessary to promote one of the standbys to primary. This can be done as follows:

# ./marathon/promote.sh http://marathon.mesos:8080 pgrepl2 /primary.pgrepl:5432

This will restart the pgrepl2 container, changing its PGREPL_ROLE environment variable from STANDBY to PRIMARY and causing it to enter primary mode. It will also associate the container with the VIP primary.pgrepl:5432 formerly associated with pgrepl1, such that any downstream apps that had been configured to communicate on that address do not require configuration changes.

In addition to HA, this process can be used to upgrade the disk allocation for the database. The primary and standby scripts use the environment variable PGREPL_DISK_SIZE to indicate the disk allocation size in megabytes. Spinning up a new standby with a larger disk and then promoting it to primary allows the the database disk allocation to be increased, which is not possible for an existing Mesos task:

# PGREPL_DISK_SIZE=250 ./marathon/standby.sh https://marathon.mesos:8080 pgrepl1 pgrepl_bigger /standby.pgrepl:5432
# # now delete pgrepl1...
# ./marathon/promote.sh https://marathon.mesos:8080 pgrepl_bigger /primary.pgrepl:5432

There are some improvements to be made to this project:

  • Add support for wal archiving
  • Add tool for automatic failover, like repmgr.
  • Mesos framework for deployment, monitoring and automatic failover
  • DC/OS Universe package

The image supports all feautures of the official postgres image, so setting postgres password etc, works, but not done in the above examples.

Replication connection uses a user called pgrepl, and needs a password that is, for now, genereated based on a token. There is a default token, but you can specify your own using the environment variable PGREPL_TOKEN.

About

Docker files for Postgres streaming replication

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%