« Previous 1 2 3 Next »
PostgreSQL Replication Update
Data Duplication
Streaming Replication
The simplest and easiest form of data duplication to maintain is thus streaming replication. First a backup of the database instance is created; then the admin replicates the changes using the transaction log. If you want to create this kind of setup, edit the postgresql.conf
file on the master so that it contains the following entries:
wal_level = replica max_wal_senders = 3 hot_standby = on
Next, you need to add the following line to the pg_hab.conf
file
host replication postgres <slave IP address> md5
and restart the master. Now, it's the slave's turn. Install PostgreSQL on the slave and create space for the data:
mkdir /data chmod 700 data
You can then launch the initial backup on the slave:
pg_basebackup -D /data -h <Master_IP> --checkpoint=fast --xlog-method=stream -R
As soon as the data has been copied, the slave should start. The -R
option generates the configuration file for the slave. Once it's running, the slave can be used as a read-only machine. Of course, you can create as many slaves as you like; star-shaped replication is possible. In the default case, this replication is an asynchronous single-master type.
Synchronous Replication
If you need synchronous replication, you can also configure it very easily. In the master's postgresql.conf
file, set the synchronous_standby_names
option; for example,
synchronous_standby_names = node1, node2, node3
and add the following to the primary_conninfo
section of the recovery.conf
file on the slave:
application_name=node1
node1
then becomes a synchronously replicated node. In this example, it is important that synchronous is the default setting for node1
; however, PostgreSQL lets you decide how to replicate each individual transaction, which allows you to avoid the overhead of a synchronous commit for each transaction.
To determine the quality (durability) of a transaction, the synchronous_commit
option can be adjusted to one of these five values at the session level:
remote_apply
: A transaction is valid if it is written on both systems and visible on the slave, which is important for load balancing.on
: A transaction is flushed on both systems but might not be visible for other reads yet. You can thus replicate more or less synchronously, although the slave displays hours-old data.remote_write
: A transaction is flushed on the master but only propagated to the kernel on the standby system (without flushing), making the transaction faster and assuming that the two storage systems do not die at the same time.local
: A transaction is flushed on the master, but no feedback is required from the slave (asynchronous).off
: Flushing is delayed on the master, and replication is generated asynchronously. The database guarantees integrity at any time but can lose data for some milliseconds in the event of a fault. The idea is to speed up short transactions.
The following example shows how to influence the durability of a transaction:
BEGIN; SET synchronous_commit TO remote_write; INSERT INTO tab VALUES <[...]> COMMIT;
The durability can be set for each transaction individually.
PostgreSQL 9.6 and 10.0
In many cases, a cluster will not be limited to two nodes; therefore, PostgreSQL lets you manage as many slaves as needed and set the number of synchronous slaves. For example, if you have 10 slaves, you can tell PostgreSQL in simple syntax to keep seven of them in sync. A commit only occurs if a sufficient number of machines have given their okay and the data is therefore safe.
A single line in the configuration lets you keep machines on several continents in sync, but keep in mind that the limits of physics continue to apply: no one can replicate data synchronously to half the globe and assume that it has no effect on the speed of write transactions. You always will see a trade-off between durability and performance.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)