Saturday 9 February 2013

PostgreSQL replication

Here is a summary of the information that I gleaned from the PostgreSQL and Slony docs :

Write Ahead Logging (WAL) :
  • Technique for providing atomicity and durability and ensuring data integrity.
  • Changes to data files (where tables and index files reside) must be written only after those changes have been logged (i.e., flushed to permanent storage).
  • In the event of a crash, we can recover the DB using the log.
  • Only log file needs to be flushed to disk to guarantee that a transaction is committed, rather than every data file changed by the transaction. This means reduced number of disk writes.
  • Log file is written sequentially.
  • By archiving the WAL data, we can revert to any point in time.
Log shipping (also known as warm standby) :
  • Directly moving WAL records from one DB server to another is known as log shipping.
  • Standby servers are ready to take over operations if the primary server fails. 
  • Primary server operates in continuous archiving mode while standby server operates in continuous recovery mode, reading the WAL files from the primary.
  • The primary and standby servers are loosely coupled.
  • Low administration overhead since no changes to DB tables are required.
  • Low performance impact on primary server.
  • Shipping of WAL files is easy and cheap.
  • File-based log shipping (16 MB in postgres) vs record-based log shipping (streams WAL changes incrementally; more granular; also known as stream replication)
  • Log shipping is asynchronous i.e., WAL records are shipped after transaction commit. As a result, there is a window for data loss. The size of data loss window can be adjusted if archive_timeout is set very low (to a few seconds). However, this will increase the bandwidth for file shipping.
  • Streaming replication allows a much smaller window of data loss.
  • Recovery performance is good (i.e., high availability). Restoring a server from an archived backup takes longer, and hence, only offers a solution for disaster recovery, not high availability.
  • A standby server can also be used for read-only queries, in which case it is called a Hot Standby server.
Hot standby : 
  • Introduced in PostgreSQL 9.0.
  • Queries execute normally while the standby database continuously replays the stream of binary modifications coming from the primary DB.
  • Enable on primary by adding wal_level = 'hot_standby' to postgresql.conf
  • Enable of standby by adding hot_standby = on to postgresql.conf
  • Hot standby works with streaming replication or with file-based log shipping.
  • In cases of conflict (e.g., DROP TABLE on master, but standby is still executing a query against that table; cannot execute DROP without cancelling current query; the longer it delays executing DROP, the further behind the standby falls), we can either pause the replay or cancel the read-only queries and move forward.
Streaming replication :
  • Introduced in PostgreSQL 9.0. Compliments the hot standby.
  • Allows a standby sever to stay more up-to-date than is possible with file-based log shipping.
  • Primary streams WAL records to the standby as they're generated, without waiting for the WAL file to be filled.
  • It is asynchronous, but the delay is much smaller than with file-based log shipping (<1s). archive_timeout is not required to reduce the data loss window.
  • Load on master for each slave is minimal- a master can have many slaves.
  • Primary and standby DBs are almost identical at the binary level.
  • wal_level should be 'archive' or 'hot_standby'.
  • A lot of private data can be extracted from WAL records, so be careful with user permissions.
  • If you don't use continuous archiving, you have to set wal_keep_segments in the master to a value high enough to ensure that old WAL segments are not recycled too early while the standby might still need them to catch up. If the standby falls behind too much, it needs to be reinitialized from a new based backup. 
  • If you set up a continuous WAL archive thats accessible to the standby, wal_keep_segments is not required as it can always use the archive to catch up.
  • Standby servers can get sent whenever they want, what they are missing from WAL, not in terms of complete files ('WAL segments'), but in terms of records in the WAL (also known as record-based file logging).
  • You can use hot standby and streaming replication together. This means you can have a near-realtime standby DB, and run read-only queries on it.
  • A standby DB can be hot standby with file shipping only, and a streaming replication DB can stream without accepting queries.
Synchronous replication :
  • Streaming replication is asynchronous by default. Data loss is possible if primary server crashes, proportional to the replication delay at the time of failover.
  • Synchronous replication offers the ability to confirm that all changes made by a transaction have been transferred to one synchronous standby server.
  • This extends the standard level of durability offered by a transaction commit. This level of protection is referred to as 2-safe replication.
  • Each write transaction will wait until confirmation is received that the commit has been written to the transaction log on disk of both primary and standby server.
  • This increases durability but also increase the response time for the requesting transaction.
  • The only possibility that data can be lost is if both the primary and standby suffer creashes at the same time.
  • Read only transactions need not wait for replies from standby servers.
  • Long running actions such as data loading or index building do not wait until the very final commit message.
Slony :
  • Third-party replication system.
  • Trigger based replication system.
  • Quite old, and not really used much nowdays.
  • When you might want to use Slony :
    • You need to interact with different postgres versions. WAL-based replication requires the same version of postgres and the same architecture too.
    • You only want to replicate part of the changes. WAL-based replication duplicates absolutely everything.
    • You want extra behaviour (e.g., populating cache management information).

1 comment:

  1. Thank you for the interesting article.
    I think it is definitely worth reading. I also would like to recommend getting an information about data room software because it is expected to be very efficient for business.