High Availability support for PostgreSQL Data lake

Installation

  • Replication Manager is compatible solely with Linux-based operating systems.
  • The Replication Manager version in use must be compatible with the major version of the installed PostgreSQL.
  • All PostgreSQL servers must be of the same version and run on the same operating system.

High availability solution for PostgreSQL data lake requires a single primary server along with a minimum of two standby servers. Both PostgreSQL and Replication Manager must be installed on every server.

On the servers that target to run as standby:

  • Do not create a PostgreSQL instance (i.e., do not execute initdb or any database creation scripts provided by packages).
  • Ensure the destination data directory exists and is owned by the postgres system user.
  1. Select your Linux distribution here and follow the steps to install PostgreSQL accordingly.
  2. Follow the steps to install Replication Manager for PostgreSQL clusters - repmgr.

Primary configuration

  1. Choose one of the installed servers to be the primary one.
  2. Navigate to the folder containing postgresql.conf file and create a replication config file named postgresql.replication.conf.

By default, postgresql.conf file is placed at

  • /var/lib/pgsql/<version>/data/ on Red Hat/Rocky.
  • /var/lib/postgresql/<version>/main on Debian/Ubuntu.
Bash
Copy
  1. Add the replication configuration file name to the end of postgresql.conf file and save the modifications.
Bash
Copy
  1. In Terminal, run the following commands to create repmgr user and database.
Bash
Copy

In this guideline, although the term repmgr is used for both user and database, any names can be used.

  1. Edit pg_hba.conf file to configure the authentication.
Bash
Copy
  1. Restart PostgreSQL server.
Bash
Copy
  1. Create repmgr.conf file, fill out information in brackets and store it in a location of your choice.

repmgr.conf file should not be placed inside PostgreSQL data folder as it may be overwritten.

Bash
Copy
KeyRed Hat/RockyDebian/Ubuntu
postgres_data_dir/var/lib/pgsql/<version>/data//var/lib/postgresql/<version>/main/
postgres_dir/usr/pgsql-<version>/bin//usr/lib/postgresql/<version>/bin/
your_dirDirectory torepmgr.conf file.Directory torepmgr.conf file.
  1. In Terminal, run the following commands to register the primary server.
Bash
Copy

Standby configuration

  1. Create repmgr.conf file and modify values of node, node_name, conninfo accordingly.
  2. Store the file in your reference location.
  3. Stop PostgreSQL server.
Bash
Copy
  1. In Terminal, run the following commands to clone data from the primary server.
Bash
Copy
  1. Start PostgreSQL server.
Bash
Copy
  1. In Terminal, run the following commands to register the standby server.
Bash
Copy
  1. Check if the node was registered successfully.
Bash
Copy

Automatic failover

In Terminal, run the following command to start Replication manager daemon on all PostgreSQL servers (including primary and standbys)

Bash
Copy

Rejoin after a failure

Replication manager daemon repmgrd does not automatically join a failed PostgreSQL server node to the cluster. Consequently, the cluster contains at least two primary nodes at one time, and the system administrator has to join the node to the cluster manually.

  1. Ensure the failed PostgreSQL server is not running. Run the following command in Terminal to stop the server if it has, by chance, already been started by the Linux system and service manager.
Bash
Copy
  1. In Terminal, run the following command to rejoin the server.
Bash
Copy
  1. If a node rejoin fails, do register the failed node as a standby. In Terminal, run the following command.
Bash
Copy
  1. Start PostgreSQL server.
Bash
Copy
  1. Force register the node as a standby.
Bash
Copy

Setup instructions

  1. Sign to MDDC Control Center console with your Administrator account.
  2. Navigate to Inventory > Services.
  3. Expand the Data Lake group.
  4. Click Add service.
  5. Enter the values for Name, Host, Port, Username and Password fields of individual PostgreSQL instance.
  6. Click the Check icon in the bottom right to complete.
  1. Ensure all PostgreSQL instances are reachable by the MDDC Control Center.
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard