High Availability (HA) with Database Replication

AI Tools

MDSSC database redundancy setup (primary + standby)

This page explains how to set up a second PostgreSQL server that can take over if the main database server fails (sometimes called a standby, replica, or disaster-recovery database).

It includes:

  • how to configure the main database server (primary)
  • how to set up the standby server (replica)
  • how to validate replication
  • how to promote the standby during an outage
  • Primary (Dockerized MDSSC Postgres): MDSSC1
  • Standby database server (replica): MDSSC2

Terminology used in this page:

  • Primary: the database server that MDSSC writes to (read/write).
  • Standby/replica: a second server that continuously copies data from the primary (read-only until promoted).
  • Failover / promotion: switching the standby to become the new primary during an outage.

Note: MDSSC1 and MDSSC2 are hostnames. Ensure they resolve (DNS or /etc/hosts) and that MDSSC2 can reach MDSSC1:5432. If you do not have DNS entries for these hosts, replace MDSSC1 and MDSSC2 on this page with their IP addresses.

Store credentials in your secrets manager and export them as environment variables at runtime.

1) Primary host (MDSSC1)

1.1 Ensure Postgres is exposed on host port 5432

Bash
Copy

Expected ports include 0.0.0.0:5432->5432/tcp (or equivalent host binding).

If port 5432 is not exposed on 0.0.0.0

  1. Edit /etc/mdssc/docker-compose.main.yml. Under the postgres: service, if there is no ports: section, add one. For example, immediately after shm_size (or before command:), insert:
YAML
Copy
  1. In /etc/mdssc/customer.env, set the host port (default 5432):
Bash
Copy
  1. Save the changes. You can apply them during the restart in 1.2 (recommended to only restart once).

Optional: if you want to verify the port exposure immediately before proceeding:

Bash
Copy

1.2 Enable replication parameters on the primary

Recommended (production): add or merge these into /etc/mdssc/customer.env (they map to wal_level, max_wal_senders, and max_replication_slots on the Postgres process):

Bash
Copy

Then restart the stack so the container picks up the new values:

Bash
Copy

If you cannot update /etc/mdssc/customer.env, you can set these directly in Postgres. This approach is less repeatable than environment-managed configuration.

Rollback: run ALTER SYSTEM RESET for each parameter and restart Postgres.

Bash
Copy

1.3 Verify replication settings on primary

Bash
Copy

1.4 Create replication user and slot

Bash
Copy

1.5 Allow replica host in pg_hba.conf

This allows the replicator user to connect for replication from the replica node.

Set REPLICA_ADDR to one of:

  • MDSSC2 (if the primary resolves MDSSC2 and Postgres is configured to use hostnames in pg_hba.conf)
  • the replica IP/CIDR, for example 10.30.160.240/32
Bash
Copy
Bash
Copy

2) Replica VM (MDSSC2)

2.1 Install PostgreSQL 17

Bash
Copy

2.2 Take base backup from primary (MDSSC1)

This will replace the replica data directory.

Set the replication password in your current shell (same value used when creating the replicator role on the primary):

Bash
Copy

2.3 Required recovery fix (must match primary limits)

If startup fails with insufficient parameter settings and max_connections ... lower setting than on the primary, run:

Bash
Copy

3) Check everything is working

3.1 Replica checks (MDSSC2)

Bash
Copy

Expected:

  • cluster status includes online,recovery

3.2 Primary check (MDSSC1)

Bash
Copy

Expected:

  • a row exists for the replica connection
  • state = streaming
  • sync_state = async

4) Known warnings (observed during validation)

  • database "MDSS" has a collation version mismatch can appear on the replica and does not block replication startup in this setup.
  • role "postgres" does not exist may appear in service checks because this deployment uses mdss as the primary database role.

5) Switch over to the standby during an outage (manual)

Promote replica:

Bash
Copy

After promotion, update the application DB host (POSTGRESQL_URL Host=...) to the promoted node (or use a VIP/HA proxy for cleaner failover).

6) Two MDSSC app servers sharing one resilient PostgreSQL database

Use this when you run two MDSSC application nodes (or two full stacks) and want both to use the same external PostgreSQL setup that can survive a database server failure (for example: primary + standby, or a managed/failover PostgreSQL service).

6.1 Align with OPSWAT external PostgreSQL

Configure MetaDefender Software Supply Chain to use an external database per OPSWAT:

That covers product-specific settings for pointing MDSSC at a non-bundled Postgres. This page only adds how to wire two MDSSC nodes to one resilient PostgreSQL setup.

6.2 One writer for both nodes

MDSSC expects a single read/write database. With primary + standby replication:

  • Both MDSSC instances must use the same connection target that resolves to the current primary (the only node that accepts writes).
  • Do not point the second MDSSC instance at the standby/replica for normal operation

In practice, the host in POSTGRESQL_URL should be one of:

  • The primary server’s IP or DNS name (steady state).
  • A virtual IP (VIP), managed DNS name, or HA proxy / connection pooler (PgBouncer, HAProxy) that always routes to the primary, including after failover.

6.3 Environment on each MDSSC node

On both nodes, set the same application database URL in /etc/mdssc/customer.env (production). The entry looks like:

Bash
Copy

Replace <primary_or_vip> with the hostname or IP that reaches the write primary from each MDSSC host. Use identical values on node 1 and node 2 unless you deliberately split traffic (not typical for this stack).

6.4 PostgreSQL access from both application hosts

On the Postgres primary (and any node that accepts app connections, if you route through a proxy), ensure pg_hba.conf allows the MDSSC database user from both MDSSC server IPs, for example:

Copy

Reload PostgreSQL configuration after changes (SELECT pg_reload_conf(); or service reload).

6.5 Failover and both MDSSC instances

After manual promotion of a standby (see the manual failover section above) or after automatic failover (for example Patroni, repmgr), every MDSSC instance must still reach the new primary:

  • If you use a VIP/DNS/proxy that moves with the primary, POSTGRESQL_URL in /etc/mdssc/customer.env may stay unchanged on both nodes.
  • If you use fixed IPs, update POSTGRESQL_URL in /etc/mdssc/customer.env on both MDSSC nodes to the new primary host (or redeploy with the updated secret/config store), then restart using your standard procedure (for example mdssc -c restart where appropriate).

Until all MDSSC instances point at the writable primary, one node may hit read-only errors or stale routing after failover.

VariableType to search · ESC to discard
GlossaryType to search · ESC to discard
InsertType to search · ESC to discard
No matches