How to Migrate Between Two RDS Postgres Instances

Jacob Reed

Jacob Reed / December 20, 2021

––– views

When searching for how to migrate between two RDS instances you may be met with articles that point you in the direction of using Amazon Database Migration Services (DMS), from experience DON'T DO IT. You'll be met with constant unexplainable failures, digging through logs that tell you nothing and paying AWS for support. DMS might be great for more complex migrations i.e. moving from one database engine to another. For Postgres -> Postgres migrations there is an easier option!

Enter Postgres Logical Replication

Logical replication was introduced in Postgres 10.0. From Postgres docs:

Logical replication is a method of replicating data objects and their changes, based upon their replication identity (usually a primary key). We use the term logical in contrast to physical replication, which uses exact block addresses and byte-by-byte replication.

This method uses a publication and subscription model for replicating changes between databases.

This is a great use case for when you're migrating between two database instances. In my case I needed to migrate to a different VPC because AWS RDS didn't support the instance class I wanted to move to.

Getting Started

The first thing we need is 2 RDS instances that are able to communicate with each other. I'll use terraform in this article but use whatever Infrastructure as Code (IaaC) that makes you happy.

resource "aws_db_instance" "default" {
  allocated_storage    = 10
  engine               = "postgres"
  engine_version       = "10.0"
  instance_class       = "db.t3.micro"
  name                 = "sourcedb"
  username             = "foo"
  password             = "foobarbaz"
  parameter_group_name = aws_db_parameter_group.source-parameters.id
  skip_final_snapshot  = true
}


resource "aws_db_instance" "target" {
  allocated_storage    = 10
  engine               = "postgres"
  engine_version       = "10.0"
  instance_class       = "db.t3.micro"
  name                 = "sourcedb"
  username             = "foo"
  password             = "foobar"
  parameter_group_name = aws_db_parameter_group.target-parameters.id
  skip_final_snapshot  = true
}

main.tf

We also want these databases to have a special role to perform the replication process. I'm using terraform here as well but you could use the console to do this as well. You'll need to do the following twice for the source and target databases.

Important Note: The role must have the roles: "rds_replication", "rds_superuser"

provider "postgresql" {
  host            = aws_db_instance.source.endpoint
  port            = 5432
  database        = "postgres"
  username        = "foo"
  password        = "foobar"
  sslmode         = "require"
  connect_timeout = 15
}

resource "postgresql_schema" "my_schema" {
  name  = "my_schema"
  owner = "postgres"


  # app_releng can create new objects in the schema.  This is the role that
  # migrations are executed as.
  policy {
    create = true
    usage  = true
    role   = "my_migration_role"
  }
}


resource "postgresql_role" "my_migration_role" {
  name             = "my_migration_role"
  login            = true
  connection_limit = 5
  password         = "apassword"
  roles            = ["rds_replication", "rds_superuser"]
}

database.tf

After deploying this we should have two databases.

Database Setup

The first thing we need on the source database is a special parameter defined to allow postgres to create replication slots. See below for my parameter group I used.

resource "aws_db_parameter_group" "source-parameters" {
  name   = "rds-postgres"
  family = "postgres10.0"

  parameter {
    name  = "rds.logical_replication"
    value = "1"
  }
}

main.tf

Enabling the Publication

After deploying the parameter group and rebooting your database you should be able to start enabling the publication. Login using your migration role into the source database.

Perform the following command:

CREATE PUBLICATION pub FOR TABLE <table>, <table2>;

Publication Docs

Enabling the Subscription

Log in to your target database and perform the following command:

CREATE SUBSCRIPTION migration CONNECTION 'host=<AWS RDS Source Server Address> port=<port> user=<user> sslmode=prefer dbname=<database name>' PUBLICATION pub

Subscription Docs

Wrapping Up

You should now be able to see your data flowing between your 2 databases. Some things to note:

  • Changes to the database schema are not replicated
  • Sequences are not replicated
  • If a table does not have a Primary Key you'll see errors

To track that everything is up to date run the following command on source and target and validate they match:

Select pg_current_wal_lsn()