Skip to main content

PostgreSQL

The Conduit Platform by default supports PostgreSQL as a source and a destination.

The PostgreSQL source can connect to and emit records from one or more tables.

Required Configurations

NameDescriptionRequiredDefault
urlConnection string for the PostgreSQL database. Example: postgres://username:[email protected]:5432/databaseYes
tablesComma-separated list of tables the source connector should read from. Example: "transactions,merchants,shipments". Using * will read from all public tables.Yes

Looking for something else? See advanced configurations.

Initial Snapshot

Snapshot mode is enabled by default. When the source connector starts, the source connector obtains a read-only lock on the table and reads all rows of the table. After reading all rows in the initial snapshot, the connector releases its lock and switches to CDC mode.

Note: The default snapshot mode can be disabled by setting snapshot to never in the configuration.

Updates

The source connector utilizes Change Data Capture (CDC) to detect changes in a PostgreSQL table by creating a logical replication slot and a publication that listens to changes in the configured tables. Each detected change is converted into a record and returned in the Read call. If no record is available when Read is invoked, it blocks until a record is available or the connector receives a stop signal.

Logical Replication

When the connector transitions to CDC mode, it will run the initial setup commands to create its logical replication slot and publication. It will establish a connection to an existing slot if one with the configured name already exists.

Configuration of a PostgreSQL user with the necessary privileges to run all of these setup commands is required.

Note: When the connector or pipeline is removed, the connector will automatically attempt to remove the replication slot and publication. This is the default behaviour and can be disabled by setting logrepl.autoCleanup to false.

Key Handling

The source connector automatically searches for the primary key column in the specified tables. If it cannot be determined, the connector will return an error.

User privileges

The PostgreSQL user you provide to the Conduit Platform must have privileges to CREATE PUBLICATION, needs to be an owner of the tables to be replicated, and must have the rds_replication role. To do this, you may use psql in Terminal to connect to the database and run the following queries:

  1. Create a unique role and user for the platform. You may replace conduit_user and conduit_role with whatever you prefer.
CREATE USER conduit_user WITH REPLICATION ENCRYPTED PASSWORD 'very-secure-password';
CREATE ROLE conduit_role;
  1. Grant the CREATE privilege and table OWNER privilege to the role and grant the role to the user. Replace my_database and my_table with your respective database and table names.
GRANT CREATE ON DATABASE my_database TO conduit_role;
ALTER TABLE my_table OWNER TO conduit_role;
GRANT conduit_role TO conduit_user;
  1. Grant rds_replication to that role. Replace conduit_user with whatever you have created in step 1.
GRANT rds_replication TO conduit_user;

If you do not have the privileges to perform above actions, please contact your PostgreSQL administrator.

Advanced Configurations

NameDescriptionRequiredDefault
snapshotModeEnable or disable snapshot of entire table before starting CDC mode. Options: initial or never.Noinitial
cdcModeDetermines the CDC mode. Options: auto, logrepl or long_polling.Noauto
logrepl.publicationNameName of the publication to listen for WAL events.Noconduitpub
logrepl.slotNameName of the slot opened for replication events.Noconduitslot
logrepl.autoCleanupEnable or disable replication slot and publication cleanup when connector or pipeline is deleted.Notrue
tableComma-separated list of tables the source connector should read from. Example: "transactions,merchants,shipments".(Deprecated: Use tables instead)

Amazon RDS Setup

Requirements

  1. Configure Security Groups
  2. Configure Network ACLs (only if your PostgreSQL database is within a VPC)
  3. Enable Logical Replication (required for CDC connection)

Configure Security Groups

To allow access to your PostgreSQL database from Conduit Platform IPs, follow these steps:

  1. Go to Amazon RDS in your AWS account.
  2. Go to Databases in the Amazon RDS menu.
  3. Click into the PostgreSQL instance you'd like to use.
  4. In the Connectivity & security section, under Security and VPC security groups, click your security group by clicking the blue hyperlink.
  5. You should now see a list of Security Groups, click into the selected security group by clicking the blue hyperlink under Security group ID.
  6. Under Inbound rules, click the Edit inbound rules button.
  7. Click the Add rule button.
  8. Set Protocol to TCP.
  9. Set Port to the port of your database. For PostgreSQL Database it's 5432 by default.
  10. Set the Source to include the Conduit Platform IPs.

Configure Network ACLs

You only need to update this if your database is within a VPC.

To allow access to your PostgreSQL database from Conduit Platform IPs, follow these steps:

  1. Go to Amazon RDS in your AWS account.
  2. Go to Databases in the Amazon RDS menu.
  3. Click into the PostgresQL Database instance you'd like to use.
  4. In the Connectivity & security section, under Networking and VPC, click your VPC by clicking the blue hyperlink.
  5. You should now see a list of Your VPCs. Click into the selected VPC by clicking the blue hyperlink under VPC ID.
  6. In your VPC Details, click the blue hyperlink under Main network ACL.
  7. You should now see a list of Network ACLs. Click into the selected Network ACL by clicking the blue hyperlink under Network ACL ID.
  8. You will update both the Inbound and Outbound Rules of the Network ACL to contain ALL - 0.0.0.0/0 - ALLOW.
  9. If your Inbound and Outbound Rules do not contain ALL - 0.0.0.0/0 - ALLOW—update the Source to allow Conduit Platform IPs both Inbound and Outbound.

Enable Logical Replication

You can enable logical replication with an existing or new parameter group:

Using an existing parameter group

  1. Go to Amazon RDS in your AWS account.
  2. Go to Databases in the Amazon RDS menu.
  3. Click into the PostgresQL database instance you'd like to use.
  4. In the Configuration section, under Instance, click the blue hyperlink under Parameter group.
  5. Click the Edit parameters button.
  6. Search and set the following key-value pair in your DB parameter group rds.logical_repliation - 1.
  7. Click the Save changes button.

Using a new parameter group

  1. Go to Amazon RDS in your AWS account.
  2. Go to Parameter groups in the Amazon RDS menu.
  3. Click Set Parameter group family to the PostgreSQL version of your database (e.g. postgres15).
  4. Set Type to DB Parameter Group.
  5. Set Group name to an identifiable name (e.g. meroxa-conduit).
  6. Click the Create button.
  7. You should now see a list of Parameter groups. Click into the selected parameter group by clicking the blue hyperlink under Name.
  8. Click the Edit parameters button.
  9. Search and set the following key-value pair in your DB parameter group rds.logical_repliation - 1.
  10. Click the Save changes button.
  11. Go to Databases in the Amazon RDS menu.
  12. Click into the PostgresQL database instance you'd like to use.
  13. Click the Modify button.
  14. Under the Additional configuration dropdown, look for DB parameter group and select your new parameter group.
  15. Click the Continue button.
  16. Under the Scheduling and modifications dialog, select Apply immediately.
  17. Click the Modify DB Instance button to confirm changes.
  18. Next, you must reboot the PostgreSQL server. To do this, click the Actions menu and click Reboot.
  19. You will be asked to confirm the reboot. Click the Confirm button.

Logical replication should now be enabled on your Amazon RDS PostgreSQL database instance.