PostgreSQL
The Conduit Platform by default supports PostgreSQL as a source and a destination.
The PostgreSQL source can connect to and emit records from one or more tables.
Required Configurations
Name | Description | Required | Default |
---|---|---|---|
url | Connection string for the PostgreSQL database. Example: postgres://username:[email protected]:5432/database | Yes | |
tables | Comma-separated list of tables the source connector should read from. Example: "transactions,merchants,shipments" . Using * will read from all public tables. | Yes |
Looking for something else? See advanced configurations.
Initial Snapshot
Snapshot mode is enabled by default. When the source connector starts, the source connector obtains a read-only lock on the table and reads all rows of the table. After reading all rows in the initial snapshot, the connector releases its lock and switches to CDC mode.
Note: The default snapshot mode can be disabled by setting snapshot
to never
in the configuration.
Updates
The source connector utilizes Change Data Capture (CDC) to detect changes in a PostgreSQL table by creating a logical replication slot and a publication that listens to changes in the configured tables. Each detected change is converted into a record and returned in the Read
call. If no record is available when Read
is invoked, it blocks until a record is available or the connector receives a stop signal.
Logical Replication
When the connector transitions to CDC mode, it will run the initial setup commands to create its logical replication slot and publication. It will establish a connection to an existing slot if one with the configured name already exists.
Configuration of a PostgreSQL user with the necessary privileges to run all of these setup commands is required.
Note: When the connector or pipeline is removed, the connector will automatically attempt to remove the replication slot and publication. This is the default behaviour and can be disabled by setting logrepl.autoCleanup
to false
.
Key Handling
The source connector automatically searches for the primary key column in the specified tables. If it cannot be determined, the connector will return an error.
User privileges
The PostgreSQL user you provide to the Conduit Platform must have privileges to CREATE PUBLICATION
, needs to be an owner of the tables to be replicated, and must have the rds_replication
role. To do this, you may use psql
in Terminal to connect to the database and run the following queries:
- Create a unique role and user for the platform. You may replace
conduit_user
andconduit_role
with whatever you prefer.
CREATE USER conduit_user WITH REPLICATION ENCRYPTED PASSWORD 'very-secure-password';
CREATE ROLE conduit_role;
- Grant the
CREATE
privilege and tableOWNER
privilege to the role and grant the role to the user. Replacemy_database
andmy_table
with your respective database and table names.
GRANT CREATE ON DATABASE my_database TO conduit_role;
ALTER TABLE my_table OWNER TO conduit_role;
GRANT conduit_role TO conduit_user;
- Grant
rds_replication
to that role. Replaceconduit_user
with whatever you have created in step 1.
GRANT rds_replication TO conduit_user;
If you do not have the privileges to perform above actions, please contact your PostgreSQL administrator.
Advanced Configurations
Name | Description | Required | Default |
---|---|---|---|
snapshotMode | Enable or disable snapshot of entire table before starting CDC mode. Options: initial or never . | No | initial |
cdcMode | Determines the CDC mode. Options: auto , logrepl or long_polling . | No | auto |
logrepl.publicationName | Name of the publication to listen for WAL events. | No | conduitpub |
logrepl.slotName | Name of the slot opened for replication events. | No | conduitslot |
logrepl.autoCleanup | Enable or disable replication slot and publication cleanup when connector or pipeline is deleted. | No | true |
table | Comma-separated list of tables the source connector should read from. Example: "transactions,merchants,shipments" . | (Deprecated: Use tables instead) |
Amazon RDS Setup
Requirements
- Configure Security Groups
- Configure Network ACLs (only if your PostgreSQL database is within a VPC)
- Enable Logical Replication (required for CDC connection)
Configure Security Groups
To allow access to your PostgreSQL database from Conduit Platform IPs, follow these steps:
- Go to Amazon RDS in your AWS account.
- Go to Databases in the Amazon RDS menu.
- Click into the PostgreSQL instance you'd like to use.
- In the Connectivity & security section, under Security and VPC security groups, click your security group by clicking the blue hyperlink.
- You should now see a list of Security Groups, click into the selected security group by clicking the blue hyperlink under Security group ID.
- Under Inbound rules, click the Edit inbound rules button.
- Click the Add rule button.
- Set Protocol to
TCP
. - Set Port to the port of your database. For PostgreSQL Database it's
5432
by default. - Set the Source to include the Conduit Platform IPs.
Configure Network ACLs
You only need to update this if your database is within a VPC.
To allow access to your PostgreSQL database from Conduit Platform IPs, follow these steps:
- Go to Amazon RDS in your AWS account.
- Go to Databases in the Amazon RDS menu.
- Click into the PostgresQL Database instance you'd like to use.
- In the Connectivity & security section, under Networking and VPC, click your VPC by clicking the blue hyperlink.
- You should now see a list of Your VPCs. Click into the selected VPC by clicking the blue hyperlink under VPC ID.
- In your VPC Details, click the blue hyperlink under Main network ACL.
- You should now see a list of Network ACLs. Click into the selected Network ACL by clicking the blue hyperlink under Network ACL ID.
- You will update both the Inbound and Outbound Rules of the Network ACL to contain
ALL - 0.0.0.0/0 - ALLOW
. - If your Inbound and Outbound Rules do not contain
ALL - 0.0.0.0/0 - ALLOW
—update the Source to allow Conduit Platform IPs both Inbound and Outbound.
Enable Logical Replication
You can enable logical replication with an existing or new parameter group:
Using an existing parameter group
- Go to Amazon RDS in your AWS account.
- Go to Databases in the Amazon RDS menu.
- Click into the PostgresQL database instance you'd like to use.
- In the Configuration section, under Instance, click the blue hyperlink under Parameter group.
- Click the Edit parameters button.
- Search and set the following key-value pair in your DB parameter group
rds.logical_repliation - 1
. - Click the Save changes button.
Using a new parameter group
- Go to Amazon RDS in your AWS account.
- Go to Parameter groups in the Amazon RDS menu.
- Click Set Parameter group family to the PostgreSQL version of your database (e.g.
postgres15
). - Set Type to
DB Parameter Group
. - Set Group name to an identifiable name (e.g.
meroxa-conduit
). - Click the Create button.
- You should now see a list of Parameter groups. Click into the selected parameter group by clicking the blue hyperlink under Name.
- Click the Edit parameters button.
- Search and set the following key-value pair in your DB parameter group
rds.logical_repliation - 1
. - Click the Save changes button.
- Go to Databases in the Amazon RDS menu.
- Click into the PostgresQL database instance you'd like to use.
- Click the Modify button.
- Under the Additional configuration dropdown, look for DB parameter group and select your new parameter group.
- Click the Continue button.
- Under the Scheduling and modifications dialog, select Apply immediately.
- Click the Modify DB Instance button to confirm changes.
- Next, you must reboot the PostgreSQL server. To do this, click the Actions menu and click Reboot.
- You will be asked to confirm the reboot. Click the Confirm button.
Logical replication should now be enabled on your Amazon RDS PostgreSQL database instance.