Amazon Redshift
The Conduit Platform by default supports Amazon Redshift as a source and a destination.
The Amazon Redshift source can connect to and emit records from a table.
Required Configurations
Name | Description | Required | Default |
---|---|---|---|
dsn | Data source name (DSN) to connect to Redshift. Example: redshift://username:password@redshift-cluster-endpoint:5439/database | Yes | |
table | The table the source connector should read from. | Yes | |
orderingColumn | The name of a column that the connector will use for ordering rows. The values must be unique and suitable for sorting, otherwise, the snapshot won't work correctly. | Yes |
Looking for something else? See advanced configurations.
Initial Snapshot
Snapshot mode is enabled by default. When the source connector starts, it captures the state of the Redshift table and its data at that point in time. It will retrieve the max value from orderingColumn
and save that value to position.
The snapshot iterator will proceed to read, fetch, and order all rows where the value of the orderingColumn
is less than or equal to the maximum value, in batches determined by the orderingColumn
value.
Note: The default snapshot mode can be disabled by setting snapshot
to false
in the configuration.
Updates
The source connector utilizes Change Data Capture (CDC) to detect changes in a Redshift table using keyset pagination, while limiting batchSize
and ordering by orderingColumn
. Only rows added after initiating the source connector are moved in batches. Each INSERT
, UPDATE
, or DELETE
operation executed on the table is captured by the CDC iterator, emitting records for each change.
Key Handling
The connector constructs sdk.Record.Key
as sdk.StructuredData
, incorporating elements from the keyColumns
configuration field. If keyColumns
is unspecified, the connector defaults to the primary keys of the specified table; if no primary keys exist, it resorts to the value of the orderingColumn
field. The values for the sdk.Record.Key
field are derived from sdk.Payload.After
, matched with the keys of this field.
Table Name
For each record, the source connector appends a redshift.table
property to the metadata, which holds the table name.
Advanced Configurations
Name | Description | Required | Default |
---|---|---|---|
snapshot | Enable or disable snapshot of entire table before starting CDC mode. Options: true or false . | No | true |
keyColumns | Comma-separated list of column names to build the sdk.Record.Key . Learn more: Key handling. | No | |
batchSize | Size of rows batch. Min is 1 and max is 100000 . | No | 1000 |