Skip to main content

Amazon S3

Amazon S3 provides flexible object storage. As a Meroxa destination, you can capture events from any source and populate a bucket within a S3 in real-time.

Adding Resource

To add an Amazon S3 resource to your Meroxa Resource Catalog, you can run the following command:

meroxa resource add datalake --type s3 -u \"s3://$AWS_ACCESS_KEY:$AWS_ACCESS_SECRET@$AWS_REGION/$AWS_S3_BUCKET\"

datalake is a human-friendly name to represent the S3 resource. Feel free to change as desired.

In the command above, replace the following variables with valid credentials from your S3 environment:

  • $AWS_ACCESS_KEY - AWS Access Key
  • $AWS_ACCESS_SECRET - AWS Access Secret
  • $AWS_REGION - AWS Region (e.g., us-east-2)
  • $AWS_S3_BUCKET - AWS S3 Bucket Name

Permissions

The following AWS Access Policy is required to be attached to the IAM user of the AWS_ACCESS_KEY provided in the Connection URL:

{
"Statement": [
{
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject",
"s3:AbortMultipartUpload",
"s3:ListMultipartUploadParts",
"s3:ListBucketMultipartUploads"
],
"Effect": "Allow",
"Resource": [
"arn:aws:s3:::<bucket-name>/*",
"arn:aws:s3:::<bucket-name>"
]
}
],
"Version": "2012-10-17"
}

Destination Configuration

The S3 Destination connector allows you to store Data Records from a Streams into an S3 Bucket.

To configure an Amazon S3 resource as a destination:

meroxa connector create to-s3 --to datalake --input $STREAM_NAME

The command above creates a new Destination Connector called to-s3, sets the destination to a resource named datalake, and configures the input with a stream .

Output

Data Records are written a folder within the root of the S3 bucket as gzipped JSON, with one record per file and using the following naming format:

<stream-name>-<partition-number>-<starting-offset>

In the following example, the record is from the resource-5-499379.public.orders stream with starting offset 0000000000 and partition 0.

aws s3 ls s3://data-lake-bucket/resource-7-133274/resource-5-499379.public.orders-0-0000000000.gz

Here is an example of a Data Record: Example Data Record.

Advanced Configuraiton

The following configuration is supported for this Connector:

Configuration Options

The following configuration is supported for this Connector:

ConfigurationDestination
output_compressionCompression type for output files. Supported algorithms are gzip and none. Defaults to gzip.