Skip to main content

Elasticsearch

Elasticsearch is a search engine based on the Lucene library. It can be used as an upstream resource in your Turbine data applications using the records function to a select type in an index.

Meroxa supports self-hosted and Elastic Cloud instances of Elasticsearch.

Setup

Networking

To add an Elasticsearch search engine as a resource, it must be accessible by Meroxa. If not publicly accessible, here are some ways to give Meroxa access.

Resource Configuration

Use the meroxa resource create command to configure your Elasticsearch resource.

info

During the public beta, Elasticsearch only recognizes the https:// scheme for the Elasticsearch URL. Using elasticsearch:// will result in an error.

The following example depicts how this command is used to create an Elasticsearch resource named elasticsearch with the minimum configuration required.

meroxa resource create elasticsearch \
--type elasticsearch \
--url https://$ES_USER:$ES_PASS@$ES_URL:$ES_PORT \
--metadata '{"index.prefix": "$ES_INDEX","incrementing.field.name": "$ES_INCREMENTING_FIELD"}'

In the example above, replace following variables with valid credentials from your Elasticsearch environment:

  • $ES_USER - Elasticsearch Username
  • $ES_PASS - Elasticsearch Password
  • $ES_URL - Elasticsearch URL
  • $ES_PORT - Elasticsearch Port (e.g., 9200)
  • ES_INDEX - Elasticsearch Index
  • ES_INCREMENTING_FIELD - Elasticsearch Incremental Field (See Configuration Requirements)

Configuration Requirements

Meroxa fetches new data using an incremental/temporal field such as a timestamp or an incrementing id to track changes to an Elasticsearch Index, which is used as an input for this resource.

The following configuration is required:

ConfigurationDestination
index.prefixIndices prefix to include in copying.
incrementing.field.nameAn incremental/temporal field such as a timestamp or an incrementing id.

Advanced Configuration

The following advanced configuration is supported and optional:

ConfigurationDestination
filters.whitelistWhitelist filter for extracting a subset of fields from Elasticsearch JSON documents. The whitelist filter supports nested fields. To provide multiple fields use ; as separator (e.g. customer;order.qty;order.price).

Data Record Format

Data records from Elasticsearch will take on the following format:

{ "schema": { <schema of payload> }, "payload": { <json data captured from index data> } }