Skip to main content

Developing Turbine data apps with Ruby

To proceed with the following guide, you must have already gone through the set up steps.

Requirements

  • Recommended: Choose one of the following Ruby version management tools of choice:
  • Latest Ruby version. If you have a Ruby version management tool installed, you can install ruby through your version management tool and specify which version you would like installed for your development use case.
  • You’ll also need to download the Meroxa CLI.

The Application

When you run the meroxa apps init liveapp --lang ruby command, Turbine automatically scaffolds an example data app in an empty Git repository on your local machine. If you want to initialize the app somewhere else, you can append the --path flag to the command (meroxa apps init my-ruby-app-name --lang ruby --path ~/anotherdir). Once you enter the my-ruby-app-name directory, the contents will look like this:

my-ruby-app-name
├── Gemfile # Describes gem dependencies required to run a Ruby program.
├── app.json # A configuration file for your data app.
├── app.rb # The base of your application code. Includes an example data app to get you started.
└── fixtures # Fixtures are JSON-formatted data records you can develop against and run with your data app locally.
└── demo.json # A data record sample for the example data app.

The codebase contains comments that describe each function and its purpose. All that awaits is your creativity demonstrated through code.

# frozen_string_literal: true

require "rubygems"
require "bundler/setup"
require "turbine_rb"

class MyApp
def call(app)
# To configure resources for your production datastores
# on Meroxa, use the Dashboard, CLI, or Terraform Provider
# For more details refer to: http://docs.meroxa.com/
#
# Identify the upstream datastore with the `resource` function
# Replace `demopg` with the resource name configured on Meroxa
database = app.resource(name: "demopg")

# Specify which upstream records to pull
# with the `records` function
# Replace `collection_name` with a table, collection,
# or bucket name in your data store.
# If a configuration is needed for your source,
# you can pass it as a second argument to the `records` function. For example:
# database.records(collection: "collection_name", configs: {"incrementing.column.name" => "id"})
records = database.records(collection: "collection_name")

# Register secrets to be available in the function:
# app.register_secrets("MY_ENV_TEST")

# Register several secrets at once:
# app.register_secrets(["MY_ENV_TEST", "MY_OTHER_ENV_TEST"])

# Specify the code to execute against `records` with the `process` function.
# Replace `Passthrough` with your desired function.
# Ensure desired function matches `Passthrough`'s' function signature.
processed_records = app.process(records: records, process: Passthrough.new)

# Specify where to write records using the `write` function.
# Replace `collection_archive` with whatever data organisation method
# is relevant to the datastore (e.g., table, bucket, collection, etc.)
# If additional connector configs are needed, provided another argument. For example:
# database.write(
# records: processed_records,
# collection: "collection_archive",
# configs: {"behavior.on.null.values": "ignore"})
database.write(records: processed_records, collection: "collection_archive")
end
end

class Passthrough < TurbineRb::Process
def call(records:)
puts "got records: #{records}"
# To get the value of unformatted records, use record .value getter method
# records.map { |r| puts r.value }
#
# To transform unformatted records, use record .value setter method
# records.map { |r| r.value = "newdata" }
#
# To get the value of json formatted records, use record .get method
# records.map { |r| puts r.get("message") }
#
# To transform json formatted records, use record .set methods
# records.map { |r| r.set('message', 'goodbye') }
records
end
end

TurbineRb.register(MyApp.new)

Managing application data

Configuration

The app.jsonis a configuration file that Meroxa uses to interpret details about a Turbine data app while developing or running locally. Configuration options include the name, codebase language, environment, and relevant datastores and their corresponding fixtures.

{
"name": "ruby-example",
"language": "ruby",
"environment": "common",
"resources": {
"demopg": "fixtures/demo.json"
}
}

In this example, the app configuration identifies a resource named demopg that points to the demo.jsonfixture. Fixtures are JSON-formatted data record samples used in place of your production data when developing or running a data app locally. The demo_pg is currently in place of the source_name. You can replace this with your resource. This needs to match the name of the resource that you'll set up in Meroxa using the meroxa resources create command or via the Dashboard. You can point to the path in the fixtures that'll be used to mock the resource when you run meroxa apps run.

Without changing anything, you can run the command meroxa app runwithin the root of your data app project and see a result. The records the application is pulling are a result of the data records contained within the demo.jsonfixture.

Using resources

To get your production data into your data app, you will need to create Meroxa resources for your datastores. Once you have a set of Meroxa resources, you can now call them within your data app.

To do this, use the Resources function within the Turbine section of your application code. In the example data app, it should appear something like this:

database = app.resource(name: 'resource_name')

Replace resource_name with the names of your Meroxa resource.

Getting data in and out

Now that you have called your Meroxa resources in your data app, you must direct Turbine on how to use them. For this, you can use the Records and Write functions.

Records

The Records function tells Turbine what records you want to use in your data app. Here you indicate the collection of records by naming them using whatever data collection construct applies to the datastore (e.g., table, collection, bucket, index, etc.).

To do this, use the records function within the Turbine section of your application code. In the example data app, it should appear something like this:

records = database.records(collection: 'collection_name')

Replace collection_namewith the name of your data collection construct in your datastore. Let's say, you have a table named Usersin a relational-database. In the case of PostgreSQL using logical replication, you might use public.Usersor Usersfor MySQL using binlog.

Write

The Write function tells Turbine where to write the output of your data app. Here you indicate the resource as well as whatever data collection construct applies to the datastore (e.g., table, collection, bucket, index, etc.).

To do this, use the Write function within the Turbine section of your application code. In the example data app, it should appear something like this:

database.write(records: processed_records, collection: "collection_archive")

Replace collection_archive with the name of your data collection construct you want written to in your datastore. Let's say, you have a table named Users in a relational-database. In the case of PostgreSQL using logical replication, you might use public.Users or Users for MySQL using binlog. If this table does not exist, assuming your credentials have the permissions to create a table, it will automatically be generated in the datastore.

Processing data

Now that you have called your Meroxa resources in your data app and told Turbine how you want to direct that data, you can now decide what sort of code you want to run against that stream of data.

Process

The Process function provides Turbine with the code you want to run against the record or event stream in your data app. This is where you write your custom code to achieve a specific outcome.

To do this, use the Process function within the Turbine section of your application code. In the example data app, this comes in two parts:

Part 1: Where you have written the custom code you wish to run against the record or event stream:

class Passthrough < TurbineRb::Process
def call(records:)
puts "got records: #{records}"
# To get the value of unformatted records, use record .value getter method
# records.map { |r| puts r.value }
#
# To transform unformatted records, use record .value setter method
# records.map { |r| r.value = "newdata" }
#
# To get the value of json formatted records, use record .get method
# records.map { |r| puts r.get("message") }
#
# To transform json formatted records, use record .set methods
# records.map { |r| r.set('message', 'goodbye') }
records
end
end

Part 2: Where you apply the Process function to the record or event stream:

processed_records = app.process(records: records, process: Passthrough.new)

You can also decide to forgo Process and do a direct end-to-end stream of records or events.

You can accomplish this by removing the call to Process in your data app, remove the Passthrough class, and change the write code to use records as a parameter instead of processed_records:

database.write(records: records, collection: "events_copy")

Records and Events

Examples

A demo fixture is included within /fixtures/ in your example data app to demonstrate CDC formatted data records.

In most cases, Meroxa will default to the CDC-format as we built our application framework around real-time data. However, you may configure specific resources to leverage polling in some instances (e.g. PostgreSQL) which will result in slightly altered formatting which will need to be accounted for in your code.

Unit testing

Those familiar with software development are likely already in the habit of writing unit tests. We encourage you to follow this convention throughout the development process to ensure each unit of code works as expected and as a way to identify any issues or bugs early on.

Unit testing is language-specific, you may use any testing framework of your choice.

Run your app locally

You can run your app locally throughout development to ensure your application is outputting the intended result.

To run your app locally, use the meroxa app run command either at the root of your application or by using the --path argument with the local path to your application. Your app will run against the sample json data specified in app.json.

# You can run the app by referencing its local path

$ meroxa app run --path /path/to/liveapp

# Or run the app directly from the app directory

$ meroxa app run

What's next?

Once your data application is running as expected, you're ready to take the next steps and deploy!

Next up: Deploy your data application