This article is based on the webinar “Automate Data Flows Between Legacies and the Cloud”. The article is more detailed and step-by-step with code. If you prefer to watch the webinar of 1 now, click click here and choose 'view replay' at the end.
Why orchestrate processes?
Orchestrating and automating processes is part of the objectives of companies in their digital transformation phase. This is because many companies with more years in the market have legacy systems fulfilling essential roles decades ago. Therefore, when companies seek to modernize their processes, the right thing to do is to do it incrementally, with decoupled services deployed in a hybrid cloud: with cloud and on-premise components working together.
One of the Amazon Web Services services that we like the most in Kranio and in which we are experts, is Step Functions. It consists of a state machine, very similar to a flow chart with sequential inputs and outputs where each output depends on each input.

Each step is a Lambda function, meaning serverless code that only executes when needed. AWS provides the runtime and we don't have to manage any type of server.
Use case
A case that helps us understand how to apply Step Functions is to create sequential records in multiple tables of an on-premise DB from a cloud application, through a rest api with an event-based architecture.
This case can be summarized in a diagram like this:

Here we can see:
- A source of the data, such as a web form.
- Data payload: these are the data we need to register them in the DB.
- CloudWatch events: Also called Event Bridge, these are events that allow you to “trigger” AWS services, in this case, the State Machine.
- Api Gateway: This is the AWS service that allows you to create, publish, maintain and monitor rest, http or websockets apis.
- A relational database.
Advantages
The advantages of orchestrating on premise from the cloud are:
- Reuse of components that already exist without leaving them aside
- The solution is decoupled, so each action that must be performed has its own development, making it easier to maintain, identify errors, etc.
- If business requirements change, we know what happened and what should be changed or between what steps a new state should be added.
- if changes are required, the impact that may have on on-premises is mitigated since the orchestration is in the cloud,.
- Because there are serverless alternatives, there is no need to manage servers or their operating systems.
- They are low-cost solutions. If you want to know more, check the prices for using Lambda, Api Gateway, SNS and CloudWatch Events.
And now what?
You already know the theory about orchestrating a flow of data. Now we'll show you the considerations and steps you should take into account to put it into practice.
Development
The resources to be used are:
- An Amazon Web Services account and have AWS CLI configured as in this Link
- Python +3.7
- The Serverless framework (know hither how to do the setup)
- The bookstore Boto3 From Python
How to leave
As it is an orchestration, we will need to identify the sequential steps that we want to orchestrate. And since orchestration is to automate, the flow must also start automatically.
For this, we will rely on the use case presented above, and we will assume that the DB in which we write corresponds to one of the components of a company's CRM, that is, one of the technologies used to manage the customer base.
We will create an event-based solution, starting the flow with the receipt of a message originated by some source (such as a web form).
After the event is received, its content (payload) must be sent via POST to an endpoint to enter the database. This DB can be cloud or on premise and the endpoint must have a backend that can perform limited operations on the DB.
To facilitate the deployment of what needs to be developed, the Serverless framework that allows us to develop and deploy.
The project will be divided into 3 parts:
Then these projects are deployed in the order infrastructure >> step-functions >> api-gateway.
It can be the same directory, where we dedicate 3 folders. The structure would look like the following:
─ ─ gateway-api
│ ING-─ database-scripts
│ │ guests ─ ─ db.py
│ │ └ ─ ─ script.py
│ INO ─ ─ libs
│ │ └ ─ ─ api_responses.py
│ INO ─ serverless.yml
│ |─ service
│ guests ─ ─ create_benefit_back.py
│ guests ─ ─ create_client_back.py
│ ─ ─ create_partner_back.py
─ ─ infrastructure
│ |─ serverless.yml
|─ step-functions
INO ─ functions
│ guests ─ ─ catch_errors.py
│ guests ─ ─ create_benefit.py
│ guests ─ ─ create_client.py
│ guests ─ ─ create_partner.py
│ |─ receive_lead.py
INO ─ serverless.yml
|─ services
─ ─ crm_service.py
Talk is cheap. Show me the code.
And with this famous quote from Linus Torvalds, we will see the essential code of the project we are creating. You can see the detail here.
Backend
Previous endpoints are of no use to us if they don't have a backend. To relate each endpoint to a backend, you must create Lambda functions that write in the database the parameters that the endpoint receives. Once the Lambda functions are created, we enter their ARN in the “uri” parameter inside “x-amazon-apigateway-integration”.
A key thing about Lambda functions is that they consist of a main method called handler that receives 2 parameters: message and context. Message is the input payload, and Context contains data about the function's invocation and data about the execution itself. All Lambda functions must receive an input and generate an output. You can learn more here.
The functions of each endpoint are very similar and only vary in the amount of data that the function needs to be able to write to the corresponding table.
Function: CreateClient
Role: create record in the CLIENTS table of our DB
Function: CreatePartner
Role: create record in the PARTNER table of our DB
Function: CreateBenefit
Role: create record in the BENEFIT table of our DB
IaaC - Infrastructure as Code
In the serverless.yml code we declare all the resources we are defining. For deployment, you must have properly configured AWS CLI and then execute the command
This generates a Cloudformation stack that groups together all the resources you declared. Learn more here.
In the Serverless.yml files you'll see values like these:
These correspond to references to strings in other yml documents within the same path, and that point to a particular value. You can learn more about this way of working here.
API Gateway
For the rest api we will build a API Gateway with a Serverless project.
The purpose of the API is to receive requests from Step Functions, recording data in the database.
The Api Gateway will allow us to expose endpoints to which to perform methods. In this project we will only create POST methods.
We'll show you the basics of the project and you can see the details here.
OpenAPI Specification
An alternative to declaring the API, its resources and methods, is to do so with OpenAPI. To learn more about Open Api, read This article what we did about him.
This file is read by the Api Gateway service and generates the API.
Important: if we want to create an API Gateway, it is necessary to add an extension to the OpenAPI with information that only AWS can interpret. For example: the create_client endpoint that we call via POST, receives a request body that a specific backend must process. That backend is a lambda. The relationship between the endpoint and the lambda function is stated in this extension. You can learn more about this here.
When deploying the project, Api Gateway will interpret this file and create this in your AWS console:

To know the URL of the deployed API, you must go to the menu Stages. The stage is a state of your API at a given time (fact: you can have as many stages as you want with different versions of your API). Here you can indicate an abbreviation for the environment in which you are working (dev, qa, prd), you can indicate the version of the api you are doing (v1, v2) or indicate that a test version corresponds to.
In the Api Gateway console, we indicated that we would deploy with the name of stage “dev”, so when you go to Stage you will see something like this:

You can find out the URLs of each endpoint by clicking on the names listed. This is what the create_client endpoint looks like:

Infrastructure
Here we will create the relational database and the Event Bridge event bus.
For now, the DB will be in the AWS cloud, but it could be a database in your own data center or in another cloud.
The Event Bridge event bus allows us to communicate 2 isolated components that may even be in different architectures. Learn more about this service hither
This repository is smaller than the previous one, since it only declares 2 resources.
Serverless.yml
You need to create the following tables in your DB. You can guide yourself for these scripts For the database that there is aq
With these steps, we conclude the creation of the infrastructure.
Step Functions
Each “step” of the State Machine that we will create is a Lambda function. Unlike the Lambdas that I talked about in the Api Gateway item and which have the role of writing to the DB, these they make requests to the endpoints Of Gateway API.
According to the architecture indicated above based on a sequential flow, the State Machine should have these steps:
- Receive data from a source (e.g. web form) through the Event Bridge event.
- Take the event data, assemble a payload with the name, last name, phone, branch and route to the create_client endpoint for the backend to write it to the CLIENTS table
- Take the event data, assemble a payload with the rut and branch to the create_partner endpoint for the backend to write to the PARTNERS table
- Take the event data, assemble a payload with the rut and WantsBenefit to the create_benefit endpoint for the backend to write it to the BENEFITS table
- You can create an additional Lambda that the flow reaches if there is an error in the execution (example: that the endpoint is down). In the case of this project, it's called catch_errors.
Therefore, one Lambda is made per action for each step.
Function: receive_lead
Rol: Receive the Event Bridge event. He cleans it and passes it on to the next Step. This step is important because when an event is received, it arrives in a JSON document with attributes defined by Amazon, and the content of the event (the JSON of the form) is nested inside an attribute called “detail”.
When a source sends you an event through Event Bridge, the payload looks like this:
We can define a Lambda that returns the content of “detail” to the following function, as in the following example:
Function: create_client
Rol: Receive in message the content of the Lambda from the previous step. It takes the content and passes it as an argument to the instance of the CRMService class.
In the CRMService class we declare the methods that will perform the requests depending on the endpoint. In this example, the request is to the create_client endpoint. For calls to the API, we used the Requests library from Python:
The Lambda functions for create_partner and create_benefit are similar to create_client, with the difference that they call the appropriate endpoints. You can review on a case-by-case basis at this part of the repository.
Function: catch_error.py
Role: It takes the errors that occur and can return them to diagnose what may have happened. It's a Lambda function like any other, so it also has a handler, context and returns a json.
Then we declare the ServerLess.yml of this project
Now we have the Lambda functions for each step of the State Machine, we have the API that writes to the DB and we have the endpoint exposed to be able to make requests.
Sending a message to the Event Bridge event bus
For all of this to start interacting, it is necessary to send the event that initializes the flow.
Assuming that we are working on your company's CRM, and that you are getting the initial data from a web form, the way to write to the event bus that will initialize the flow is through the AWS SDK. Know the languages for which it is available hither
If you're working with Python, the way to submit the form would be this:
Once you have configured everything correctly, you should go to the Step Functions service in your AWS console and see the list of sent events:

If you choose the last event, you will see the execution sequence of the Step Functions and the details of their inputs and outputs. :

Where when choosing the ReceiveLead step, the input corresponds to the payload sent as an event via Event Bridge.

The Truth Test
If you enter your database (either with the terminal client or with an intuitive visual client) you will see that each data is in each of the corresponding tables.

Conclusions
Step Functions is a very powerful service if you need to automate a sequential flow of actions. We now elaborate on a simple example, but it's highly scalable. In addition, working with Step Functions is an invitation to decouple requirements from the solution you need to implement, making it easier to identify points of failure.
This type of orchestration is completely serverless, so it is much cheaper than developing an application that runs on a server just to fulfill this role.
It's a great way to experiment with a hybrid cloud, reusing and integrating applications from your data center, and interacting with cloud services.
Ready to optimize your data flows in hybrid environments?
At Kranio, we have experts in system integration and serverless solutions that will help you orchestrate efficient processes between your on-premise systems and the cloud. Contact us and discover how we can promote the digital transformation of your company.