Streamline Real-Time Data Integration by Leveraging Workflows

By |2020-09-21T15:54:26+00:00January 7th, 2020|

A study by IDC predicts that the worldwide data volume will grow to an enormous 175 zettabytes (ZB) by 2025. Managing growing volumes of data from diverse sources can be a tricky feat. For this reason, many organizations leverage data integration tools with workflow automation capabilities to accelerate their data processes.

Whether you want to consolidate transactional data, migrate data from legacy systems, or integrate partner or vendor data, the workflow component in Astera Centerprise automates the execution of a sequence of tasks, in serial or parallel, on multiple servers. This helps minimize error probability, optimize business processes, and improve time-to-value by eliminating the manual steps involved in designing and deploying data integration flows.

Integrate Data with Workflow Automation in Centerprise

Designed to offer ease-of-use and flexibility, the workflow component helps visualize and automate the entire process, from the point data enters an organization to when it is cleaned, validated, and loaded into the preferred destination.

To illustrate the data integration capabilities of Astera Centerprise with workflow automation, let us consider a scenario in which a business receives customer data in spreadsheet every month. The requirement is to cleanse incoming data and load the processed data into the company’s CRM, Salesforce.com, for a unified view.

The screenshot below shows how the workflow functionality in Astera Centerprise helped accomplish this task by automating the ETL process, and sending email notifications to information users on successful completion of the job.

Workflow in Astera Centerprise

Figure 1: Using a Workflow Component

Let’s look at the steps involved in the workflow in detail.

Step 1: Looping Through the Source Directory

First, a File System object is used for looping through the source directory to pick up the source file path.

File System Action Object

Figure 2: Specifying the Source Directory

Step 2: Performing ETL

The Run Dataflow task is used to call a dataflow within a workflow. In this scenario, we will call an existing dataflow for executing the ETL process.

Run Dataflow Object

Figure 3: Run Dataflow Object Properties

 

Workflow Orchestration

Figure 4: Orchestrating the ETL Process in a Workflow

The dataflow is used to extract data from source files, cleanse the raw data to create a standardized structure, process it according to the business requirements, and load the transformed data into the company’s CRM, Salesforce.com, as shown in figure 5.

Moreover, the input and output variables are defined in the Variables object. The former is used to provide the source file path to the Excel source object and the latter is used for passing the job status value to the containing workflow for decision-making, as shown in the image below.

Dataflow

Figure 5: ETL Process to streamline database

Step 3: Sending Email Notification

A Decision task invokes one of the two paths in the workflow, depending on whether the logical expression inside the Decision object returns a Yes (True) or a No (False). In this scenario, the value of output variable ‘Job Status’ is being passed on from the dataflow into the workflow for decision-making. A Decision object is used to send an email to notify the users when the job is completed or copy the file to a directory if the job is terminated.

Decision Properties

Figure 6: Decision Properties

 

Workflow

Figure 7: Using a Decision Object in the Workflow

Step 4:  Automating the ETL Flow

Astera Centerprise has a built-in job scheduler that allows you to automate ETL flows and streamline database integration by specifying job frequency. This eliminates the need to manually run the flow every time a file is received.

In this case, the flow runs every time a file is dropped in the source directory, as shown below.

Job Scheduler | Astera Centerprise

Figure 8: Scheduling a Job

Streamline Data Integration with Centerprise

Data integration jobs involve complex workflows that extract, cleanse, and validate structured and unstructured data. Automation plays a crucial role in data event streaming, as it helps increase throughput and productivity.

Using Astera’s workflow component, you can visually piece together workflows of any complexity, and scale and automate the entire real-time data integration process – from the extraction of source data to transformation and loading in the data warehouse. In case of real-time streaming, this data can then be displayed through reports simultaneously.

workflows

Learn more about data integration and workflow automation capabilities of Astera Centerprise. Download a free 14-day trial version and experience it first-hand!