A study by IDC predicts that the worldwide data volume will grow to an enormous 175 zettabytes (ZB) by 2025. Managing growing volumes of data from diverse sources can be a tricky feat. For this reason, many organizations leverage data integration tools with workflow automation capabilities to accelerate their data processes.
Whether you want to consolidate transactional data, migrate data from legacy systems, or integrate partner or vendor data, the workflow component in Astera Centerprise automates the execution of a sequence of tasks, in serial or parallel, on multiple servers. This helps minimize error probability, optimize business processes, and improve time-to-value by eliminating the manual steps involved in designing and deploying data integration flows.
Workflow Automation Capabilities in Astera Centerprise
Designed to offer ease-of-use and flexibility, the workflow component helps visualize and automate the entire process, from the point data enters an organization to when it is cleaned, validated, and loaded into the preferred destination.
To illustrate the workflow functionality, let us consider a scenario in which a business receives customer data in spreadsheet every month. The requirement is to cleanse incoming data and load the processed data into the company’s CRM, Salesforce.com, for a unified view.
The screenshot below shows how the workflow functionality in Astera Centerprise helped accomplish this task by automating the ETL process, and sending email notifications to information users on successful completion of the job.
Let’s look at the steps involved in the workflow in detail.
Step 1: Looping Through the Source Directory
First, a File System object is used for looping through the source directory to pick up the source file path.
Step 2: Performing ETL
The Run Dataflow task is used to call a dataflow within a workflow. In this scenario, we will call an existing dataflow for executing the ETL process.
The dataflow is used to extract data from source files, cleanse the raw data to create a standardized structure, process it according to the business requirements, and load the transformed data into the company’s CRM, Salesforce.com, as shown in figure 5.
Moreover, the input and output variables are defined in the Variables object. The former is used to provide the source file path to the Excel source object and the latter is used for passing the job status value to the containing workflow for decision-making, as shown in the next step.
Step 3: Sending Email Notification
A Decision task invokes one of the two paths in the workflow, depending on whether the logical expression inside the Decision object returns a Yes (True) or a No (False). In this scenario, the value of output variable ‘Job Status’ is being passed on from the dataflow into the workflow for decision-making. A Decision object is used to send an email to notify the users when the job is completed or copy the file to a directory if the job is terminated.
Step 4: Automating the ETL Flow
Astera Centerprise has a built-in job scheduler that allows you to automate ETL flows by specifying job frequency. This eliminates the need to manually run the flow every time a file is received.
In this case, the flow runs every time a file is dropped in the source directory, as shown below.
Data integration jobs involve complex workflows that extract, cleanse, and validate structured and unstructured data. Automation plays a crucial role in streamlining these processes, as it helps increase throughput and productivity. Using Astera’s workflow component, you can visually piece together workflows of any complexity, and scale and automate the entire integration process – from the extraction of source data to transformation and loading.
Interested in giving Astera Centerprise’s workflow automation feature a try? Download a free 14-day trial version and experience it first-hand!