Astera introduces the newest addition to the platform, Astera Data Services - a complete API lifecycle management solution. Learn More

X

Centerprise Best Practices: Modularity and Reusability in Dataflow Design

By |2022-10-17T07:55:18+00:00November 18th, 2013|

Dataflows are the cornerstone of any data integration project in Centerprise. The Dataflow Designer, with its visual interface, drag-and-drop capabilities, instant preview, and full complement of sources, targets, and transformations, ensures users will be able to create and maintain effective and efficient dataflows. This two-part blog offers some best practices for getting the most out of your Centerprise integration projects. Part One addresses modularity and reusability, key design principles in the development of good dataflows. We’ll discuss performance-tweaking best practices in the second part of this blog.

Modularity and Reusability
Modularity enhances the maintain¬ability of your dataflows by making them easier to read and understand. It also promotes reusability by isolating frequently-used logic into individual components that can be leveraged as “black boxes” by other flows. Centerprise supports multiple types of reusable components, including subflows, shared actions, shared connections, and detached transformations.

Modularity and Reusability in Dataflow Design

Subflows are reusable blocks of frequently-used dataflow steps that have inputs and outputs. Once created, subflows can be used just like built-in Centerprise transformations. Examples of reusable logic that can be housed in subflows include:

  • Validations that are applied to data coming from multiple sources, frequently in incompatible formats
  • Transformation sequences such as a combination of lookup, expression, and function transformations that occur in multiple places in the project
  • Processing of incoming data that arrives in different formats but must be normalized, validated, and boarded

Example Centerprise subflow

Shared Actions are similar to subflows but contain only a single action. They are useful when a source or destination is used in multiple places within a project. If a field is added or removed from the source, all the projects inherit that change automatically.

Shared Connections contain database connection information that can be shared by multiple actions within a dataflow. They can also be used to enforce transaction management across a number of database destinations. Use them whenever multiple actions in a dataflow use the same database connection information.

Detached Transformations are a capability within Centerprise developed for scenarios where a lookup or expression is used in multiple places within a dataflow. Detached Transfor­mations enable you to create a single instance and use it in multiple places. They are available in expressions as callable functions, enabling you to use them in multiple expressions. Additionally, Detached Transformations allow you to use lookups conditionally. An example of a conditional lookup would be, “if party type is insurer, perform lookup on insurer table else perform lookup on provider table.”

Centerprise Detached Transformation

Input parameters and output variables make it possible to supply values to dataflows at runtime and return output results from dataflows. Well-designed parameter and output variable structures promote reusability and reduce ongoing maintenance costs.

Input Parameters

Input parameter values can be supplied from a calling workflow using data mapping. When designing dataflows, analyze the values that could change between different runs of the flow and define these values as input parame­ters. Input parameters can include file paths, database connection information, and other data values.

Output Variables

If you would like to make decisions about subsequent execution paths based on the result of a dataflow run, define output variables and use an expression transformation to store values into output variables. These output variables can be used in subsequent workflow actions to control the execution path.

Centerprise Performance Best Practices

Next week we’ll share performance best practices. Because Centerprise has been designed as a parallel-processing platform to deliver superior speed and performance, designing dataflows to take advantage of the software’s abilities can significantly affect your data integration performance. The performance best practices we’ll discuss next week can result in a major performance boost.

Related Articles

Creating a Complex Dataflow in Centerprise – Part 1

  Part 1 –Join Transformations and Functions Our last post (Creating an Integration Flow in Centerprise) described how to create a...
read more

Centerprise Best Practices: Performance Tweaking in Dataflow Design

In Part I of this two-part blog, we talked about how dataflows are the cornerstone of any data integration project...
read more

Centerprise 6: Detached Transformations Enable Higher Productivity and Performance

Centerprise Data Integrator features a large number of built-in transformations that enable users to develop sophisticated flows to meet complex...
read more