In a conventional ETL or ELT pipeline, all data mapping is tightly bound to a statically defined schema in which data types, number of columns, and column names are generated based on the original layout of corresponding source tables.
Under this approach, any modifications in source files or tables must be manually reflected in related dataflows and workflows. While these tasks are generally limited, in situations where transactional data is being received from a wide variety of locations, some input feeds may contain additional columns or follow different recording conventions for specific fields. What’s more, concerned departments are usually more than willing to adjust how their data is communicated on an ongoing basis.
With static layouts in place, users may be required to make several manual adjustments to their existing mappings and transformations to resolve deviations from the updated schema.
Astera’s new dynamic layout option streamlines these time-consuming tasks with intuitive features that allow parameter configuration for source and destination entities with all changes automatically propagated throughout linked data maps. These changes are initiated based on the paths and relationships that are already incumbent on existing fields within your dataflows and workflows, regardless of the visible structure of source entities.
With dynamic layouts enabled, these differentials can be automatically identified and implemented in your ETL and ELT processes without disrupting your data integrations.
Dynamic Layout Use Cases
We’ve identified a number of scenarios in which the ability to automate structural changes to layouts would be advantageous to our users. Potential uses for this feature include:
- Quickly updating dataflow layouts when columns are added, removed, or modified at source.
- Performing identical processes (cleansing, transformations, migrations) across multiple dataflows with minimal manual adjustment.
- Creating a dynamic ETL/ELT process that can pick disparate files from a given location and load them into new destinations automatically.
These use cases are explained in more detail below.
Dealing with Layout Anomalies
In the following dataflow, you can see that the source entity allows for three default fields, which are then enhanced with a variable and constant value transformation.
Now, we can use the layout builder screen, to set up a new parameter for the source as shown below.
Once dynamic layouts are enabled, the output for this entity is automatically updated with all new fields pushed through the same mapping logic. The new table structure appears as follows:
We can further test this feature by adding an additional field to the source file without making any change to the layout of the source object in Astera Centerprise. This will show us if the change is accurately reflected in dataflow.
Now, if we check output at the destination after re-running the dataflow, we can see that a differential has been automatically executed to resolve any discrepancies between source entity and Excel file. The additional field is then seamlessly incorporated into the existing mapping. The same process will take place whenever fields are removed from the source file.
Dynamic Layouts in a High-Volume Data Migrations
Dynamically Updated Databases
The true effectiveness of this feature becomes apparent when we implement it in a workflow object, which picks an array of source files from a repository and runs them through a series of pre-set transformations.
Above, we have a dynamic database source that is configured to point towards an input defined in the variable transformation.
However, the variable transformation contains no definitions for a source file or directory.
In this case, the definitions are supplied by an external File System Items Source in a workflow (See below)
This source object is set up to point towards a directory containing 26 different files, each one with a different layout.
As you can see above, the full path and input file name have been linked to variable object in the dataflow. This workflow has also been set up to run on a loop, so each file in the source directory will be picked up and run through the dataflow object, which will dynamically adapt to the new layout and run the source file through the transformations provided in the last use case without the need for any manual adjustment by the user.
We have also configured the destination table in our original dataflow to load all transformed files into a CSV target, which is defined as an uppercase version of the original source filename as denoted by an additional expression in the variable entity.
Dynamically Updated SQL Queries
In the diagram below, we’ve used the SQL Query Source object to select all fields from the Employees table in the source database.
Once again, the parameter paths in the variable are dependent on inputs from an external workflow object as seen below.
In this case, we have selected a delimited source that contains a total of 26 table names. This object is passed through an expression transformation that places the supplied table name in a SELECT statement
This statement is then linked to the sqlquery1 input variable field in the dataflow. As a result, the select statement performed at source in Fig 4 will be updated dynamically to pick data from a new table each time the dataflow is executed.
Each new dynamically generated dataflow can also be accessed individually, if additional adjustments are required based on the input table.
As you can see, the source file in Fig 5 has been set to run on a loop so the workflow will continue to supply new table names which will be used in Fig 4’s SQL query source. By implementing dynamic layouts in this way, users can save themselves the time and effort spent on building several individual dataflows to perform the same operations on disparate source tables.
Data Cleansing and Dynamic Layouts
Another highly effective implementation of dynamic layouts is in large-scale data cleansing operations, in which a constantly updated dataset must be validated for accuracy and consistency before being passed onto reporting and analytics systems.
In a conventional schema, the layout would need to be manually updated whenever source data is updated. In cases where the concerned function made structural changes to their tables without informing developers, data cleansing activities may be disrupted altogether. This will create a sizeable impact on decision-making at both executive and operational level.
What’s more, if multiple tables need to be cleansed in the same manner, then the developers would need to create individual processes to deal with each source.
The figure above shows a source object which is configured to pick CSV files from an input source based on values provided in the variable transformation.
The target entity has been similarly configured to create new destination based on definitions provided in the variable transformation.
While the dataflow currently shows only one field being pushed through this process, we can add several additional columns to the source without any issue.
These will be integrated seamlessly into the existing mapping based on the CustomerID guide field, transformed, and output to the newly configured destination.
Add Instant Adaptability to your Dataflows with Astera Centerprise’s Dynamic Layouts Feature
Whether you’re looking to propagate changes to a destination object or build a complete dataflow pipeline that can respond in real-time to updates and modifications to your source datasets, dynamic layouts can add flexibility to even the most complex data mappings.
For more information on this and other features, you can schedule a no-obligation consultation with our technical team today.