Overview
Data Synchronization is a key feature of Centerprise Data Integrator and is
designed to optimize large-scale data transformation processes. This feature is
especially valuable in situations where you periodically replicate data between
different systems or update large data sets with data refreshes from other
systems. By eliminating database writes that are not necessary and optimizing
the ones that are, this feature can bring about major performance gains.
This document provides an overview of Centerprise Data Integrator's Data
Synchronization feature. It discusses circumstances where Data Synchronization
can be used to optimize transfer tasks. It also discusses techniques and options
to control the data synchronization process.
How Data Synchronization Works
Data Synchronization is designed for situations where an existing database is
updated periodically with data from another source. The source can be a file,
database, query, or any other source type supported by Centerprise. Typically,
the source contains a large number of records. However, there is usually only a
subset of these records that has actually changed since prior updates.
Centerprise data synchronization compares data in the destination with the
source data and, based on set of user-controlled options, updates the
destination database. Synchronization is performed in batches. Centerprise reads
a batch of records from the source and maps them to the destination. For all
records in the batch, it retrieves the destination records and performs
reconciliation between the source and destination records. If source and
destination records are identical, no updates are performed. If there is no
corresponding destination record for a source record, the record is inserted.
Multiple
batches are run in parallel to increase throughput.
Synchronizing Data
Centerprise Data Integrator provides a number of options to influence data
synchronization behavior. This section provides a brief description of
synchronization options.
If Record exists in
source and destination
You can choose Update, Skip, or Delete and Insert actions
on records that exist in both source and destination. You can also use business
rules to specify conditions for updates. For instance, if you want to skip
updating orders that are already shipped, you can specify a condition that
filters out these orders.
If Record exists only in source
You can choose Skip or Insert actions on records that are found in
source but do not exist in destination.
If Record exists only in destination
For records that exists in destination but do not exist in source, you can
choose to leave them untouched or delete them.
Using Data Synchronization for Data Replication
Data synchronization can be used to perform data replication across databases.
Using Centerprise Data Integrator's scheduler, you can trigger data
synchronization at specific intervals. As Centerprise supports popular databases
such as Microsoft SQL Server, Oracle, Sybase, and DB2, you can perform these
updates directly between two databases and skip using flat files.
Speeding up Synchronization with Database Writer Options
Centerprise Data Integrator provides a number of features that enable you to
control the database writing process.
Several
options are provided to
speed up database writes. These options can be combined to provide huge
performance boost to your data synchronization tasks. Please refer to
‘Optimizing Large Jobs in Centerprise' for detailed discussion of this topic.
Controlling Synchronization Process
There are other options that enable you to control the data synchronization
process. These include specifying the synchronization batch size, and the number
of synchronization batches that can be executed concurrently.
Creating Custom Synchronizers
While Centerprise Data Integrator's Data Synchronization provides a high degree
of configurability, there may be situations where you may want to create
customized synchronizers. Centerprise exposes all the synchronization
functionality through extensive APIs that can be used to create customized
variants. Please refer to ‘Creating Custom Integrators using Centerprise APIs'
series of articles.