Data Synchronization is also known as data version concurrency or data version synchronization. It ensures the concurrency of replicated data and makes sure that every replicated data value is matches the official data version.
There are many data synchronization technologies which are available in order to synchronize a single set of data between two or more electronic devices such as computers, cellular phones, and other personal digital assistants. These technologies can automatically do data copying of changes forth and back. For instance, a contact list on the mobile phone of a certain user can be synchronized with a similar contact list in another mobile phone or in another computer.
Data synchronization can be broadly categorized as local synchronization or remote synchronization. In local data synchronization, the devices which are tying to synchronize data may be located near each other or can be located side by side.
The transfer of data to be synchronize may employ connection technologies such as infra-red, network cable or Bluetooth technology. In remote synchronization, devices which are tying to synchronize data are located very far from each other and data transfer employ network technology with the help of networking protocols such as file transfer protocol (FTP).
In the past, data management used to be a scenario where data is either consistent or highly available but could never be both at the same time. This was what was referred to as the Heisenbergian dilemma. But with today’s fast advancement in information technology especially in the field of real time processing, data synchronization is much more efficient than ever before.
During the time when Usenets were very popular on the internet, it was more sensible to make replications of contents across a federation of news servers. The authors of RFC 977 which specify Standard for the Stream-Based Transmission of News wrote:
“There are few (if any) shared or networked file systems that can offer the generality of service that stream connections using Internet TCP provide.”
But today, there are already thousands of shared file systems offer generality of service. Generality of service is what web servers do as they serve today’s dynamic webpages including forums, blogs and wikis. These developments have led to better data synchronization techniques cast from the model of internet infrastructures and implemented in organizational data warehouses.
Data synchronization helps greatly in large a warehouse which also maintains various data marts and other data sources in order to distribute the processing load for efficiency. Since the data warehouse is expected to feed updated, clean, relevant, meaningful and timely information to be used by business intelligence systems, it has to make sure that data is synchronized throughout the whole system of data sources.
Since data synchronization makes frequent communication between the data warehouse and the other data sources, problems related to network traffic management could spring up. This has be managed carefully be a set of standard network protocols as well as proprietary protocols employed by network application solutions developers.
Data synchronization is not just about overcoming network problems also. It works closely with the whole IT and business data architecture. One aspect of data synchronization is isolating multiple data views from underlying model so that the updates of the data model will not just alter the data view but also propagate to the synchronized instances of the said data model.
There are actually hundreds of techniques for data synchronization and different software solution vendors have different implementations of these techniques. There can never be an all in one solutions as needs differ from one data warehouse to another.