Disparate Operational Data

Disparate Operational Data is generally used to support day-to-day business transactions.  They reflect current status of an organizations operational data.

Dealing with disparate operational data is a very common everyday process in a data warehouse implementation. Since a data warehouse is built to contain the repository of current and historical data of a business organization and one of its main goals is to supply the needed input for a business intelligence system, a data warehouse needs to handle very large volumes of data at very intervals.

A single central database alone cannot handle an intensive data warehousing load so it needs to have many other physical servers share the load as well as store other data. For instance, in a large business organization, some data may come from the financial department, others from the human resource department, manufacturing department, sales departments (like from point of sales department stores) and many other data sources.

But still, despite the shared load, it would still be too heavy for the data warehouse system to simultaneously handle all the data. So it needs a "temporary" area to for current data values to be worked on. This areas is what is commonly referred to as the operational data store (ODS).

An operational data store is actually just another set of relational databases which contain data that is extracted from a regular basis, say nightly basis, from different sections of the business organization. For example, in the case of a school operational system, the data extracted may come from students, personnel, financial aid, admissions, and the Billing and Accounts Receivable System.

The operational data store is designed for integrating data from multiple sources so that operations, analysis and reporting can be efficiently facilitated. Since the data, as mentioned, comes from a variety of sources, the integration would include involves cleaning, redundancy resolution and business rule enforcement. This data store is usually designed to handle low level or atomic and indivisible data such as transactions and prices which is in contrast to aggregated or summarized data such as net contributions. The aggregated data are usually found stored in the data warehouse.

It is very common that the data sources could be of disparate nature. The reality of data warehousing system is that different data sources are powered by different databases systems. For instance, some databases may be run by Oracle, others by MySQL or Microsoft SQL Sever and many other commercial relational database vendors.

Even if the underlying frameworks of these relational databases are basically similar, each of them still has implemented its own exclusive or proprietary formatting. So the different outputs of these databases once they get to the operational data store could still be disparate operational data.

Another cause of having disparate operational data does not lie on the relational database management system itself but in the overall design of the enterprise information system. For instance, a company that is implementing a database system may not have defined a single, complete, integrated inventory of all its data. Or maybe the real substance, meaning and content of all the data within the organizational data resource is not readily known or well defined. And still yet, there exists very high variability of data formats and contents in the company’s information system.

Handling of disparate operational data is being managed by many commercial software application tools. A common process in data warehouse called ETL which stands for extract, transform, load is commonly done to ensure that all types of disparate data and metadata are being transformed before they get loaded to the data warehouse.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.


Pin It