Data Quality indicates how well data in the data resource meet the business information demand. Data Quality includes data integrity, data accuracy, and data completeness.
Today’s business organization cannot function at its optimum without relying on information. Data sources supplying information such as data warehouses and data marts are fast becoming ubiquitous in various business environments around the globe today. This data warehouses and data marts work together with business intelligence systems so that companies get a picture of the industry trends and its relation to the performance of business operations.
But inaccurate and inconsistent data is a great hindrance to having a company understand the current and future business situation. No matter how advanced the business intelligence system is, if there is not way to guarantee high quality of data, the final information can cause a disaster in the decisions of the company as well as in a variety of other negative effects such as lost profits, operational delays, customer dissatisfaction and much more. As they say, garbage in is always garbage out. No matter how advance the business intelligence algorithm is, if the data to begin with is not accurate, then the final output will definitely not be accurate as well.
An effective strategy in order to come up with quality data should be integrated in data management. In fact, the primary goals of the data manager should be to ensure that the data source infrastructure can efficiently transform data from its raw state into consistent, timely, accurate and reliable information that the business organization can utilize. The very foundation of data management could be generally categorized into 5 aspects: data profiling, data quality, data integration, data augmentation and data monitoring.
Data Profiling is about inspecting data for errors, determining inconsistencies, checking for data redundancy and completing incomplete information. At this point, the database manager can already have an overview of the data based on its profiles.
Data Quality is the process wherein data is corrected, standardize and verified. This process needs very meticulous inspection because any mistake at this point could send waves of errors along the way.
Data Integration is the process of matching, merging and linking data fro a wide variety of sources which usually come in disparate platforms.
Data Augmentation is the process of enhancing data information from internal and external data sources.
Finally, Data Monitoring is making sure that data integrity is checked and controlled over time.
In real life implementation, data quality is a concern for professionals who are involved with a wide range of information systems. These professionals know the technical in and out of a variety of business solution systems ranging from data warehousing and business intelligence to customer relationship management and supply chain management.
According to a study in 2002, in the United States alone, it was estimated that the total cost of efforts dealing with problems related to achieving and maintaining high data quality is about US$600 billion every year. This figure shows that concern over data quality is such a serious aspect so much so that many companies have begun to set up data governance teams solely dedicated to maintaining data quality.
Aside from the formation of dedicated data quality teams in many companies to address problems related to data quality, several software developers and vendors have also come up with tools. Many software vendors today are marketing tools used in the analysis and repair of poor quality data. There are also many service providers specializing in data cleaning on contractual and data consultancy firms also offer advice on avoiding poor quality data.