In any data resource, it is essential to meet requirements of current as well as future demand for information. Data completeness assures that the above criterion is fulfilled.
Data completeness refers to an indication of whether or not all the data necessary to meet the current and future business information demand are available in the data resource.
It deals with determining the data needed to meet the business information demand and ensuring those data are captured and maintained in the data resource so they are available when needed.
A data warehouse has six main processes. These processes should be carefully carried out by the data warehouse administrator in order to achieve data completeness. The processes are as follows:
• Data Extraction – the data in the warehouse can come from many sources and of multiple data format and types with may be incompatible from system to system. The process of data extraction includes formatting the disparate data types into one type understood by the warehouse. The process also includes compressing the data and handling of encryptions whenever this applies.
• Data Transformation – This processes include data integration, denormalization, surrogate key management, data cleansing, conversion, auditing and aggregation.
• Data Loading – After the first two process, the data will then be ready to be optimally stored in the data warehouse.
• Security Implementation – Data should be protected from prying eyes whenever applicable as in the case of bank records and credit card numbers. The data warehouse administrator implements access and data encryption policies.
• Job Control – This process is the constant job of the data warehouse administrator and his staff. This includes job definition, time and event job scheduling, logging, monitoring, error handling, exception handling and notification.
The measure of a data warehouse’s performance depends on one of the factors pertaining to availability of useful data which is also an indication of the success of a business organization in reaching its own goals. All data can be imperfect in some fashion to some degree. It is the goal of the data warehouse manager to pursue perfect data which is consumed by the public resources without the need for creating appreciable value. The data warehouse manager and his staff should come up with strategies to be able to provide substantial accuracy and timeliness of data at a reasonable cost so as not to burden the company with extra expenses.
In most cases, data warehouses are available twenty four hours a day, seven days a week. So that comprehensive data is gathered, extracted, loaded and shared within the data warehouse, regular updates should be done. Parallel and distributed servers target for world wide availability of data so data completeness can be achieved with investing in high powered servers and robust software applications. Data warehouses are also designed for customer level analysis, aside from organizational level analysis and reporting. So flexible tools should be implemented in the data warehouse database to accommodate new data sources and support for metadata. Reliability can be achieved when all these are considered.
The success in achieving data completeness in a warehouse is not just dependent on the current status of the database and its physical set-up. At the planning stage, every detail about the data warehouse should be carefully scrutinized. All other frameworks of the data warehouse should also be carefully planned including the details of the business architecture, business data, business schema, business activities, data model, critical success factors, meta data, comprehensive data definition and other related aspects of organizational functions.
Having complete data can give an accurate guidance of the business organization’s decision maker. With complete data, statistical reports will be generated with will reflect and accurate status of the company and how it is faring with the trends and patterns in the industry and how to make innovative moves to gain competitive advantages over the competitors.