Lightly Summarized Data

In Lightly Summarized Data the evaluational data is summarized by removing one, or a few, data characteristic from the primary key of the data focus.

Any company implementing a data warehouse is investing in large amount of money in the hope of getting relevant information that will help the company come up with very sound decisions to give them competitive edge in today’s data driven business environment.

But a data warehouse, as can be expected of a system that handles very large volume of data, is often implemented with many other different databases hosted in various computer systems which are called data sources, data stores or data marts.

All these systems give disparate data to the warehouse to be processed according to the business data architecture and business rules. As such the data warehouse needs very intensive loads so it needs to have a mechanism whereby it can serve its very purpose which is to give relevant information from among the millions and millions of data inside it.

Aside from data archives and all the systems of records and integration and transformation programs, the data warehouse also contains current details and summarized data.

The heart of the data warehouse is its very current detail. This is the place where the biggest of bulk of the data resides and the current details is being supplied by directly from the operational systems which may be contained either as raw data or aggregated raw data.

The current details are often categorized into different subject areas which correspond to representations of the entire enterprise rather than a given application. The current detail has the lowest level in terms of data granularity from among the other data in the warehouse.   

The period represented by the current detail depends on the company data architecture but if is common to set the current details to cover about two to five years. The refreshing of current details occurs as frequently as necessary to support the requirements of the enterprise.

One of the most distinct representations of current details in particular and data warehouse in general is the aspect of lightly summarized data. All enterprise elements such as region, functions and departments do not need the same requirements for information so an effectively designed and implemented data warehouse can supply customized lightly summarized data for every enterprise element. Access to both detailed and summarized data can be had by the enterprise elements.  

Data warehouse is designed and implemented such that data is stored and generated to many levels of granularity. To illustrate the different levels, let us imagine a cellular phone company that wants to implement a data warehouse in order to analyze user behavior.

In the finest granularity level are the records of customers kept about every call description record during a 30-day period. During the next level which is the lightly summarized data history, statistical information by month for that customer such as calls by hour of day, day of week, area codes of numbers called, average duration of calls and other related information are stored.

Finally, at the highly summarized level which is the next level of granularity, the records that may be contained include number of calls made from a zip code by all customers, roaming call activity, customer churn rate and this can be used other statistical activities.

Data warehouses are typically implemented with having different databases handling the different levels of data granularity such as raw data, lightly summarized data and highly summarized data in a large information system with federated database. Lightly summarized data have fine granularity. To have maximum efficiency, a stable network infrastructure should be implemented.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.

Pin It