What is Data Dimension

Data Dimension is mainly used in data warehouse implementations. A data warehouse is implemented to that organizations can profit from data driven operation which constitute a major component in running businesses these days.

To be effective with a data driven operation, data which is the basis for statistical results for trending should be accurate and timely. In order to achieve timeliness, a company should invest in top of the line server hardware technologies which includes fast computers and network equipment, a task which is relatively easy to do as long as there is money.

But in order to achieve accuracy, the data warehouse should be based on a carefully planned data architecture based on real life business rules. This process is not just expensive but it also takes so much time and careful attention to the tiniest of details so that the data architecture reflects the real life business operations.

The data dimension employed in a data warehouse is designed to compartmentalize data in the warehouse. The data dimensions should clearly define the structured labeling of information to otherwise unordered numeric measures. To illustrate this, let us think of a sales receipt. A sales receipt may contain several information on it. Such data as "Date", "Customer Name" and "Product" are all data dimensions which could be have meaningful applications to a sales receipt. A data element could be to some degree similar to a categorical variable in the science of statistics.

There are three main functions of a data dimension which are filtering, grouping and labeling. For instance, in a company data warehouse, each person, regardless of whether this person is a client, a company staff, or company official, is categorized according to gender – male, female or unknown. If a data consumer wants a report by gender category, say, all males, the data warehouse will have a fast and efficient means in sifting the big bulk of data within the data warehouse.

In general, each dimension found in the data warehouse could have one or more hierarchies. For example, the "Date" dimension may contain several hierarchies like Day > Month > Year; or Week > Year. It is up to the design of the data warehouse how the hierarchy in data dimension will be laid out.

A concept in data warehousing called "role-playing dimension" is used when multiple application with the same database recycle data dimensions. For example, in the "Date" dimension again, the said "Date" can be used for "Date of Delivery" as well as "Date of Sale" or "Date of Hire". This can help the database save space on storage.

A dimension table is used in a data warehouse as one of the set of companion tables to a fact table (which of course contains business facts). A dimension table contains attributes or fields which are used as constraints and group data when performing a query.

Another related term used in data warehousing is the degenerate dimension. This is a dimension derived from a fact table but it does not have its own dimension table. This is generally used in cases where the grain of the fact table represents transactional level data and a user wants to main specific system identifiers like invoice or order numbers. When one wants to provide a direct reference back to a transactional system without having to care about overhead cost from maintaining a separate dimension table, then a degenerate dimension is the way to go.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.


Pin It