Star Schema

The star schema, which is sometimes called a star join schema, is one of the most simple styles of a data warehouse schema. It consists of a few fact tables that reference any number of dimension tables. The facts tables hold the main data with the typically smaller dimension tables describing each individual value of a dimension.

Star Schema is characterized by

  • simplicity
  • allows easy navigation
  • has rapid response time

Normalization is not a goal of star schema design. Star schemas are usually divided into fact tables and dimensional tables, where the dimensional tables supply supporting information. 

A fact table contains a compound primary key that consists of aggregate of relevant dimension keys while a dimension table has a simple primary key.

A dimension table is also commonly in the second normal form as it consolidates redundant data while a fact table is commonly in the third normal form as all data depend on either one dimension or all of them and not just on combinations of a few dimensions.

A star schema is a very important aspect in a data warehouse implementation in that it is the best way to implement a multi-dimensional database by using any common mainstream relational database. It is also a very simple method and from the perspective of the users, the queries are simple because the only joins and conditions involve a fact table and a single level of dimension tables, without the indirect dependencies to other tables that are possible in a better normalized snowflake schema.

Making a star schema for a database may relatively easy but it still very important to make some investments in time and research because the schema’s effect on the usability and performance of the database is very important in the long run.

In a data warehouse implementation, the creation of the star schema database is one of the most important and often the final process in implementing the data warehouse. A star schema has also a significant importance in some business intelligence processes such as on-line transaction processing (OLTP) system and the on-line analytical processing (OLAP).

On-line Transaction Processing is a standard, normalized database structure and as the name implies, it is used for transactions, which means that involves database table inserts, updates, and deletes must be fast. For instance, let us take the scenario in the organization’s call center.

Several call center agents continuously take calls and enter order typically involving numerous items which must be stored immediately in the database. This makes the scenario very critical and that the speed of inserts, updates and deletes should be maximized. In order to optimize the performance, the database should hold as few records as possible at any given time.

On the other hand, On-line Analytical Processing, though this may mean many different things to different people, are many for analyzing corporate data. But in some cases, the terms OLAP and star schema are used interchangeably. But a more precise way of thinking would be to think of a star schema database is an OLAP system which can be any system of read-only, historical, aggregated data

The same OLAP and OLTP can be optimized with a star schema in a data warehouse implementation. Since a data warehouse is the main repository of a company’s historical data, it naturally contains very high volumes of data which can be used for analysis with OLAP. Querying these data may take a long time but with the help of star schema in implementation, the access time may be made faster and more efficient.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.


Pin It