Normalized Data is the data in the data view schema and the external schema which have gone through data normalization.
Maintaining a data warehouse means dealing with millions of data as the data warehouse itself is the main repository of a company's historical data or its corporate memory. Thus, data should be well managed and one of the ways to effectively manage a data warehouse is by reducing data redundancy.
One of the techniques employed commonly for reducing data is by using database normalization. This technique is used mainly for relational database tables in order to minimize the duplication of data and in doing such, the database can be safeguarded against some types of structural or logical problems such as data anomalies.
Let us take the case wherein a certain piece of information has multiple instances in a table. This case would result in having the instances not being kept consistent when an update is done on the table thus leading to a loss of data integrity. When a table is normalized, it becomes less prone to data integrity problems.
When a database is normalized to a certain high degree, more tables are being created to avoid data redundancy in one table but there would also be a need for having a larger number of joins and this can result to reduced performance.
As a general rule, database applications that involve a lot of isolated transactions such those in an automated teller machine need to be more highly normalized while those database applications that do not need mapping of complex relationships employ less normalized tables.
The degrees of normalization of database tables are described in the database theory in terms of normal forms. Some of the normal forms inlude first normal form, second normal form, third normal form, Boyce-Codd normal form, Fourth normal form, Fifth normal form, Domain/key normal form and Sixth normal form.
As earlier mentioned, normalized data are the data used for the data view schema. A data view schema is the logical or virtual table composed of the data query results on a database. But they are not like ordinary tables in a relational database in that a data view is not a part of the physical schema but it is instead a dynamic and virtual table whose contents come from collated or computer data.
A data view can be a subset of the data contained in a table and can join and simplify various tales into one virtual table. The data contents may be aggregated from different table resulting from computation operations such as average, products and sums.
Also mentioned earlier is the fact that normalized data are also used for external schema. An external schema is designed for supporting user views of data and providing programmers with easy access to data from a database.
The data that users see are in terms of an external view which is defined by an external schema of the database. The external schema basically consists of descriptions of each of the various types of external record in that external view as well as a definition of the maps and connection between the external schema and the underlying conceptual schema.
Because of the very nature of normalized data wherein redundancies are greatly reduced or eliminated, the database as well as the entire information system greatly benefits in that the systems become a lot easier to manage and maintain. If there were so much data redundancy scattered throughout the entire system, additional overhead cost through the purchase of additional hardware and software would be needed to make sure that data consistency and integrity is attained.