Often, data warehouses are being managed by more than just one computer server. This is because the high volume of data cannot be handled by one computer alone. In the past, mainframes were used for processes involving big bulks of data. Mainframes were giant computers housed in big rooms to be used for critical applications involving lots of data used as in processes involving census, consumer statistics, enterprise resource planning or financial transaction processing.
In today's business environment, data management has become a critical part of a company. Many customers and company staff use a combination of laptops, desktops and terminal services. This has posed serious challenges to many data administrators to make user data available to data consumers in a consistent and timely fashion.
Making use of data distribution partially answers these challenges. Data distribution may at first give the impression of contradiction to data normalization which is a relational database process of eliminating redundant data to speed up computer processing and save company expenses by saving on expensive storage devices. Because data distribution is basically a way of replicating data, it could give the impression of defeating the purpose of database normalization.
In data distribution replicated data are stored in another computer database which are normalized. It could be a mirror of a database in another computer system. In distributed systems, several computers process a program using its own data from its own database and these computers continuously update each other to synchronize their data content. This is useful in several ways. On the one hand, sharing processing load could mean faster and more efficient computing.
On the other hand, employing a distributed computing system could mean less prone to failure. If one of the computers within the distributed system fails, another one can carry out the same load. In contrast to centralized computing as in the case of mainframes, if the mainframe computer breaks down, the whole system goes down as well.
As a result, business operations could come to a halt until the mainframe is fixed. Or the company might temporarily switch to non automated system while waiting from the mainframe computer to go up again. The toll on this non automated system could fall on the system administrator because his work could pile up due to the bulk of data accumulated during the non automated period.
Data distribution can be costly undertaking but when properly managed, it could be very beneficial to a company implementing a data warehouse.