Learning Series
Home Database Database Concepts

Distributed Databases

Category: Concepts | Comments (0)

The Three Expenses in Distributed Databases

Page 3 of 3


The Three Expenses in Distributed Databases

Essentially, replication entails creating exact copies of databases on many computers and updating every database simultaneously whenever an update is performed on one database. The pitfalls of this process are explained by the three expenses, below.


Replication has finance expense because every server, every hard drive, every battery-backed RAID card, every network switch, every fast network connection, every battery-backed power supply, and every other piece of associated hardware must be purchased. In addition to that are the costs of bandwidth, maintenance, backup servers, co-location, remote management, and many other things. For a decent-sized database, this could very easily run into the tens of thousands of dollars before even getting to the “every hard drive” part of the list.


Replication has time expense because each operation performed on one node’s database must be performed on each other node’s database simultaneously. Before the operation can be said to be committed, each other node must verify that the operation in its own database succeeded. This can take a lot of time and produce considerable lag in the interface to the database.


And, replication has data expense because every time the database is replicated, another hard drive or two or more fills up with data pertaining to the database. Then, every time one node gets a request to update that data, it must transmit just as many requests as there are other nodes. And, confirmations of those updates must be sent back to the node that requested the update. That means a lot of data is flying around among the database nodes, which, in turn, means ample bandwidth must be available to handle it.


How to Initiate Replication

Many of the more popular databases support some sort of native replication. MySQL, for example, provides the GRANT REPLICATION command, which initiates replication automatically. PostgreSQL, on the other hand, requires external software for replication. This usually happens in the form of Slony-1, a comprehensive replication suite. Each database platform has a different method for initiating replication services, so it is best to consult that platform’s manual before implementing a replication solution.


Considerations

When implementing a distributed database, one must take care to properly weigh the advantages and disadvantages of the distribution. Distributing a database is a complicated and sometimes expensive task, and it may not be the best solution for every project. On the other hand, with some spare equipment and a passionate database developer, distributing a database could be a relatively simple and straightforward task.


The most important thing to consider is how extensively your database system supports distribution. PostgreSQL, MySQL, and Oracle, for example, have a number of native and external methods for distributing their databases, but not all database systems are so robust or so focused on providing a distributed service. Research must be performed to determine whether the database system supports the sort of distribution required.


The field of distributed database management is relatively young, so the appropriate distribution model for a particular task may not be readily available. In a situation like this, designing one’s own distributed database system may be the best development option.


Regardless of the approach taken, distributing a database can be a very rewarding process when considering the improvement of the scalability and reliability of a system.




First Page: Distributed Databases

Next: Databases Are Fun!


Post Comment


Members Please Login

Name:


Email:
 
(Optional. Used for Notification)

Title:

 
Comment:


Validation Code:
 <=>  (Enter this code in text box)
Subscribe





Google Sponsored Links

 

Daily Email Updates

Get Latest Learning Series Updates delivered directly to your Inbox...

Enter your email address:

Latest Learning Series Updates

Database Concepts Tutorials

Related Tutorials