Dual Data Partitioning

Dual data partitioning is common in most data distribution. By its very essence, a data partitioning process is a task which determines which data subjects, data occurrence groups and data characteristics are needed at each data site. This should be done in an orderly manner of allocating data to the data sites which is done within the same common data architecture.  

In Dual Data Partitioning, both data occurrence and characteristic partitioning are performed on the same data subject.

Data partitioning also refers to the process of logically and/or physically partitioning data into different segments which can be more easily accessed and maintained for easy facilitation of the entire information system of the business organization.

Today, most of the modern relational database management systems provide this kind of distribution functionality were data partitioning can help both in overall performance and utility processing.

But dual partitioning is a more advanced process than the usual data partitioning which partitions data subjects, data occurrence groups and data characteristics one at a time.

Apparently, because of the simultaneous nature of the dual data processing, more data are being processed and more network resources can be consumed. The process of dual data partitioning is common in implementation of data distribution service.

The data characteristic partition is an aspect of the data subject which contains everything about the entities and attributes pertaining to the data subject. For example if the data subject is "person", then the data characteristic partition includes digital representation of the person’s name, age, address, marital status, job, address and many other information. The data occurrence partition on the other hand is an aspect of the data subject which contains records of about where the "person" entity has occurred.

For instance, a person who has quite a lot of transactions within the company may found in many places in the database within the data warehouse. Or a salesperson associated with many customers may occur in many tables in the databases. Separating these two aspects of the data subject can boost the speed and performance of the database.

A data distribution service (DDS) is used for real time system and a publish / subscribe middleware specification for distributed systems. It has been created as a response to the need for augmenting the Common Object Request Broker Architecture (CORBA) with a data-centric publish-subscribe specification.

CORBA enables different components of software which have been written in multiple computer languages with disparate platforms and running on various computers to work together in an information system. CORBA is also a standard which has been defined by the Object Management Group (OMG).

Data Distribution Service can be managed by different kinds of software applications solutions including networking middleware applications which can offer a real-time publish-subscribe communications model and distributed processes to share data without concern for the actual physical location or architecture of their peers. These software applications also support for best-effort and reliable communications such as a multicast which is reliable and also client-server communications.

Most DDS implementations adhere to an open-architecture, data-critical platform based on the Object Management Group’s (OMG) Data Distribution Service for Real-Time Systems.

It is common for DDS to work in close collaboration with large data warehouses. And since implementing data warehouses require heavy resources that can handle the great demands of processing and storing very high levels of data that come and go every short period intervals, it is often a practice of breaking down the system into sub-systems and this include different data stores, data mart, data sources and all other related technologies. It could be said that data partition is one of the smaller sub-systems as the case indicates.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.


Pin It