Entity Attribute Value (EAV)

What is an Entity-Attribute-Value Model (EAV)?

The Entity-Attribute-Value model or EAV is also sometimes referred to at the Object-Attribute-Value Model, or even the Open Schema. This is a data model that is often used in instances where the amount of attributes, properties, or parameters that can be used to define an entity are potentially limitless, however the number that will apply to the entity is some what modest.

The easiest way to understand the function of the Entity-Attribute-Value model design is to try to understand row modeling, as Entity-Attribute Value models are a universal form. Let’s think of a department store database. These databases are responsible for managing endless amounts of products and product brands.

It is innately obvious that the product names wouldn’t be hard-coded as the names of the columns in a table. Alternatively one department’s product descriptions in a product table may function as follows: purchases/sales of an individual item are recorded in another table that would have separate rows with a way to use the product ID for referencing.

An Entity-Attribute-Value design normally involves a solitary table with three columns, these columns most often contain data referring to; the entity, an attribute, and a value for that attribute.

In this design one row actually stores a single fact, in a traditional table that has one column per attribute, one row stores a set of facts. The Entity-Attribute-Value design is applicable when the number of parameters that could apply to an entity is significantly more then those that truly apply to a single entity.

Where is the Entity-Attribute-Value Model used?

Perhaps the most notable example of the EAV model is in the production databases we see with clinical work. This includes clinical past history, present clinical complaints, physical examinations, lab test, special investigations, and diagnoses. Basically all of the aspects that could apply to a patient.  When we take into account all of the specialties of medicine, this information can consist of hundreds of thousands of units of data.

However, most people who visit a health care provides have few findings. Physicians simply do not have the time to ask a patient about every possible thing, this is just not the way in which patients are examined. Rather then using the process of elimination against thousands of possibilities the health care provider focuses on the primary complaints of the patient, and then asks questions related to those complaints.

Now let’s consider how some one would attempt to represent a general-purpose clinical record in a database like those we discussed earlier.

By creating a table or even a set of tables with thousands of columns would not be the best choice of action, the vast majority of the columns would be unacceptable, also the user interface would be obsolete with out an extremely elaborate logic that could hide groups of columns based on the data that has been entered in the previous columns.

To complicate things further the patient record and medical findings continue to grow. The Entity-Attribute-Value data model is a natural solution for this perplexing issue, and you shouldn’t be surprised to find that larger clinical data repositories do use this model.

{qbapagebreak title=Structure of the Entity-Attribute-Value Table}

What is the Structure of the Entity-Attribute-Value Table?

Earlier we covered the facts that the EAV table consists of thee columns in which data is recorded. Those columns were the entity, the attribute, and the value. Now we will talk a little more in-depth about each column.

  • The Entity, sticking to the scenario of clinical finding, the entity would be the patient event. This would contain at the very least a patient ID and the date and time of the examination. 
  • The Attribute, or often referred to as the parameter, is a foreign key into a table of attribute definitions. In our example it would be the definitions of the clinical findings. The attributes table should contain the attribute ID, the attribute name, description, data type, units of measurement, and columns aiding input validation. 
  • The Value of an attribute, this depends on the data type.

Entity-Attribute-Value Database

This database is most commonly called the EAV database; this is a database where a large portion of data is modeled as EAV. Yet, you may still find some traditional relational tables within this type of database.

  • We stated earlier what the EAV modeling does for certain categories of data such as clinical findings where attributes are many and few. However where these specific functions do not apply we can use a traditional relational model instead. Using EAV has nothing to do with leaving the common sense and principles of a relational model behind. 
  • The EAV database is basically un-maintainable without the support of many tables that store supportive metadata. These metadata tables usually outnumber the EAV tables by about three or more, they are normally traditional relational tables. 
  • The Entity in clinical data is usually a Clinical Event as we have discussed above. However for more general purposes the entity is a key into an Objects table that is used to note common information about all of the objects in the database. The use of an Object table does not need EAV, traditional tables can be used to store the category-specific details of each object. 
  • The Value brings all values into lings, as in the EAV data example above as well, this results in a simple, yet still not scalable structure. Larger systems use separate EAV tables for each of their data types, including the binary larger objects, this deals with the metadata for a specific attribute in identifying the EAV table in which the data will be stored. 
  • The Attribute, in the EAV table this is no more then an Attribute ID, there are normally multiple metadata tables that contain the attribute related information.

Issues Associated with the EAV Model

There have been a number of issues with the Entity-Attribute-Value model brought to light throughout its lifetime. We will briefly discuss those now. It is important that we clarify first that these issues arise when metadata is not used with the EAV model, for metadata is vital for its functionality.

Here are some of the issues:

Flaccidity. The litheness is wonderful, still there is a time where we no longer have any structure at all. Normally you can not rely on the built in database features like the referential integrity any longer. To ensure that the column takes the values within an acceptable range only you need to code the integrity checks inside of the application. This does not aid in making the model maintainable.

Designer issues. Adding attributes as you go is tolerable for a prototype. Yet if you are unaware of what data you want to use from the go, you are just looking for problem.

The technology of the relational databases will be inaccessible and will have to be recreated by a development team, this could include system tables, Graphical query tools, fine grained data security, incremental back-up and restore, and exception handling, partitioned tabled, and clustered indexes. All of which are currently non-existent.

The actual format is not supported well by the DBMS internals. Standard query optimizers for SQL do not handle the EAV formatted data that well, and a lot of time will need to be dedicated to performance tuning for an acceptable production quality application.

As you can see from above there are still a few issues that need to be addressed by developers in order to make the EAV optimal. Regardless of those issues we have also learned that if we use metadata with the EAV we can avoid many if not all of these issues.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.

Pin It