Semi-Structured Model

What is the Semi-Structured Data Model?

The semi-structured data model is a data model where the information that would normal be connected to a schema is instead contained within the data, this is often referred to as self describing model.

With this type of database there is no clear separation between the data and the schema, also the level to which it is structured relies on the application being used.

Certain forms of semi-structured data have no separate schema, while in others there is a separate schema but only in areas of little restriction on the data.

Modeling semi-structured data in graphs which have labels that give semantics to its fundamental structure is a natural process. Databases of this type include the modeling power of other extensions of flat relational databases, to sheathed databases which enable the encapsulation of entities, as well as to the object databases, which also enable recurring references between objects.

Data that is semi-structured has just recently come into view as an important area of study for various reasons. One reason is that there are data sources like the World Wide Web, which we often treat as a database but it cannot be controlled by a schema.

Another reason is it might be advantageous to have a very flexible format for data exchange between contrasting databases. Finally there is also the reason that when dealing with structured data it sometimes may still be helpful to view it as semi-structured data for the tasks of browsing.

{qbapagebreak title=Semi-Structured Data}

What is Semi-Structured data?

We are familiar with structured data, which is the data that has been clearly formed, formatted, modeled, and organized into customs that are easy for us to work and manage. We are also familiar with unstructured data.

Unstructured data combines the bulk of information that does not fit into a set of databases. The most easily recognized form of unstructured data is the text in a document, like this article.

What you may not have known is that there is a middle ground for data; this is the data we refer to as semi-structured. This would be data sets that some implied structure is usually followed, but still not a standard enough structured to meet the criteria needed for the types of management and mechanization that is normally applied to structured data.

We deal with semi-structured data every day; this applies in both technical and non-technical environments. Web pages track definite distinctive forms, and the content entrenched within HTML usually have a certain extent of metadata within the tags.

Details about the data are implied instantly when using this information. This is why semi-structured data is so intriguing, though there is no set formatting rule, and there is still adequate reliability in which some interesting information can be taken from.

What does the Semi-Structured Data Model do?

Some advantages to the semi-structured data model include:

  • Representation of the information about data sources that normally can not be constrained by schema.
  • The model provides a flexible format used for the data switch over amongst dissimilar kinds of databases.
  • Semi-structured data models are supportive in screening structured data as semi-structured data.
  • The schema is effortlessly altered with the model
  • The data transportation configuration can be convenient.

The most important exchange being made in using a semi-structured database model is quite possibly that the queries will not be made as resourcefully as in the more inhibited structures, like the relational model.

Normally the records in a semi-structured database are stored with only one of a kind IDs that are referenced with indicators to their specific locality on a disk. Due to this the course-plotting or path based queries are very well-organized, yet for the purpose of doing searches over scores of records it is not as practical for the reason that it is forced to seek in the various regions of the disk by following the indicators.

We can clearly see that there are some disadvantages with semi-structured data model, as there are with all other models, lets take a moment to outline a few of these disadvantages.

{qbapagebreak title=Semi-Structured Data Issues}

Issues with Semi-Structured Data

Semi-structured data need to be characterized, turned over, stored, manipulated or analyzed with adeptness. Even so there are challenges in semi-structured data use. Some of these challenges include:

Data Diversity: The issues of data diversity in federated systems is a complex issue, it also involves areas such as unit and semantic incompatibilities, grouping incompatibilities, and non-consistent overlapping of sets.

Extensibility: It is vital to realize that extensibility as used to data is in indication to data presentation and not data processing. Data processing should be able to happen with out the aid of database updates.

Storage: Transfer formats like XML are universally in text or in Unicode; they are also prime candidates for transference, yet not so much for storage. The presentations are instead stored by deep seated and accessible systems that support such standards.

In short, many academic, open source, or other direct attention to these particular issues have been at an on-the-surface level of resolving representation or definitions, or even units.

The formation of sufficient processing engines for well-organized and scalable storage recovery has been wholly deficient in the complete driving force for a semi-structured data model. It is obvious that this needs further study and attention from developers.


We have researched many area of the semi-structured data model; include the differences between structured data, unstructured data, and semi-structured data. We have also explored the various used for the model.

After looking at the advantages and the disadvantages, we are now educated enough about the semi-structured model to make a decision regarding its usefulness.

Though this model is worthy of more research and deeper contemplation. The advantage of flexibility and diversity that this particular model offers is more then praiseworthy.

After researching, one can see many conventional and non-conventional uses for this model in our systems. A model example for semi-structured data model is depicted below.

The semi-structured information used above is actually the detail pertaining to this very article. Each line or arrow in the model had a specific purpose. This purpose is clearly listed as Article, Author, Title, and Year.

At the end of each arrow you can find the corresponding information. So this model example expresses the information about this article, the information being express is the title of the article which is

The Semi-Structure Data Model, also expresses the year in which the article was written which is 2008, and finally is tells us who the author is. As you can see from the example this data model is pretty easy to follow and useful when dealing with semi-structured information like web pages.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written posts on Online Learning.

Pin It