Legacy Data – Online Learning

Warning: Undefined variable $user in /home/geekint/public_html/learn/wp-content/plugins/google-plus-authorship/google-plus-authorhip.php on line 28

Warning: Attempt to read property "ID" on null in /home/geekint/public_html/learn/wp-content/plugins/google-plus-authorship/google-plus-authorhip.php on line 28

Deprecated: Function get_bloginfo was called with an argument that is deprecated since version 2.2.0! The home option is deprecated for the family of bloginfo() functions. Use the url option instead. in /home/geekint/public_html/learn/wp-includes/functions.php on line 6121

Deprecated: Function get_the_author_ID is deprecated since version 2.8.0! Use get_the_author_meta('ID') instead. in /home/geekint/public_html/learn/wp-includes/functions.php on line 6121

Warning: Undefined variable $customprofilefield in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 66

Deprecated: Function get_the_author_description is deprecated since version 2.8.0! Use get_the_author_meta('description') instead. in /home/geekint/public_html/learn/wp-includes/functions.php on line 6121

Warning: Undefined variable $display_author_email in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 152

Warning: Undefined variable $display_google_profile in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 152

Warning: Undefined variable $display_facebook_profile in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 152

Warning: Undefined variable $display_twitter_profile in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 152

Warning: Undefined variable $display_youtube_profile in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 152

Warning: Undefined variable $display_linkedin_profile in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 152

Warning: Undefined variable $display_pinterest_profile in /home/geekint/public_html/learn/wp-content/plugins/author-box-with-different-description/author_box_display.php on line 152

Legacy data comes from virtually everywhere within the information system and support legacy systems. The many sources of legacy data include databases, often relational but hierarchical, network, object, XML, and object/relational databases as well. Legacy data is another term used for disparate data.

Some files such as XML documents or “flat files” such as configuration files and comma-delimited text files may also be sources of legacy data. But the biggest sources of legacy data are those from the old, updated and antiquated legacy systems.

A legacy system refers to an existing group of computers or application programs which have been old and outdated by companies still refuse to give them up because they still serve well.

These systems are usually large and companies have invested so much money in implementing legacy systems in the past that despite some potentially problematic identified by IT professionals, many still want to keep them for several reasons.

One of the main problems with legacy systems is that they often run on very slow and obsolete hardware parts which, when broken, would be very difficult to look for replacements. Because of the general lack of understanding of these old technologies, they are often very hard to maintain, improve and expand. And because they are old and obsolete, chances the operations manual and other documentations may have been lost through the years.

Despite the emergences of newer technologies with individual parts relatively cheaper, many companies still have compelling reasons why they are keeping such old and antiquated system whose data adds to the disparity in data warehouse systems.

One of the biggest reasons is the legacy systems were implemented to be large and monolithic in nature and coming up with a one time redesign and reimplementation would be very costly and complicated. If legacy systems are taken out at one single moment, the whole business process would be halted for sometime because of the monolithic and centralized nature of these systems.

Most companies cannot afford any business stoppage especially in today’s very fast paced data driven business environment. What worsens the situation even more is that legacy systems are not very understood by younger IT professional so redesigning them to adopt to newer technologies would take so long and intensive planning.

That is why it is very common to see data warehouses nowadays which are a combination of new and legacy systems. The effect would be having legacy data which are very incompatible with the data coming from the data sources using newer technologies.

In fact, different new technology vendors are encountering differing disparity data related problems with using legacy systems. IBM alone has enumerated some typical legacy data problems which include among others:

Incorrect data values
Inconsistent/incorrect data formatting
Missing data
Missing columns
Additional columns
Multiple sources for the same data
A single column being used for several purposes
The purpose of a column is determined by the value of one or more other columns
Important entities, attributes, and relationships hidden and floating in text fields
Data values that stray from their field descriptions and business rules
Various key strategies for the same type of entity
Unrealized relationships between data records
One attribute is stored in several fields
Inconsistent use of special characters
Different data types for similar columns
Different levels of detail
Different modes of operation
Varying timeliness of data
Varying default values and other Various representations.

Legacy data and the problem regarding data disparity they bring to a data warehouse can be solved by the process of ETL (extract, transform, load). This is a mechanism of converting disparate data not just from legacy systems but all other disparate data sources as well before they are loaded into the data warehouse.

Editorial Team at Geekinterview is a team of HR and Career Advice members led by Chandra Vennapoosa.

Editorial Team – who has written 1033 posts on Online Learning.

Related Posts