In a logical data model, the conceptual data model which is based on the business semantic is being defined. Thus, entities and relationships and corresponding table and column design, object oriented classes, and XML tags, among other things are being laid regardless of the database will be physically implemented.
A data file is a physical file. This means that this is a file that is represented as real bit in the storage hardware of the computer.
Dealing with data files in a large data warehouse is not as simple as dealing with them on a stand alone computer. Large data warehouses are managed by relational database management systems. In relational databases, entities refer to any data that can be of interest and these entities have attributes.
For example, a CUSTOMER is an entity in the database. The customer could have attributes such as First Name, Middle Name, Last Name, Customer Number, SSS Number and a lot more. When an entity is entered into the database, the database management system connects an entity with its attributes in different ways called a relation.
An entity may have multiple attributes such as the number of places that he has lived all this life. All these information are saved as data file in a database management system.
Today’s data warehouses also make intensive use of extensible mark up language (XML) which is general purpose mark up language. XML is primarily used to facilitate sharing of structured data across several information systems which may have disparate servers such as the internet. XML is also used to encode documents and to serialize data so they can be easy to process.
XML has in fact been used by many as an excellent alternative to relational databases. In a distributed system or in data warehouses getting data from several sources, having relational database files could mean that one file from a server may not be compatible with the other servers. Using XML overcomes the problem with portability because XML files are actually standard text files so different servers reading them could understand the files.
Since XML can make its own mark ups, data warehouses could utilize an XML data file to store information about an entity and use the information later. XML data files may reside anywhere within the computer the storage. When information about an entity is needed from an XML data file, an XML needs to be processes using programming language in conjunction with either SAX API or DOM API.
A transformation engine or a filter can also be used. Newer techniques for XML processing include push parsing, data binding and non extractive XML Processing API.
An entity can also be represented by manual data files. In fact, there are many instances that manual data filing is used instead of a database system or XML. For example documents files such as last will and testament or long contract files have to be stored separately as manual data files.
Also, large video or photo files pertaining to a person need to be stored as data files too. But there has to be a mechanism to reference these manual files so they relate in ownership to a data entity. Both the database management system and the XML technique can be used to do the referencing.