Topic 3 Notes
3.1 Data model
A data model is an abstract model that organizes elements of data and standardizes how they relate
to one another and to properties of the real world entities. For instance, a data model may specify
that the data element representing a car be composed of a number of other elements which, in turn,
represent the color and size of the car and define its owner.
The term data model is used in two distinct but closely related senses. Sometimes it refers to an
abstract formalization of the objects and relationships found in a particular application domain, for
example the customers, products, and orders found in a manufacturing organization. At other times
it refers to a set of concepts used in defining such formalizations: for example concepts such as
entities, attributes, relations, or tables.
Data model is based on Data, Data relationship, Data semantic and Data constraint. A data model
provides the details of information to be stored, and is of primary use when the final product is the
generation of computer software code for an application or the preparation of a functional
specification to aid a computer software make-or-buy decision. A data model explicitly determines
the structure of data. A data model can sometimes be referred to as a data structure, especially in
the context of programming languages. Data models are often complemented by function models,
especially in the context of enterprise models.
3.2 The role of data models
The main aim of data models is to support the development of information systems by providing
the definition and format of data. If the same data structures are used to store and access data then
different applications can share data.
A data model explicitly determines the structure of data. Typical applications of data models
include database models, design of information systems, and enabling exchange of data. Usually
data models are specified in a data modeling language.
3.3 Types of data models
A database model is a specification describing how a database is structured and used.
Several such models have been suggested. Common models include:
Flat model
This may not strictly qualify as a data model. The flat (or table) model consists of a single, two-
dimensional array of data elements, where all members of a given column are assumed to be
similar values, and all members of a row are assumed to be related to one another.
Hierarchical model
The hierarchical model is similar to the network model except that links in the hierarchical model
form a tree structure, while the network model allows arbitrary graph.
Network model
This model organizes data using two fundamental constructs, called records and sets. Records
contain fields, and sets define one-to-many relationships between records: one owner, many
members. The network data model is an abstraction of the design concept used in the
implementation of databases.
Relational model
Is a database model based on first-order predicate logic. Its core idea is to describe a database as a
collection of predicates over a finite set of predicate variables, describing constraints on the
possible values and combinations of values. The power of the relational data model lies in its
mathematical foundations and a simple user-level paradigm.
Object-relational model
Similar to a relational database model, but objects, classes and inheritance are directly supported
in database schemas and in the query language.
Geographic data model
A data model in Geographic information systems is a mathematical construct for representing
geographic objects or surfaces as data. For example,
i. the vector data model represents geography as collections of points, lines, and polygons;
ii. the raster data model represent geography as cell matrixes that store numeric values;
iii. the Triangulated irregular network (TIN) data model represents geography as sets of
contiguous, nonoverlapping triangles.
Generic data model
Generic data models are generalizations of conventional data models. They define standardised
general relation types, together with the kinds of things that may be related by such a relation type.
Generic data models are developed as an approach to solve some shortcomings of conventional
data models. For example, different modelers usually produce different conventional data models
of the same domain. This can lead to difficulty in bringing the models of different people together
and is an obstacle for data exchange and data integration. Invariably, however, this difference is
attributable to different levels of abstraction in the models and differences in the kinds of facts that
can be instantiated (the semantic expression capabilities of the models). The modelers need to
communicate and agree on certain elements which are to be rendered more concretely, in order to
make the differences less significant.
Semantic data model
A semantic data model in software engineering is a technique to define the meaning of data within
the context of its interrelationships with other data. A semantic data model is an abstraction which
defines how the stored symbols relate to the real world. A semantic data model is sometimes called
a conceptual data model.
Object model
An object model in computer science is a collection of objects or classes through which a program
can examine and manipulate some specific parts of its world. In other words, the object-oriented
interface to some service or system. Such an interface is said to be the object model of the
represented service or system. For example, the Document Object Model (DOM) is a collection of
objects that represent a page in a web browser, used by script programs to examine and
dynamically change the page. There is a Microsoft Excel object model for controlling Microsoft
Excel from another program, and the ASCOM Telescope Driver is an object model for controlling
an astronomical telescope.
In computing the term object model has a distinct second meaning of the general properties of
objects in a specific computer programming language, technology, notation or methodology that
uses them. For example, the Java object model, the COM object model, or the object model of
OMT. Such object models are usually defined using concepts such as class, message, inheritance,
polymorphism, and encapsulation. There is an extensive literature on formalized object models as
a subset of the formal semantics of programming languages.
Unified Modeling Language models
The Unified Modeling Language (UML) is a standardized general-purpose modeling language in
the field of software engineering. It is a graphical language for visualizing, specifying,
constructing, and documenting the artifacts of a software-intensive system. The Unified Modeling
Language offers a standard way to write a system's blueprints, including:
i. Conceptual things such as business processes and system functions
ii. Concrete things such as programming language statements, database schemas, and
iii. Reusable software components.
UML offers a mix of functional models, data models, and database models.
3.4 Data architecture
Data architecture is the design of data for use in defining the target state and the subsequent
planning needed to hit the target state. It is usually one of several architecture domains that form
the pillars of an enterprise architecture or solution architecture.
A data architecture describes the data structures used by a business and/or its applications. There
are descriptions of data in storage and data in motion; descriptions of data stores, data groups and
data items; and mappings of those data artifacts to data qualities, applications, locations etc.
Essential to realizing the target state, Data architecture describes how data is processed, stored,
and utilized in a given system. It provides criteria for data processing operations that make it
possible to design data flows and also control the flow of data in the system.
3.5 Data modeling
Data modeling in software engineering is the process of creating a data model by applying formal
data model descriptions using data modeling techniques. Data modeling is a technique for defining
business requirements for a database. It is sometimes called database modeling because a data
model is eventually implemented in a database.
A conceptual data model is developed based on the data requirements for the application that is
being developed, perhaps in the context of an activity model. The data model will normally consist
of entity types, attributes, relationships, integrity rules, and the definitions of those objects. This is
then used as the start point for interface or database design.[8]
3.6 Data properties
Some important properties of data for which requirements need to be met are:
i. definition-related properties:
a) relevance: the usefulness of the data in the context of your business.
b) clarity: the availability of a clear and shared definition for the data.
c) consistency: the compatibility of the same type of data from different sources.
Some important properties of data
ii. content-related properties
a) timeliness: the availability of data at the time required and how up to date that data
is.
b) accuracy: how close to the truth the data is.
iii. properties related to both definition and content
a) completeness: how much of the required data is available.
b) accessibility: where, how, and to whom the data is available or not available (e.g.
security).
c) cost: the cost incurred in obtaining the data, and making it available for use.
3.7 Data model and Data structure
A data structure is a way of storing data in a computer so that it can be used efficiently. It is an
organization of mathematical and logical concepts of data. Often a carefully chosen data structure
will allow the most efficient algorithm to be used. The choice of the data structure often begins
from the choice of an abstract data type.
A data model describes the structure of the data within a given domain and, by implication, the
underlying structure of that domain itself. This means that a data model in fact specifies a dedicated
grammar for a dedicated artificial language for that domain. A data model represents classes of
entities (kinds of things) about which a company wishes to hold information, the attributes of that
information, and relationships among those entities and (often implicit) relationships among those
attributes. The model describes the organization of the data to some extent irrespective of how data
might be represented in a computer system.
The entities represented by a data model can be the tangible entities, but models that include such
concrete entity classes tend to change over time. Robust data models often identify abstractions of
such entities. For example, a data model might include an entity class called "Person", representing
all the people who interact with an organization. Such an abstract entity class is typically more
appropriate than ones called "Vendor" or "Employee", which identify specific roles played by
those people.
Revision questions
1. What is a data model?
2. What properties define data for a model
3. Identify models that can be used for structuring data