CEF342 - Database and Design Chapter 2 - Data Models
Instructor Eng. Nukamnduwoh Harvey Sama
Academic Year 2024/2025
Semester 2
Chapter 2: Data Models
1. Introduction to Data Models
1.1 What Is a Data Model?
● Definition:
A data model is an abstraction that represents the complex real-world data structures
and their relationships. It serves as a blueprint for designing a database.
● Purpose:
○ To bridge the gap between how data exists in the real world and how they are
implemented in a database.
○ To support communication among designers, programmers, and end users by
presenting a clear and unambiguous description of the data.
● Iterative Nature:
Data modeling is a progressive process. Designers start with a simple understanding of
the problem domain and refine the model as their understanding deepens. The final
model acts as a “blueprint” that directs the construction of a database meeting all
end-user requirements.
2. The Importance of Data Models
2.1 Communication and Understanding
● Key Role:
A well-developed data model fosters better communication among:
○ Designers (who build the database),
○ Programmers (who implement applications),
○ End users (who interact with the data).
● Organizational Insight:
A robust data model can even help users understand the entire organization by showing
how data elements interrelate.
● Real-World Impact:
As one client noted, a clear data model was the first time they truly “saw how all the
pieces fit together.”
2.2 Avoiding Inconsistencies
● Blueprint Importance:
Just as a house cannot be built without a blueprint, an effective database cannot be built
without a proper data model.
● Business Process Alignment:
A good data model ensures that systems (like inventory management and order entry)
use consistent data definitions, thus preventing costly errors such as conflicting
numbering schemes.
3. Data Modeling and Data Models
3.1 Data Modeling Process
● Purpose:
To create a specific data model for a clearly defined problem domain.
● Components of an Implementation-Ready Data Model:
○ Data Structure Description:
Defines how data will be stored.
○ Integrity Rules:
A set of enforceable rules to maintain data quality.
○ Data Manipulation Methodology:
Defines how data transformations are performed.
● Blueprint Characteristics:
The final model is both narrative (textual descriptions) and graphical (diagrams) to
ensure clarity and precision.
3.2 Data Model vs. Database Model
● Interchangeable Use:
Although “data model” and “database model” are often used interchangeably, in this
context the latter refers to the implemented version within a DBMS.
4. Basic Building Blocks of Data Models
4.1 Entities
● Definition:
An entity represents a person, place, thing, or event about which data are collected.
● Example:
A CUSTOMER entity might represent individual customers.
● Entity Set:
A collection of similar entities (e.g., all customers).
4.2 Attributes
● Definition:
Characteristics or properties that describe an entity.
● Example:
Attributes of a CUSTOMER might include Last Name, First Name, Phone Number, and
Credit Limit.
● Analogy to File Systems:
Attributes are similar to fields in a file system, but within a data model they contribute to
the logical structure.
4.3 Relationships
● Definition:
Associations among entities.
● Types:
○ One-to-Many (1:M):
A single instance of entity A is associated with multiple instances of entity B.
Example: One CUSTOMER may generate many INVOICES.
○ Many-to-Many (M:N):
Multiple instances of entity A relate to multiple instances of entity B.
Example: Many STUDENTS take many CLASSES.
○ One-to-One (1:1):
A single instance of entity A is associated with one instance of entity B.
Example: Each STORE is managed by one EMPLOYEE.
● Bidirectional Nature:
Relationships work both ways; understanding how many instances of B relate to one
instance of A—and vice versa—helps define the relationship type.
4.4 Constraints
● Definition:
Restrictions or rules that ensure data integrity.
● Examples:
○ A salary must be between 6,000 and 350,000.
○ A student’s GPA must be between 0.00 and 4.00.
○ Each class must have one and only one teacher.
5. Business Rules
5.1 Role of Business Rules
● Definition:
Precise, unambiguous descriptions of policies, procedures, or principles within an
organization.
● Purpose:
They form the basis for identifying entities, attributes, relationships, and constraints in
the data model.
● Examples:
○ “A customer may generate many invoices.”
○ “An invoice is generated by only one customer.”
○ “A training session must include between 10 and 30 employees.”
5.2 Discovering and Translating Business Rules
● Sources:
Interviews with managers, policy documents, and department standards.
● Translation:
○ Nouns in business rules typically become entities.
○ Verbs that associate those nouns become relationships.
● Naming Conventions:
Consistent and descriptive naming (e.g., using prefixes like CUS_ for CUSTOMER
attributes) helps ensure that the model is self-documenting and easily understood by all
stakeholders.
6. Evolution of Data Models
6.1 Overview of Major Data Models
Data models have evolved to address different business and technical challenges. The following
table summarizes this evolution:
Generation Data Model Examples Comments
First File Systems VMS/VSAM Managed records, not
(1960s–1970s) relationships; used on
mainframes.
Second (1970s) Hierarchical and IMS, ADABAS, IDS-II Early database systems
Network Models with navigational
access.
Third Relational Model DB2, Oracle, MS SQL Conceptual simplicity;
(Mid-1970s) Server, MySQL based on set theory and
predicate logic.
Fourth Object-Oriented & Versant, Objectivity/DB Support for object data
(Mid-1980s) Object/Relational types and data
Models warehousing.
Fifth XML and Hybrid dbXML, Tamino Unstructured data
(Mid-1990s) Models support, merging
relational and object
models.
Emerging Big Data & NoSQL Amazon SimpleDB, Distributed, highly
(Early Google BigTable, scalable, and designed
2000s–Present) Apache Cassandra for unstructured data.
Table adapted from Table 2.1 in the text
6.2 Data Model Notations and Abstraction Levels
● Entity Relationship Diagrams (ERDs):
Graphical tools used to model entities and relationships. Three common notations
include:
○ Chen Notation:
Uses rectangles for entities and diamonds for relationships.
○ Crow’s Foot Notation:
Uses a “crow’s foot” symbol to indicate the “many” side of a relationship.
○ UML Class Diagram Notation:
Part of the Unified Modeling Language; uses specific symbols (e.g., 1..*, 1..1) to
denote relationships.
● Levels of Abstraction:
Data models can be classified by their abstraction level—from conceptual (ER model) to
logical (relational model) to physical (DBMS implementation details).
7. Emerging Data Models: Big Data and NoSQL
7.1 Big Data Challenges
● The 3 Vs Framework:
○ Volume:
Massive amounts of data, now reaching petabytes.
○ Velocity:
The speed at which data are generated and must be processed.
○ Variety:
Data in multiple formats, including structured, semi-structured, and unstructured.
7.2 NoSQL Databases
● Characteristics:
○ Not Based on the Relational Model:
They are “schema-less” or have a flexible schema.
○ Distributed Architecture:
Designed for high scalability and fault tolerance.
○ Types:
Key-value stores, document databases, column stores, and graph databases.
● Example – Key-Value Model:
○ Each record is stored as a key and a corresponding value.
○ Illustration: In a key-value representation for a company like Trucks-R-Us, the
key might be “Driver” and the value may contain various attributes in a long string
format.
● Practical Considerations:
NoSQL databases are best for applications requiring rapid scalability and handling of
sparse or varied data, while relational databases still dominate day-to-day transactional
applications.
Summary and Key Takeaways
● Data Modeling is Fundamental:
It is the first and critical step in database design—helping convert real-world information
into a structured blueprint.
● Models Facilitate Communication:
They align the views of end users, designers, and programmers, reducing
misinterpretations.
● Basic Building Blocks:
Entities, attributes, relationships, and constraints form the foundation of every data
model.
● Business Rules are Essential:
They provide the context and policies that drive the design of a data model.
● Evolution of Models:
From file systems to relational, then to object-oriented and NoSQL, each model
addresses specific challenges.
● Emerging Needs:
Big Data and NoSQL technologies are designed to handle the increasing volume,
velocity, and variety of data in today’s environment.
Study Aids and Diagrams
Diagrammatic Representations:
● ER Diagrams (ERDs):
○ Chen Notation: Entities as rectangles with diamonds for relationships.
○ Crow’s Foot Notation: The “crow’s foot” symbol denotes the “many” side (e.g.,
CUSTOMER 1 – ∞ INVOICE).
○ UML Class Diagram: Uses symbols such as 1..* and 1..1 to depict relationship
multiplicities.
Refer to Figure 2.3 in the text for a side-by-side comparison of these notations.
Key Definition Table:
Term Definition
Entity A distinct object (person, place, thing, or event) about which data is
collected.
Attribute A property or characteristic of an entity (similar to a field in a file).
Relationship An association between two or more entities (e.g., CUSTOMER generates
INVOICE).
Constraint A rule that limits the values that can be stored (e.g., GPA between 0.00 and
4.00).
Data Model An abstraction that describes the data structures, relationships, and
constraints of a problem domain.
Final Notes
These lecture notes for Chapter 2 provide a comprehensive review of the concepts of data
modeling. Be sure to:
● Understand how data models serve as the blueprint for database design.
● Recognize the importance of business rules in shaping the data model.
● Familiarize yourself with the evolution of data models, from early file systems to
emerging Big Data and NoSQL models.
● Practice reading and drawing ER diagrams using different notations (Chen, Crow’s Foot,
UML).
Review these notes alongside your textbook and class discussions to reinforce your
understanding of the material. A solid grasp of these concepts is essential as they form the
foundation for all subsequent database design and implementation topics.