CHAPTER Two
DATA MODELING AND ENTITY RELATIONSHIP DIAGRAMS
Using high level Data models for Database Design
• The Entity-Relationship (ER) model, Is a popular high, level conceptual
data model
Database Design Process
requirements collection and analysis:
• Here, the database designers interview prospective database users to
understand and document their data requirements
• The result of this step is a concisely written set of users’ requirements
Functional requirements:
these are also specified in parallel with requirement collection
• These consist of the user-defined operations (or transactions) that will be
applied to the database, including both retrievals and updates.
conceptual design:.
• Once all the requirements have been collected and analyzed, the next step
is to create a conceptual schema for the database, using a high-level
conceptual data model
• includes detailed descriptions of the entity types, relationships, and
constraints
logical design or data model mapping:
• Deals with the actual implementation of the database
• use an implementation data model such as the relational or the
object-relational database model-so the conceptual schema is
transformed from the high-level data model into the implementation
data model
Physical design:
• Here the internal storage structures, indexes, access paths, and file
organizations for the database files are specified
For more detail see
FIGURE 3.1 A simplified diagram to illustrate the main phases of
database design on page 51
Example of Database Application
• The COMPANY database keeps track of a company's employees, departments,
and projects. After requirement analysis phase, the DB designers provided the
following description.
1. The company is organized into departments. Each department has a unique
name, a unique number, and a particular employee who manages the
department.
We keep track of the start date when that employee began managing the
department. A department may have several locations.
2. A department controls a number of projects, each of which has a unique
name, a unique number, and a single location.
3. We store each employee's name, social security number, address, salary, sex,
and birth date. An employee is assigned to one department but may work on
several projects, which are not necessarily controlled by the same
department. We keep track of the number of hours per week that an
employee works on each project. We also keep track of the direct supervisor
of each employee.
4. We want to keep track of the dependents of each employee for insurance
purposes. We keep each dependent's first name, sex, birth date, and
ENTITY TYPES, ENTITY SETS,ATTRIBUTES, AND KEYS
• The ER model describes data as entities, relationships, and attributes.
Entity:
• The basic object that the ER model represents is an entity, which is a
"thing" in the real world with an independent existence.
• An entity may be an object with a physical existence
– for example, a particular person, car, house
• it may be an object with a conceptual existence
– for example, a company, a job, or a university course.
Attribute:
• Each entity has attributes-the particular properties that describe it.
• For example, an employee entity may be described by the
employee's name, age, address, salary, and job. A particular entity
will have a value for each of its attributes
Types of attributes
Composite versus Simple (Atomic) Attributes:
• Composite attributes can be divided into smaller subparts, which
represent more basic attributes with independent meanings.
• For example, the Address attribute of the employee entity can be
subdivided into StreetAddress, City, State, and Zip.
• Attributes that are not divisible are called simple or atomic
attributes
• Composite attributes can form a hierarchy; for example,
StreetAddress can be further subdivided into three simple
attributes: Number, Street, and ApartmentNumber
• if there is no need to refer to the individual components of an
address (zip code, street, and so on), then the whole address can
be designated as a simple attribute.
Single-Valued versus Multivalued Attributes
single-valued: Most attributes have a single value for a particular
entity; such attributes are called single-valued. For example, Age is
a single-valued attribute of a person.
Multivalued: In some cases an attribute can have a set of values for the
same entity-for example, a Colors attribute for a car, or a
CollegeDegrees attribute for a person Such attributes are
multivalued attributes
Stored versus Derived Attributes
derived attribute:
in some cases, two (or more) attribute values are related-
for example, the Age and BirthDate attributes of a person.
For a particular person entity, the value of Age can be determined from the
current (today's) date and the value of that person's BirthDate
The Age attribute is hence called a derived attribute and is said to be
derivable from the BirthDate attribute which is called a stored attribute
Null VaIues:
• In some cases a particular entity may not have an applicable value for
an attribute
• For example, the ApartmentNumber attribute of an address applies
only to addresses that are in apartment buildings and not to other
types of residences, such as single-family homes.
• Similarly, a CollegeDegrees attribute applies only to persons with
college degrees ( when it is not applicable)
• Null can also be used if we do not know the value of an attribute for a
particular entity
– for example, if we do not know the home phone of "John Smith“( when it is
unknown)
Complex attributes:
Composite and multivalued attributes can be nested
by grouping components of a composite attribute between
parentheses () and separating the components with commas, and
by displaying multivalued attributes between braces
For example, if a person can have more than one residence and
each residence can have multiple phones, an attribute
AddressPhone for a person can be specified as
{AddressPhone( {Phone(AreaCode,PhoneNumber)},Address(StreetAddress(Number,Street,ApartmentNumber),City,State,Zip) ) }
Entity Types, Entity Sets, Keys, and Value Sets
Entity type:
• defines a collection (or set) of entities that have the same attributes.
• Ex: company employing hundreds of employees may want to store
similar information concerning each of the employees
• These employee entities share the same attributes, but each entity
has its own value(s) for each attribute
• Each entity type in the database is described by its name and
attributes
entity set:
• The collection of all entities of a particular entity type in the
database at any point in time
• For example, EMPLOYEE refers to both a type of entity as well as the
current set of all employee entities in the database.
Entity type name: EMPLOYEE COMPANY
Name, Age, Salary Name, Headquarters, President
e1
JohnSmith, 55, 80k) C1.
e2 SuncoOil, Houston, John Smith)
(Fred Brown, 40, 30K) C2.
ENTITY SET
e3 (FastComputer, Dallas, Bob King)
(JudyClark, 25, 20K) .
. .
. .
. .
,
Figure: Two entity types, EMPLOYEE and COMPANY, and some member entities of
each.
Key Attributes of an Entity Type:
• An important constraint on the entities of an entity type is the key or
uniqueness constraint on attributes
• An entity type usually has an attribute whose values are distinct for
each individual entity in the entity set.
• Such an attribute is called a key attribute, and its values can be used to
identify each entity uniquely
• For example, the Name attribute is a key of the COMPANY entity type
because no two companies are allowed to have the same name.
• For the PERSON entity type, a typical key attribute is
SocialSecurityNumber
• Sometimes, several attributes together form a key, meaning that the
combination of the attribute values must be distinct for each entity.
• If a set of attributes possesses this property, the proper way to
represent this in the ER model is to define a composite attribute and
designate it as a key attribute of the entity type
• each key attribute has its name underlined inside the oval
• Some entity types have more than one key attribute
• For example, each of the VehicleID and Registration attributes of
the entity type CAR is a key in its own right
CAR:
• Registration(RegistrationNumber, State), VehiclelD, Make, Model,
Year, {Color}
Ex:Car1 : ( (ABC 123, TEXAS), TK629, Ford Mustang, convertible, 1998, {red, black})
car2: ((ABC 123, NEW YORK), WP9872, Nissan Maxima, 4-door, 1999, {blue})
car 3: ((VSY 720, TEXAS), TD729 , Chrysler LeBaron, 4-door, 1995, {white, blue})
• The Registration attribute is an example of a composite key formed
from two simple component attributes, RegistrationNumber and
State, neither of which is a key on its own
Weak entity type:
• An entity type may also have no key, in which case it is called a weak
entity type
Value sets(Domains of attribute)
• Each simple attribute of an entity type is associated with a value
set (or domain of values), which specifies the set of values that
may be assigned to that attribute for each individual entity
• Ex we can specify the value set of the Age attribute of EMPLOYEE
to be the set of integer numbers between 16 and 70
• Value sets are typically specified using the basic data types
available in most programming languages, such as integer, string,
boolean, float, enumerated type, and so on
Initial Conceptual Design of the COMPANY Database
• we can identify four entity types-one corresponding to each of the
four items in the specification
1. An entity type DEPARTMENT with attributes Name, Number,
Locations, Manager, and ManagerStartDate. Locations is the only
multivalued attribute. We can specify that both Name and Number
are (separate) key attributes, because each was specified to be
unique.
2. An entity type PROJECT with attributes Name, Number, Location, and
ControllingDepartment. Both Name and Number are (separate) key
attributes.
3. An entity type EMPLOYEE with attributes Name, SSN (for social
security number),Sex, Address, Salary, BirthDate, Department, and
Supervisor.
4. An entity type DEPENDENT with attributes Employee,
dependentName, Sex, BirthDate, and Relationship (to the employee).
DEPARTMENT
Name, Number, {Locations}, Manager, ManagerStartDate
PROJECT
Name, Number, Location, ControllingDepartment
EMPLOYEE
Name(FName, Mlnit, LName), SSN,Sex, Address, Salary,
BirthDate, Department, Supervisor,
WorksOn
(Project, Hours)}
DEPENDENT
Employee, DependentName, Sex,BirthDate, Relationship
RELATIONSHIPS IN DAATABASE
Relationship:
• is whenever an attribute of one entity type refers to another entity
type, some relationship exists
For example:
• the attribute Manager of DEPARTMENT refers to an employee who
manages the department;
• the attribute ControllingDepartment of PROJECT refers to the
department that controls the project;
• the attribute Supervisor of EMPLOYEE refers to another employee
(the one who supervises this employee);
• the attribute Department of EMPLOYEE refers to the department
for which the employee works
Relationship Types, Sets, and Instances
• A relationship type R among n entity types E1, E2, , En defines a
set of associations or a relationship set -among entities from these
entity types
• Each of the entity types E1, E2, ... , En is said to participate in the
relationship type R;
• similarly, each of the individual entities e1, e2.... , en is said to
participate in the relationship instance ri = (e1, e2 ..., en)
• the entities participating in ri are related in some way in the
corresponding miniworld situation
• For example: a relationship type WORKS_FOR between the two
entity types EMPLOYEE and DEPARTMENT, which associates each
employee with the department for which the employee works.
• Each relationship instance in the relationship set WORKSFOR
associates one employee entity and one department entity
EMPLOYEE WORKS
r1 _FOR DEPARTMENT
e1 r1
e1
r2 d1
e2
r3 d2
e3
r4 d3
e4
r5
e5
r6
e6
r7
e7
• The figure shows employees e1, e3' and e6 work for department
d1 ; e2 and e4 work for d2; and e5 and e7 work for d3
• In ER diagrams, relationship types are displayed as diamond-
shaped boxes(see figure 3.2)
Relationship Degree, Role Names, and Recursive Relationships
Degree of a Relationship Type:
• The degree of a relationship type is the number of participating
entity types.
• the WORKS_FOR relationship is of degree two
• A relationship type of degree two is called binary, and one of
degree three is called ternary
• EX: Ternary is whenever s supplies part p to project j.(See figure
3.10 page 63)
• Relationships can generally be of any degree and becomes
complex , but the ones most common are binary relationships
Role Names
• Each entity type that participates in a relationship type plays a
particular role in the relationship
• The role name signifies the role that a participating entity from the
entity type plays in each relationship instance
• For example, in the WORKS_FOR relationship type, EMPLOYEE plays
the role of employee or worker and DEPARTMENT plays the role of
department or employer.
Recursive Relationships:
• Role names are not technically necessary in relationship types where
all the participating entity types are distinct, since each participating
entity type name can be used as the role name
• However, in some cases the same entity type participates more than
once in a relationship type in different roles
• In such cases the role name becomes essential for distinguishing the
meaning of each participation
• The SUPERVISION relationship type relates an employee to a
supervisor, where both employee and supervisor entities are
members of the same EMPLOYEE entity type.
• the EMPLOYEE entity type participates twice in SUPERVISION:
once in the role of supervisor (or boss), and once in the role of
supervisee (or subordinate) ( see figure 3. 11 page 65)
Constraints on Relationship Types
• Relationship types usually have certain constraints that
limit the possible combinations of entities that may
participate in the corresponding relationship set
• Determined from the miniworld situation
• For example, in EMPLOYEE works_for DEPT, if the
company has a rule that each employee must work for
exactly one department, then we would like to describe
this constraint in the schema
• two main types of relationship constraints are:
cardinality ratio and participation
Cardinality Ratios for Binary Relationships
• The cardinality ratio for a binary relationship specifies the maximum
number of relationship instances that an entity can participate in
• The possible cardinality ratios for binary relationship types are:
– 1:1 one to one
– l:N one to many
– M:N many to many
• One To many :For example, in the WORKS_FOR binary relationship type,
DEPARTMENT: EMPLOYEE is of cardinality ratio l:N,
• meaning that each department can be related to (that is, employs) any
number of employees) but an employee can be related to (work for)
only one department.
• One to one: An example of a 1:1 binary relationship is MANAGES which
relates a department entity to the employee who manages that
department
• miniworld constraints that-at any point in time-an employee can
manage only one department and a department has only one manager
• Many to many: relationship type WORKS_ON is of cardinality ratio
M:N, because the miniworld rule is that an employee can work on
several projects and a project can have several employees
• See figure 3.11, 3.12, 3.13 on your book
Participation Constraints and Existence Dependencies
• The participation constraint specifies whether the existence of an
entity depends on its being related to another entity via the
relationship type
• There are two types of participation constraints:-total and partial
Total participation:
• Here an Entity can exist only if it participates in at least one
relationship instance:
• Ex: If a company policy states that every employee must work for a
department, then an employee entity can exist only if it
participates in at least one WORKS_FOR relationship instance
• Thus, the participation of EMPLOYEE in WORKS_FOR is called total
participation
• meaning that every entity in "the total set" of employee entities
must be related to a department entity via WORKS_FOR
• Total participation is also called existence dependency
Partial participation:
• All entities are not necessarily involved in a relationship instance
• Ex: We do not expect every employee to manage a department, so
the participation of EMPLOYEE in the MANAGES relationship type
is partial,
• meaning that some or "part of the set of" employee entities are
related to some department entity via MANAGES, but not
necessarily all.
• We will refer to the cardinality ratio and participation constraints,
taken together, as the structural constraints of a relationship type
• In ER diagrams, total participation (or existence dependency) is
displayed as a double line connecting the participating entity type
to the relationship, whereas partial participation is represented by
a single line
Weak entities
• Entity types that do not have key attributes of their own are called
weak entity types.
• In contrast, regular entity types that do have a key attribute-which
include all the examples we discussed so far-are called strong entity
types.
• Entities belonging to a weak entity type are identified by being
related to specific entities from another entity type in combination
with one of their attribute values
• We call this other entity type the identifying or owner entity type
• and we call the relationship type that relates a weak entity type to
its owner the identifying relationship of the weak entity type.
• A weak entity type always has a total participation constraint
(existence dependency) with respect to its identifying relationship,
because a weak entity cannot be identified without an owner entity.
• Consider the entity type DEPENDENT, related to EMPLOYEE, which is
used to keep track of the dependents of each employee via a l:N
relationship
• The attributes of DEPENDENT are Name (the first name of the
dependent), BirthDate, Sex, and Relationship (to the employee)
• Two dependents of two distinct employees may, by chance, have the
same values for Name, BirthDate, Sex, and Relationship, but they
are still distinct entities
• They are identified as distinct entities only after determining the
particular employee entity to which each dependent is related
• A weak entity type normally has a partial key, which is the set of
attributes that can uniquely identify weak entities that are related to
the same owner entity
• Ex if we assume that no two dependents of the same employee ever
have the same first name, the attribute Name of DEPENDENT is the
partial key
• In ER diagrams, both a weak entity type and its identifying
relationship are distinguished by surrounding their boxes and
diamonds with double lines (see Figure 3.2).
• The partial key attribute is underlined with a dashed or dotted line.
ER DIAGRAMS, NAMING CONVENTIONS,AND DESIGN ISSUES
• Figure 3.2 displays the COMPANY ER database schema as an ER diagram.
• Entity types such as EMPLOYEE, DEPARTMENT, and PROJECT are shown in rectangular boxes.
• Relationship types such as WORKS_FOR, MANAGES, CONTROLS, and WORKS_ON are shown
in diamond-shaped boxes attached to the participating entity types with straight lines.
• Attributes are shown in ovals, and each attribute is attached by a straight line to its entity
type or relationship type.
• Component attributes of a composite attribute are attached to the oval representing the
composite attribute, as illustrated by the Name attribute of EMPLOYEE.
• Multivalued attributes are shown in double ovals, as illustrated by the Locations attribute
of DEPARTMENT.
• Key attributes have their names underlined.
• Derived attributes are shown in dotted ovals, as illustrated by the NumberOfEmployees
attribute of DEPARTMENT.
• Weak entity types are distinguished by being placed in double rectangles and by having
their identifying relationship placed in double diamonds.
• The partial key of the weak entity type is underlined with a dotted line.
• the cardinality ratio of each binary relationship type is specified by attaching
• a I, M, or N on each participating edge.
• The cardinality ratio of DEPARTMENT: EMPLOYEE in MANAGES is 1:1,
• whereas it is l:N for DEPARTMENT: EMPLOYEE in WORKS_FOR, and M:N for WORKS_ON.
1. MANAGES
• a 1:1 relationship type between EMPLOYEE and DEPARTMENT.
EMPLOYEE participation is partial.
• DEPARTMENT participation is not clear from the requirements.
We question the users, who say that a department must have a
manager at all times, which implies total participation. The
attribute StartDate is assigned to this relationship type.
2. WORKSFOR,
• a I:N relationship type between DEPARTMENT and EMPLOYEE.
Both participations are total.
3. CONTROLS,
• a I:N relationship type between DEPARTMENT and PROJECT.
• The participation of PROJECT is total, whereas that of
DEPARTMENT is determined to be partial, after consultation with
the users indicates that some departments may control no
REFINING THE ER DESIGN FOR THE COMPANY DATABASE
4. SUPERVISION
• a I:N relationship type between EMPLOYEE (in the supervisor role)
and EMPLOYEE (in the supervisee role).
• Both participations are determined to be partial, after the users
indicate that not every employee is a supervisor and not every
employee has a supervisor.
5. WORKS_ON
• determined to be an M:N relationship type with attribute Hours,
after the users indicate that a project can have several employees
working on it. Both participations are determined to be total.
6. DEPENDENTS_OF
• a l:N relationship type between EMPLOYEE and DEPENDENT, which is
also the identifying relationship for the weak entity type DEPENDENT.
The participation of EMPLOYEE is partial, whereas that of
DEPENDENT is total.
ER DIAGRAMS, NAMING CONVENTIONS,AND DESIGN ISSUES
• See figure 3.14 :Summary of the notation for ER
diagrams on page 72
• See Figure 3.2 ER database schema as an ER
diagram with the notation
• Also FIGURE 3.15 ER diagrams for the
COMPANY schema, with structural constraints
specified using (min, max) notation and also
Role names of the company Database.
Mapping ER-models to Relational Tables
• steps of an algorithm for ER-to-relational mapping
Step 1: Mapping of Regular Entity Types
• For each regular (strong) entity type E in the ER schema, create a
relation R that includes all the simple attributes of E
• Choose one of the key attributes of E as primary key for R.
• Put the primary key of the table as the foreign key in to the second
table for referential integrity
• Foreign keys in company database include the attributes SUPERSSN
and DNO of EMPLOYEE, MGRSSN and MGRSTARTDATE of
DEPARTMENT, and DNUM of PROJECT
• SSN, DNUMBER, and PNUMBER as primary keys for the relations
EMPLOYEE, DEPARTMENT, and PROJECT respectively
• Step 2:Mapping of Weak Entity Types.
• For each weak entity type W in the ER schema with owner entity type
E, create a relation R and include all simple attributes(or simple
components of composite attributes) of W as attributes of R
• In addition, include as foreign key attributes of R the primary key
attributes) of the relations) that correspond to the owner entity
types):
• this takes care of the identifying relationship type of W
• The primary key of R is the combination of the primary keys) of the
owners) and the partial key of the weak entity type W, if any
• We include the primary key SSN of the EMPLOYEE relation-which
corresponds to the owner entity type-as a foreign key attribute of
DEPENDENT; and renamed it ESSN.
• The primary key of the DEPENDENT relation is the combination
{ESSN, DEPENDENT_NAME} because DEPENDENT_NAME is the
partial key of DEPENDENT
Step 3: Mapping of Binary 1:1 Relationship Types
• For each binary 1:1 relationship type R in the ER schema, identify the
relations S and T that correspond to the entity types participating in R
Use the foreign key approach
• That is Choose one of the relations-S, say-and include as a foreign key
in S the primary key of T.
• It is better to choose an entity type with total participation in R in the
role of S.
• example, we map the 1:1 relationship type MANAGES from by
choosing the participating entity type DEPARTMENT to serve in the role
of S, because its participation in the MANAGES relationship type is total
(every department has a manager).
• We include the primary key of the EMPLOYEE relation as foreign key in
the DEPARTMENT relation and rename it MGRSSN.
• We also include the simple attribute STARTDATE of the MANAGES
relationship type in the DEPARTMENT relation and rename it
MGRSTARTDATE.
Step 4: Mapping of Binary 1 :N Relationship Types
• For each regular binary l:N relationship type R, identify the relation S
that represents the participating entity type at the N-side of the
relationship type
• Include as foreign key in S the primary key of the relation T that
represents the other entity type participating in R
• Example:
– For WORKS_FOR we include the primary key DNUMBER of the DEPARTMENT
relation as foreign key in the EMPLOYEE relation and call it DNO.
– For SUPERVISION we include the primary key of the EMPLOYEE relation as
foreign key in the EMPLOYEE relation itself because the relationship is recursive-
and call it SUPERSSN.
– The CONTROLS relationship is mapped to the foreign key attribute DNUM of
PROJECT, which references the primary key DNUMBER of the DEPARTMENT
relation.
Step 5: Mapping of Binary M:N Relationship Types.
• For each binary M:N relationship type R, create a new relation S to
represent R
• Include as foreign key attributes in S the primary keys of the
relations that represent the participating entity types;
• their combination will form the primary key of S
• We cannot represent an M:N relationship type by a single foreign key
attribute in one of the participating relations (as we did for 1:1 or I:N
relationship types) because of the M:N cardinality ratio; we must
create a separate relationship relation S.
• Example: we map the M:N relationship type WORKS_ON by creating
the relation WORKS_ON ,We include the primary keys of the
PROJECT and EMPLOYEE relations as foreign keys in WORKS_ON and
rename them PNO and ESSN, respectively
• The primary key of the WORKS_ON relation is the combination of the
foreign key attributes {ESSN, PNO}.
Step 6: Mapping of Multivalued Attributes.
• For each multivalued attribute A, create a new relation R.
• This relation R will include an attribute corresponding to A, plus the
primary key attribute K-as a foreign key in R
• The primary key of R is the combination of A and K
• example, we create a relation DEPT_LOCATIONS.
• The attribute DLOCATION represents the multivalued attribute
LOCATIONS of DEPARTMENT, while DNUMBER-as foreign key
represents the primary key of the DEPARTMENT relation.
• The primary key of DEPT_LOCATIONS is the combination of
{DNUMBER, DLOCATION}.
• A separate tuple will exist in DEPT_LOCATIONS for each location that
a department has.
Figure Result of mapping the COMPANY ER
schema into a relational database schema.
AND
See populated company Database with sample
Data on the next Slide