1
07 Normalization
CII1J3 Database Modelling
2
Learning Outcomes
Students can create database modeling using E-R model,
relational model, and normalization correctly
3
Topics
Steps in Normalization
1st Normalization
2nd Normalization
3rd Normalization
Intro
4
• Now that we have examined functional dependencies and keys, we are
ready to describe and illustrate the steps of normalization.
• If an EER data model has been transformed into a comprehensive set of
relations for the database, then each of these relations needs to be
normalized.
• In other cases in which the logical data model is being derived from user
interfaces, such as screens, forms, and reports, you will want to create
relations for each user interface and normalize those relations.
Steps in Normalization
5
Steps in Normalization (cont.)
6
1. First normal form Any multivalued attributes (also called repeating groups) have been
removed, so there is a single value (possibly null) at the intersection of each row and column
of the table.
2. Second normal form Any partial functional dependencies have been removed (i.e., nonkey
attributes are identified by the whole primary key).
3. Third normal form Any transitive dependencies have been removed (i.e., nonkey attributes
are identified by only the primary key).
4. Boyce-Codd normal form Any remaining anomalies that result from functional
dependencies have been removed (because there was more than one possible primary key
for the same nonkeys).
5. Fourth normal form Any multivalued dependencies have been removed.
6. Fifth normal form Any remaining anomalies have been removed.
Example
7
• For a simple illustration, we use a customer invoice from Pine Valley Furniture Company
8
Represent the View in Tabular Form
Invoice data (Pine Valley Furniture Company)
9
*Plus data for second order (Order ID 1007)
10
Convert to First Normal Form
1NF Constraints
11
A relation is in first normal form (1nF) if the following two constraints both
apply:
1. There are no repeating groups in the relation (thus, there is a single fact at
the intersection of each row and column of the table).
2. A primary key has been defined, which uniquely identifies each row in the
relation.
Remove Repeating Group
12
Repeating Group
Relation with no repeating group
13
Determine FD & Select the Candidate Key
14
There are four determinants in INVOICE, and their functional dependencies are the following:
• OrderID → OrderDate, CustomerID, CustomerName, CustomerAddress
• CustomerID → CustomerName, CustomerAddress
• ProductID → ProductDescription, ProductFinish, ProductStandardPrice
• OrderID, ProductID → OrderedQuantity
From the four determinants we can infer that:
• OrderID, ProductID → OrderDate, CustomerID, CustomerName, CustomerAddress,
ProductDescription, ProductFinish, ProductStandardPrice, OrderedQuantity
As you can see, the only candidate key for INVOICE is the composite key consisting of the
attributes OrderID and ProductID (because there is only one row in the table for any
combination of values for these attributes).
Relation in 1NF with CK
15
Candidate Key
16
Anomalies still occur in 1NF
INSERTION DELETION UPDATE
ANOMALY ANOMALY ANOMALY
17
Convert to Second Normal Form
2NF Constraints
18
A relation is in second normal form (2NF) if it is :
1. In first normal form
2. Contains no partial functional dependencies.
A partial functional dependency exists when a nonkey attribute is
functionally dependent on part (but not all) of the primary key.
Functional Dependency Diagram for Invoice
19
As you can see, the following partial dependencies exist in
the diagram:
▪ OrderID → OrderDate, CustomerID, CustomerName, CustomerAddress
▪ ProductID → ProductDescription, ProductFinish, ProductStandardPrice
Removing partial dependencies
20
To convert a relation with partial dependencies to second normal form, the
following steps are required:
1. Create a new relation for each primary key attribute (or combination of
attributes) that is a determinant in a partial dependency. That attribute is the
primary key in the new relation.
2. Move the nonkey attributes that are only dependent on this primary key
attribute (or attributes) from the old relation to the new relation.
Relation in 2NF (no partial dependencies)
21
22
Convert to Third Normal Form
Constraints
23
A relation is in third normal form (3NF) if it is:
1. in second normal form
2. no transitive dependencies exist.
Transitive dependency in a relation is a functional dependency between the
primary key and one or more nonkey attributes that are dependent on the
primary key via another nonkey attribute.
24
There are two transitive dependencies in the CUSTOMER ORDER relation
shown in the diagram:
1. OrderID → CustomerID → CustomerName
2. OrderID → CustomerID → CustomerAddress
Removing transitive dependencies
25
You can easily remove transitive dependencies from a relation by means of a three-step
procedure:
1. For each nonkey attribute (or set of attributes) that is a determinant in a relation, create a
new relation. That attribute (or set of attributes) becomes the primary key of the new
relation.
2. Move all of the attributes that are functionally dependent only on the primary key of the
new relation from the old to the new relation.
3. Leave the attribute that serves as a primary key in the new relation in the old relation to
serve as a foreign key that allows you to associate the two relations.
Removing transitive dependencies
26
into
Relation in 3NF (no transitive dependencies)
27
Relational schema for invoice data
28
Customer Order
PK CustomerID PK OrderID
CustomerName OrderDate
CustomerAddress FK CustomerID
Product OrderLine
PK ProductID PK FK OrderID
ProductDescription PK FK ProductID
ProductFinish OrderedQuantity
ProductStandardPrice
29
Advanced Normalization:
Boyce-Codd Normal Form
Definition & Constraint
30
• A relation is in Boyce-Codd normal form (BCNF) if and only if
every determinant in the relation is a candidate key
Invoice Relation
31
There are four determinants in INVOICE from functional dependencies
• OrderID → OrderDate, CustomerID, CustomerName, CustomerAddress
• CustomerID → CustomerName, CustomerAddress
• ProductID → ProductDescription, ProductFinish, ProductStandardPrice
• OrderID, ProductID → OrderedQuantity
3NF relation:
Based on 3NF result, INVOICE relations already in BCNF, because every determinant is a candidate key.
Relation in 3NF but not in BCNF
32
Relation with sample data
Functional dependencies in
STUDENT ADVISOR
2-step process
33
1. The relation is modified so that the determinant in the
relation that is not a candidate key becomes a
component of the primary key of the revised relation.
2-step process
34
2. Decompose the relation to eliminate the partial
functional dependency
Note for Decomposition
35
• The decomposition is a lossless decomposition
Let R be a relation schema and let R1 and R2 form a decomposition of R . That
is R = R1 U R2
We say that the decomposition is a lossless decomposition if there is no
loss of information by replacing R with the two relation schemas R1 U R2
Example of Lossy Decomposition
36
FAILURE DECOMPOSITION
37
Reference
Hoffer, Jeffrey A., et.al., "Modern Database Management", Twelfth Edition,
Pearson, 2016. Chapter 4
Questions
38