Overview: Normalization
Database Normalization
Data Anomalies Caused by:
Update, Insertion, Deletion
Brief History/Overview
1st Normal Form
2nd Normal Form
3rd Normal Form
Conclusion
Database Normalization
The main goal of Database Normalization is to
restructure the logical data model of a
database to:
Eliminate redundancy
Organize data efficiently
Reduce the potential for data anomalies.
Data Anomalies
Data anomalies are inconsistencies in the
data stored in a database as a result of an
operation such as update, insertion, and/or
deletion.
Such inconsistencies may arise when have
a particular record stored in multiple
locations and not all of the copies are
updated.
We can prevent such anomalies by
implementing different level of
normalization called Normal Forms (NF).
3
Brief History/Overview
Database Normalization was first proposed by
Edgar F. Codd.
Codd defined the first three Normal Forms.
One of the key requirements to remember is that
Normal Forms are progressive. That is, in order to
have 3rd NF we must have 2nd NF and in order to
have 2nd NF we must have 1st NF.
Dependencies: Definitions
Multivalued Attributes (or repeating
groups): the values of non-key attributes or
groups of non-key attributes are not uniquely
identified by (directly or indirectly) or not
functionally dependent on the value of the
Primary Key (or its part).
General Hardware
Company: Unnormalized
Data
Salesperson
Number
Product
Number
137 19440
24013
26722
186 16386
19440
21765
24013
204 21765
26722
361 16386
21765
26722
Salesperson
Name
Commission
Percentage
Baker
10
Adams
Dickens
Carlyle
Year Department
of
Number
Manager Product
Hire
Name
Name
1995
73 Scott
Hammer
Saw
Pliers
15 2001
59 Lopez
Wrench
Hammer
Drill
Saw
10 1998
73 Scott
Drill
Pliers
20 2001
73 Scott
Wrench
Drill
Pliers
Unit
Price Quantity
17.50
473
26.25
170
11.50
688
12.95
1745
17.50
2529
32.99
1962
26.25
3071
32.99
809
11.50
734
12.95
3729
32.99
3110
11.50
2738
SALESPERSON/PRODUCTTable
Records contain multivalued attributes.
7-6
General Hardware
Company: First Normal Form
SALESPERSON/PRODUCTTable
The attributes under consideration have been listed
in one table, and a primary key has been established.
The number of records has been increased so that
every attribute of every record has just one value.
The multivalued attributes have been eliminated.
7-7
Dependencies:
determinant
Definitions:Functional
Dependency
Salesperson Number
dependan
t
Salesperson Name
Salesperson Number is the determinant.
The value of Salesperson Number
determines the value of Salesperson Name.
Salesperson Name is functionally
dependent on Salesperson Number.
7-8
Dependencies: Definitions
Partial Dependency when an non-key attribute is
determined by a part, but not the whole, of a COMPOSITE
primary key.
Partial
Dependency
Dependencies: Definitions
Transitive Dependency when a non-key
attribute determines another non-key
attribute.
Transitive
Dependency
10
Normal Forms: Review
Unnormalized There are multivalued
attributes or repeating groups
1 NF No multivalued attributes or
repeating groups.
2 NF 1 NF plus no partial dependencies
3 NF 2 NF plus no transitive
dependencies
11
Example 1: Determine NF
ISBN Title
ISBN Publisher
Publisher Address
All attributes are directly
or indirectly determined
by the primary key;
therefore, the relation is
at least in 1 NF
BOOK
ISBN
Title
Publisher
Address
12
Example 1: Determine NF
ISBN Title
ISBN Publisher
Publisher Address
The relation is at least in 1NF.
There is no COMPOSITE
primary key, therefore there
cant be partial dependencies.
Therefore, the relation is at
least in 2NF
BOOK
ISBN
Title
Publisher
Address
13
Example 1: Determine NF
ISBN Title
ISBN Publisher
Publisher Address
We know that the relation is at
least in 2NF, and it is not in 3
NF. Therefore, we conclude
that the relation is in 2NF.
BOOK
ISBN
Title
Publisher
Address
14
Example 1: Determine NF
ISBN Title
ISBN Publisher
Publisher Address
Publisher is a non-key attribute,
and it determines Address,
another non-key attribute.
Therefore, there is a transitive
dependency, which means that
the relation is NOT in 3 NF.
BOOK
ISBN
Title
Publisher
Address
15
Example 1: Determine NF
ISBN Title
ISBN Publisher
Publisher Address
In your solution you will write the
following justification:
1) No M/V attributes, therefore at
least 1NF
2) No partial dependencies,
therefore at least 2NF
3) There is a transitive dependency
(Publisher Address), therefore,
not 3NF
Conclusion: The relation is in 2NF
BOOK
ISBN
Title
Publisher
Address
16
Example 2: Determine NF
Product_ID Description
All attributes are directly or
indirectly determined by the
primary key; therefore, the relation
is at least in 1 NF
ORDER
Order_No
Product_ID
Description
17
Example 2: Determine NF
Product_ID Description
The relation is at least in 1NF.
There is a COMPOSITE Primary Key (PK)
(Order_No, Product_ID), therefore there can be
partial dependencies. Product_ID, which is a part
of PK, determines Description; hence, there is a
partial dependency. Therefore, the relation is not
2NF. No sense to check for transitive
dependencies!
ORDER
Order_No
Product_ID
Description
18
Example 2: Determine NF
Product_ID Description
We know that the relation is at least
in 1NF, and it is not in 2 NF.
Therefore, we conclude that the
relation is in 1 NF.
ORDER
Order_No
Product_ID
Description
19
Example 2: Determine NF
Product_ID Description
In your solution you will write the
following justification:
1) No M/V attributes, therefore at least
1NF
2) There is a partial dependency
(Product_ID Description), therefore
not in 2NF
Conclusion: The relation is in 1NF
ORDER
Order_No
Product_ID
Description
20
Example
3:
Determine
NF
Part_ID Description
Part_ID Price
Part_ID, Comp_ID No
Comp_ID and No are not
determined by the primary
key; therefore, the relation
is NOT in 1 NF. No sense
in looking at partial or
transitive dependencies.
PART
Part_ID
Descr
Price
Comp_ID
No
21
Example 3: Determine NF
In your solution you will write
Part_ID Description
the following justification:
1) There are M/V attributes;
Part_ID Price
therefore, not 1NF
Part_ID, Comp_ID No Conclusion: The relation is not
normalized.
PART
Part_ID
Descr
Price
Comp_ID
No
22
Bringing a Relation to
1NF
23
Bringing a Relation to
1NF
Option 1: Make a determinant of the
repeating group (or the multivalued
attribute) a part of the primary key.
Composite
Primary Key
24
Bringing a Relation to
1NF
Option 2: Remove the entire repeating group
from the relation. Create another relation which
would contain all the attributes of the repeating
group, plus the primary key from the first
relation. In this new relation, the primary key
from the original relation and the determinant of
the repeating group will comprise a primary key.
25
Bringing a Relation to
1NF
STUDENT_COURSE
Stud_ID
Course
Units
101
MSI 250
101
MSI 415
125
MSI 331
26
Bringing a Relation to
2NF
Composite
Primary Key
27
Bringing a Relation to
2NF
Goal: Remove Partial Dependencies
Composite
Primary Key
Partial
Dependencies
28
Bringing a Relation to
2NF
Remove attributes that are dependent from the part
but not the whole of the primary key from the
original relation. For each partial dependency, create
a new relation, with the corresponding part of the
primary key from the original as the primary key.
29
Bringing a Relation to 2NF
STUDENT_COURSE
Stud_ID
Course_ID
101
MSI 250
101
MSI 415
125
MSI 331
COURSE
Course_ID
Units
MSI 250
3.00
MSI 415
3.00
MSI 331
3.00
30
Bringing a Relation to
3NF
Goal: Get rid of transitive dependencies.
Transitive
Dependency
31
Bringing a Relation to
Remove the attributes, which are dependent on a
3NF
non-key attribute, from the original relation. For
each transitive dependency, create a new relation
with the non-key attribute which is a determinant
in the transitive dependency as a primary key, and
the dependent non-key attribute as a dependent.
32
Bringing a Relation to 3NF
DEPARTMENT
Dept_ID Dept_Name
1
Acct
Mktg
33
Conclusion
We have seen how Database Normalization
can decrease redundancy, increase efficiency
and reduce anomalies by implementing three
of seven different levels of normalization
called Normal Forms. The first three NFs are
usually sufficient for most small to medium
size applications.
34