Normalization - Why
• Eliminate anomalies
• Avoid duplication
• Increase flexibility and stability
• Reduce maintenance
Normalization - What?!?
• Analysis of functional dependencies
between attributes
• Building several smaller tables from larger
ones
• Decomposing relations with anomalies to
produce smaller, well-structured relations
• Reducing complexity & increasing stability
Normalization - What (2)
• Series of Steps
– Recipe for constructing a “good” physical
model of a database from a logical model
– Applied to all existing tables, including ones
produced by earlier normalization steps
Example
Sales
(Order#, Date, CustID, Name, Address,
City, State, Zip, {Product#, ProductDesc,
Price, QuantityOrdered}, Subtotal, Tax,
S&H, Total)
• What are the problems with using a single
table for all order information?
Problems
• Implementing Repeating Groups
• Duplication of Data (customer name &
address)
• Unnecessary Data (subtotal, total, tax)
• Others
Normalization is a process to eliminate these
problems.
1st Normal Form
• Eliminate Repeating Groups
• 1st Normal Form has no repeating groups
• Create definition with all other attributes,
remove the repeat {}, and change the
primary key to include the “key” for the
repeating group.
Example
Sales
(Order#, Date, CustID, Name, Address,
City, State, Zip, Product#, ProductDesc,
Price, QuantityOrdered, Subtotal, Tax,
S&H, Total)
• Why is this better?
1st NF Improvements
• Implementation is possible
• Querying is possible
2nd Normal Form
• Remove all partial functional dependencies
• 2nd Normal Form has no partial functional
dependencies and is in 1st Normal Form
• Partial dependencies get their own tables --
original table gets a foreign key
Partial Functional Dependencies
• An attribute is only dependent on part of the
primary key
– must be composite key
– single attribute key is 2nd NF
• Functional dependencies can be specified
explicitly but usually come from the E-R model,
user specifications, and common sense
key non-key attributes
Example - Functional
Dependencies
Order# Date, CustID, Name, Address, City, State, Zip,
Subtotal, Tax, S&H, Total
Order#, Product# ProductDesc, Price, QuantityOrdered
CustID Name, Address, City, State, Zip
Product# ProductDesc, Price
Which are partial functional dependencies?
Example
Sales (Order#, Date, CustID, Name, Address,
City, State, Zip, Subtotal, Tax, S&H, Total)
OrderLine (Order#, Product#, ProductDesc,
Price, QuantityOrdered)
• Is this 2nd NF?
Example
Sales (Order#, Date, CustID, Name, Address, City,
State, Zip, Subtotal, Tax, S&H, Total)
OrderLine (Order#, Product#, QuantityOrdered)
Product (Product#, ProductDesc, Price)
• Is this 2nd NF? Why is this better than 1st NF?
2nd NF Improvements
• Elimination of Duplicate Data
• No Loss
3rd Normal Form
• Eliminate transitive functional
dependencies
• 3rd Normal Form has no transitive
depencencies and is in 2nd Normal Form
• Transitive dependencies get their own
tables -- original table gets a foreign key
Transitive Functional
Dependencies
• Attribute is dependent on another, non-key
attribute or attributes
• Attribute is the result of a calculation
CustID Name, Address, City, State, Zip
Example
Sales (Order#, Date, CustID, Subtotal, Tax, S&H, Total)
OrderLine (Order#, Product#, QuantityOrdered)
Product (Product#, ProductDesc, Price)
Customer (CustID, Name, Address, City, State, Zip)
• Is this 3rd NF? Why is this better than 2nd NF?
Example
Sales (Order#, Date, CustID)
OrderLine (Order#, Product#, QuantityOrdered)
Product (Product#, ProductDesc, Price)
Customer (CustID, Name, Address, City, State, Zip)
• Is this 3rd NF? Why is this better than 2nd NF?