0% found this document useful (0 votes)
41 views15 pages

Week 1

Uploaded by

osesayjr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views15 pages

Week 1

Uploaded by

osesayjr
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

WEEK 1

LECTURE 1: NORMALIZATION TECHNIQUES

Lecture Synopsis:

1.1 Understanding Normalization

1.2 Detailed steps of normalization (1NF, 2NF, 3NF)

1.3 Practical exercises on normalizing data

What Is Database Normalization?

Database normalization, or just normalization as it’s commonly called, is a process used

for data modelling or database creation, where you organize your data and tables so it

can be added and updated efficiently.

Normalization is the process of organizing the data in the database.

It’s something a person does manually, as opposed to a system or a tool doing it. It’s

commonly done by database developers and database administrators.

It can be done on any relational database, where data is stored in tables that are linked

to each other. This means that normalization in a DBMS (Database Management System)

can be done in Oracle, Microsoft SQL Server, MySQL, PostgreSQL and any other type of

database.

To perform the normalization process, you start with a rough idea of the data you want

to store, and apply certain rules to it in order to get it to a more efficient form.

1
Why Normalize a Database?

So why would anyone want to normalize their database?

Why do we want to go through this manual process of rearranging the data?

There are a few reasons we would want to go through this process:

 Make the database more efficient

 Prevent the same data from being stored in more than one place (called an “Insert

Anomaly”)

 Prevent updates being made to some data but not others (called an “Update

Anomaly”)

 Prevent data not being deleted when it is supposed to be, or from data being lost

when it is not supposed to be (called a “Delete Anomaly”)

 Ensure the data is accurate

 Reduce the storage space that a database takes up

 Ensure the queries on a database run as fast as possible

Normalization in a DBMS is done to achieve these points (removing these anomalies).

Without normalization on a database, it can lead to data redundancy and can cause data

integrity and other problems as the database grows. The database can be slow, incorrect,

and messy.

2
Data Anomalies

An anomaly is where there is an issue in the data that is not meant to be there. This can

happen if a database is not normalized.

There different kinds of data anomalies that can occur and that can be prevented with a

normalized database.

Example

We’ll be using a student database as an example in this note, which records student,

class, and teacher information.

Student Student Fees Course Module 1 Module Module 3


ID Name Paid Name 2

1 John Sesay 200 BIT Web Design OOP 1


2 Maria Kanu 500 BSEM Database Maths Programming 2
3 Susan Johnson 400 BICT Multimedia
4 Mathew Cole 850 DIT

This table keeps track of a few pieces of information:

 The student names

 The fees a student has paid

 The classes a student is taking, if any

This is not a normalized table, and there are a few issues with this.

3
Insert Anomaly

An insert anomaly happens when we try to insert a record into this table without knowing

all the data we need to know. That is, when one cannot insert a new record into a table

due to lack of data.

For example, if we wanted to add a new student but did not know their course name.

The new record would look like this:

Student Student Name Fees Course Class 1 Class 2 Class 3

ID Paid Name

1 John Sesay 200 BIT Web Design Database

2 Maria Kanu 500 BSEM Database Maths Programming 2

3 Susan Johnson 400 BICT Multimedia

4 Mathew Cole 850 DIT

5 Alie Turay 500 ? ?

We would be adding incomplete data to our table, which can cause issues when trying to

analyze this data.

Update Anomaly

An update anomaly happens when we want to update data and we update some of the
data but not other data.
For example, let’s say the class Biology 1 was changed to “Intro to Biology”. We would
have to query all of the columns that could have this Class field and rename each one
that was found.

4
Student Student Fees Course Class 1 Class 2 Class 3

ID Name Paid Name

1 John Sesay 200 BIT Web Design Database

2 Maria Kanu 500 BSEM Database Maths Programming

3 Susan Johnson 400 BICT Multimedia

4 Mathew Cole 850 DIT

There’s a risk that we miss out on a value, which would cause issues.

Ideally, we would only update the value once, in one location.

Delete Anomaly

A delete anomaly occurs when we want to delete data from the table, but we end up

deleting more than what we intended.

For example, let’s say Susan Johnson quits and her record needs to be deleted from the

system. We could delete her row:

Student Student Fees Course Class 1 Class 2 Class 3

ID Name Paid Name

1 John Smith 200 Economics Economics Biology 1

5
2 Maria 500 Computer Biology 1 Business Programming

Griffin Science Intro 2

3 Susan 400 Medicine Biology 2

Johnson

4 Matt Long 850 Dentistry

But, if we delete this row, we lose the record of the Biology 2 class, because it’s not

stored anywhere else. The same can be said for the Medicine course.

We should be able to delete one type of data or one record without having impacts on

other records we don’t want to delete.

What Are The Normal Forms?

The process of normalization involves applying rules to a set of data. Each of these rules

transforms the data to a certain structure, called a normal form.

There are three main normal forms that you should consider (Actually, there are six

normal forms in total, but the first three are the most common).

Whenever the first rule is applied, the data is in “first normal form“. Then, the second

rule is applied and the data is in “second normal form“. The third rule is then applied and

the data is in “third normal form“.

Fourth and fifth normal forms are then achieved from their specific rules.

6
First Normal Form (1NF)

 A relation will be 1NF if it contains an atomic value.

 It states that an attribute of a table cannot hold multiple values. It must hold only

single-valued attribute.

 First normal form disallows the multi-valued attribute, composite attribute, and their

combinations.

 Ensures that the database table is organized such that each column contains atomic

values, and each record is unique. This eliminates repeating groups, thereby

structuring data into tables and columns.

Examples 1:

EMPLOYEE table:

EMP_ID EMP_NAME DEPARTMENT BRANCH


1 John Sesay IT Aberdeen, Lumley
2 Henry Tucker Finance Kissy, Wellington
3 Sam Johnny HR Aberdeen, Lumley, Kissy,
Wellington

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID EMP_NAME DEPARTMENT BRANCH


1 John Sesay IT Aberdeen
1 John Sesay IT Lumley
2 Harry Tucker Finance Kissy
2 Harry Tucker Finance Wellington

7
3 Sam Johnny HR Aberdeen
3 Sam Johnny HR Lumley
3 Sam Johnny HR Kissy
3 Sam Johnny HR Wellington

Examples 2:

Assume, a video library maintains a database of movies rented out. Without any

normalization in database, all information is stored in one table as shown below.

ID Full name Address Movies Rented Contact

101 John Sesay 7 Campbell Street Strike Back, 13 Hours The 23200123456

Secret Soldiers Of

Benghazi

102 Abu Koroma 6 Sanders Street Picture Of Her, Bob Marley 23211258963

One Love

103 Peter Amara 12 Amara Drive Strike Back 23233000000

Here you see Movies Rented column has multiple values.

Now let’s move into 1st Normal Forms:

First Normal Form (1NF)

 Each table cell should contain a single value.

 Each record needs to be unique.

The above table in 1NF:

8
ID Full name Address Movies Rented Contact
101 John Sesay 7 Campbell Street Strike Back 23200123456
102 John Sesay 7 Campbell Street 13 Hours The Secret Soldiers Of 23200123456
Benghazi
102 Abu Koroma 6 Sanders Street Picture Of Her 23211258963
102 Abu Koroma 6 Sanders Street Bob Marley One Love 23211258963
103 Peter Amara 12 Amara Drive Strike Back 23233000000

Examples 3:

Employee table, it displays employees are working with multiple departments.

ID Employee Age Department


1 Melvin 32 Marketing, Sales
2 Edward 45 Quality Assurance
3 Alex 36 Human Resource

Employee table following 1NF:

ID Employee Age Department


1 Melvin 32 Marketing
1 Melvin 32 Sales
2 Edward 45 Quality Assurance
3 Alex 36 Human Resource

9
Examples 4:
Student Database
ID Name Age Gender Phone Faculty Courses Modules Lecturers
Number
1 John Doe 22 Male +23276123456 FICT DIT Database Mr Umar,
Systems, Mr Jelil
Object-
Oriented
Programming
Methods 1
2 Jane 21 Female +23277654321 FDI BIT Fundamentals Mr
Smith of Computer Amandus,
Systems, Mr Kanu
Database
Design &
Management
2
3 Abdul 23 Male +23278987654 FABE BBIT Multimedia Mr Sahid,
Kamara Technology, Mr Umar
Web Design 1
4 Aminata 20 Female +23279321987 FICT BICT Object- Mr Jelil,
Bangura Oriented Mr
Programming Amandus
Methods 1,
Database
Design &
Management
2
5 Mohamed 24 Male +23276555666 FDI BSEM Web Design Mr Kanu,
Sesay 1, Mr Sahid
Fundamentals
of Computer
Systems

10
The above table in 1NF:

ID Name Age Gender Phone Faculty Courses Modules Lecturers


Number
1 John Doe 22 Male 232123456 FICT DIT Database Mr Umar
Systems
1 John Doe 22 Male 232123457 FICT DIT Object- Mr Jelil
Oriented
Programming
Methods 1
2 Jane 21 Female 232123458 FDI BIT Fundamentals Mr
Smith of Computer Amandus
Systems
2 Jane 21 Female 232123459 FDI BIT Database Mr Kanu
Smith Design &
Management
2
3 Abdul 23 Male 232123460 FABE BBIT Multimedia Mr Sahid
Kamara Technology
3 Abdul 23 Male 232123461 FABE BBIT Web Design 1 Mr Umar
Kamara
4 Aminata 20 Female 232123462 FICT BICT Object- Mr Jelil
Bangura Oriented
Programming
Methods 1
4 Aminata 20 Female 232123463 FICT BICT Database Mr
Bangura Design & Amandus
Management
2
5 Mohamed 24 Male 232123464 FDI BSEM Web Design 1 Mr Kanu
Sesay
5 Mohamed 24 Male 232123465 FDI BSEM Fundamentals Mr Sahid
Sesay of Computer
Systems

11
Second Normal Form (2NF)

 In the 2NF, relational must be in 1NF.

 In the second normal form, all non-key attributes are fully functional dependent on

the primary key. That is, all attributes that are not primary key within the entity

should depend solely on the unique identifier of the entity.

Examples 1: Student Database

Students Table:

Student Name Age Gender Phone Faculty Courses


ID Number
1 John Doe 22 Male 23276123456 FICT DIT

2 Jane Smith 21 Female 23277654321 FDI BIT

3 Abdul Kamara 23 Male 23278987654 FABE BBIT

4 Aminata 20 Female 23279321987 FICT BICT


Bangura
5 Mohamed Sesay 24 Male 23276555666 FDI BSEM

Modules Table:

Module ID Module Name Lecturer

1 Database Systems Mr Umar

2 Object-Oriented Programming Methods 1 Mr Jelil


3 Fundamentals of Computer Systems Mr Amandus
4 Database Design & Management 2 Mr Kanu

5 Multimedia Technology Mr Sahid


6 Web Design 1 Mr Umar

12
In this 2NF structure:

 The Students table contains unique student information, fully dependent on the
Student ID.
 The Modules table contains unique module information, fully dependent on the
Module ID.

Third Normal Form (3NF)

A relation (table) is in Third Normal Form if:

1. It is in Second Normal Form (2NF): This means that it is already in First Normal

Form (1NF), and all non-key attributes are fully functionally dependent on the

primary key.

2. It contains no transitive dependencies: This means that non-key attributes

must not depend on other non-key attributes. Every non-key attribute must be

directly dependent on the primary key and not on any other non-key attribute.

In simpler terms, 3NF ensures that:

 There is no redundancy in non-key attributes.

 All attributes are directly dependent on the primary key.

 3NF is used to reduce the data duplication. It is also used to achieve the data

integrity.

13
Students Table:

Student ID Name Age Gender Phone Number

1 John Doe 22 Male 23276123456

2 Jane Smith 21 Female 23277654321

3 Abdul Kamara 23 Male 23278987654

4 Aminata Bangura 20 Female 23279321987

5 Mohamed Sesay 24 Male 23276555666

Modules Table:

Module ID Module Name Student ID Lecturer ID

1 Database Systems

2 Object-Oriented Programming Methods 1


3 Fundamentals of Computer Systems
4 Database Design & Management 2

5 Multimedia Technology
6 Web Design 1

Lecturer Table:

Lecturer ID Lecturer

1 Mr Umar

2 Mr Jelil
3 Mr Amandus
4 Mr Kanu

5 Mr Sahid
6 Mr Umar

14
Faculty Table:

Faculty ID Faculty Student ID Lecturer ID

1 FICT

2 FDI

3 FABE

4 FICT

5 FDI

Course Table:

Course ID Courses Student ID Lecturer ID

1 DIT

2 BIT

3 BBIT

4 BICT

5 BSEM

15

You might also like