0% found this document useful (0 votes)
4 views

DBMS, unit-5

Normalization is a process in database management that reduces redundancy and prevents anomalies such as insertion, update, and deletion. It involves organizing data into normal forms (1NF, 2NF, 3NF, BCNF) to ensure data integrity and consistency. Each normal form addresses specific types of dependencies and anomalies to improve database design and efficiency.

Uploaded by

Patel Shubham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

DBMS, unit-5

Normalization is a process in database management that reduces redundancy and prevents anomalies such as insertion, update, and deletion. It involves organizing data into normal forms (1NF, 2NF, 3NF, BCNF) to ensure data integrity and consistency. Each normal form addresses specific types of dependencies and anomalies to improve database design and efficiency.

Uploaded by

Patel Shubham
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Database Management System

Unit 5: Normalization

Normalization is the process of minimizing redundancy from a relation or set of relations.

Redundancy in relation may cause insertion, deletion, and update anomalies. So, Normalization
helps to minimize the redundancy in relations.

Normalization is used to keep data consistent and to check that no loss of data as well as data
integrity.

Anomalies in DBMS -
There are three types of anomalies that occur when the database is not normalized. These are:
Insertion, update and deletion anomaly. Let’s take an example to understand this.

Example: A manufacturing company stores the employee details in a table Employee that has four
attributes: Emp_Id for storing employee’s id, Emp_Name for storing employee’s name, Emp_Address
for storing employee’s address and Emp_Dept for storing the department details in which the
employee works. At some point of time the table looks like this:

Emp_Id Emp_Name Emp_Address Emp_Dept


101 Rick Delhi D001
101 Rick Delhi D002
123 Maggie Agra D890
166 Glenn Chennai D900
166 Glenn Chennai D004

This table is not normalized. We will see the problems that we face when a table in database is not
normalized.

Update anomaly: In the above table we have two rows for employee Rick as he belongs to two
departments of the company. If we want to update the address of Rick then we have to update the
same in two rows or the data will become inconsistent. If somehow, the correct address gets updated
in one department but not in other then as per the database, Rick would be having two different
addresses, which is not correct and would lead to inconsistent data.

Insert anomaly: Suppose a new employee joins the company, who is under training and currently
not assigned to any department then we would not be able to insert the data into the table if
Emp_Dept field doesn’t allow null.

Delete anomaly: Let’s say in future, company closes the department D890 then deleting the rows
that are having Emp_Dept as D890 would also delete the information of employee Maggie since she
is assigned only to this department.

To overcome these anomalies we need to normalize the data.

Created By : Krupa Patel Page 1


Normalization:

Normalization is done through normal forms


.

First Normal Form


(1NF)

Second Normal
Form (2NF)
NORMAL FORMS in
DBMS
Third Normal Form
(3NF)

Boyce Codd Normal


Form (BCNF)

Created By : Krupa Patel Page 2


First normal form (1NF):
A relation is said to be in 1NF (first normal form) if it doesn’t contain any multi-valued
attribute.

In other words, you can say that a relation is in 1NF if each attribute contains only an
atomic(single) value only.

Example:

Let’s say a company wants to store the names and contact details of its employees. It creates a
table in the database that looks like this:

Emp_Id Emp_Name Emp_Address Emp_Mobile


101 Roshn New Delhi 8912312390
102 Aman Kanpur 8812121212 ,
9900012222
103 Hitesh Chennai 7778881212
104 Raj Bangalore 9990000123,
8123450987

Two employees (Aman & Raj) have two mobile numbers that caused the Emp_Mobile field to have
multiple values for these two employees.

This table is not in 1NF as the rule says “each attribute of a table must have atomic (single) values”,
the Emp_Mobile values for employees Aman & Raj violates that rule.

To make the table complies with 1NF we need to create separate rows for each mobile number in
such a way that none of the attributes contains multiple values.

Emp_Id Emp_Name Emp_Address Emp_Mobile


101 Roshan New Delhi 8912312390
102 Aman Kanpur 8812121212
102 Aman Kanpur 9900012222
103 Hitesh Chennai 7778881212
104 Raj Bangalore 9990000123
104 Raj Bangalore 8123450987

Created By : Krupa Patel Page 3


Second Normal Form (2NF):

A given relation is called in Second Normal Form (2NF) if and only if-

1. The relation should be in the First Normal Form.


2. There should be no Partial Dependency.

Partial Dependency occurs when a non-prime attribute is functionally dependent on part of a


candidate key.

The 2nd Normal Form (2NF) eliminates the Partial Dependency.


Let us see an example −
<StudentProject>

StudentID ProjectNo StudentName ProjectName


S01 199 Mahek Geo Location
S02 120 Rahil Cluster Exploration

StudentID = Unique ID of the student


StudentName = Name of the student
ProjectNo = Unique ID of the project
ProjectName = Name of the project

In the above table, we have partial dcependeny; let us see how −

The prime key attributes are StudentID and ProjectNo, and


As stated, the non-prime attributes i.e. StudentName and ProjectName should be functionally
dependent on part of a candidate key, to be Partial Dependent.

The StudentName can be determined by StudentID, which makes the relation Partial Dependent.

The ProjectName can be determined by ProjectNo, which makes the relation Partial Dependent.

Therefore, the <StudentProject> relation violates the 2NF in Normalization and is considered a bad
database design.

To remove Partial Dependency and violation on 2NF, decompose the tables –

<StudentInfo> <ProjectInfo>
StudentID ProjectNo StudentName ProjectNo ProjectName
S01 199 Mahek 199 Geo Location
S02 120 Rahil 120 Cluster Exploration

Created By : Krupa Patel Page 4


Third Normal Form(3NF) :

A given relation is called in Third Normal Form (3NF) if and only if-

1. Relation already exists in 2NF.


2. No transitive dependency exists for non-prime attributes.

In other words,

A relation that is in First and Second Normal Form and in which no non-primary-key attribute is
transitively dependent on the primary key, then it is in Third Normal Form (3NF).

Note – If A->B and B->C are two FDs then A->C is called transitive dependency.

The normalization of 2NF relations to 3NF involves the removal of transitive dependencies.

If a transitive dependency exists, we remove the transitively dependent attribute(s) from the
relation by placing the attribute(s) in a new relation along with a copy of the determinant.

Consider the examples given below.

Example-1:
In relation STUDENT,

STUD_NO STUD_NAME STUD_STATE STUD_COUNTRY STUD_AGE


1 Harshit Gujarat India 20
2 Karan Punjab India 19
3 Jaimin Punjab India 21

FD set:
{STUD_NO -> STUD_NAME,
STUD_NO -> STUD_STATE, STUD_STATE -> STUD_COUNTRY,
STUD_NO -> STUD_AGE}
Candidate Key:
{STUD_NO}
For this relation,
STUD_NO -> STUD_STATE and STUD_STATE -> STUD_COUNTRY are true.
So STUD_COUNTRY is transitively dependent on STUD_NO.
It violates the third normal form.

To convert it in third normal form, we will decompose the relation


STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE) as:

STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE, STUD_AGE)


STATE_COUNTRY (STATE, COUNTRY)
Note –
Third Normal Form (3NF) is considered adequate for normal relational database design because most of the

Created By : Krupa Patel Page 5


3NF tables are free of insertion, update, and deletion anomalies. Moreover, 3NF always ensures functional
dependency preserving and lossless.

Created By : Krupa Patel Page 6


Normal forms at a glance

Its time to summarize our reading. We have below image to summarize the reading on normal
forms:

Created By : Krupa Patel Page 7


Boyce Codd normal form (BCNF)
Rules for BCNF
For a table to satisfy the Boyce-Codd Normal Form, it should satisfy the following two conditions:

1. It should be in the Third Normal Form.


2. And, for any dependency A → B, A should be a super key.

The second point sounds a bit tricky, right? In simple words, it means, that for a dependency A → B,
A cannot be a non-prime attribute, if B is a prime attribute.

Example

Below we have a college enrolment table with columns student_id, subject and professor.

student_id Subject professor


101 Java P.Java
101 C++ P.Cpp
102 Java P.Java2
103 C# P.Chash
104 Java P.Java

As you can see, we have also added some sample data to the table.

In the table above:

 One student can enrol for multiple subjects. For example, student with student_id 101, has opted
for subjects - Java & C++
 For each subject, a professor is assigned to the student.
 And, there can be multiple professors teaching one subject like we have for Java.

What do you think should be the Primary Key?

Well, in the table above student_id, subject together form the primary key, because using
student_id and subject, we can find all the columns of the table.

One more important point to note here is, one professor teaches only one subject, but one subject
may have two different professors.

Hence, there is a dependency between subject and professor here, where subject depends on the
professor name.

This table satisfies the 1st Normal form because all the values are atomic, column names are unique
and all the values stored in a particular column are of same domain.

This table also satisfies the 2nd Normal Form as their is no Partial Dependency.

And, there is no Transitive Dependency, hence the table also satisfies the 3rd Normal Form.

Created By : Krupa Patel Page 8


But this table is not in Boyce-Codd Normal Form.

Why this table is not in BCNF?

In the table above, student_id, subject form primary key, which means subject column is a prime
attribute.

But, there is one more dependency, professor → subject.

And while subject is a prime attribute, professor is a non-prime attribute, which is not allowed by
BCNF.

How to satisfy BCNF?

To make this relation(table) satisfy BCNF, we will decompose this table into two tables, student
table and professor table.

Below we have the structure for both the tables.

Student Table

student_id p_id
101 1
101 2
and so on...

And, Professor Table

p_id professor subject


1 P.Java Java
2 P.Cpp C++
and so on...

And now, this relation satisfy Boyce-Codd Normal Form.

Advantages and Disadvantages of Normalization :


Advantages :
 Eliminate modification anomalies
 Reduce duplicate data
o Eliminate data integrity problems
o Save file space
o Single table queries will run faster

Disadvantages
o More complicated SQL required for multitable subqueries and joins
o Extra work for DBMS can mean slower applications

Created By : Krupa Patel Page 9

You might also like