0% found this document useful (0 votes)
21 views

DBMS Module 3.2 PDF

Uploaded by

Nitya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

DBMS Module 3.2 PDF

Uploaded by

Nitya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

DATABASE MANAGEMENT SYSTEM

MCA 104
Module-3
3.2 Normal Forms, Decompositionin to Normalized Relations

Normal Forms

In the field of database management systems (DBMS), normal forms are


a set of guidelines that define how to organize and structure relational
databases to minimize data redundancy, avoid data anomalies, and
ensure data integrity. Normal forms are used in the process of database
normalization, which is essential for creating well-structured and
efficient databases.

The concept of normal forms was introduced by Edgar F. Codd, a


computer scientist, in his seminal paper "A Relational Model of Data for
Large Shared Data Banks" published in 1970. Codd's work laid the
foundation for the relational database model that is widely used today.
There are several normal forms, each building upon the previous ones,
and they are denoted by numbers (e.g., First Normal Form, Second
Normal Form, etc.).

1. First Normal Form (1NF):

2. Second Normal Form (2NF):

3. Third Normal Form (3NF):

4. Boyce-Codd Normal Form (BCNF):

There are higher normal forms like Fourth Normal Form (4NF) and Fifth
Normal Form (5NF), which further refine the normalization process and
address additional anomalies, but they are less commonly encountered
in practical database design.

The process of achieving higher normal forms involves decomposition,


which is the breaking down of tables into smaller, more normalized
tables to remove data redundancy and dependencies. Proper database
normalization helps in designing efficient databases that are easy to
maintain, update, and query, while reducing the risk of data
inconsistencies and anomalies.
Frist Normal Form (1NF)

First Normal Form (1NF) is the foundational level of normalization in


database management systems (DBMS). It ensures that each attribute
in a table contains only atomic values, and there are no repeating
groups or arrays within the table. In other words, 1NF eliminates the
use of composite attributes and ensures that each cell in the table holds
a single value.

Example - Student Course Registration:

Suppose we have a table called "StudentCourses" that stores


information about students and the courses they have registered for.
The table has the following attributes:

 StudentID (Primary Key)

 StudentName

 Courses (a multivalued attribute containing a list of courses the


student is registered for)

Here's the initial table without being in 1NF:

StudentID StudentName Courses

101 John Smith Mathematics, Science, History


102 Jane Doe English, Mathematics

103 Mike Johnson Science, Computer Science

In the above table, the "Courses" attribute contains multiple values,


creating a repeating group within the table for each student. This
violates the principles of 1NF, as it should have atomic values in each
cell.

To convert the table into 1NF, we need to break down the multivalued
attribute "Courses" into individual rows for each course registered by a
student. The resulting table would look like this:

Table - StudentCourses (in 1NF):

StudentID StudentName Courses

101 John Smith Mathematics

101 John Smith Science

101 John Smith History

102 Jane Doe English

102 Jane Doe Mathematics

103 Mike Johnson Science


103 Mike Johnson Computer Science

Now, the "StudentCourses" table is in 1NF because each cell contains


atomic values. The multivalued attribute "Courses" has been
transformed into separate rows, one for each course registered by a
student.

By achieving 1NF, we ensure that the data is properly organized, and we


avoid repeating groups, making it easier to manage and query the
database. The next level of normalization, the Second Normal Form
(2NF), further refines the organization of data by addressing partial
dependencies within the table.

Second Normal Form (2NF)

A table is in Second Normal Form (2NF) if it is in 1NF and there are no


partial dependencies. In other words, each non-key attribute (attribute
that is not part of the primary key) must be fully functionally dependent
on the entire primary key.

Example - Student Course Grades:

Suppose we have a table called "StudentGrades" that stores


information about students, the courses they have taken, and their
grades. The table has the following attributes:
 StudentID (Primary Key)

 CourseID (Primary Key)

 StudentName

 CourseName

 Grade

Here's the initial table without being in 2NF:

StudentID CourseID StudentName CourseName

101 1 John Smith Mathematics

101 2 John Smith Science

101 3 John Smith History

102 1 Jane Doe Mathematics

102 4 Jane Doe English

103 2 Mike Johnson Science

103 5 Mike Johnson Computer


Science

In the above table, we have a composite primary key (StudentID,


CourseID) that uniquely identifies each record. However, the non-key
attributes "StudentName" and "CourseName" are not fully dependent
on the entire primary key. Instead, they are partially dependent on just
the "StudentID" and "CourseID," respectively.

To convert the table into 2NF, we need to separate the non-key


attributes that are partially dependent on the primary key into their
own tables. The resulting tables would look like this:

1. Table - Students:

StudentID StudentName StudentID

101 John Smith 101

102 Jane Doe 102

103 Mike Johnson 103

2. Table - Courses:

CourseID CourseName

1 Mathematics

2 Science

3 History

4 English
5 Computer Science

3. Table - StudentGrades (in 2NF):

StudentID CourseID Grade

101 1 A

101 2 B

101 3 A-

102 1 B+

102 4 A

103 2 A-

103 5 B

Now, the "StudentGrades" table is in 2NF because each non-key


attribute ("StudentID," "CourseID," and "Grade") is fully functionally
dependent on the entire primary key. The "StudentName" and
"CourseName" attributes have been moved to their respective tables
("Students" and "Courses") where they are fully dependent on their
corresponding primary keys.
By achieving 2NF, we ensure that the database is more organized and
free from partial dependencies, which can lead to data redundancy and
anomalies.

Third Normal Form (3NF)

A table is in Third Normal Form (3NF) if it is in 2NF and there are no


transitive dependencies. In other words, each non-key attribute
(attribute that is not part of the primary key) must be dependent only
on the primary key and not on other non-key attributes.

Example - Employee Department Information:

Suppose we have a table called "EmployeeDepartments" that stores


information about employees, their departments, and the department
managers. The table has the following attributes:

 EmployeeID (Primary Key)

 EmployeeName

 DepartmentID (Primary Key)

 DepartmentName

 ManagerID (Foreign Key referencing EmployeeID)

 ManagerName
Here's the initial table without being in 3NF:

EmpID EmpName DepartI DepartmentName ManagerID ManagerNa


D

101 John Smith D001 HR 201 Mary Brown

102 Jane Doe D002 Marketing 202 Mike Johnso

201 Mary Brown D001 HR NULL NULL

202 Mike D002 Marketing NULL NULL


Johnson

In the above table, we have a composite primary key (EmployeeID,


DepartmentID) that uniquely identifies each record. However, the non-
key attributes "DepartmentName" and "ManagerName" are not directly
dependent on the entire primary key. Instead, they are transitively
dependent on the primary key through the "DepartmentID" attribute.
For example, the "ManagerName" is determined by the "ManagerID,"
which, in turn, is dependent on the "DepartmentID."

To convert the table into 3NF, we need to remove the transitive


dependencies by breaking down the table into separate tables. The
resulting tables would look like this:

1. Table - Employees:
EmployeeID EmployeeName

101 John Smith

102 Jane Doe

201 Mary Brown

202 Mike Johnson

2. Table - Departments:

DepartmentID DepartmentName ManagerID

D001 HR 201

D002 Marketing 202

3. Table - Managers:

ManagerID ManagerName

201 Mary Brown

202 Mike Johnson

Now, the "EmployeeDepartments" table is in 3NF because each non-key


attribute ("EmployeeID," "DepartmentID," "ManagerID,"
"DepartmentName," and "ManagerName") is directly dependent on the
entire primary key. The transitive dependencies have been removed by
splitting the original table into three separate tables: "Employees,"
"Departments," and "Managers."

By achieving 3NF, we ensure that the database is free from transitive


dependencies, which can lead to data redundancy and anomalies. The
next level of normalization, Boyce-Codd Normal Form (BCNF), further
refines the organization of data by addressing multivalued
dependencies within the table.

Boyce-Codd Normal Form (BCNF)

A table is in Boyce-Codd Normal Form (BCNF) if, for every non-trivial


functional dependency X -> Y, the determinant X is a super key. In other
words, in BCNF, each non-trivial functional dependency has a super key
as the determinant, ensuring that there are no non-trivial multivalued
dependencies in the table.

To better understand BCNF, let's consider an example:

Example - Employee Project Allocation:

Suppose we have a table called "EmployeeProjects" that stores


information about employees, the projects they are assigned to, and
their roles in each project. The table has the following attributes:

 EmployeeID (Primary Key)


 ProjectID (Primary Key)

 EmployeeName

 ProjectName

 Role

Here's the initial table without being in BCNF:

EmployeeID ProjectID EmployeeName ProjectName

101 P001 John Smith ProjectA

101 P002 John Smith ProjectB

102 P001 Jane Doe ProjectA

103 P003 Mike Johnson ProjectC

In the above table, we have a composite primary key (EmployeeID,


ProjectID) that uniquely identifies each record. However, there is a
multivalued dependency between the attributes "EmployeeID" and
"Role." The "Role" attribute is dependent on the "EmployeeID" but not
on the entire composite primary key.

To convert the table into BCNF, we need to remove the multivalued


dependency by breaking down the table into separate tables. The
resulting tables would look like this:
1. Table - Employees:

EmployeeID EmployeeName

101 John Smith

102 Jane Doe

103 Mike Johnson

2. Table - Projects:

ProjectID ProjectName

P001 ProjectA

P002 ProjectB

P003 ProjectC

3. Table - EmployeeRoles:

EmployeeID ProjectID Role

101 P001 Developer

101 P002 Tester

102 P001 Tester

103 P003 Manager


Now, the "EmployeeProjects" table is in BCNF because each non-trivial
functional dependency ("EmployeeID ->EmployeeName" and "ProjectID
->ProjectName") has a super key as the determinant. Additionally, the
multivalued dependency between "EmployeeID" and "Role" has been
removed by creating a separate table for "EmployeeRoles."

By achieving BCNF, we ensure that the database is free from non-trivial


multivalued dependencies, which can lead to data redundancy and
anomalies. BCNF is a stronger level of normalization compared to Third
Normal Form (3NF) and is useful in creating efficient and well-
structured databases. However, it is essential to note that achieving
BCNF may result in a higher number of tables, which can impact query
performance and require careful consideration during database design.

6.1 Decomposition into Normalized Relations

Decomposition into normalized relations, also known as database


normalization, is the process of breaking down a single table with
complex attributes into multiple smaller tables with simpler structures.
The goal of decomposition is to eliminate data redundancy, prevent
data anomalies, and ensure data integrity while adhering to the
principles of normalization.

The normalization process involves several steps, each aiming to


achieve a higher normal form. The most commonly used normal forms
are First Normal Form (1NF), Second Normal Form (2NF), Third Normal
Form (3NF), Boyce-Codd Normal Form (BCNF), Fourth Normal Form
(4NF), and Fifth Normal Form (5NF).

Step-by-Step Process of Decomposition into Normalized Relations:

1. Identify the Functional Dependencies:Examine the table's


attributes and identify the functional dependencies between
them. A functional dependency is a relationship where the
value of one attribute determines the value of another
attribute.

2. Ensure First Normal Form (1NF):Ensure that each table cell


contains atomic (indivisible) values. Eliminate repeating groups,
nested relations, and composite attributes. This step ensures
that each cell holds a single value, and there are no arrays or
lists within a single cell.

3. Ensure Second Normal Form (2NF):Address partial


dependencies. Ensure that each non-key attribute (attribute
that is not part of the primary key) is fully functionally
dependent on the entire primary key. If any non-key attribute
depends on only part of the primary key, decompose the table
to resolve this issue.
4. Ensure Third Normal Form (3NF):Address transitive
dependencies. Ensure that each non-key attribute depends
only on the primary key and not on other non-key attributes. If
any non-key attribute depends on another non-key attribute,
decompose the table to resolve this issue.

5. Ensure Boyce-Codd Normal Form (BCNF):Address multivalued


dependencies. Ensure that each non-trivial functional
dependency has a superkey as the determinant. If there are
non-trivial multivalued dependencies, decompose the table to
eliminate them.

6. Ensure Fourth Normal Form (4NF) and Fifth Normal Form (5NF)
(if applicable):Address multi-valued and join dependencies that
arise in certain complex scenarios. Decompose the table
further to achieve higher normal forms if necessary.

7. Decompose the Table:Based on the identified functional


dependencies and the normalization principles, decompose the
original table into multiple smaller tables. Each table should
have a clear and specific purpose, and the data should be
logically distributed among the new tables.

8. Define Relationships:Establish appropriate relationships


(primary key and foreign key relationships) between the newly
created tables to maintain the integrity and connectivity of the
data.

9. Review and Optimize:Review the resulting tables and


relationships to ensure that the database design is efficient,
without any data anomalies or redundancy. Optimize the
database structure as needed for better performance.

Example - Library Book Catalog:

Consider a table called "Books" in a library database that stores


information about books available in the library. The table has the
following attributes:

 BookID (Primary Key)

 Title

 Author

 Genre

 ISBN

 ShelfNumber

Here's the initial "Books" table:


BookI Title Author Genre ISBN ShelfNumb
D er

1 "To Kill a Harper Fictio 978006112008 A-101


Mockingbird Lee n 4
"

2 "1984" George Fictio 978045152493 B-205


Orwell n 5

3 "The Great F. Scott Fictio 978074327356 A-101


Gatsby" Fitzgerald n 5

4 "Introductio John Doe Non- 978013214519 C-303


n to DBMS" Fictio 8
n

5 "Data Jake Non- 978149191205 D-408


Science VanderPla Fictio 8
Handbook" s n

Step-by-Step Decomposition:

1. Identify Functional Dependencies:

Based on the "Books" table, we can observe the following functional


dependencies:
 BookID -> Title, Author, Genre, ISBN, ShelfNumber

 ISBN -> Title, Author, Genre

 ShelfNumber -> Title, Author, Genre

2. Ensure First Normal Form (1NF):

The "Books" table is already in 1NF since each cell contains atomic
values, and there are no repeating groups or composite attributes.

3. Ensure Second Normal Form (2NF):

Since the "Books" table has a composite primary key (BookID), it is


already in 2NF. There are no partial dependencies.

4. Ensure Third Normal Form (3NF):

The "Books" table has a transitive dependency between "ISBN" and


"ShelfNumber" attributes. To remove the transitive dependency, we
decompose the table as follows:

 Table - Titles:

BookID Title Author Genre

1 "To Kill a Harper Lee Fiction


Mockingbird"
2 "1984" George Fiction
Orwell

3 "The Great Gatsby" F. Scott Fiction


Fitzgerald

4 "Introduction to John Doe Non-Fiction


DBMS"

5 "Data Science Jake Non-Fiction


Handbook" VanderPlas

 Table - BookDetails:

ISBN BookID ShelfNumber

9780061120084 1 A-101

9780451524935 2 B-205

9780743273565 3 A-101

9780132145198 4 C-303

9781491912058 5 D-408

5. Ensure Boyce-Codd Normal Form (BCNF):


The "BookDetails" table has a non-trivial functional dependency "ISBN -
>BookID," and "ISBN" is a superkey. The table is in BCNF.

6. Optimization (Optional):

At this point, the decomposition is complete, and the tables are in


BCNF. You may review the database design for any potential
optimizations based on query patterns and performance
considerations.

By decomposing the "Books" table into "Titles" and "BookDetails"


tables, we have achieved a normalized database structure. Each table
now serves a specific purpose, and the data is logically distributed,
reducing redundancy and ensuring data integrity. This decomposition
makes the database easier to maintain, query, and update, while
minimizing the risk of data anomalies.

You might also like