Module-3
Module-3
Module – 3
Normalization
Normalization
P a g e 1 | 15
Module – 3: Database Management System
errors.
• Simplified database design: Normalization provides guidelines for organizing
tables and data relationships, making it easier to design and maintain a database.
• Improved query performance: Normalized tables are typically easier to search
and retrieve data from, resulting in faster query performance.
• Easier database maintenance: Normalization reduces the complexity of a
database by breaking it down into smaller, more manageable tables, making it
easier to add, modify, and delete data.
Functional dependencies
• In relational database management, functional dependency is a concept that
specifies the relationship between two sets of attributes where one attribute
determines the value of another attribute.
• It is denoted as X → Y, where the attribute set on the left side of the arrow, X is
called Determinant, and Y is called the Dependent.
• A functional dependency occurs when one attribute uniquely determines another
attribute within a relation.
• It is a constraint that describes how attributes in a table relate to each other. If
attribute A functionally determines attribute B we write this as the A→B.
• Functional dependencies are used to mathematically express relations among
database entities.
Example:
roll_no name dept_name dept_building
42 abc CO A4
43 pqr IT A3
44 xyz CO A4
45 xyz IT A3
46 mno EC B2
47 jkl ME B2
From the above table we can conclude some valid functional dependencies:
• roll_no → { name, dept_name, dept_building }→ Here, roll_no can determine values
P a g e 2 | 15
Module – 3: Database Management System
42 abc 17
43 pqr 18
44 xyz 18
Here, {roll_no, name} → name is a trivial functional dependency, since the dependent
name is a subset of determinant set {roll_no, name}. Similarly, roll_no → roll_no is also an
example of trivial functional dependency.
P a g e 3 | 15
Module – 3: Database Management System
Example:
roll_no name age
42 abc 17
43 pqr 18
44 xyz 18
Here, roll_no → name is a non-trivial functional dependency, since the dependent name
is not a subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-
trivial functional dependency, since age is not a subset of {roll_no, name}.a
P a g e 4 | 15
Module – 3: Database Management System
A table is in 1 NF if:
P a g e 5 | 15
Module – 3: Database Management System
In the above table, Courses has a multi-valued attribute, so it is not in 1NF. The Below
Table is in 1NF as there is no multi-valued attribute.
• Second Normal Form (2NF) is based on the concept of fully functional dependency.
• For a table to be in 2NF, it must first meet the requirements of First Normal Form
(1NF). Additionally, the table should not have partial dependencies. In other
words,
• 2NF eliminates redundant data by requiring that each non-key attribute be
dependent on the primary key or candidate key.
• This means that each column should be directly related to the primary key, and not
to other columns.
Example-1: Consider the table below.
P a g e 6 | 15
Module – 3: Database Management System
• There are many courses having the same course fee. Here, COURSE_FEE cannot
alone decide the value of COURSE_NO or STUD_NO.
• COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO.
• COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO.
• The candidate key for this table is {STUD_NO, COURSE_NO} because the
combination of these two columns uniquely identifies each row in the table.
• COURSE_FEE is a non-prime attribute because it is not part of the candidate
key {STUD_NO, COURSE_NO}.
• But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO,
which is a proper subset of the candidate key.
• Therefore, non-prime attribute COURSE_FEE is dependent on a proper subset of
the candidate key, which is a partial dependency and so this relation is not in
2NF.
To convert the above relation to 2NF, we need to split the table into two tables such
as:
• Table 1: STUD_NO, COURSE_NO
• Table 2: COURSE_NO, COURSE_FEE.
P a g e 7 | 15
Module – 3: Database Management System
o A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
o 3NF is used to reduce the data duplication. It is also used to achieve the data
integrity.
o If there is no transitive dependency for non-prime attributes, then the relation
must be in third normal form.
A relation is in third normal form if it holds at least one of the following conditions for
every non-trivial function dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Example:
EMPLOYEE_DETAIL table
P a g e 8 | 15
Module – 3: Database Management System
• That's why we need to move the EMP_CITY and EMP_STATE to the new relation,
with EMP_ZIP as a Primary key.
P a g e 9 | 15
Module – 3: Database Management System
EMP_DEPT_MAPPING table:
P a g e 10 | 15
Module – 3: Database Management System
In other words, R(X, Y, Z) in a relation where it will be assumed that X, Y, Z are pair
wise disjoint.
X = {x1, x2 …………….. xn)
Y = { y1, y2 ……………..yn)
Z = { z1, z2 …………….. zn)
P a g e 11 | 15
Module – 3: Database Management System
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent
entities. Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two hobbies, Dancing and Singing. So, there is a
multi-valued dependency on STU_ID, which leads to unnecessary repetition of data. So,
to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE:
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
P a g e 12 | 15
Module – 3: Database Management System
STUDENT_HOBBY:
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey
Join Dependency
o Join decomposition is a further generalization of Multivalued dependencies.
o If the join of R1 and R2 over C is equal to relation R, then we can say that a join
dependency (JD) exists.
o Where R1 and R2 are the decompositions R1(A, B, C) and R2(C, D) of a given
relations R (A, B, C, D).
o Alternatively, R1 and R2 are a lossless decomposition of R.
o A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1, R2,....., Rn is a lossless-
join decomposition.
Types of Join Dependency
There are two types of Join Dependencies:
• Lossless Join Dependency: It means that whenever the join occurs between the
tables, then no information should be lost, the new table must have all the
content in the original table.
Example:
Student
Roll No. S_name S_dept
1 Raju CSE
2 Raju Quantum Computing
P a g e 13 | 15
Module – 3: Database Management System
StudentDetails Dept
Roll No. S_name Roll No. S_dept
1 Raju 1 CSE
2 Raju 2 Quantum Computing
C1 AC Aman
C2 Refrigerator Mohan
C2 TV Mohit
Table: R1 Table: R2
Company Product Product Agent
C1 TV TV Aman
C1 AC AC Aman
C2 TV TV Mohit
P a g e 14 | 15
Module – 3: Database Management System
In the above result we got two additional tuples after performing join i.e. (C1, TV, Mohan)
& (C2, TV, Aman) these tuples are known as Spurious Tuple, which is not the property of
Join Dependency. Therefore, this type of dependency is called as lossy join dependency.
P a g e 15 | 15