Normalization
Normalization
Advantages of Normalization
o Normalization helps to minimize data redundancy.
o Greater overall database organization.
o Data consistency within the database.
o Much more flexible database design.
o Enforces the concept of relational integrity.
Disadvantages of Normalization
o You cannot start building the database before knowing what the user needs.
o The performance degrades when normalizing the relations to higher normal forms, i.e., 4NF,
5NF.
o It is very time-consuming and difficult to normalize relations of a higher degree.
o Careless decomposition may lead to a bad database design, leading to serious problems.
7272826385,
14 John UP
9064738238
7390372389,
12 Sam Punjab
8589830302
The above EMPLOYEE table is an unnormalized relation as it contains multiple values corresponding
to EMP_PHONE attribute i.e. these values are non-atomic. So, relations with multi value entries are
called unnormalized relations.
To overcome this problem, we have to eliminate the non-atomic values of EMP_PHONE attribute.
The decomposition of the EMPLOYEE table into 1NF has been shown below:
14 John 7272826385 UP
14 John 9064738238 UP
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
• There are many courses having the same course fee. Here, COURSE_FEE cannot alone
decide the value of COURSE_NO or STUD_NO.
• COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO.
• COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO.
• The candidate key for this table is {STUD_NO, COURSE_NO} because the combination of
these two columns uniquely identifies each row in the table.
• COURSE_FEE is a non-prime attribute because it is not part of the candidate
key {STUD_NO, COURSE_NO}.
• But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO,
which is a proper subset of the candidate key.
• Therefore, Non-prime attribute COURSE_FEE is dependent on a proper subset of the
candidate key, which is a partial dependency and so this relation is not in 2NF.
To convert the above relation to 2NF, we need to split the table into two tables such as : Table 1:
STUD_NO, COURSE_NO Table 2: COURSE_NO, COURSE_FEE.
STUD_COURSE table:
STUD_NO COURSE_NO
1 C1
2 C2
1 C4
4 C3
4 C1
2 C5
COURSES table:
COURSE_NO COURSE_FEE
C1 1000
C2 1500
C4 2000
C3 1000
C5 2000
EMPLOYEE_ZIP table:
EMP_ZIP EMP_STATE EMP_CITY
201010 UP Noida
02228 US Boston
60007 US Chicago
06389 UK Norwich
462007 MP Bhopal
264 India
364 UK
EMP_DEPT table:
EMP_DEPT_MAPPING table:
EMP_ID EMP_DEPT
D394 283
D394 300
D283 232
D283 549
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey
The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent entities.
Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two hobbies, Dancing and Singing. So, there is a multi-valued
dependency on STU_ID, which leads to unnecessary repetition of data.
So, to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics
STUDENT_HOBBY
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey