0% found this document useful (0 votes)
20 views

Unit-2-Normalization

Uploaded by

vvyogeshwari95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Unit-2-Normalization

Uploaded by

vvyogeshwari95
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Normalization

Normalization is the process of organizing the data in the database with two goal:
 To minimize the redundancy from a relation or set of relations.
 To ensure the data dependencies

It is also used to eliminate the undesirable characteristics like Insertion, Update and Deletion
Anomalies.

Normalization divides the larger table into the smaller table and links them using relationship.

Benefits of Normalization:

 Quicker updates
 Less storage space
 Clear relationship
 Flexible structure
 Less inconsistency
 Less redundancy

Types of Normalization:

Normal Form Description

1NF A relation is in 1NF if every attribute contains an atomic value.

2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully
functional dependent on the primary key.

3NF A relation will be in 3NF if it is in 2NF and no transitive dependency exists.

BCNF A relation will be in BCNF if it is in 3NF and every determinant set must the
super key

4NF A relation will be in 4NF if it is in Boyce Codd normal form and has no
multi- valued dependency.

5NF A relation is in 5NF if it is in 4NF and not contains any join dependency and
joining should be lossless.

First Normal Form (INF):

 A relation will be 1NF if it contains an atomic value.


 It states that an attribute of a table cannot hold multiple values. It must hold only single-
valued attribute.
 First normal form disallows the multi-valued attribute, composite attribute, and their
combinations.
Example: Relation EMPLOYEE is not in 1NF because of multi-valued attribute EMP_PHONE.

EMPLOYEE table:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385, UP
9064738238

20 Harry 8574783832 Bihar

12 Sam 7390372389, Punjab


8589830302

The decomposition of the EMPLOYEE table into 1NF has been shown below:

EMP_ID EMP_NAME EMP_PHONE EMP_STATE

14 John 7272826385 UP

14 John 9064738238 UP

20 Harry 8574783832 Bihar

12 Sam 7390372389 Punjab

12 Sam 8589830302 Punjab

Second Normal Form (2NF):

 In the 2NF, relational must be in 1NF.


 In the second normal form, every attributes are fully functional dependent on all prime
attribute.

Prime attribute – Member of the candidate key


Non-Prime attribute – Non Member of the candidate key

Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a
school, a teacher can teach more than one subject.
TEACHER table
TEACHER_ID SUBJECT TEACHER_AGE
25 Chemistry 30

25 Biology 30

47 English 35

83 Math 38

83 Computer 38

In the given table, non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which


is a proper subset of a candidate key. That's why it violates the rule for 2NF.

To convert the given table into 2NF, we decompose it into two tables: TEACHER_DETAIL

table:
TEACHER_ID TEACHER_AGE

25 30

47 35

83 38

TEACHER_SUBJECT table:
TEACHER_ID SUBJECT

25 Chemistry

25 Biology

47 English

83 Math

83 Computer

Third Normal Form (3NF):

A relation will be in 3NF if it is in 2NF and not contain any transitive dependency.. If there is no
transitive dependency for non-prime attributes, then the relation must be in third normal form.
3NF states that all column reference in referenced data that are not dependent on the primary key
should be removed.
3NF is used to reduce the data duplication. It is also used to achieve the data integrity

Consider the following example:

In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre Type].
Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have transitive
functional dependency, and this structure does not satisfy third normal form.

To bring this table to third normal form, we split the table into two as follows:

Boyce codd Normal Form (BCNF):

 BCNF is the advance version of 3NF. It is stricter than 3NF.


 A table is in BCNF if every functional dependency X → Y, X is the super key of the
table.
 For BCNF, the table should be in 3NF, and for every FD, LHS is super key.

Example: Let's assume there is a company where employees work in more than one
department.
Below we have a college enrolment table with columns student_id, subject and professor.

student_id subject professor

101 Java P.Java

101 C++ P.Cpp

102 Java P.Java2

103 C# P.Chash

104 Java P.Java

 One student can enrol for multiple subjects. For example, student with student_id 101,
has opted for subjects - Java & C++
 For each subject, a professor is assigned to the student.
 And, there can be multiple professors teaching one subject like we have for Java.

In the table above, student_id, subject together form the primary key, because
using student_id and subject, we can find all the columns of the table.

One more important point to note here is, one professor teaches only one subject, but one subject
may have two different professors. Hence, there is a dependency
between subject and professor here, where subject depends on the professor name.

In the table above, student_id, subject form primary key, which means subject column is a prime
attribute.
But, there is one more dependency, professor → subject.

And while subject is a prime attribute, professor is a non-prime attribute, which is not
allowed by BCNF.

To make this relation(table) satisfy BCNF, we will decompose this table into two
tables, student table and professor table.
Student Table

student_id p_id

101 1

101 2

102 3

103 4

104 1

And, Professor Table

p_id professor subject

1 P.Java Java

2 P.Cpp C++

3 P.Java2 java

4 P.hash C#

COMPARISION OF 3NF AND BCNF

BASIS FOR
3NF BCNF
COMPARISON

Concept No non-prime attribute must For any trivial dependency


be transitively dependent on in a relation R say X->Y, X
the Candidate key. should be a super key of
relation R.

Dependency 3NF can be obtained without Dependencies may not be


sacrificing all dependencies. preserved in BCNF.

Decomposition Lossless decomposition can Lossless decomposition is


be achieved in 3NF. hard to achieve in BCNF.
Fourth normal form (4NF)

A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency. If the table has two or more independent one – many relationships then the relation
is said to be in multivalued dependency. i.e, For a dependency A → B, if for a single value of A,
multiple values of B exists, then the relation will be a multi-valued dependency.

Example
STUDENT

STU_ID COURSE HOBBY

21 Computer Dancing

21 Math Singing

34 Chemistry Dancing

74 Biology Cricket

59 Physics Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent
entity. Hence, there is no relationship between COURSE and HOBBY.

In the STUDENT relation, a student with STU_ID, 21 contains two


courses, Computer and Math and two hobbies, Dancing and Singing. So there is a Multi-
valued dependency on STU_ID, which leads to unnecessary repetition of data.
So to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE
STU_ID COURSE

21 Computer

21 Math

34 Chemistry

74 Biology
59 Physics

STUDENT_HOBBY

STU_ID HOBBY

21 Dancing

21 Singing

34 Dancing

74 Cricket

59 Hockey

Fifth Normal Form (5NF):


 A relation is in 5NF if it is in 4NF and not contains any join dependency and joining
should be lossless.
 5NF is satisfied when all the tables are broken into as many tables as possible in order to
avoid redundancy.
 5NF is also known as Project-join normal form (PJ/NF).

If the above table is decomposed as:


if the natural join of PI and P2 IS taken, the result is:

The spurious row as asterisked. Now, if this result is joined with P3 over the
column 'company 'product_name' the following table is obtained:

Hence, in this example, all the redundancies are eliminated, and the decomposition of ACP is a
lossless join decomposition. Therefore, the relation is in 5NF as it does not violate the property
of lossless join.

Denormalization:
The process of introducing redundancy in the relations is called as denormalization.
Example: Auditing

DKNF:

DKNF is a normal form used in database normalization which requires that


the database contains no constraints other than domain constraints and key
constraints.
Reason to use DKNF are as follows:
1. To avoid general constraints in the database that are not clear key
constraints. Most database can easily test or check key constraints on
attributes

You might also like