Normal Forms
Normal Forms
A relation that is in First Normal Form and every non-primary-key attribute is fully
functionally dependent on the primary key, then the relation is in Second Normal
Form (2NF).
Note – If the proper subset of candidate key determines non-prime attribute, it is
called partial dependency. The normalization of 1NF relations to 2NF involves
the removal of partial dependencies. If a partial dependency exists, we remove
the
partially dependent attribute(s) from the relation by placing them in a new
relation along with a copy of their determinant.
Consider the examples given below.
Example-1: Consider table as following below.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
{Note that, there are many courses having the same course fee. }
Here, COURSE_FEE cannot alone decide the value of COURSE_NO or
STUD_NO; COURSE_FEE together with STUD_NO cannot decide the value of
COURSE_NO;
COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO;
Hence, COURSE_FEE would be a non-prime attribute, as it does not belong
to the one only candidate key {STUD_NO, COURSE_NO} ;
But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on
COURSE_NO, which is a proper subset of the candidate key.
Now, non-prime attribute COURSE_FEE is dependent on a proper subset of
the candidate key, which is a partial dependency and so this relation is not in
2NF.
To convert the above relation to 2NF, we need to split the table into two
tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
2 C5
Note – 2NF tries to reduce the redundant data getting stored in memory. For
instance, if there are 100 students taking C1 course, we dont need to store its Fee
as 1000 for all the 100 records, instead once we can store it in the second table as
the course fee for C1 is 1000.
Example-2: Consider following functional dependencies in relation
R (A, B, C, D )
AB -> C [A and B together determine C]
BC -> D [B and C together determine D]
In this case, we can see that the relation R has a composite candidate key {A,B} as
AB->C. Therefore, A and B together uniquely determine the value of C.
Similarly, BC -> D shows that B and C together uniquely determine the value of
D.
The relation R is already in 1NF because it does not have any repeating groups or
nested relations.
However, we can see that the non-prime attribute D is functionally dependent on
only part of a candidate key, BC. This violates the 2NF condition.
Third Normal Form (3NF):
A relation is in third normal form, if there is no transitive dependency for non-
prime attributes as well as it is in second normal form.
A relation is in 3NF if at least one of the following condition holds in every non-
trivial function dependency X –> Y:
X.is a super key.
Y.is a prime attribute (each element of Y is part of some candidate key). In other
words,
A relation that is in First and Second Normal Form and in which no non-primary-
key attribute is transitively dependent on the primary key, then it is in Third Normal
Form (3NF).
Note – If A->B and B->C are two FDs then A->C is called transitive dependency.
The normalization of 2NF relations to 3NF involves the removal of transitive
dependencies. If a transitive dependency exists, we remove the transitively
dependent attribute(s) from the relation by placing the attribute(s) in a new relation
along with a copy of the determinant.
Consider the examples given below.
Example-1:
In relation STUDENT given in
Table 4,
FD set:
{STUD_NO -> STUD_NAME, STUD_NO -> STUD_STATE, , STUD_NO ->
STUD_AGE,
STUD_STATE -> STUD_COUNTRY }
Candidate Key:
{STUD_NO}
For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true. So STUD_COUNTRY is transitively dependent on
STUD_NO. It violates the third normal form. To convert it in third normal form,
we will decompose the relation STUDENT (STUD_NO, STUD_NAME,
STUD_PHONE, STUD_STATE, STUD_COUNTRY_STUD_AGE) as:
STUDENT (STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,
STUD_AGE)
STATE_COUNTRY (STATE, COUNTRY)
Example-2:
Consider relation R(A, B, C, D, E)
A -> BC,
CD -> E,
B -> D, E -> A
All possible candidate keys in above relation are {A, E, CD, BC} All attribute
are on right sides of all functional dependencies are prime. So its already in 3NF.
Note –
Third Normal Form (3NF) is considered adequate for normal relational database
design because most of the 3NF tables are free of insertion, update, and deletion
anomalies.
Moreover, 3NF always ensures functional dependency preserving and lossless.
Boyce-Codd Normal Form (BCNF)
•Application of the general definitions of 2NF and 3NF may identify
additional redundancy caused by dependencies that violate one or more
candidate keys.
•However, despite these additional constraints, dependencies can still
exist that will cause redundancy to be present in 3NF relations.
•This weakness in 3NF resulted in the presentation of a stronger normal
form called the Boyce-Codd Normal Form (Codd, 1974).
Although, 3NF is an adequate normal form for relational databases, still,
this (3NF) normal form may not remove 100% redundancy because of
X−>Y functional dependency if X is not a candidate key of the given
relation. This can be solved by Boyce-Codd Normal Form (BCNF).
Boyce-Codd Normal Form (BCNF)
Boyce–Codd Normal Form (BCNF) is based on functional dependencies
that take into account all candidate keys in a relation; however, BCNF
also has additional constraints compared with the general definition of
3NF.
Rules for BCNF
Rule 1: The table should be in the 3rd Normal Form.
Rule 2: X should be a superkey for every functional dependency (FD)
X−>Y in a given relation.
Note: To test whether a relation is in BCNF, we identify all the
determinants and make sure that they are candidate keys.
BCNF in DBMS
It can be inferred that every relation in BCNF is also in 3NF. To put
it another way, a relation in 3NF need not be in BCNF.
To determine the highest normal form of a given relation R with
functional dependencies, the first step is to check whether the BCNF
condition holds. If R is found to be in BCNF, it can be safely deduced
that the relation is also in 3NF, 2NF, and 1NF as the hierarchy shows.
The 1NF has the least restrictive constraint – it only requires a relation
R to have atomic values in each tuple. The 2NF has a slightly more
restrictive constraint.
The 3NF has a more restrictive constraint than the first two normal forms but is less
restrictive than the BCNF. In this manner, the restriction increases as we traverse
down the hierarchy.
Example 1
Let us consider the student database, in which data of the student are mentioned.
Stu_I Stu_Branch Stu_Course Branch_Num Stu_Course_
D ber No
101 Computer Science & Engineering DBMS B_001 201
101 Computer Science & Engineering Computer Networks B_001 202
101 201
101 202
102 401
Stu_ID Stu_Course_No
102 402
attributes of the relation, So AC will be the candidate key. A or C can’t be derived from
any other attribute of the relation, so there will be only 1 candidate key {AC}.
Step-2: Prime attributes are those attributes that are part of candidate key {A, C} in this
Example 2 –
ID Name
Courses
1 ------------------
A c1, c2
2 E c3
3 M C2, c3
In the above table Course is a multi-valued attribute so it is not in 1NF. Below
Table is in 1NF as there is no multi-valued attribute
ID Name Course
------------------
1 A c1
1 A c2
2 E c3
3 M c2
3 M c3
Second Normal Form
To be in second normal form, a relation must be in first normal form and relation must
not contain any partial dependency. A relation is in 2NF if it has No
Partial Dependency, i.e., no non-prime attribute (attributes which are not part of any
candidate key) is dependent on any proper subset of any candidate key of the
table. Partial Dependency – If the proper subset of candidate key determines non-prime
attribute, it is called partial dependency.
Example 1 – Consider table-3 as following below.
STUD_NO COURSE_NO COURSE_FEE
1 C1 1000
2 C2 1500
1 C4 2000
4 C3 1000
4 C1 1000
2 C5 2000
{Note that, there are many courses having the same course fee} Here,
COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO;
COURSE_FEE together with STUD_NO cannot decide the value of
COURSE_NO; COURSE_FEE together with COURSE_NO cannot decide the
value of STUD_NO; Hence, COURSE_FEE would be a non-prime attribute, as it
does not belong to the one only candidate key {STUD_NO, COURSE_NO} ; But,
COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on
COURSE_NO, which is a proper subset of the candidate key. Non-prime attribute
COURSE_FEE is dependent on a proper subset of the candidate key, which is a
partial dependency and so this relation is not in 2NF. To convert the above relation
to 2NF, we need to split the table into two tables such as : Table 1: STUD_NO,
COURSE_NO Table 2: COURSE_NO, COURSE_FEE
Table 1 Table 2
STUD_NO COURSE_NO COURSE_NO COURSE_FEE
1 C1 C1 1000
2 C2 C2 1500
1 C4 C3 1000
4 C3 C4 2000
4 C1 C5 2000
NOTE: 2NF tries to reduce the redundant data getting stored in memory. For
instance, if there are 100 students taking C1 course, we don’t need to store its Fee
as 1000 for all the 100 records, instead, once we can store it in the second table as
the course fee for C1 is 1000.
Example 2 – Consider following functional dependencies in relation R (A,
B , C, D )
AB -> C [A and B together determine C]
BC -> D [B and C together determine D]
In the above relation, AB is the only candidate key and there is no partial dependency,
i.e., any proper subset of AB doesn’t determine any non-prime attribute.
X is a super key.
Y is a prime attribute (each element of Y is part of some candidate key).
Example 1: In relation STUDENT given in Table 4, FD set: {STUD_NO -
> STUD_NAME, STUD_NO -> STUD_STATE, STUD_STATE ->
STUD_COUNTRY, STUD_NO -> STUD_AGE}
Candidate Key: {STUD_NO}
For this relation in table 4, STUD_NO -> STUD_STATE and STUD_STATE ->
STUD_COUNTRY are true.
So STUD_COUNTRY is transitively dependent on STUD_NO. It violates the third
normal form.
To convert it in third normal form, we will decompose the relation STUDENT
(STUD_NO, STUD_NAME, STUD_PHONE, STUD_STATE,
STUD_COUNTRY_STUD_AGE) as: STUDENT (STUD_NO, STUD_NAME,
STUD_PHONE, STUD_STATE, STUD_AGE) STATE_COUNTRY (STATE,
COUNTRY)
Consider relation R(A, B, C, D, E) A -> BC, CD -> E, B -> D, E -> A All
possible candidate keys in above relation are {A, E, CD, BC} All attributes are on right
sides of all functional dependencies are prime.
Example 2: Find the highest normal form of a relation R(A,B,C,D,E) with FD set
as
{BC->D, AC->BE, B->E}
Step 1: As we can see, (AC)+ ={A,C,B,E,D} but none of its subset can determine
all attribute of relation, So AC will be candidate key. A or C can’t be derived
from any other attribute of the relation, so there will be only 1 candidate key {AC}.
Step 2: Prime attributes are those attributes that are part of candidate key {A, C} in this
example and others will be non-prime {B, D, E} in this example.
Step 3: The relation R is in 1st normal form as a relational DBMS does not allow multi-
valued or composite attribute. The relation is in 2nd normal form because BC->D is in
2nd normal form (BC is not a proper subset of candidate key AC) and AC->BE is in 2nd
normal form (AC is candidate key) and B->E is in 2nd normal form (B is not a proper
subset of candidate key AC).
The relation is not in 3rd normal form because in BC->D (neither BC is a super key nor D
is a prime attribute) and in B->E (neither B is a super key nor E is a prime attribute)
but to satisfy 3rd normal for, either LHS of an FD should be super key or RHS should be
prime attribute. So the highest normal form of relation will be 2nd Normal form.
For example consider relation R(A, B, C) A -> BC, B -> A and B both are super keys so
above relation is in BCNF.
Third Normal Form
A relation is said to be in third normal form, if we did not have any transitive
dependency for non-prime attributes. The basic condition with the Third Normal Form is
that, the relation must be in Second Normal Form.
Below mentioned is the basic condition that must be hold in the non-trivial functional
dependency X -> Y:
X is a Super Key.
Y is a Prime Attribute ( this means that element of Y is some part of Candidate Key).
For more, refer to Third Normal Form in DBMS.
BCNF (Boyce-Codd Normal Form)
BCNF (Boyce-Codd Normal Form) is just a advanced version of Third Normal Form.
Here we have some additional rules than Third Normal Form. The basic condition for
any relation to be in BCNF is that it must be in Third Normal Form.
We have to focus on some basic rules that are for BCNF:
1.Table must be in Third Normal Form.
2.In relation X->Y, X must be a superkey in a relation. For more, refer to BCNF in
DBMS.
Fourth Normal Form
Fourth Normal Form contains no non-trivial multivaued dependency except candidate
key. The basic condition with Fourth Normal Form is that the relation must be in BCNF.
The basic rules are mentioned below.
1.It must be in BCNF.
2.It does not have any multi-valued dependency. For more, refer to Fourth Normal Form
in DBMS. Fifth Normal Form
Fifth Normal Form is also called as Projected Normal Form. The basic conditions of
Fifth Normal Form is mentioned below.
Relation must be in Fourth Normal Form.
The relation must not be further non loss decomposed. For more, refer to Fifth Normal
Form in DBMS. Applications of Normal Forms in DBMS
Data consistency: Normal forms ensure that data is consistent and does not contain any
redundant information. This helps to prevent inconsistencies and errors in the database.
Data redundancy: Normal forms minimize data redundancy by organizing data into
tables that contain only unique data. This reduces the amount of storage space
required for the database and makes it easier to manage.
Query performance: Normal forms can improve query performance by reducing the
number of joins required to retrieve data. This helps to speed up query processing and
improve overall system performance.
Database maintenance: Normal forms make it easier to maintain the database by
reducing the amount of redundant data that needs to be updated, deleted, or modified.
This helps to improve database management and reduce the risk of errors or
inconsistencies.
Database design: Normal forms provide guidelines for designing databases that are
efficient, flexible, and scalable. This helps to ensure that the database can be easily
modified, updated, or expanded as needed.
Some Important Points about Normal Forms
BCNF is free from redundancy.
If a relation is in BCNF, then 3NF is also satisfied.
If all attributes of relation are prime attribute, then the relation is always in 3NF.
A relation in a Relational Database is always and at least in 1NF form.
Every Binary Relation ( a Relation with only 2 attributes ) is always in BCNF.
If a Relation has only singleton candidate keys( i.e. every candidate key consists of only
1 attribute), then the Relation is always in 2NF( because no Partial functional dependency
possible).
Sometimes going for BCNF form may not preserve functional dependency. In that case
go for BCNF only if the lost FD(s) is not required, else normalize till 3NF
only.
There are many more Normal forms that exist after BCNF, like 4NF and more. But in
real world database systems it’s generally not required to go beyond BCNF.