0% found this document useful (0 votes)
5 views

Module-3

Module 3 covers Database Management Systems with a focus on normalization, which organizes data to eliminate redundancy and improve consistency. It discusses various normal forms (1NF, 2NF, 3NF, BCNF, 4NF) and their advantages, as well as functional dependencies and types, including trivial, non-trivial, multivalued, and transitive dependencies. The module emphasizes the importance of structuring databases effectively to enhance performance and maintainability.

Uploaded by

gracyma007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Module-3

Module 3 covers Database Management Systems with a focus on normalization, which organizes data to eliminate redundancy and improve consistency. It discusses various normal forms (1NF, 2NF, 3NF, BCNF, 4NF) and their advantages, as well as functional dependencies and types, including trivial, non-trivial, multivalued, and transitive dependencies. The module emphasizes the importance of structuring databases effectively to enhance performance and maintainability.

Uploaded by

gracyma007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Module – 3: Database Management System

Module – 3
Normalization
Normalization

• Normalization is a systematic approach to organize data in a database to eliminate


redundancy, avoid anomalies and ensure data consistency.
• The process involves breaking down large tables into smaller, well-structured
ones and defining relationships between them.
• This not only reduces the chances of storing duplicate data but also improves the
overall efficiency of the database.
• Purpose of Normal Forms:
To organize data efficiently, eliminate redundancy, and prevent anomalies
during data operations like insertion, deletion and updates.
• By following a series of rules called normal forms, normalization ensures that the
data is logically organized and maintains its integrity.
Types of Normal Forms

➢ First Normal Form (1NF).


➢ Second Normal Form (2NF
➢ Third Normal Form (3NF).
➢ Boyce-Codd Normal Form (BCNF).
➢ Fourth Normal Form (4NF).
➢ Fifth Normal Form (5NF)

Advantages of Normal Form

• Reduced data redundancy: Normalization helps to eliminate duplicate data in


tables, reducing the amount of storage space needed and improving database
efficiency.
• Improved data consistency: Normalization ensures that data is stored in a
consistent and organized manner, reducing the risk of data inconsistencies and

P a g e 1 | 15
Module – 3: Database Management System

errors.
• Simplified database design: Normalization provides guidelines for organizing
tables and data relationships, making it easier to design and maintain a database.
• Improved query performance: Normalized tables are typically easier to search
and retrieve data from, resulting in faster query performance.
• Easier database maintenance: Normalization reduces the complexity of a
database by breaking it down into smaller, more manageable tables, making it
easier to add, modify, and delete data.
Functional dependencies
• In relational database management, functional dependency is a concept that
specifies the relationship between two sets of attributes where one attribute
determines the value of another attribute.
• It is denoted as X → Y, where the attribute set on the left side of the arrow, X is
called Determinant, and Y is called the Dependent.
• A functional dependency occurs when one attribute uniquely determines another
attribute within a relation.
• It is a constraint that describes how attributes in a table relate to each other. If
attribute A functionally determines attribute B we write this as the A→B.
• Functional dependencies are used to mathematically express relations among
database entities.
Example:
roll_no name dept_name dept_building
42 abc CO A4
43 pqr IT A3
44 xyz CO A4
45 xyz IT A3
46 mno EC B2
47 jkl ME B2
From the above table we can conclude some valid functional dependencies:
• roll_no → { name, dept_name, dept_building }→ Here, roll_no can determine values

P a g e 2 | 15
Module – 3: Database Management System

of fields name, dept_name and dept_building, hence a valid Functional dependency


• roll_no → dept_name , Since, roll_no can determine whole set of {name, dept_name,
dept_building}, it can determine its subset dept_name also.
• dept_name → dept_building , Dept_name can identify the dept_building accurately,
since departments with different dept_name will also have a different
dept_building

Types of Functional Dependencies in DBMS

1. Trivial functional dependency:

In Trivial Functional Dependency, a dependent is always a subset of the


determinant. i.e. If X → Y and Y is the subset of X, then it is called trivial functional
dependency.
Symbolically: A→B is trivial functional dependency if B is a subset of A.
Example:

roll_no name age

42 abc 17

43 pqr 18

44 xyz 18

Here, {roll_no, name} → name is a trivial functional dependency, since the dependent
name is a subset of determinant set {roll_no, name}. Similarly, roll_no → roll_no is also an
example of trivial functional dependency.

2. Non-Trivial functional dependency:

In non-trivial functional dependency, the dependent is strictly not a


subset of the determinant. i.e. If X → Y and Y is not a subset of X, then it is called
non-trivial functional dependency.

P a g e 3 | 15
Module – 3: Database Management System

Example:
roll_no name age
42 abc 17
43 pqr 18
44 xyz 18

Here, roll_no → name is a non-trivial functional dependency, since the dependent name
is not a subset of determinant roll_no. Similarly, {roll_no, name} → age is also a non-
trivial functional dependency, since age is not a subset of {roll_no, name}.a

3. Multivalued functional dependency:

In Multivalued functional dependency, entities of the dependent set are not


dependent on each other. i.e. If a → {b, c} and there exists no functional dependency
between b and c, then it is called a multivalued functional dependency.
Example:
bike_model manuf_year color
tu1001 2007 Black
tu1001 2007 Red
tu2012 2008 Black
tu2012 2008 Red
tu2222 2009 Black
In this table:
• X: bike_model
• Y: color
• Z: manuf_year
For each bike model (bike_model):
1. There is a group of colors (color) and a group of manufacturing years
(manuf_year).
2. The colors do not depend on the manufacturing year, and the manufacturing year
does not depend on the colors. They are independent.
3. The sets of color and manuf_year are linked only to bike_model.

P a g e 4 | 15
Module – 3: Database Management System

4. Transitive functional dependency:


In transitive functional dependency, dependent is indirectly dependent on
determinant. i.e. If a → b & b → c, then according to axiom of transitivity, a → c. This
is a transitive functional dependency.
Example:
enrol_no name dept building_no
42 abc CO 4
43 pqr EC 2
44 xyz IT 1
45 abc EC 2
Here, enrol_no → dept and dept → building_no. Hence, according to the axiom of
transitivity, enrol_no → building_no is a valid functional dependency. This is an indirect
functional dependency, hence called Transitive functional dependency.

First Normal Form (1NF)

• A relation is in first normal form if every attribute in that relation is single-valued


attribute.
• If a relation contains a composite or multi-valued attribute, it violates the first
normal form.

A table is in 1 NF if:

• There are only Single Valued Attributes.


• Attribute Domain does not change.
• There is a unique name for every Attribute/Column.
• The order in which data is stored does not matter.
Example for 1 NF: Consider the below COURSES Relation:

P a g e 5 | 15
Module – 3: Database Management System

In the above table, Courses has a multi-valued attribute, so it is not in 1NF. The Below
Table is in 1NF as there is no multi-valued attribute.

Second Normal Form (2NF)

• Second Normal Form (2NF) is based on the concept of fully functional dependency.
• For a table to be in 2NF, it must first meet the requirements of First Normal Form
(1NF). Additionally, the table should not have partial dependencies. In other
words,
• 2NF eliminates redundant data by requiring that each non-key attribute be
dependent on the primary key or candidate key.
• This means that each column should be directly related to the primary key, and not
to other columns.
Example-1: Consider the table below.

P a g e 6 | 15
Module – 3: Database Management System

• There are many courses having the same course fee. Here, COURSE_FEE cannot
alone decide the value of COURSE_NO or STUD_NO.
• COURSE_FEE together with STUD_NO cannot decide the value of COURSE_NO.
• COURSE_FEE together with COURSE_NO cannot decide the value of STUD_NO.
• The candidate key for this table is {STUD_NO, COURSE_NO} because the
combination of these two columns uniquely identifies each row in the table.
• COURSE_FEE is a non-prime attribute because it is not part of the candidate
key {STUD_NO, COURSE_NO}.
• But, COURSE_NO -> COURSE_FEE, i.e., COURSE_FEE is dependent on COURSE_NO,
which is a proper subset of the candidate key.
• Therefore, non-prime attribute COURSE_FEE is dependent on a proper subset of
the candidate key, which is a partial dependency and so this relation is not in
2NF.

To convert the above relation to 2NF, we need to split the table into two tables such
as:
• Table 1: STUD_NO, COURSE_NO
• Table 2: COURSE_NO, COURSE_FEE.

P a g e 7 | 15
Module – 3: Database Management System

Third Normal Form (3NF)

o A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
o 3NF is used to reduce the data duplication. It is also used to achieve the data
integrity.
o If there is no transitive dependency for non-prime attributes, then the relation
must be in third normal form.
A relation is in third normal form if it holds at least one of the following conditions for
every non-trivial function dependency X → Y.
1. X is a super key.
2. Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Example:
EMPLOYEE_DETAIL table

EMP_ID EMP_NAME EMP_ZIP EMP_STATE EMP_CITY

222 Harry 201010 UP Noida

333 Stephan 2228 US Boston

444 Lan 60007 US Chicago

555 Katharine 6389 UK Norwich

666 John 462007 MP Bhopal

Super key in the table above:


{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on
Candidate key: {EMP_ID}
Non-prime attributes:
In the given table, all attributes except EMP_ID are non-prime.
• In the above relation EMP_STATE & EMP_CITY dependent on EMP_ZIP and
EMP_ZIP dependent on EMP_ID.
• The non-prime attributes (EMP_STATE, EMP_CITY) transitively dependent on
super key (EMP_ID), this violates the rule of third normal form.

P a g e 8 | 15
Module – 3: Database Management System

• That's why we need to move the EMP_CITY and EMP_STATE to the new relation,
with EMP_ZIP as a Primary key.

Boyce Codd normal form (BCNF)


o BCNF is the advance version of 3NF. It is stricter than 3NF.
o A table is in BCNF if every functional dependency X → Y, X is the super key of the
table.
o For BCNF, the table should be in 3NF, and for every FD, LHS is super key.
Example:
EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283

264 India Testing D394 300

364 UK Stores D283 232

364 UK Developing D283 549

In the above table Functional dependencies are as follows:


1. EMP_ID → EMP_COUNTRY
2. EMP_DEPT → {DEPT_TYPE, EMP_DEPT_NO}
Candidate key: {EMP-ID, EMP-DEPT}
• The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
• To convert the given table into BCNF, we decompose it into three tables:

P a g e 9 | 15
Module – 3: Database Management System

EMP_COUNTRY table: EMP_DEPT table:

EMP_DEPT_MAPPING table:

Multivalued Dependency (MVD) in DBMS


• Multivalued dependency occurs when two attributes in a table are independent of
each other but, both depend on a third attribute.
• A multivalued dependency consists of at least two attributes that are dependent
on a third attribute that's why it always requires at least three attributes.
• Multivalued dependencies are consequences of 1NF which did not allow an
attribute in a tuple to have a set of values.
• In a relation, the functional dependency A -> B relates a value of A to a value of B
while multivalued dependency represented A ->-> B represents a relationship that
defines a relationship in which attribute B are determined by a single value of A.
The multivalued dependency is the result of 1NF that prohibits an attribute from
having a set of values.

P a g e 10 | 15
Module – 3: Database Management System

Example: Suppose there is a bike manufacturer company which produces two


colors(white and black) of each model every year.

BIKE_MODEL MANUF_YEAR COLOR


M2011 2008 White
M2001 2008 Black
M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black
In the above table COLOR and MANUF_YEAR are dependent on BIKE_MODEL and
independent of each other. In this case, these two columns can be called as multivalued
dependent on BIKE_MODEL. The representation of these dependencies is shown below:
1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR
A Multivalued dependency can be defined as:
Multivalued dependencies occur when two or more independent multivalued facts
about an attribute occur are in the same table.

In other words, R(X, Y, Z) in a relation where it will be assumed that X, Y, Z are pair
wise disjoint.
X = {x1, x2 …………….. xn)
Y = { y1, y2 ……………..yn)
Z = { z1, z2 …………….. zn)

P a g e 11 | 15
Module – 3: Database Management System

Fourth normal form (4NF)


o A relation will be in 4NF if it is in Boyce Codd normal form and has no multi-valued
dependency.
o For a dependency A → B, if for a single value of A, multiple values of B exist, then
the relation will be a multi-valued dependency.
Example :
STU_ID COURSE HOBBY
21 Computer Dancing
21 Math Singing
34 Chemistry Dancing
74 Biology Cricket
59 Physics Hockey

The given STUDENT table is in 3NF, but the COURSE and HOBBY are two independent
entities. Hence, there is no relationship between COURSE and HOBBY.
In the STUDENT relation, a student with STU_ID, 21 contains two
courses, Computer and Math and two hobbies, Dancing and Singing. So, there is a
multi-valued dependency on STU_ID, which leads to unnecessary repetition of data. So,
to make the above table into 4NF, we can decompose it into two tables:
STUDENT_COURSE:
STU_ID COURSE
21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics

P a g e 12 | 15
Module – 3: Database Management System

STUDENT_HOBBY:
STU_ID HOBBY
21 Dancing
21 Singing
34 Dancing
74 Cricket
59 Hockey

Join Dependency
o Join decomposition is a further generalization of Multivalued dependencies.
o If the join of R1 and R2 over C is equal to relation R, then we can say that a join
dependency (JD) exists.
o Where R1 and R2 are the decompositions R1(A, B, C) and R2(C, D) of a given
relations R (A, B, C, D).
o Alternatively, R1 and R2 are a lossless decomposition of R.
o A JD ⋈ {R1, R2,..., Rn} is said to hold over a relation R if R1, R2,....., Rn is a lossless-
join decomposition.
Types of Join Dependency
There are two types of Join Dependencies:
• Lossless Join Dependency: It means that whenever the join occurs between the
tables, then no information should be lost, the new table must have all the
content in the original table.
Example:
Student
Roll No. S_name S_dept
1 Raju CSE
2 Raju Quantum Computing

P a g e 13 | 15
Module – 3: Database Management System

StudentDetails Dept
Roll No. S_name Roll No. S_dept
1 Raju 1 CSE
2 Raju 2 Quantum Computing

StudentDetails ⨝ Dept = Student


Student
Roll No. S_name S_dept
1 Raju CSE
2 Raju Quantum Computing
In the above result we got original table after performing join Therefore, this type of
dependency is called as lossy join dependency.
• Lossy Join Dependency: In this type of join dependency, data loss may occur at
some point in time which includes the absence of a tuple from the original table
or duplicate tuples within the database.
Table: Company_Stats
Company Product Agent
C1 TV Aman

C1 AC Aman
C2 Refrigerator Mohan
C2 TV Mohit

Table: R1 Table: R2
Company Product Product Agent
C1 TV TV Aman
C1 AC AC Aman

C2 Refrigerator Refrigerator Mohan

C2 TV TV Mohit

P a g e 14 | 15
Module – 3: Database Management System

On performing join operation between R1 & R2:


R1 ⨝ R2
Company Product Agent
C1 TV Aman
C1 TV Mohan
C1 AC Aman
C2 Refrigerator Mohan
C2 TV Aman
C2 TV Mohit

In the above result we got two additional tuples after performing join i.e. (C1, TV, Mohan)
& (C2, TV, Aman) these tuples are known as Spurious Tuple, which is not the property of
Join Dependency. Therefore, this type of dependency is called as lossy join dependency.

Fifth normal form (5NF)


• A relation R is in Fifth Normal Form if and only if everyone joins dependency in R
is implied by the candidate keys of R.
• A relation decomposed into two relations must have lossless property, which
ensures that no spurious or extra tuples are generated when relations are reunited
through a natural join.
Properties
A relation R is in 5NF if and only if it satisfies the following conditions:
• R should be already in 4NF.
• It cannot be further non loss decomposed (join dependency).

P a g e 15 | 15

You might also like