NORMALIZATION PART 2!!
ABOUT SYSTEM
ABC is an online bus reservation system. There are 3 important entities – 1) bus 2) passenger 30
route
Any user can login and check for schedule of busses using username and email address.
The route table has the details of schedule of every bus.
The other attributes are departure date, departure time, bus number, capacity, seat number, status,
fare, route name, source, destination, and distance.
The status attribute checks whether any seat is available or not.
Once, an available seat is viewed by the user, he goes ahead for booking.
A user can book ticket for many passengers
Booking will generate ticket which has attributes like ticket number and mode of payment.
The mode of payment can be either by cash or by credit card.
NORMALIZATION
Normalization is a process in which a given set of relations is replaced by successive collections of
relations that a simpler and more regular structure.
It transforms data from a problem into relations while ensuring data integrity and eliminating /
reducing data redundancy.
4 most commonly used normal forms are first normal form (1NF), second normal form (1NF), third
normal form (1NF), and Boyce-Codd normal form (BCNF).
OBJECTIVES
To make it feasible to represent any relation in the database.
To free relations from undesirable insertion, update, and deletion anomalies.
BEFORE UNDERSTANDING THE NORMAL FORMS, WE NEED TO UNDERSTAND SOME TERMS:
TYPES OF FUNCTIONAL DEPENDANCIES
1) Full dependency
In a relation, the attribute(s) B is fully functionally dependent on A if B is functionally
dependent on A, but not on any proper subsets of A.
2) Partial dependency
If there is some attribute that can be removed from A and the dependency still holds.
Ex. P_id, p_name -> userid
3) Transitive dependency
In a relation, if attribute(s) A->B and B->C, the C is transitively dependent on A via B
(provided that A is not functionally dependent on B or C). Ex., Bus_no -> Route_no
And Route_no -> Route_name
UNNORMALIZED FORM
A table that contains one or more repeating groups.
To create an unnormalized table –
1) Transform the data from the information source
2) (E.g., form) into table format with columns and rows
ABC TRAVELS
bookindDate userid email ticketNo D_date D_time P_id P_name P_address DOB
GENDE phn busN busNam Capacit typ routen routeNam sourc destinatio Far modeofpa
R o o e y e o e e n e y
1NF
The table cells must be of single (atomic) values
Eliminating repeating groups in individual tables
Creating a separate table for each set of related data
Identify each set of related data with a primary key
User
userid username email
111 Savita Marwal
[email protected] 222 Himansh Mansinghani
[email protected] Passenger
Pid phno pname paddress DOB gender
1 111111111 Ram Sharma pune 6-1-1990 Male
2 222222222 Siya Varma Nasik 3-5-1997 Female
3 333333333 Siya Varma Nasik 3-5-1997 Female
BusRoute
routeno busno routename source destination distance fare Dtime Ddate busName capacity type
2000 10 Delhi- delhi jaipur 2000km 2000 11:00am 3-4- AA 20 A/c
jaipur 2023
2001 11 Pune - pune mumbai 200km 500 12:00pm 4-4- BB 25 Non a/c
mumbai 2023
Reservation
seatno busno status bookingdate ticketno Mode of payment
1110 10 booked 1-4-2023 1122 Cash
1111 11 booked 2-4-2023 1121 credit
2NF
A table is in 2 NF if it is 1 NF and if all non-key attributes are dependent on all of the key i.e., no
patrial dependency.
Passenger
pid pname paddress DOB gender userid
101 Ram Sharma Pune 6-1-1990 Male 111
102 Siya Varma Nasik 3-5-1997 female 112
102 Siya Varma Nasik 3-5-1997 female 112
Contact tables (phid(PK) -> phno)
phid pid Phno
1 101 11111111
2 102 22222222
3 102 33333333
Rest tables same
3 NF
A table is in 3 NF if it is in 2 NF and if it has no transitive dependencies.
Bus (busno -> bname, capacity, type, routeno)
busno bname capacity Type Routeno
10 AA 20 a/c 2000
11 BB 25 Non a/c 500
Route ( routeno -> routename, source, destination, Ddate, Dtime, distance, fare)
3NF
From reservation {seatno -> ticketno and ticketno -> modeofpayment}
Booking (seatno -> pid, busno, status, ticketno)
Ticket (ticketno -> bookingdate, mode of payment)
BCNF
A table is in BCNF if it is in 3NF and if every determinant is a candidate key
BCNF is stronger form of 3NF
BCNF => 3NF
3NF => BCNF
Booking (seatno -> pid, busno, status)
seatno pid busno Status
1001 101 10 Booked
1008 102 11 booked
Here, all the attributes other than seatno acts as a candidate key
e.g., can act as a primary key alone
busno can also act as a primary key
status is not unique (i.e., booked or available), so we use (seatno and status) as candidate key.
FULLY NORMALIZED DATA
User
Passenger
Contact
Bus
Route
Booking’
Ticket