0% found this document useful (0 votes)
6 views

Week1 (5)

Normalization is a process in relational database design aimed at organizing data to eliminate redundancy and ensure data integrity. It involves structuring databases into smaller tables based on normal forms (1NF, 2NF, 3NF, BCNF, 4NF, 5NF) to improve maintenance, query performance, and data consistency. The process helps avoid anomalies during updates, inserts, and deletes, making databases easier to manage.

Uploaded by

haederredha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

Week1 (5)

Normalization is a process in relational database design aimed at organizing data to eliminate redundancy and ensure data integrity. It involves structuring databases into smaller tables based on normal forms (1NF, 2NF, 3NF, BCNF, 4NF, 5NF) to improve maintenance, query performance, and data consistency. The process helps avoid anomalies during updates, inserts, and deletes, making databases easier to manage.

Uploaded by

haederredha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

What is Normalization?

Normalization is a systematic process used in relational database design to organize data


efficiently. It involves structuring a database in such a way that it eliminates redundancy
(duplicate data) and ensures data integrity while making the database easier to maintain
and query. Normalization is achieved by dividing a large table into smaller tables and
establishing relationships between them using keys.

The process of normalization is guided by a set of normal forms (NF), which are rules
that ensure the database adheres to specific standards.

Steps of Normalization (Normal Forms):

1. First Normal Form (1NF): Ensures that each column contains atomic (indivisible)
values and eliminates repeating groups.

2. Second Normal Form (2NF): Removes partial dependencies, meaning that every
non-key attribute must be fully functionally dependent on the primary key.

3. Third Normal Form (3NF): Eliminates transitive dependencies, ensuring that non-
key attributes are not dependent on other non-key attributes.

4. Boyce-Codd Normal Form (BCNF): A stricter version of 3NF, ensuring every


determinant is a candidate key.

5. Fourth Normal Form (4NF): Removes multi-valued dependencies.

6. Fifth Normal Form (5NF): Resolves join dependencies, ensuring the database is
free from redundancy caused by join operations.

Benefits of Normalization

1. Reduces Redundancy:

o By splitting large tables into smaller ones, normalization minimizes


duplicate data, reducing storage costs and making updates easier.

2. Improves Data Integrity:

o Ensures the database remains consistent. For example, updates in one


place automatically reflect throughout related tables, reducing errors.

3. Avoids Update, Insert, and Delete Anomalies:


o Update Anomaly: Avoids inconsistencies during updates (e.g., changing a
value in one place but forgetting to change it elsewhere).

o Insert Anomaly: Allows data to be inserted without unnecessary


restrictions.

o Delete Anomaly: Prevents unintended loss of data when deleting records.

4. Easier Maintenance:

o Smaller, well-structured tables are easier to modify and maintain,


improving the database’s flexibility and scalability.

5. Faster Query Performance:

o Although normalization can increase joins, it optimizes data storage and


makes queries more predictable and easier to optimize.

6. Efficient Data Relationships:

o Ensures a clear and logical relationship between tables, making the


database easier to understand and work with.

7. Facilitates Data Consistency:

o By enforcing relationships through foreign keys, normalized databases


ensure that related data remains consistent.

8. Better Data Organization:

o Data is organized systematically, which simplifies database design and


helps future modifications or expansions.

Example:

Without Normalization:

StudentID StudentName Course1 Course2 Course3

1 Alice Math Physics NULL

2 Bob Chemistry Biology Math

 Issues:
o Repeating columns (Course1, Course2, Course3) make the table hard to
scale.

o NULL values waste storage space.

With Normalization (1NF):

StudentID StudentName Course

1 Alice Math

1 Alice Physics

2 Bob Chemistry

2 Bob Biology

2 Bob Math

 Benefits:

o The table is simpler, avoids repeating groups, and scales easily.

First Normal Form (1NF)

The First Normal Form (1NF) is the most basic level of database normalization. It ensures
that a table is structured properly to reduce redundancy and eliminate anomalies in the
data.

Rules for a Table to Be in 1NF:

1. Atomic Values:

o All attributes (columns) must contain atomic (indivisible) values.

o No repeating groups or arrays are allowed.

2. Unique Column Names:

o Every column in the table should have a unique name for clarity.

3. Uniqueness of Rows:

o Each row (record) in the table must be unique and identifiable using a
primary key.
Example of 1NF

Step 1: A Table That Is Not in 1NF

Consider the following table:

StudentID StudentName Courses

1 Alice Math, Physics

2 Bob Chemistry

3 Carol Math, Biology

 Violations of 1NF:

o The Courses column contains non-atomic values (multiple courses


separated by commas).

Step 2: Converting to 1NF

To bring the table into 1NF, split the non-atomic values into separate rows:

StudentID StudentName Course

1 Alice Math

1 Alice Physics

2 Bob Chemistry

3 Carol Math

3 Carol Biology

 How It Satisfies 1NF:

o Each cell contains only atomic values.

o There are no repeating groups or arrays.

o Each row is unique.

Advantages of 1NF

 Eliminates repeating groups, making data easier to query and maintain.


 Provides a solid foundation for higher normal forms (2NF, 3NF, etc.).

Real-Life Scenario

Imagine you are storing a list of employees and their skills in a table. Without 1NF, you
might use a single row for an employee and list all their skills in one column. For
example:

EmployeeID EmployeeName Skills

101 John Python, SQL

102 Sarah Java, JavaScript

To convert it to 1NF:

1. Split the Skills column into atomic values.

2. Create a separate row for each skill:

EmployeeID EmployeeName Skill

101 John Python

101 John SQL

102 Sarah Java

102 Sarah JavaScript

Explanation of Second Normal Form (2NF)

The Second Normal Form (2NF) is a step in the database normalization process. It builds
upon the First Normal Form (1NF) and ensures that the data in a relational database is
organized to reduce redundancy and dependency issues.

Rules for a Table to be in 2NF:

1. The table must first satisfy 1NF:

o Each column should contain atomic (indivisible) values.

o Rows should be unique, and columns must have a single value (no
multivalued or composite attributes).
2. No Partial Dependency:

o A table is in 2NF if all non-prime attributes (attributes that are not part of
any candidate key) are fully functionally dependent on the entire primary
key.

o In other words, no non-prime attribute should depend on only a part of a


composite primary key.

Partial Dependency: When a non-prime attribute depends on part (not all) of a


composite primary key.

Example of 2NF

Step 1: A Table in 1NF

Imagine a table storing data about courses and instructors:

CourseID InstructorID InstructorName CourseName

1 101 Alice Mathematics

2 102 Bob Physics

1 103 Carol Mathematics

 Primary Key: (CourseID, InstructorID) (composite key).

 The table is in 1NF because all values are atomic and there are no repeating
groups.

Step 2: Identifying Partial Dependencies

In the above table:

 InstructorName depends only on InstructorID (part of the composite key) and


not on the entire composite key (CourseID, InstructorID).

 This is a partial dependency, violating 2NF.

Step 3: Decomposing into 2NF

To eliminate partial dependencies, we split the table into two:


1. Course Table:

o Stores course-related data.

o CourseID is the primary key.

CourseID CourseName

1 Mathematics

2 Physics

2. Instructor Table:

o Stores instructor-related data.

o InstructorID is the primary key.

InstructorID InstructorName

101 Alice

102 Bob

103 Carol

3. Course-Instructor Table:

o Links courses and instructors.

o (CourseID, InstructorID) remains the composite key.

CourseID InstructorID

1 101

1 103

2 102

Key Points of the 2NF Result

 Now, every non-prime attribute depends on the entire primary key (or a single
primary key, in the case of non-composite tables).

 Redundancy is reduced: Instructor names or course names are no longer


repeated unnecessarily.
 The database structure is now easier to maintain and update.

Third Normal Form (3NF)

The Third Normal Form (3NF) is the next step after 2NF in database normalization. It
aims to further reduce redundancy by eliminating transitive dependencies.

Rules for a Table to be in 3NF:

1. The table must be in 2NF:

o No partial dependency should exist, as covered in 2NF.

2. No Transitive Dependency:

o A transitive dependency occurs when a non-prime attribute depends on


another non-prime attribute, which in turn depends on the primary key.

o In simpler terms:
If A→BA \to BA→B and B→CB \to CB→C, then A→CA \to CA→C is a
transitive dependency.

Example of 3NF

Step 1: A Table in 2NF

Consider the following table:

StudentID StudentName CourseID CourseName InstructorID InstructorName

1 Alice 101 Math 201 Dr. Smith

2 Bob 102 Physics 202 Dr. Johnson

3 Carol 101 Math 201 Dr. Smith

 Primary Key: (StudentID, CourseID) (composite key).

 This table is in 2NF because:

o All non-prime attributes depend fully on the composite key.

Step 2: Identifying Transitive Dependencies


 CourseName depends on CourseID, not directly on the composite key (StudentID,
CourseID).

 InstructorName depends on InstructorID, and InstructorID depends on CourseID.

o Thus, InstructorName indirectly depends on CourseID via InstructorID.

Step 3: Decomposing into 3NF

To remove transitive dependencies, we decompose the table into smaller tables:

1. Student Table:

o Stores student information.

o StudentID is the primary key.

StudentID StudentName

1 Alice

2 Bob

3 Carol

2. Course Table:

o Stores course information.

o CourseID is the primary key.

CourseID CourseName InstructorID

101 Math 201

102 Physics 202

3. Instructor Table:

o Stores instructor information.

o InstructorID is the primary key.

InstructorID InstructorName

201 Dr. Smith

202 Dr. Johnson


InstructorID InstructorName

4. Enrollment Table:

o Links students and courses.

o (StudentID, CourseID) remains the composite key.

StudentID CourseID

1 101

2 102

3 101

Key Points of the 3NF Result

 No Transitive Dependencies:

o For example, InstructorName is now stored separately in the Instructor


Table and depends only on InstructorID, not indirectly through CourseID.

 Data Redundancy is Minimized:

o Instructor and course information is stored once and reused.

 Improved Integrity:

o Updating an instructor's name or a course's details is now easier and


avoids anomalies.

Comparison of 2NF and 3NF

Aspect Second Normal Form (2NF) Third Normal Form (3NF)

Focus Eliminates partial dependency. Eliminates transitive dependency.

Data Reduced, but some redundancy


Further reduces redundancy.
Redundancy might exist.

Complexity Relatively simple to achieve. More decomposition might be


Aspect Second Normal Form (2NF) Third Normal Form (3NF)

required.

You might also like