Normalization
Normalization
-Santosh Bajgain
Definition
• Data normalization is a process of presenting a database in a normal form to
avoid undesirable things such as repetition of information, inability to
represent information, loss of information, etc. It improves performance by
reducing data redundancy to a large extent.
• Or
• Normalization is a database design technique that reduces data redundancy
and eliminates undesirable characteristics like Insertion, Update and
Deletion Anomalies. Normalization rules divide larger tables into smaller
tables and links them using relationships. The purpose of Normalization in
SQL is to eliminate redundant (repetitive) data and ensure data is stored
logically.
• The inventor of the relational model Edgar Codd, proposed the theory of
normalization of data with the introduction of the First Normal Form, and he
continued to extend the theory with the Second and Third Normal Forms.
Later he joined Raymond F. Boyce to develop the theory of the Boyce-Codd
Normal Form.
Advantages of Normalization
• It reduces data redundancy.
• It improves faster sorting and index creation.
• It creates a few indexes and nulls.
• It ignores the repetition of information.
• It simplifies the structure of tables
• It improves the performance of a system.
• It avoids the loss of information.
Need of normalization
The table is a basic building block in the database design process. So, the
structure of the table is the great interest in relational database design. A poor
table structure degrades the performance of the RDBMS. So, recognizing a
poor table structure and producing a good table is based on normalization.
Normalization is a process for assigning attributes to entities. Normalization
reduces data redundancies and helps to eliminate the data anomalies in a
database Normalization does not eliminate data redundancies. Instead, it
produces a controlled mechanism to reduce the repetition of data that lets us
link database tables. So, normalization is the process of decomposing a big
table into much smaller and simplest.
The need for normalization in point wise listed as:
1) It eliminates redundant data.
2) It reduces the chances of data error.
3) Normalization is important because it allows the database to take up less disk space.
4) It also helps in increasing performance.
5) It improves data integrity and consistency.
Types
• The most important and widely used normal forms are:
• First Normal Form (INF)
• Second Normal Form (2NF).
• Third Normal Form (3NF) Boyce Codd Normal Form (BCNF)
• Fourth Normal Form (4NF).
A relation is said to be in particular normal form if it satisfies a
prescribed set of rules.
From structure view point 1NF is better than 2NF and 2 NF is better
than 3 NF.
1NF • A relation or table is said to be in 1NF if all attribute is atomic.
• That is, there should not be any repeating group of an attribute.
• Purpose of 1NF
• To eliminate the repeating group of attributes in an entity
• In other words,
• If any character is repeated again and again in the same table/row, then
such attributes are either removed to a separate table or decomposed to
several rows