0% found this document useful (0 votes)
56 views45 pages

4 5NF

The document discusses database normalization. It defines four normal forms - first normal form (1NF), second normal form (2NF), third normal form (3NF), and Boyce-Codd normal form (BCNF). The normal forms aim to minimize data redundancy and avoid data anomalies. The process of normalization involves analyzing relations based on their keys and dependencies and splitting relations until a normal form is reached.

Uploaded by

desert_05
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views45 pages

4 5NF

The document discusses database normalization. It defines four normal forms - first normal form (1NF), second normal form (2NF), third normal form (3NF), and Boyce-Codd normal form (BCNF). The normal forms aim to minimize data redundancy and avoid data anomalies. The process of normalization involves analyzing relations based on their keys and dependencies and splitting relations until a normal form is reached.

Uploaded by

desert_05
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 45

Normalization

 Main objective in developing a logical data


model for relational database systems is to
create an accurate representation of the data,
its relationships, and constraints.

 To achieve this objective, must identify a


suitable set of relations.

1
Normalization
 Four most commonly used normal forms are first
(1NF), second (2NF) and third (3NF) normal
forms, and Boyce–Codd normal form (BCNF).

 Based on functional dependencies among the


attributes of a relation.

 A relation can be normalized to a specific form to


prevent possible occurrence of update anomalies.

2
Data Redundancy
 Major aim of relational database design is to
group attributes into relations to minimize
data redundancy and reduce file storage space
required by base relations.

 Problems associated with data redundancy are


illustrated by comparing the following Staff
and Branch relations with the StaffBranch
relation.

3
Data Redundancy

4
Data Redundancy
 StaffBranch relation has redundant data: details
of a branch are repeated for every member of
staff.

 In contrast, branch information appears only


once for each branch in Branch relation and only
branchNo is repeated in Staff relation, to
represent where each member of staff works.

5
Update Anomalies
 Relations that contain redundant information
may potentially suffer from update anomalies.

 Types of update anomalies include:


– Insertion,
– Deletion,
– Modification.

6
Lossless-join and Dependency Preservation
Properties
 Two important properties of decomposition:
- Lossless-join property enables us to find any
instance of original relation from
corresponding instances in the smaller
relations.
- Dependency preservation property enables us
to enforce a constraint on original relation by
enforcing some constraint on each of the
smaller relations.

7
Functional Dependency
 Main concept associated with normalization.

 Functional Dependency
– Describes relationship between attributes in
a relation.
– If A and B are attributes of relation R, B is
functionally dependent on A (denoted A 
B), if each value of A in R is associated with
exactly one value of B in R.

8
Functional Dependency
 Property of the meaning (or semantics)
of the attributes in a relation.

 Diagrammatic representation:

 Determinant of a functional dependency refers


to attribute or group of attributes on left-hand
side of the arrow.
9
Example - Functional Dependency

10
Functional Dependency
 Main characteristics of functional dependencies
used in normalization:

– have a 1:1 relationship between attribute(s)


on left and right-hand side of a dependency;
– hold for all time;
– are nontrivial.

11
Functional Dependency
 Complete set of functional dependencies for a given
relation can be very large.

 Important to find an approach that can reduce set


to a manageable size.

 Need to identify set of functional dependencies (X)


for a relation that is smaller than complete set of
functional dependencies (Y) for that relation and
has property that every functional dependency in Y
is implied by functional dependencies in X.

12
Functional Dependency
 Set of all functional dependencies implied by a
given set of functional dependencies X called
closure of X (written X+).

 Set of inference rules, called Armstrong’s


axioms, specifies how new functional
dependencies can be inferred from given ones.

13
Functional Dependency
 Let A, B, and C be subsets of the attributes of
relation R. Armstrong’s axioms are as follows:
 1. Reflexivity
If B is a subset of A, then A B
2. Augmentation
If A B, then A,C C
3. Transitivity
If A B and B C, then A C

14
The Process of Normalization
 Formal technique for analyzing a relation based on
its primary key and functional dependencies
between its attributes.

 Often executed as a series of steps. Each step


corresponds to a specific normal form, which has
known properties.

 As normalization proceeds, relations become


progressively more restricted (stronger) in format
and also less vulnerable to update anomalies.

15
Relationship Between Normal Forms

16
Unnormalized Form (UNF)
 A table that contains one or more repeating
groups.

 To create an unnormalized table:


– transform data from information source
(e.g. form) into table format with columns
and rows.

17
First Normal Form (1NF)
 A relation in which intersection of each row
and column contains one and only one value.

18
UNF to 1NF
 Nominate an attribute or group of attributes to
act as the key for the unnormalized table.

 Identify repeating group(s) in unnormalized


table which repeats for the key attribute(s).

19
UNF to 1NF
 Remove repeating group by:
– entering appropriate data into the empty
columns of rows containing repeating data
(‘flattening’ the table).
Or by
– placing repeating data along with copy of
the original key attribute(s) into a separate
relation.

20
Second Normal Form (2NF)
 Based on concept of full functional dependency:
– A and B are attributes of a relation,
– B is fully dependent on A if B is functionally
dependent on A but not on any proper subset of
A.

 2NF - A relation that is in 1NF and every non-


primary-key attribute is fully functionally
dependent on the primary key.

21
1NF to 2NF
 Identify primary key for the 1NF relation.

 Identify functional dependencies in the


relation.

 If partial dependencies exist on the primary


key remove them by placing them in a new
relation along with copy of their determinant.

22
Third Normal Form (3NF)
 Based on concept of transitive dependency:
– A, B and C are attributes of a relation such that if
A  B and B  C,
– then C is transitively dependent on A through B.
(Provided that A is not functionally dependent on
B or C).

 3NF - A relation that is in 1NF and 2NF and in


which no non-primary-key attribute is transitively
dependent on the primary key.

23
2NF to 3NF
 Identify the primary key in the 2NF relation.

 Identify functional dependencies in the relation.

 If transitive dependencies exist on the primary


key remove them by placing them in a new
relation along with copy of their determinant.

24
General Definitions of 2NF and 3NF
 Second normal form (2NF)
– A relation that is in 1NF and every non-
primary-key attribute is fully functionally
dependent on any candidate key.

 Third normal form (3NF)


– A relation that is in 1NF and 2NF and in
which no non-primary-key attribute is
transitively dependent on any candidate key.

25
Boyce–Codd Normal Form (BCNF)
 Based on functional dependencies that take
into account all candidate keys in a relation,
however BCNF also has additional constraints
compared with general definition of 3NF.

 BCNF - A relation is in BCNF if and only if


every determinant is a candidate key.

26
Boyce–Codd normal form (BCNF)
 Difference between 3NF and BCNF is that for a
functional dependency A  B, 3NF allows this
dependency in a relation if B is a primary-key
attribute and A is not a candidate key.

 Whereas, BCNF insists that for this dependency to


remain in a relation, A must be a candidate key.

 Every relation in BCNF is also in 3NF. However,


relation in 3NF may not be in BCNF.

27
Boyce–Codd normal form (BCNF)
 Violation of BCNF is quite rare.

 Potential to violate BCNF may occur in a


relation that:
– contains two (or more) composite candidate
keys;
– the candidate keys overlap (i.e. have at least
one attribute in common).

28
Review of Normalization (UNF to BCNF)

29
Review of Normalization (UNF to BCNF)

30
Review of Normalization (UNF to BCNF)

31
Review of Normalization (UNF to BCNF)

32
Fourth Normal Form (4NF)
 Although BCNF removes anomalies due to
functional dependencies, another type of
dependency called a multi-valued dependency
(MVD) can also cause data redundancy.

 Possible existence of MVDs in a relation is due


to 1NF and can result in data redundancy.

33
Fourth Normal Form (4NF) - MVD
 Dependency between attributes (for example,
A, B, and C) in a relation, such that for each
value of A there is a set of values for B and a
set of values for C. However, set of values for B
and C are independent of each other.

34
Fourth Normal Form (4NF)
 MVD between attributes A, B, and C in a
relation using the following notation:
A  B
A  C

35
 Department d1 works on jobs j1, and j2 with parts p1 and p2
 Department d2 works on jobs j3, j4, and j5 with parts p2 and p4
 Department d3 works on job j2 only with parts p5 and p6.

Department Job Part#


-------------------------------------------------
d1 j1 p1
d1 j1 p2 Department  Job
d1 j2 p1
d1 j2 p2
d2 j3 p2 Department  Part
d2 j3 p4
d2 j4 p2
d2 j4 p4
d2 j5 p2
d2 j5 p4
d3 j2 p5
d3 j2 p6
36
Fourth Normal Form (4NF)
 MVD can be further defined as being trivial or
nontrivial.
– MVD A  B in relation R is defined as
being trivial if (a) B is a subset of A or (b) A  B
= R.
– MVD is defined as being nontrivial if neither (a)
nor (b) are satisfied.
– Trivial MVD does not specify a constraint on a
relation, while a nontrivial MVD does specify a
constraint.

37
Fourth Normal Form (4NF)
 Defined as a relation that is in BCNF and
contains no nontrivial MVDs.

38
4NF - Example

39
Fifth Normal Form (5NF)
 A relation decomposed into two relations must
have lossless-join property, which ensures that no
spurious tuples are generated when relations are
reunited through a natural join.

 However, there are requirements to decompose a


relation into more than two relations.

 Although rare, these cases are managed by join


dependency and fifth normal form (5NF).

40
Fifth Normal Form (5NF)
 A relation that has no join dependency.

41
5NF - Example

42
 DKNF offers a complete solution to the problem
of avoiding modification abnormalities
 Domain/key normal form (DKNF). A key
uniquely identifies each row in a table.
 By enforcing key and domain restrictions, the
database is assured of being freed from any
modification inconsistency.

43
 Ronald Fagin (1981) proved that if a Relation is
in DKNF then it is free from any
anomalies(redundancies). Including the ones
caused by FDs, MVDs, JDs.

 DKNF seems simple enough then why all


hoopla the about 1NF, 2NF, 3NF, BCNF, 4NF,
5NF

44
DKNF not always achievable, and there is no
formal definition to verify if a relation schema is
in DKNF

In short, sets of single-theme tables will most


likely be in DKNF.

45

You might also like