0% found this document useful (0 votes)
60 views26 pages

Schema Refinement & Normalization Theory

The document discusses third normal form (3NF) as a weaker normal form than Boyce-Codd normal form (BCNF) that allows some redundancy but allows functional dependencies to be checked on individual relations without joins. 3NF is a compromise that is used when BCNF is not achievable. A relation is in 3NF if, for every non-trivial functional dependency X → A, either A is in X, X is a superkey, or A is part of some candidate key. Decomposition into 3NF relations that are lossless-join and dependency preserving is always possible.

Uploaded by

29pradeep29
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views26 pages

Schema Refinement & Normalization Theory

The document discusses third normal form (3NF) as a weaker normal form than Boyce-Codd normal form (BCNF) that allows some redundancy but allows functional dependencies to be checked on individual relations without joins. 3NF is a compromise that is used when BCNF is not achievable. A relation is in 3NF if, for every non-trivial functional dependency X → A, either A is in X, X is a superkey, or A is part of some candidate key. Decomposition into 3NF relations that are lossless-join and dependency preserving is always possible.

Uploaded by

29pradeep29
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Schema Refinement & Normalization Theory

Normalization/Decomposition

Week 13
1

Third Normal Form: Motivation


There are some situations where
BCNF is not dependency preserving, and efficient checking for FD violation on updates is important

Solution: define a weaker normal form, called Third Normal Form (3NF)
Allows some redundancy (with resultant problems; we will see examples later) But functional dependencies can be checked on individual relations without computing a join. There is always a lossless-join, dependency-preserving decomposition into 3NF.
2

Third Normal Form (3NF)


If R is in BCNF, obviously in 3NF. If R is in 3NF, some redundancy is possible. It is a compromise, used when BCNF not achievable (e.g., no good decomposition, or performance considerations).

Lossless-join, dependency-preserving decomposition of R into a collection of 3NF relations always possible.


3

3NF
Reln R with FDs F is in 3NF if, for each nontrivial FD X A in F, one of the following statements is true:
A X (trivial FD), or X is a superkey, or A is part of some key for R
If one of these two is satisfied, then R is in BCNF

What Does 3NF Achieve?


If 3NF is violated by XA, one of the following holds:

X is a subset of some key K (partial redundancy)


We store (X, A) pairs redundantly.

X is not a proper subset of any key.


There is a chain of FDs K X A, which means that we cannot associate an X value with a K value unless we also associate an A value with an X value.

But: even if reln is in 3NF, these problems could arise.

e.g., Reserves SBDC (keys are SBD, CBD), S C, C S is in 3NF, but for each reservation of sailor S, same (S, C) pair is stored.

Thus, 3NF is indeed a compromise relative to BCNF.


5

Decomposition into 3NF


Obviously, the algorithm for lossless join decomp into BCNF can be used to obtain a lossless join decomp into 3NF (typically, can stop earlier). To ensure dependency preservation, one idea:

If X Y is not preserved, add relation XY. Problem is that XY may violate 3NF!

Refinement: Instead of the given set of FDs F, use a minimal cover for F.
6

Minimal Cover for a Set of FDs


Minimal cover G for a set of FDs F:

Closure of F = closure of G. Right hand side of each FD in G is a single attribute. If we modify G by deleting an FD or by deleting attributes from an FD in G, the closure changes.

Intuitively, every FD in G is needed, and as small as possible in order to get the same closure as F.

Obtaining Minimal Cover


Step 1: Put the FDs in a standard form (i.e. right-hand side should contain only single attribute) Step 2: Minimize the left side of each FD Step 3: Delete redundant FDs

Example
F = {A B, ABCD E, EF G, EF H, ACDF EG} Rewrite ACDF->EG: {ACDF->E, ACDF->G} For ACDF->G, its implied by {A->B, ABCD->E, EF>G}: {ABCDF->EF->G}, so DELETE For ACDF->E, its implied by {A->B, ABCD->E}: {ABCDF->EF, ABCDF->E, ABCDF->F}, so DELETE ABCD->E = ACD->E, since A->B So the minimal cover for F is:

A B, ACD E, EF G and EF H

What to do with Minimal Cover?


After obtaining the minimal cover, for each FD X A in the minimal cover that is not preserved, create a table consisting of XA (so we can check dependency in this new table, i.e. dependency is preserved). Why does this new table is guaranteed to be in 3NF (whereas if we created the new table from F, it might not?)
Since X A is in the minimal cover, Y A does not hold for any Y that is a strict subset of X.
So X is a key for XA (satisfies condition #2) If any other dependencies hold over XA, the right side can involve only attributes in X because A is a single attribute (satisfies condition #3).

10

Comparison of BCNF and 3NF


It is always possible to decompose a relation into a set of relations that are in 3NF such that:
the decomposition is lossless the dependencies are preserved

It is always possible to decompose a relation into a set of relations that are in BCNF such that:
the decomposition is lossless it may not be possible to preserve dependencies.

11

Normalization Review
Identify all FDs in F+ Identify candidate keys Identify (strongest, or specific) normal forms
BCNF, 3NF

Schema decomposition
When to decompose How to check if a decomposition is lossless-join and/or dependency preserving
Use projection of F+ to check for dependency preservation

Decompose into:
Lossless-join Dependency preserving
Use minimal cover

12

Normalization Theory Practice Questions

Example
A 1 1 2 2 B 1 1 2 2 C 2 3 3 2
FDs with A as the left side: Satisfied by the relation?

AA AB AC AB A AC B

Yes (trivial FD) Yes No: tuples 1&2 Yes (trivial FD) Yes
14

Example
Let F={ A BC, B C }. Is C AB in F+? Answer: No. Either of the following 2 reasons is ok: Reason 1) C+=C, and does not include AB. Reason 2) We can find a relation instance such that it satisfies F but does not satisfy C AB. A B C
1 2 1 1 2 2
15

List all the non-trivial FDs in F+


Given F={ A B, B C}. Compute F+ (with attributes A, B, C).
A B C AB AC BC ABC A B C AB AC BC ABC
Attribute closure

A+=ABC B+=BC C+=C AB+=ABC AC+=ABC BC+=BC ABC+=ABC


16

Example
Given F={ A B, B C}. Find a relation that satisfies F:

A 1 2

B 1 1

C 2 2

Given F={ A B, B C}. Find a relation that satisfies F but does not satisfy B A. Well, the above example suffices. Can you find an instance that satisfies F but not A C? No. Because A C is in F+
17

Examples
R(A, B, C, D, E), F = {A B, C D} Candidate key: ACE. How do we know? Intuitively, - B cannot be in a candidate key. - A is not determined by any other attributes (like E), and A has to be in a candidate key (because a candidate key has to determine all the attributes). - Now if A is in a candidate key, B cannot be in the same candidate key, since we can drop B from the candidate without losing the property of being a key. - Same reasoning apply to others attributes. 18

Example
R(A, B, C, D, E), F = {A B, C D} [Same as previous] Which normal form? Not in BCNF. This is the case where all attributes in the FDs appear in R. We consider A, and C to see if either is a superkey of not. Obviously, neither A nor C is a superkey, and hence R is not in BCNF. More precisely, we have A B is in F+ and non-trivial, but A is not a superkey of R.
19

Example
R(A, B, C, D, E) F = {A B, C D} [Same as previous] Which normal form? We already know that its not in BCNF. Not in 3NF either. We have A B is in F+ and non-trivial, but A is not a superkey of R. Furthermore, B is not in any candidate key (since the only candidate key is ACE).
20

Example
R(A,B,F), F = {AC E, B F}. Candidate key? AB BCNF? No, because of B F (B is not a superkey). 3NF? No, because of B F (F is not part of a candidate key).

21

Example
R(D, C, H, G), F = {A I, I A} Candidate key? DCHG BCNF? Yes 3NF? Yes

22

Example
R(A, B, C, D, E, G, H) F={AB C, AC B, B D, BC A, E G} Candidate keys?
H has to be in all candidate keys E has to be in all candidate keys G cannot be in any candidate key (since E is in all candidate keys already). Since AB C, AC B and BC A, we know no candidate key can have ABC together. AEH, BEH, CEH are not superkeys. Try ABEH, ACEH, BCEH. They are all superkeys. And we know they are all candidate keys (since above properties) These are the only candidate keys: (1) each candidate key either contains A, or B, or C since no attributes other than A,B,C determine A, B, C, and (2) if a candidate key contains A, then it must contain either B, or C, and so on.
23

Example
Same as previous Not in BCNF, not in 3NF Decomposition:
ABCDEGH BD Using B D ABC Using AB C EG Using E G ABCEGH ABEGH ABEH
24

Example
R(A, B, C, D, E, G, H) F={AB C, AC B, B D, BC A, E G}

Decomposition: BD, ABC, EG, ABEH


Why good decomposition?
They are all in BCNF Lossless-join decomposition All dependencies are preserved.

25

Example
R(A, B, D, E) decomposed into R1(A, B, D), R2(A, B, E) F={AB DE} It is a dependency preserving decomposition!
AB D can be checked in R1
AB E can be checked in R2 {AB DE} is equivalent to {AB D, AB E}

26

You might also like