Schema Refinement & Normalization Theory
Schema Refinement & Normalization Theory
Normalization/Decomposition
Week 13
1
Solution: define a weaker normal form, called Third Normal Form (3NF)
Allows some redundancy (with resultant problems; we will see examples later) But functional dependencies can be checked on individual relations without computing a join. There is always a lossless-join, dependency-preserving decomposition into 3NF.
2
3NF
Reln R with FDs F is in 3NF if, for each nontrivial FD X A in F, one of the following statements is true:
A X (trivial FD), or X is a superkey, or A is part of some key for R
If one of these two is satisfied, then R is in BCNF
e.g., Reserves SBDC (keys are SBD, CBD), S C, C S is in 3NF, but for each reservation of sailor S, same (S, C) pair is stored.
If X Y is not preserved, add relation XY. Problem is that XY may violate 3NF!
Refinement: Instead of the given set of FDs F, use a minimal cover for F.
6
Closure of F = closure of G. Right hand side of each FD in G is a single attribute. If we modify G by deleting an FD or by deleting attributes from an FD in G, the closure changes.
Intuitively, every FD in G is needed, and as small as possible in order to get the same closure as F.
Example
F = {A B, ABCD E, EF G, EF H, ACDF EG} Rewrite ACDF->EG: {ACDF->E, ACDF->G} For ACDF->G, its implied by {A->B, ABCD->E, EF>G}: {ABCDF->EF->G}, so DELETE For ACDF->E, its implied by {A->B, ABCD->E}: {ABCDF->EF, ABCDF->E, ABCDF->F}, so DELETE ABCD->E = ACD->E, since A->B So the minimal cover for F is:
A B, ACD E, EF G and EF H
10
It is always possible to decompose a relation into a set of relations that are in BCNF such that:
the decomposition is lossless it may not be possible to preserve dependencies.
11
Normalization Review
Identify all FDs in F+ Identify candidate keys Identify (strongest, or specific) normal forms
BCNF, 3NF
Schema decomposition
When to decompose How to check if a decomposition is lossless-join and/or dependency preserving
Use projection of F+ to check for dependency preservation
Decompose into:
Lossless-join Dependency preserving
Use minimal cover
12
Example
A 1 1 2 2 B 1 1 2 2 C 2 3 3 2
FDs with A as the left side: Satisfied by the relation?
AA AB AC AB A AC B
Yes (trivial FD) Yes No: tuples 1&2 Yes (trivial FD) Yes
14
Example
Let F={ A BC, B C }. Is C AB in F+? Answer: No. Either of the following 2 reasons is ok: Reason 1) C+=C, and does not include AB. Reason 2) We can find a relation instance such that it satisfies F but does not satisfy C AB. A B C
1 2 1 1 2 2
15
Example
Given F={ A B, B C}. Find a relation that satisfies F:
A 1 2
B 1 1
C 2 2
Given F={ A B, B C}. Find a relation that satisfies F but does not satisfy B A. Well, the above example suffices. Can you find an instance that satisfies F but not A C? No. Because A C is in F+
17
Examples
R(A, B, C, D, E), F = {A B, C D} Candidate key: ACE. How do we know? Intuitively, - B cannot be in a candidate key. - A is not determined by any other attributes (like E), and A has to be in a candidate key (because a candidate key has to determine all the attributes). - Now if A is in a candidate key, B cannot be in the same candidate key, since we can drop B from the candidate without losing the property of being a key. - Same reasoning apply to others attributes. 18
Example
R(A, B, C, D, E), F = {A B, C D} [Same as previous] Which normal form? Not in BCNF. This is the case where all attributes in the FDs appear in R. We consider A, and C to see if either is a superkey of not. Obviously, neither A nor C is a superkey, and hence R is not in BCNF. More precisely, we have A B is in F+ and non-trivial, but A is not a superkey of R.
19
Example
R(A, B, C, D, E) F = {A B, C D} [Same as previous] Which normal form? We already know that its not in BCNF. Not in 3NF either. We have A B is in F+ and non-trivial, but A is not a superkey of R. Furthermore, B is not in any candidate key (since the only candidate key is ACE).
20
Example
R(A,B,F), F = {AC E, B F}. Candidate key? AB BCNF? No, because of B F (B is not a superkey). 3NF? No, because of B F (F is not part of a candidate key).
21
Example
R(D, C, H, G), F = {A I, I A} Candidate key? DCHG BCNF? Yes 3NF? Yes
22
Example
R(A, B, C, D, E, G, H) F={AB C, AC B, B D, BC A, E G} Candidate keys?
H has to be in all candidate keys E has to be in all candidate keys G cannot be in any candidate key (since E is in all candidate keys already). Since AB C, AC B and BC A, we know no candidate key can have ABC together. AEH, BEH, CEH are not superkeys. Try ABEH, ACEH, BCEH. They are all superkeys. And we know they are all candidate keys (since above properties) These are the only candidate keys: (1) each candidate key either contains A, or B, or C since no attributes other than A,B,C determine A, B, C, and (2) if a candidate key contains A, then it must contain either B, or C, and so on.
23
Example
Same as previous Not in BCNF, not in 3NF Decomposition:
ABCDEGH BD Using B D ABC Using AB C EG Using E G ABCEGH ABEGH ABEH
24
Example
R(A, B, C, D, E, G, H) F={AB C, AC B, B D, BC A, E G}
25
Example
R(A, B, D, E) decomposed into R1(A, B, D), R2(A, B, E) F={AB DE} It is a dependency preserving decomposition!
AB D can be checked in R1
AB E can be checked in R2 {AB DE} is equivalent to {AB D, AB E}
26