Lecture7-Keys_and_FD
Lecture7-Keys_and_FD
… … …
A functional dependency (FD) on a relation
R is a statement of the form X → Y, where X
and Y are sets of attributes in a relation R.
A1, A2, …, An → Bm
In a relation r, a set of attributes Y is functionally
dependent upon another set of attributes X (X→Y) iff…
for all pairs of tuples t1 and t2 in r…
if t1[X]=t2[X]…
FD EXAMPLE (2)
Every student is classified as
{StudentID}→ {Year} either a Freshman, Sophomore,
{StudentID,Class}→ {Instructor} Junior, or Senior.
Key(s): {StudentID,Class} Students can take only a single
section of a class, taught by a
single instructor.
StudentID Year Class Instructor
t1 1 Sophomore COMP355 Wu
t2 2 Sophomore COMP285 Wu
t3 3 Junior COMP355 Wu
t4 3 Junior COMP285 Wu
t5
2 Sophomore COMP355 Russo
t6
4 Sophomore COMP355 Russo
FD EXAMPLE (3)
{StudentID} ↛{Instructor} {Class} ↛{Year}
{StudentID} ↛ {Class} {Class} ↛ {StudentID}
{Year} ↛ {StudentID} {Class} ↛ {Instructor}
{Year} ↛ {Instructor} {Instructor} ↛ {Class}
{Year} ↛ {Class} {Instructor} ↛ {Year}
{Instructor} ↛ {StudentID}
Title Year Length Genre studioName starName
Star war 1977 124 SciFi Fox Carrie
Fisher
FD EXERCISE A B C D E
Functional Dependencies Keys
A→ B DA
CD → E DB A B C D E
BD → A
D→C
We say a set of one more attributes K is a key for a relation R if:
▪ Those attributes functionally determine all other attributes
of the relation. That is, it’s impossible for two distinct tuples
of R to agree on all K.
▪ No proper subset of K functionally determine all other
attributes of R, i.e., a key must be minimal.
When a key consists of a single attribute A, we often say that A
(rather than {A}) is a key.
A set of attributes 𝐾 is a key for a relation 𝑅 if
• 𝐾 → all (other) attributes of 𝑅
• That is, 𝐾 is a “super key”
• No proper subset of 𝐾 satisfies the above condition
• That is, 𝐾 is minimal
Ex: Attributes {title, year, starName}
form a key for the relation Movies1.
▪ Suppose two tuples agree on
these three attributes: title, year,
and starName.
▪ Because they agree on title and
year, they must agree on other
attributes length, genre, and
studioName.
▪ Argue that no proper subset of
{title, year, starName} functionally
determines all other attributes.
▪ Why title and year do not
determine starName, because
many movies have more than one
star.
▪ Thus {title, year} is not a key,
similar to {year, starName}, and
{title, starName}
▪ Sometimes a relation has more than one key. If so, it’s
common to designate one of the keys as the primary key
(PK).
▪ In commercial database systems, the choice of PK can
influence some implementation issues such as how the
relation is stored on disk. However, the theory of FD’s give no
special role to PK.
• Source attributes (𝑆𝐴): Those that are appearing only in the
left side of the functional dependency (FD), or the ones that
are not part of any FDs.
• Intermediate attributes (𝐼𝐴): Those that are the ones
appearing on both sides of the FDs.
• Target attributes (𝑇𝐴): Those that are only appearing on the
right side of the FDs.
Step 1: Determine source attributes (SA), Intermediate
attributes (IA).
Step 2: If IA = then
K = SA is the only key.
Return K;
Step 3: Determine all subsets of IA.
Step 4: Determine the super keys Si from ∀𝑋𝑖 ⊂ 𝐼𝐴.
IF 𝑆𝐴 ∪ 𝑋𝑖 + = 𝑅 + THEN
𝑆𝑖 = 𝑆𝐴 ∪ 𝑋𝑖
For example:
Title, year → title, genre
is equivalent to
Title, year → genre
Given 𝑅, a set of FD’s ℱ that hold in 𝑅, and a set of
attributes X in 𝑅:
The closure of X (denote X+) with respect to ℱ
is the set of all attributes {𝐴1,𝐴2,…} functionally
determined by X (that is, X → 𝐴1𝐴2 …)
Output: The closure X+
Start with closure = X
Input: A set of attributes X If Z →Y is in ℱ and Z is already in
and set of FD’s of ℱ the closure, then also add Y to
the closure
Repeat until no new attributes can
be added.
Input: R, ℱ, X R+
Output: X+
Step 1: Set X+ = X
Step 2: temp = X+
f Z →Y ℱ
if(Z X+)
X+ = X+ Y
ℱ=ℱ–f
Step 3: if (X+=Temp)
“X+ is a result” stop
else
Return step 2
Let us consider a relation R(A,B,C,D,E,G) and FD’s
ℱ ={AB → C, BC → AD, D → E, CG → B} .
What is {A,B}+?
X ={A,B}
Next, X = {A,B,C} based on AB → C
X = {A,B,C,D} based on BC → D
X = {A,B,C,D,E} based on D → E and no more changes to X are
possible.
Thus, {A,B}+ = {A,B,C,D,E}
▪ Let us consider a relation R(A,B,C,D,E,G) and FD’s
ℱ = {AB → C, BC → AD, D → E, CG → B} .
Suppose we wish to test whether AB → D follows from these
FD’s.
▪ We computer {A,B}+ = {A,B,C,D,E}. Since D is a member of
the closure, we conclude that AB → D does follow.
▪ However, D → A does not follow.
1. Let us consider a relation R(A,B,C,D,E,G) and FD’s
ℱ ={AB → D, AC → BD, D → G, CG →A} .
What is {A,C}+, {B,D}+?
Find F1?
Input: Two relations R, R1, computed by the projection
R1 = L(R). Also, a set of FD’s F that hold in R.
Output: The set of FD’s that hold in R1.
1. Let F1 be the eventual output set of FD’s. Initially, F1 is
empty
2. For each set of attributes X that is a subset of the
attributes of R1, compute X+. This computation is
performed with respect to the set of FD’s F, and may
involve attributes that are in the schema of R but not R1.
Add to F1 all nontrivial FD’s X →A such that A both in X+
and an attribute of R1.
3. Now, F1 is a basic for the FD’s that hold in R1 but may not
be a minimal cover. We may construct a minimal basic by
modifying F1 as follows:
a. If there is an FD f1 in F1 that follows from the other FD’s
in F1, remove f1 from F1.
b. Let Y → B be an FD in F1, with at least two attributes in
Y, and let Z be Y with one of its attributes removed. If Z
→B follows from the FD’s in F1 (including Y →B), then
replace Y → B by Z →B.
c. Repeat the above steps in all possible ways until no
more changes to F1 can be made.
Given R(A,B,C,D) has FD’s F={A→B, B →C, C →D}. Suppose also that we wish to
project out the attribute B, leaving a relation R1(A,C,D). Find F1?
Answer:
In principle, to find the FD’s for R1, we need to take the closure of all eight
subsets of {A,C,D}, using the full set of FD’s including those involving B.
Subset of {A,C,D}: , A,C,D,AC,AD,CD, ACD.
Closures of all subsets: {}+= , {A}+ = {A,B,C,D}, {C}+ = {C,D},
{D}+ = {D}, {AC}+ = {A,B,C,D}, {CD}+ = {C,D}, {ACD}+ =
{A,B,C,D}.
F={A→B, B →C, C →D}
F1+ = R1(F) = {A→C, A→D, C→D}. We can observe that A→D
follows from the other two by transitivity. Therefore, a minimal
cover for the FD’s of R1 is
F1 = {A→C, C→D}.
1. Consider a relation with schema R(A,B,C,D) and FD’s
F ={AB →C, C →D, D →A}
a. What are all the nontrivial FD’s that follow from the
given FD’s? You should restrict yourself to FD’s with
single attribute on the right side.
b. What are all the keys of R?
c. What are all the super keys for R that are not keys?
2. Consider a relation with schema R(A,B,C,D)
and FD’s
F ={AB →C, BC →D, CD →A, AD →B}
a. What are all the nontrivial FD’s that follow from