Database Design Theory And Normalization
Module No. 3
Functional dependency (FD), Closure of FD, Closure of Attributes, Cover, Equivalence of FD,
Canonical cover, Key generation, Normalization, Desirable properties of decomposition.
Informal Design Guidelines for Relation Schemas
Introduction
• Database design methods
• Bottom–up approach
• Top – down approach(forms, invoice)
• Two levels of relation schemas
• The logical "user view" level
• The storage "base relation" level
• What are the criteria for "good" base relations?
• Objective: Goodness/quality of DB
• Functional dependency
• Normal Forms
Informal Design Guidelines for Relation Schemas
1. Semantics of the Attributes
2. Reducing the Redundant Value in Tuples.
3. Reducing Null values in Tuples.
4. Disallowing spurious Tuples.
Informal Design Guidelines for Relation Schemas
1. Semantics of the Attributes
• To form a relational schema there should be some meaning among the attributes.
• This meaning is called semantics.
• This semantics relates one attribute to another with some relation.
• Informally, each tuple in a relation should represent one entity or relationship instance. (Applies to
individual relations and their attributes).
• Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be
mixed in the same relation.
• Only foreign keys should be used to refer to other entities.
• Entity and relationship attributes should be kept apart as much as possible.
Informal Design Guidelines for Relation Schemas
1. Semantics of the Attributes
Informal Design Guidelines for Relation Schemas
2. Redundant Information in Tuples and Update Anomalies
• Data redundancy in a Database Management System (DBMS) refers to the repetition of the same
data in multiple places within a database. It is a concern because it can lead to inconsistencies,
update anomalies, and increased storage requirements, impacting data integrity and database
performance.
• Mixing attributes of multiple entities may cause problems
• Information is stored redundantly wasting storage
• Problems with update anomalies
• Insertion anomalies
• Deletion anomalies
• Modification anomalies
Informal Design Guidelines for Relation Schemas
2. Redundant Information in Tuples and Update Anomalies
• Insert Anomaly: Cannot insert a project unless an employee is assigned to it.
Inversely - Cannot insert an employee unless he/she is assigned to a project.
• Delete Anomaly: When a project is deleted, it will result in deleting all the employees who
work on that project. Alternatively, if an employee is the sole employee on a project, deleting
that employee would result in deleting the corresponding project.
• Update Anomaly: Changing the name of project number P1 from “ProductX” to
“Product Y” may cause this update to be made for all employees working on
project P1.
Informal Design Guidelines for Relation Schemas
3. Null Values in Tuples
• Attributes that are NULL frequently could be placed in separate relations
(with the primary key).
• Reasons for nulls:
- Attribute not applicable or invalid
- Attribute value unknown (may exist)
- Value known to exist, but unavailable
Informal Design Guidelines for Relation Schemas
4. Spurious Tuples
• Bad designs for a relational database may result in erroneous results for
certain JOIN operations
• The "lossless join" property is used to guarantee meaningful results for join
operations
There are two important properties of decompositions:
- Non-additive or losslessness of the corresponding join
- Preservation of the functional dependencies.
Informal Design Guidelines for Relation Schemas
4. Spurious Tuples
Informal Design Guidelines for Relation Schemas
4. Spurious Tuples
Guidelines - Summary
GUIDELINE 1: Design a schema that can be explained easily relation by relation. The
semantics of attributes should be easy to interpret.
GUIDELINE 2: Design a schema that does not suffer from insertion, deletion and
update anomalies. If there are any present, then note them so that applications can
be made to take them into account.
GUIDELINE 3: Relations should be designed such that their tuples will have as few
NULL values as possible
GUIDELINE 4: The relations should be designed to satisfy the lossless join condition.
No spurious tuples should be generated by doing a natural-join of any relations.
Functional Dependency
Functional Dependency
• Functional Dependency (FD) determines the relation of one attribute to another
attribute in a database management system (DBMS) system.
• Functional dependency helps you to maintain the quality of data in the database.
• A functional dependency is denoted by an arrow →.
• The functional dependency of X on Y is represented by X → Y.
Example
The city, Employee Name, and salary are functionally depended on the Employee number.
If t1(X) = t2(X),
then t1(Y) = t2(Y)
Functional Dependency If t1(X) = t2(X),
then t1(Y) = t2(Y)
• Is there a problem here? No.
• We have the FD that
• EmpNo → Name. This means that every time we find 104,
we find the name, Fred.
• Just because something is on the left-hand side of a FD, it does not imply
that you have a key or that it will be unique in the database.
• i.e. the FD X → Y only means that for every occurrence of X you will get the
same value of Y.
Functional Dependency
• Here, we will define two FD
• 1.SSN → Name and
School → Location.
• 2. SSN → School.
• First, have we violated any FDs with our data? Because all SSNs are unique, there cannot be
a FD violation of SSN → Name. Why? Because a FD X → Y says that given some value for X,
you always get the same Y.
• Because the X's are unique, you will always get the same value. The same comment is true
for SSN → School.
Functional Dependency
• How about our second FD, School→ Location? There are only
three schools in the example and you may note that for
every school, there is only one location, so no FD violation.
• Now, we want to point out something interesting. If we
define a functional dependency X → Y and we define a
functional dependency Y → Z, then we know by inference
that X → Z.
• Here, we defined SSN → School. We also defined School → Location, so we can infer that
SSN → Location although that FD was not originally mentioned.
FD Satisfies
Algorithm: SATISFIES
Input: A relation r and an FD X Y.
Output: true if T satisfies X Y, false otherwise.
SATISFIES(r, X Y);
1. Sort the relation r on its X columns to bring tuples with
equal X-values together.
2. If each set of tuples with equal X-values has equal Y values, return true.
Otherwise, return false.
SATISFIES tests if a relation r satisfies an FD X Y
Algorithm: Satisfies
Using the algorithm satisfies to test if FLIGHT DEPARTS
Question: DEPARTS FLIGHT ???
Example -2
What FDs may exist?
• A relation R(A, B, C, D) with its extension.
• Which FDs may exist in this relation?
{AB} → C and B → C
Functional Dependency
Key Terms
Key Terms Description
Axiom Axioms is a set of inference rules used to infer all the functional
dependencies on a relational database
Decomposition It is a rule that suggests if you have a table that appears to contain
two entities that are determined by the same primary key then you
should consider breaking them up into two different tables.
Dependent It is displayed on the right side of the functional dependency diagram.
Determinant It is displayed on the left side of the functional dependency Diagram.
Union It suggests that if two tables are separate, and the PK is the same, you
should consider putting them Together.
Functional Dependency
Rules of Functional Dependencies
• There are three functional dependency rules, known as Armstrong’s axioms, you use to infer
functional dependency within relational databases. Proposed by William. Armstrong in 1974.
Rules Secondary Rules (inference rules)
Primary Rules
• Union
• Reflexive rule
• Composition
• Augmentation rule
• Decomposition
• Transitivity rule
• Pseudo Transitivity
• Self Determination
• Extensivity
Rules of Functional Dependencies
1. Reflexive rule
• If X is a composite, composed of A and B, then X→ A and X→ B.
Example
• The rule, which seems quite obvious, says if I give you
the combination, what is this person's Name? What is
this person's City? While this rule seems obvious enough,
it is necessary to derive other functional dependencies.
• If A is a set of attributes and B is a subset of A, then A holds B. If B⊆A then A→B
• This property is trivial property. where a set of attributes implies itself.
Rules of Functional Dependencies
2. Augmentation Axiom
• If A→B holds and Y is the attribute set, then AY→BY also holds. That is
adding attributes to dependencies, does not change the basic dependencies.
If A→B, then AC→BC for any C.
• Now, I claim that because Name→ City, that Name + Shoe Size → City
• (i.e., we augmented Name with Shoe Size).
• Will there be a contradiction here, ever? No, because we defined Name → City, Name
plus more information will always identify the unique City for that individual. We can
always add information to the LHS of an FD and still have the FD be true.
Rules of Functional Dependencies
3. Transitivity rule
• Same as the transitive rule in algebra, if A→B holds and B→C
holds, then A→C also holds. A→B is called A functionally which
determines B. If X→Y and Y→Z, then X→Z.
• Show_ID -> Telecast_ID
• Telecast_ID -> Telecast_Type
• Thus, the following has a transitive type of
functional dependency.
• Show_ID -> Telecast_Type
Rules of Functional Dependencies
Secondary Rules or Derived (inference rules)
• Union: If A→B holds and A→C holds, then A→BC holds. If X→Y and X→Z then
X→YZ.
• Composition: If A→B and X→Y hold, then AX→BY holds.
• Decomposition: If A→BC holds then A→B and A→C hold. If X→YZ then X→Y and
X→Z.
• Pseudo Transitivity: If A→B holds and BC→D holds, then AC→D holds. If X→Y and
YZ→W then XZ→W.
• Self Determination: It is similar to the Axiom of Reflexivity, i.e. A→A for any A.
• Extensivity: Extensivity is a case of augmentation. If AC→A, and A→B, then AC→B.
Similarly, AC→ABC and ABC→BC. This leads to AC→BC.
Advantages of Using Armstrong’s Axioms in
Functional Dependency
• They provide a systematic and efficient method for inferring additional functional
dependencies from a given set of functional dependencies, which can help to optimize
database design.
• They can be used to identify redundant functional dependencies, which can help to
eliminate unnecessary data and improve database performance.
• They can be used to verify whether a set of functional dependencies is a minimal cover,
which is a set of dependencies that cannot be further reduced without losing
information.
Types of Functional Dependencies in DBMS
1.Trivial functional dependency
2.Non-Trivial functional dependency
3.Multivalued functional dependency
4.Transitive functional dependency
Trivial Functional Dependency
• In Trivial functional dependency, a dependent is always a subset of the determinant. In other words,
a functional dependency is called trivial if the attributes on the right side are the subset of the
attributes on the left side of the functional dependency.
• X → Y is called a trivial functional dependency if Y is the subset of X.
Employee_Id Name Age • Here, { Employee_Id, Name } → { Name } is a Trivial
1 sss 23 functional dependency, since the dependent Name is
2 kkk 43 the subset of determinant { Employee_Id, Name }.
3 Xxx 32
4 yyy 45
• { Employee_Id } → { Employee_Id }, { Name } → { Name
} and { Age } → { Age } are also Trivial.
Non-Trivial Functional Dependency
• It is the opposite of Trivial functional dependency. Formally speaking, in Non-Trivial functional
dependency, dependent if not a subset of the determinant.
• X → Y is called a Non-trivial functional dependency if Y is not a subset of X.
• So, a functional dependency X → Y where X is a set of attributes and Y is also a set of the attribute
but not a subset of X, then it is called Non-trivial functional dependency.
Employee_Id Name Age • Here, { Employee_Id } → { Name } is a non-trivial
1 sss 23
functional dependency because Name(dependent) is not
2 kkk 43
a subset of Employee_Id(determinant).
3 Xxx 32
4 yyy 45
• Similarly, { Employee_Id, Name } → { Age } is also a
non-trivial functional dependency.
Multivalued Functional Dependency
• In Multivalued functional dependency, attributes in the dependent set are not dependent on each
other.
• For example, X → { Y, Z }, if there exists is no functional dependency between Y and Z, then it is
called as a Multivalued functional dependency.
•Here, { Employee_Id } → { Name, Age } is a
Employee_Id Name Age
Multivalued functional dependency, since the dependent
1 sss 23
attributes Name, Age are not functionally
2 kkk 23
dependent(i.e. Name → Age or Age → Name doesn’t
3 Xxx 32
4 yyy 32 exist !).
Transitive Functional Dependency
• Consider two functional dependencies A → B and B → C then according to the transitivity axiom A → C
must also exist. This is called a transitive functional dependency.
• In other words, dependent is indirectly dependent on the determinant in Transitive functional
dependency.
•Here, { Employee_Id → Department } and { Department →
Employee_Id Name Department Street
No Street Number } holds true. Hence, according to the axiom of
1 sss CD 11 transitivity, { Employee_Id → Street Number } is a valid
2 kkk SDC 24 functional dependency.
3 Xxx CD 11
4 yyy CTS 12