DBMS Unit 3.1
DBMS Unit 3.1
What is RDBMS?
RDBMS stands for Relational Database Management System. It is a type of database management
system that stores data in a structured format, using rows and columns (i.e., tables) to manage
relationships between data.
Designing an RDBMS involves several systematic steps to ensure efficient data storage, integrity, and
retrieval. Here is the general design process:
1. Requirement Analysis
Example: In a university system, gather requirements like student records, courses, enrollments,
etc.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 1/33
3. Logical Design
4. Normalization
Apply normal forms (1NF, 2NF, 3NF, etc.) to organize data efficiently.
5. Physical Design
Decide how the logical design will be implemented on the database system.
6. Implementation
8. Maintenance
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 2/33
Monitor database performance.
Summary Table:
Step Purpose
Requirement Analysis Understand what data and processes are needed
Conceptual Design Build high-level ER diagrams
In an RDBMS (Relational Database Management System), a table is the basic unit of data storage. It
organizes data into rows and columns, where each row represents a record and each column
represents a field or attribute of that record.
Structure of a Table:
ID Name Age
1 Alice 21
2 Bob 25
Columns (Fields): Define the type of data (e.g., name, age, salary).
Rows (Records): Contain the actual data entries.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/33
Primary Key: Uniquely identifies each row (e.g., ID).
Example:
Here:
Summary:
A table in RDBMS is a structured collection of related data organized in rows and columns, used to
store and manage information efficiently in a relational format.
Tables
Fields (columns) and their data types
Constraints (e.g., PRIMARY KEY, FOREIGN KEY)
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 4/33
Does not contain actual data; it only defines the structure of the data.
Each user or application interacts with the database using this schema.
✅ Example:
Consider a `Student` table:
sql
pgsql
It tells:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 5/33
Tuple 1: (101, Alice, 20, CSE)
Tuple 2: (102, Rahul, 21, ECE)
📝 Final Notes:
Schema helps design the database.
Let me know if you want an ER diagram or real-world analogy for better understanding!
Explain briefly about super key , candidate key and primary key in RDBMS
Sure! Here's a brief explanation of Super Key, Candidate Key, and Primary Key in RDBMS:
🔹 Super Key
A super key is any combination of attributes that can uniquely identify a row in a table.
It can include extra attributes that are not necessary for uniqueness.
✅ Example:
For a `Student` table:
scss
🔹 Candidate Key
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 6/33
A candidate key is a minimal super key, meaning it has no extra attributes.
✅ Example:
In the `Student` table:
`{StudentID}` and `{Email}` could both be candidate keys (if both are unique).
Both can uniquely identify a student, and neither includes unnecessary attributes.
🔹 Primary Key
A primary key is the chosen candidate key that is used to uniquely identify rows in the table.
✅ Example:
From the candidate keys `{StudentID}` and `{Email}`, we might choose `StudentID` as the primary key.
🔁 Summary Table:
Quantity per
Term Definition Uniqueness Nullable
Table
"A candidate key is a super key but a super key may not always be a candidate key"-explain
"A candidate key is a super key, but a super key may not always be a candidate key."
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 7/33
That’s also what a super key does.
✅ Example:
For a `Student` table with:
sql
Assume:
`StudentID` is unique.
`Email` is also unique.
Then:
Since both uniquely identify rows, they are also super keys.
Some super keys contain extra attributes that are not necessary for uniquely identifying
records.
✅ Example:
`{StudentID, Name}` is a super key (since `StudentID` is unique, the combination is still unique).
Candidate keys must be minimal—they should not have any extra attribute.
🔁 Summary:
Key Type Unique Minimal Example
✅ Final Thought:
All candidate keys are super keys (because they ensure uniqueness),
but not all super keys are candidate keys (because some are not minimal).
✅ Definition:
If attribute A functionally determines attribute B, it is written as:
css
A → B
This means:
If two rows have the same value of A, they must have the same value of B.
🔸 Example 1:
Consider a `Student` table:
Here:
`StudentID → Name`
(Because StudentID is unique, it determines the student's Name)
`StudentID → Department`
(Each student is in only one department)
🔸 Example 2:
Consider a `Course` table:
CS101 DBMS 4
CS102 OS 3
Here:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 9/33
`CourseID → CourseName`
`CourseID → Credits`
Together:
nginx
This means: Knowing the `CourseID` is enough to find the CourseName and Credits.
A → A, or A → part of A
Example: `{StudentID, CourseID} → Grade` (if grade depends on both student and course)
🔁 Summary:
Term Meaning
Functional dependency is the foundation for normalization in RDBMS to eliminate redundancy and
ensure data integrity.
Let me know if you want a diagram or explanation of how FDs help in normalization!
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 10/33
🔸 1. Trivial Functional Dependency
A functional dependency is trivial if the RHS is a subset of the LHS.
✅ Notation:
csharp
X → Y is trivial if Y ⊆ X
🔹 Example:
`{StudentID, Name} → StudentID` ✔️
`{CourseID, Name} → Name` ✔️
These are trivial because the RHS is already included in the LHS.
✅ Notation:
csharp
X → Y is non-trivial if Y ⊈ X
🔹 Example:
`StudentID → Name` ❌
`CourseID → Credits` ❌
These are non-trivial because the RHS attributes are not present in the LHS.
✅ Notation:
rust
In simple terms:
The RHS contains some attributes from LHS, but not all.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 11/33
🔹 Example:
✅ RHS
`{StudentID, Name} → Name`
`Name` is part of LHS, but not the entire LHS → Semi-trivial
🔁 Summary Table:
Type Definition Example
Trivial RHS is a subset of LHS `{A, B} → A`
Non-Trivial RHS is not part of LHS `StudentID → Name`
Semi-Trivial RHS overlaps partially with LHS `{A, B, C} → B` (B ∈ LHS, but not all)
Let me know if you'd like a visual or diagram to represent these more clearly!
🔸 Formal Definition:
Given:
A set of attributes X
Then, the closure of X, written as `X⁺`, is the set of all attributes Y such that:
cpp
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 12/33
🔹 Example:
Consider a relation R(A, B, C, D)
and functional dependencies F:
css
1. A → B
2. B → C
3. A → D
✅ Step-by-Step:
Start with:
`A⁺ = {A}`
Apply FDs:
✅ Final:
`A⁺ = {A, B, C, D}`
🔸 Another Example:
Let R(P, Q, R, S)
FDs:
1. P → Q
2. Q → R
3. R → S
Find `P⁺`:
Start: `P⁺ = {P}`
Apply:
`P → Q` ⇒ `{P, Q}`
`Q → R` ⇒ `{P, Q, R}`
`R → S` ⇒ `{P, Q, R, S}`
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 13/33
🔁 Summary:
Term Meaning
Attribute Closure Set of all attributes that can be functionally derived from a given attribute set
Notation `X⁺`
Usage Candidate key check, Normalization, Lossless Join
Key Point Apply FDs iteratively until no new attributes are added
Let me know if you'd like help with solving closure problems or visualizing the process!
Define super key and candidate key from the context of FDs
Sure! Here's the definition of Super Key and Candidate Key from the perspective of Functional
Dependencies (FDs):
mathematica
1. `K⁺ = {A1, A2, ..., An}` (it functionally determines all attributes — same as super key)
2. No proper subset of `K` has this property (minimality condition)
In other words, you cannot remove any attribute from `K` without losing the ability to determine
all attributes.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 14/33
🔹 Example:
Relation: `R(A, B, C, D)`
FDs:
css
1. A → B
2. B → C
3. A → D
Check `A`:
Candidate keys:
`{A}` is minimal (no attribute can be removed).
`{A, B}` is not minimal (since `A` alone suffices).
If you want, I can help with finding keys from given FDs!
Sure! Here’s an explanation of Key Hierarchy in RDBMS from the perspective of Functional
Dependencies (FDs):
1. Super Key
A super key is any set of attributes `K` whose closure (`K⁺`) contains all attributes of the relation.
Formally:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 15/33
r
It may have extra attributes that are not necessary for uniqueness.
Example: `{A}`, `{A, B}`, `{A, C, D}` could all be super keys.
2. Candidate Key
3. Primary Key
The primary key is the selected candidate key to uniquely identify tuples.
Each relation has only one primary key.
It is used as the main reference key for identifying rows.
4. Foreign Key
A foreign key is an attribute or a set of attributes in one relation that references the primary key
in another relation.
It ensures referential integrity between tables.
5. Alternate Key
Any candidate key that is not chosen as primary key is called an alternate key.
It can still uniquely identify tuples but is not the main key.
6. Composite Key
Super Key
├─ Candidate Key (minimal Super Key)
├─ Primary Key (chosen Candidate Key)
└─ Alternate Key (non-chosen Candidate Keys)
Foreign Key (references Primary Key in another table)
Composite Key (can be Super, Candidate, or Primary with multiple attributes)
Example:
css
A → B
B → C
`{A}` closure: `{A, B, C}` → super key and candidate key (minimal)
`{A, B}` closure: `{A, B, C}` → super key but not candidate key (not minimal)
Let me know if you want me to dive deeper into any of these keys or explain with more examples!
What is membership set . Explain in detail . How to construct this explain with example\
🔹 Definition:
A Membership Set is a concept used in functional dependency analysis and database normalization
to represent the collection of all functional dependencies that involve a particular attribute or set
of attributes — either on the left-hand side (determinant) or right-hand side (dependent).
In simpler terms:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 17/33
For an attribute A, its membership set is the set of all FDs where A appears (usually on the left or
right).
It helps in understanding how attributes relate to others in a relation and is useful for dependency
preservation and decomposition.
Step 2: Choose the attribute(s) for which you want to find the membership set.
Step 3: Collect all FDs that contain the chosen attribute(s) either on the left side
(determinant) or right side (dependent).
Step 4 (optional): Separate membership sets can be defined for left membership
(attributes determining others) and right membership (attributes being
determined).
Example:
css
1. A → B
2. B → C
3. A, C → D
4. D → A
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 18/33
FD No. Functional Dependency Involves A on LHS or RHS?
2 B→C No
3 A, C → D Yes (A on LHS)
css
{ A → B, A, C → D, D → A }
For B:
FDs containing B:
`A → B` (B on RHS)
`B → C` (B on LHS)
For C:
FDs containing C:
`B → C` (C on RHS)
`A, C → D` (C on LHS)
For D:
FDs containing D:
`A, C → D` (D on RHS)
`D → A` (D on LHS)
Summary:
B { A → B, B → C }
C { B → C, A, C → D }
D { A, C → D, D → A }
Applications:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 19/33
If you want, I can also explain how membership sets are used in specific algorithms like lossless join
test or dependency preservation!
Closure of a functional dependency set, denoted as F⁺, is the set of all functional dependencies that
can be inferred (or logically derived) from the original set of functional dependencies F using
Armstrong's axioms (rules of inference).
Key Points:
F⁺ includes:
All FDs in F itself
All other FDs that can be derived from F by applying inference rules like:
Reflexivity
Augmentation
Transitivity
Essentially, F⁺ represents the complete knowledge of dependencies implied by F.
Why is it important?
Example:
If
ini
F = { A → B, B → C }
css
A → C
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 20/33
is also implied, so
mathematica
F⁺ = { A → B, B → C, A → C }
If you'd like, I can help explain the inference rules used to compute the closure!
Armstrong's axioms are a set of inference rules used to derive all possible functional dependencies
(FDs) from a given set of FDs in a relational database.
They provide a sound and complete method to reason about FDs.
1. Reflexivity
If Y is a subset of X, then:
nginx
X → Y
nginx
X → Y
nginx
XZ → YZ
nginx
X → Y and Y → Z
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 21/33
then:
nginx
X → Z
Union: If
nginx
X → Y and X → Z
then:
nginx
X → YZ
Decomposition: If
nginx
X → YZ
then:
nginx
X → Y and X → Z
Pseudo-transitivity: If
nginx
X → Y and WY → Z
then:
nginx
WX → Z
Summary:
Axiom Meaning
Reflexivity Attribute set determines its subsets
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 22/33
Axiom Meaning
Augmentation Add attributes to both sides of FD
Transitivity Chain FDs to get new dependencies
Armstrong's axioms are fundamental for reasoning about FDs, computing closures, and normalization.
Briefly explain about equality of 2 FD sets . Explain in detail the process of checking that 2 FDs are
equal or not with example
Two sets of functional dependencies, say F and G, are equal if they imply exactly the same functional
dependencies. In other words, their closures are the same:
F + = G+
Where F + and G+ are the closures (sets of all FDs implied by F and G, respectively).
Step 2: Check if F + ⊆ G+
Verify that every FD implied by F is also implied by G.
Step 3: Check if G+ ⊆ F+
Verify that every FD implied by G is also implied by F.
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 23/33
Step 4: Conclusion
Example:
Relation R(A, B, C)
F = {A → B, B → C}
G = {A → B, A → C}
Check F + ⊆ G+ :
From F , A → B and B → C are given.
By transitivity, A → C can be inferred in F + .
G explicitly has A → B and A → C .
So, all FDs in F + are also in G+ .
Check G+ ⊆ F +:
G has A → B and A → C .
F has A → B and B → C .
Since A → C in G is inferred in F (via transitivity), G+ ⊆ F + .
Conclusion:
F + = G+
So, F and G are equivalent FD sets.
If you want, I can help you step-by-step compute closures for any FD set!
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 24/33
A Minimal Cover (also called Canonical Cover) of a set of functional dependencies F is a simplified
version of F that is:
1. Make the right side of every FD a single attribute (decompose FDs with multiple attributes on
RHS).
2. Remove extraneous attributes from the left side of FDs.
3. Remove redundant FDs if they can be derived from others.
Example:
Given:
F = {AB → C, A → B, B → C}
AB → C (already single)
A → B (single)
B → C (single)
Check if B → C (already in F)
Check if A → C can be derived without B — no.
Check if B → C can replace AB → C .
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 25/33
Step 3: Remove redundant FDs
Now F = {A → B, B → C}.
Neither FD is redundant.
{A → B, B → C}
Summary:
Want me to help you find the minimal cover for a specific FD set?
What is decomposition . Explain the properties and the process with example
✅ What is Decomposition?
Decomposition is the process of breaking a relation (table) into two or more smaller relations to
eliminate redundancy, anomalies, and ensure a better database design.
It is commonly used in Normalization to achieve desirable forms like 2NF, 3NF, or BCNF.
🎯 Goals of Decomposition:
1. Eliminate redundancy
2. Avoid anomalies (insertion, update, deletion)
3. Preserve dependencies
4. Maintain data without loss
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 26/33
Ensures no data is lost.
A decomposition R → R1 , R2 is lossless if:
R1 ∩ R2 → R1 or R2
R(A, B, C)
F = {A → B, B → C}
Decompose R into:
1. R1 (A, B) — to satisfy A
→B
2. R2 (B, C) — to satisfy B
→C
✅ Lossless Join:
R ∩ R = B , and B → C ∈ F , so lossless
1 2
✅ Dependency Preservation:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 27/33
✅ Final Result:
Decomposition is correct and useful if it is:
Lossless
Dependency preserving
Free from redundancy and anomalies
Let me know if you want to normalize a relation through decomposition with your own example!
What is lossless and lossy join . Explain in detail about dependency preservation
🔹 Condition:
For a decomposition of relation R into R₁ and R₂ to be lossless, at least one of the following must hold:
R1 ∩ R2 → R1
or R1 ∩ R2 → R2
This ensures that the common attributes can act as a key to join the relations back.
✅ Example:
Relation:
R(A, B, C)
FDs:
F = {A → B}
Decompose into:
R1 (A, B)
R2 (A, C)
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 28/33
Join: R1 ⋈ R2 = R
✔️ This is a lossless join because A (common attribute) is a key in at least one of the decomposed tables
❌ Example:
Same relation:
R(A, B, C)
Decompose into:
R1 (B, C)
R2 (A, B)
Now if we try to join R1 ⋈ R2 , we may get spurious tuples that were not present in the original
🔁 Dependency Preservation
📌 What is it?
A decomposition is said to have dependency preservation if all functional dependencies (FDs) from
the original relation can be enforced without needing to join the decomposed relations.
This is important because checking FDs by joining tables is costly in practice.
✅ Goal:
Let original FDs be F , and decomposed relations be R1 , R2 , ..., Rn with their FDs F1 , F2 , ..., Fn .
R(A, B, C), F = {A → B, B → C}
Decompose into:
R1 (A, B)
R2 (B, C)
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 29/33
FDs in:
R1 : A → B
R2 : B → C
Union: {A → B, B → C} = F
✔️ Dependency preserved
❌ Example of Not Dependency Preserving:
Given:
Decompose into:
R1 (A, B)
R2 (A, C)
Now,
R1 : A → B
R2 : A → C
But original FD A → BC isn't present directly; we can infer it by combining the two, but it’s not
preserved in either table directly.
❌ Not dependency preserving
🔚 Summary Table
Property Lossless Join Dependency Preservation
Let me know if you'd like help checking lossless join and dependency preservation for your own schema!
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 30/33
✅ Key Attribute
A Key Attribute is:
An attribute (or part of a set of attributes) that forms a candidate key or primary key.
It is used to uniquely identify tuples in a relation.
🔹 Examples:
In a relation `Student(roll_no, name, age)`, if `roll_no` is the primary key, then:
`roll_no` is a key attribute
If the candidate key is a composite like `(emp_id, dept_id)`, both `emp_id` and `dept_id` are key
attributes.
❌ Non-Key Attribute
A Non-Key Attribute is:
🔹 Examples:
In `Student(roll_no, name, age)`:
`name` and `age` are non-key attributes, because they are not part of the key.
Normalization:
2NF: No partial dependency of non-key attributes on part of a candidate key.
3NF: No transitive dependency of non-key attributes on key attributes.
Ensuring data integrity
Avoiding redundancy and anomalies
🧾 Summary:
Type Definition Role in Database
Key Attribute Part of a candidate/primary key Helps uniquely identify tuples
Non-Key Attribute Not part of any candidate key Stores descriptive data dependent on keys
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 31/33
Let me know if you'd like to see how key and non-key attributes are used in normalization with
examples!
What are anomalies in RDBMS . Explain in detail about three types of anomalies (Insertion, update,
deletion)
Data duplication
Inconsistent data
Inefficient data operations
🔹 Example:
Student_ID Student_Name Course Instructor
101 Alice DBMS Dr. Smith
Suppose you want to add a new course but no student has enrolled yet.
❌ You cannot insert it because `Student_ID` and `Student_Name` are required, even though you only
want to store course data.
🔹 Example:
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 32/33
If `Dr. Smith` changes his name to `Dr. John Smith`, and he is listed as the instructor for multiple
students:
🔹 Example:
Student_ID Student_Name Course Instructor
101 Alice DBMS Dr. Smith
If Alice drops the course and her record is deleted, and she was the only student enrolled, then:
❌ The course DBMS and Instructor Dr. Smith info are also lost.
🔁 Summary Table
Type What It Means Resulting Problem
Insertion Can't add data without other unnecessary data Prevents adding independent data
✅ Solution: Normalization
All these anomalies can be resolved by designing the schema using proper normal forms (1NF, 2NF,
3NF, BCNF), ensuring:
Less redundancy
Clear functional dependencies
Logical data separation
Let me know if you want to see how normalization helps remove these anomalies with examples!
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 33/33