0% found this document useful (0 votes)
51 views37 pages

MCS-207 2024-25

The document is a solved assignment for MCS-207 covering various topics in database management systems, including DBMS architecture, relational model terms, ER diagrams, normalization, and secondary indexes. It includes detailed explanations, examples, and tasks related to functional dependencies and normal forms. The assignment consists of five questions, each with specific requirements and marks allocated.

Uploaded by

jicof22068
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views37 pages

MCS-207 2024-25

The document is a solved assignment for MCS-207 covering various topics in database management systems, including DBMS architecture, relational model terms, ER diagrams, normalization, and secondary indexes. It includes detailed explanations, examples, and tasks related to functional dependencies and normal forms. The assignment consists of five questions, each with specific requirements and marks allocated.

Uploaded by

jicof22068
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

MCS-207

SOLVED ASSIGNMENT
2024-25
There are five questions in this assignment, which carries 80 marks. Rest 20 marks are
for viva voce. You may use illustrations and diagrams to enhance the explanations.
Please go through the guidelines regarding assignments given in the Programme Guide
for the format of the presentation. The answer to each part of the question should be
confined to about 300 words. Make suitable assumption, if any.
Question 1: (Covers Block 1) (4+4+4+4+4=20 Marks)
(a) Explain the three level DBMS architecture with the help of an example. Also, explain
the concept of data independence in the context of database systems with the help of an
example.
Ans. Three-Level DBMS Architecture

The three-level architecture of a Database Management System (DBMS) separates the


database into three layers:

1. Internal Level: This is the lowest level, representing how data is physically stored in the
system. It manages data storage, indexing, and memory allocation. For example, data might
be stored as binary files or on disks in blocks.

2. Conceptual Level: This middle layer provides a community view of the entire database,
abstracting away details of physical storage. It defines what data is stored and the
relationships between different data types. For instance, in a college database, tables like
"Students" and "Courses" would be defined, but physical storage details would be hidden.

3. External Level: This topmost level shows how individual users or applications view the
data. Different users can see different views of the same database. For example, a professor
may see only students' academic records, while an administrator might see financial
information.

Data Independence

Data independence allows changes to one level of the DBMS without affecting others. There
are two types:

- Logical Data Independence: Changing the conceptual schema (e.g., adding a field to a table)
without affecting external views.

Page 1 of 36
- Physical Data Independence: Changing the internal schema (e.g., how data is stored)
without altering the conceptual schema.

For example, the physical storage format of student records can be changed, but users
accessing those records via SQL queries remain unaffected.

(b) Explain the following terms in the context of a relational model with the help of one
example of each– Super key, Domain, Cartesian Product, Primary Key, Natural join,
Set Intersection, Set Difference operation and referential integrity constraint.
Ans. Relational Model Terms

1. Super Key: A super key is a set of one or more attributes that uniquely identify a tuple
(row) in a table.
Example: In a table Students, a combination of attributes like (Student_ID, Email) can form
a super key because both together uniquely identify each student.

2. Domain: A domain refers to the set of permissible values for an attribute.


Example: The domain of the attribute Age in a Students table could be integers between 18
and 30.

3. Cartesian Product: This is the combination of all possible pairs of rows from two tables.
Example: If Students has 3 rows and Courses has 2 rows, the Cartesian product of Students
× Courses will produce 6 rows.

4. Primary Key: A primary key is a minimal super key that uniquely identifies each row in a
table.
Example: In a Students table, the attribute Student_ID can serve as the primary key because
it uniquely identifies each student.

5. Natural Join: A natural join combines two tables based on common attributes, removing
duplicates.
Example: Joining Students and Courses based on a common column like Student_ID will
merge the tables where Student_ID matches.

6. Set Intersection: It returns the common rows from two tables.

Page 2 of 36
Example: If Table A and Table B both have common student records, the intersection will
return only those common records.

7. Set Difference: This operation returns the rows present in one table but not in the other.
Example: If Table A has students enrolled in Course A, and Table B has students enrolled in
Course B, the difference between Table A and Table B will show students enrolled only in
Course A.

8. Referential Integrity Constraint: This ensures that a foreign key in one table must have a
corresponding value in the referenced primary key of another table.
Example: In a Courses table, the Student_ID (foreign key) must match a valid Student_ID
in the Students table.

(c) A University maintains the list of the books available in its library using a database
system. In addition, this system is used for issue and return of books to its students. This
database is used to find the following details by the students of the university and the
staff of the library:
• List of the classification number, ISBN number, Title, Author Names, Subject Area of
the books.
• Searching of books using subject area, Title and Author name.
• List of books that are issued to a specific student. Draw an ER diagram for the library.
Specify key attributes and constraints on each entity type and on each relationship type.
Note any unspecified requirements and make appropriate assumptions to make the
specification complete.
Ans. ER Diagram for Library Database System

The university library system tracks the books, students, and book transactions (issue and
return). The following ER diagram captures the entities, relationships, key attributes, and
constraints.

Entities:

1. Book:
- Attributes:
- ISBN (Primary Key): Unique identifier for each book.

Page 3 of 36
- Classification Number: A code representing the categorization of the book.
- Title: The name of the book.
- Author Names: Names of the book’s authors.
- Subject Area: The topic the book covers.
- Constraints:
- A book must have one ISBN and one classification number.
- A book can have multiple authors.

2. Student:
- Attributes:
- Student_ID (Primary Key): Unique identifier for each student.
- Name: Name of the student.
- Department: Academic department to which the student belongs.
- Year of Study: Year the student is currently in.
- Constraints:
- A student must have a unique Student_ID.

3. Transaction (Book Issue/Return):


- Attributes:
- Transaction_ID (Primary Key): Unique identifier for each transaction.
- Issue Date: Date when the book was issued.
- Return Date: Date when the book was returned.
- Constraints:
- A transaction must have an associated issue date and optionally a return date.

Relationships:

1. Written By (Between Book and Author):


- Cardinality:
- A book can have one or more authors (1:N).

Page 4 of 36
- An author can write multiple books (N:1).

2. Issued To (Between Book and Student):


- Attributes:
- Issue Date: Date of issue of the book.
- Return Date: Date of return.
- Cardinality:
- A student can borrow multiple books (1:N).
- A book can be issued to one student at a time (N:1).

Key Attributes and Constraints:

- Book: ISBN is the primary key.


- Student: Student_ID is the primary key.
- Transaction: Transaction_ID is the primary key, and foreign keys are Student_ID and ISBN
(both are necessary to track issued books).

- Constraints:
- Each book must have a unique ISBN.
- A book can only be issued to one student at a time.
- A student cannot issue more than a predefined number of books (assume 5 books).

Unspecified Requirements and Assumptions:

- Each student is allowed to borrow a maximum of 5 books.


- A return date is optional and exists only if the book has been returned.
- The library staff can add books and students to the system.
- Authors are stored as a separate entity due to their multiple book contributions.

Page 5 of 36
This model captures the essential functions of the system: listing, searching, and tracking
book transactions. By enforcing referential integrity, the system ensures that issued books are
properly returned and that all records are consistent.

(d) Design normalised tables in 3NF for the ER diagram drawn in part
Ans. Normalized Tables in 3NF

Based on the ER diagram for the library system, here are the normalized tables in Third
Normal Form (3NF):

1. Book
- Attributes:
- ISBN (Primary Key): Unique identifier for each book.
- Classification_Number: Classification code of the book.
- Title: Title of the book.
- Subject_Area: Subject area of the book.

- Constraints:
- ISBN is the primary key.
- Classification_Number, Title, and Subject_Area are fully functionally dependent on ISBN.

2. Author
- Attributes:
- Author_ID (Primary Key): Unique identifier for each author.
- Author_Name: Name of the author.

- Constraints:
- Author_ID is the primary key.
- Author_Name must be unique.

3. Book_Author (Associative table for many-to-many relationship)

Page 6 of 36
- Attributes:
- ISBN (Foreign Key): References Book.ISBN.
- Author_ID (Foreign Key): References Author.Author_ID.

- Constraints:
- Composite primary key (ISBN, Author_ID).
- Ensures that each combination of ISBN and Author_ID is unique.

4. Student
- Attributes:
- Student_ID (Primary Key): Unique identifier for each student.
- Name: Name of the student.
- Department: Academic department of the student.
- Year_of_Study: Current academic year of the student.

- Constraints:
- Student_ID is the primary key.

5. Transaction
- Attributes:
- Transaction_ID (Primary Key): Unique identifier for each transaction.
- ISBN (Foreign Key): References Book.ISBN.
- Student_ID (Foreign Key): References Student.Student_ID.
- Issue_Date: Date when the book was issued.
- Return_Date: Date when the book was returned (nullable).

- Constraints:
- Transaction_ID is the primary key.
- ISBN and Student_ID together with Issue_Date ensure that a book issued to a student is
tracked accurately.

Page 7 of 36
(c), with the required integrity constraints.
Ans. Normalized Tables in 3NF with Integrity Constraints

Based on the ER diagram for the university library system, the normalized tables in Third
Normal Form (3NF) are:

1. Book
- Attributes:
- ISBN (Primary Key): Unique identifier for each book.
- Classification_Number: Classification code for the book.
- Title: Title of the book.
- Subject_Area: Subject area of the book.

- Constraints:
- ISBN is the primary key.
- Each book’s Classification_Number, Title, and Subject_Area must be uniquely associated
with ISBN.

2. Author
- Attributes:
- Author_ID (Primary Key): Unique identifier for each author.
- Author_Name: Name of the author.

- Constraints:
- Author_ID is the primary key.
- Author_Name must be unique.

3. Book_Author
- Attributes:
- ISBN (Foreign Key): References Book.ISBN.
- Author_ID (Foreign Key): References Author.Author_ID.
- Constraints:
- Composite primary key (ISBN, Author_ID).
- Ensures that each book-author pair is unique.

4. Student
- Attributes:
- Student_ID (Primary Key): Unique identifier for each student.
- Name: Name of the student.
- Department: Department to which the student belongs.
- Year_of_Study: Academic year of the student.

- Constraints:
- Student_ID is the primary key.

5. Transaction
- Attributes:
- Transaction_ID (Primary Key): Unique identifier for each transaction.
- ISBN (Foreign Key): References Book.ISBN.
- Student_ID (Foreign Key): References Student.Student_ID.
- Issue_Date: Date when the book was issued.
- Return_Date: Date when the book was returned (nullable).

- Constraints:
- Transaction_ID is the primary key.
- ISBN and Student_ID together with Issue_Date ensure accurate tracking of issued books.

Integrity Constraints
- Referential Integrity:
- ISBN in Book_Author and Transaction references Book.ISBN.
- Student_ID in Transaction references Student.Student_ID.
- Uniqueness:
- ISBN and Author_ID combination in Book_Author must be unique.
- Each Student_ID and Transaction_ID must be unique.

(e) Explain how the secondary index can be created in a file. Also, explain the
advantages and disadvantages of using secondary indexes. When should you use
secondary Indexes? Give reasons in support of your answer
Ans. Creating a Secondary Index

A secondary index is created to improve query performance for non-primary key attributes in
a file. Here's how it can be created:

1. Select the Attribute: Choose the attribute (non-primary key) that frequently appears in
query conditions.

2. Create the Index: Build a separate index file with entries containing the values of the
chosen attribute and pointers (addresses) to the corresponding records in the main file. For
example, if indexing the Author_Name attribute, the secondary index will list Author_Name
values and their record addresses.

3. Maintain the Index: Update the secondary index whenever records are added, deleted, or
modified in the main file to ensure it remains synchronized.

Advantages and Disadvantages

Advantages:
- Faster Query Performance: Speeds up searches, especially for non-primary key attributes.
- Improved Sorting and Filtering: Allows efficient sorting and filtering based on non-primary
key attributes.

Disadvantages:
- Increased Storage: Requires additional storage for the index file.
- Performance Overhead: Can slow down insert, update, and delete operations due to the need
to maintain the index.

When to Use Secondary Indexes

Use secondary indexes when:


- Queries frequently involve non-primary key attributes.
- Performance improvement for these queries outweighs the overhead of maintaining the
index.

Reasons:
- They enhance performance for specific query patterns.
- They help in optimizing the retrieval of data based on attributes other than the primary key,
improving overall efficiency.

However, avoid excessive indexing, as it can lead to significant maintenance costs and
storage overhead. Use secondary indexes selectively to balance query performance and
system resources.

Question 2: (Covers Block 2) (4+4+2+10=20 Marks)


(a) Consider a Relation: Student (EnrolNo, StudentName, ProgrammeCode,
ProgrammeName, CourseCode, CourseName, Grade). Some of the constraints on the
relation Student are:
• A student is assigned one unique Enrolment Number (EnrolNo).
• Two students may have same name.
• A student can register in one Programme only.
• A Programme consists of several compulsory courses.
• Grades can be “A”, “B”, “C”, or “D” and only one grade value is recorded for a
student in a course. Perform the following task for the relation given above:
(i) What is the key to the relation?
(ii) Identify and list the functional dependencies in the relation.
(iii)Make an instance of this relation showing possible redundancies.
(iv) Decompose R into 2NF and 3NF relations.
Ans. (a) Relation: Student (EnrolNo, StudentName, ProgrammeCode, ProgrammeName,
CourseCode, CourseName, Grade)

(i) Key to the Relation

The primary key for this relation is a composite key: (EnrolNo, CourseCode). This
combination uniquely identifies each record because:
- EnrolNo ensures the uniqueness of each student.
- CourseCode specifies the unique course for each student.

(ii) Functional Dependencies

1. EnrolNo → StudentName, ProgrammeCode


2. ProgrammeCode → ProgrammeName
3. CourseCode → CourseName
4. (EnrolNo, CourseCode) → Grade

(iii) Instance of the Relation Showing Redundancies

Redundancies:
- ProgrammeName is repeated for each student in the same programme.
- CourseName is repeated for each course in the same relation.

(iv) Decompose into 2NF and 3NF Relations


2NF Decomposition

1. Student:
- EnrolNo (Primary Key)
- StudentName
- ProgrammeCode

2. Programme:
- ProgrammeCode (Primary Key)
- ProgrammeName

3. Course:
- CourseCode (Primary Key)
- CourseName

4. Enrollment:
- EnrolNo (Foreign Key)
- CourseCode (Foreign Key)
- Grade

3NF Decomposition

- Student: Already in 3NF.


- Programme: Already in 3NF.
- Course: Already in 3NF.
- Enrollment: Already in 3NF.

In 3NF, all the attributes in each relation are fully functionally dependent on the primary key,
and there are no transitive dependencies.
(b) Explain the concept of Multi-valued dependency and Join dependency with the help
of an example of each. Also, explain the 4th Normal Form and 5th Normal form.
Ans. Multi-Valued Dependency (MVD) and Join Dependency

Multi-Valued Dependency (MVD)

Concept: A multi-valued dependency occurs when one attribute in a relation determines


another set of attributes, which are independent of each other. This means that for a given
value of the determinant attribute, there can be multiple values for the dependent attributes.

Example:
Consider a relation `R(A, B, C)` where:
- `A` → `B`
- `A` → `C`

This implies that for each value of `A`, there can be multiple values for `B` and `C`
independently.

Instance of the relation:

Explanation: For `A = 1`, the values of `B` and `C` can vary independently, showing a multi-
valued dependency.

Join Dependency (JD)

Concept: A join dependency occurs when a relation can be decomposed into multiple
relations such that the original relation can be reconstructed by joining these decomposed
relations.

Example:
Consider a relation `R(A, B, C)` with join dependency if `R` can be decomposed into `R1(A,
B)` and `R2(A, C)`, and joining `R1` and `R2` on attribute `A` will yield the original relation
`R`.

Instance of the relation:

Decomposition:
- `R1(A, B)`: | 1 | X | 1 | Y |
- `R2(A, C)`: | 1 | M | 1 | N |

Join:
- `R1` ⨝ `R2` on `A` reconstructs the original relation.

4th Normal Form (4NF)

Concept: A relation is in 4NF if it is in Boyce-Codd Normal Form (BCNF) and has no multi-
valued dependencies other than those implied by candidate keys.

Example:
Consider a relation `R(A, B, C)` with the functional dependencies `A → B` and `A → C`,
where `A` is the candidate key.

Instance:

Decomposition:
- `R1(A, B)`
- `R2(A, C)`
Each decomposed relation is in 4NF because there are no non-trivial multi-valued
dependencies.

5th Normal Form (5NF)

Concept: A relation is in 5NF (or Project-Join Normal Form) if it is in 4NF and cannot be
decomposed further without losing information. It deals with join dependencies, ensuring that
every join dependency in the relation is a consequence of the candidate keys.

Example:
Consider a relation `R(A, B, C, D)` with join dependency where:

- `R` can be decomposed into `R1(A, B)`, `R2(B, C)`, and `R3(C, D)`.

Instance:

Decomposition:
- `R1(A, B)`: | 1 | X | 1 | Y |
- `R2(B, C)`: | X | M | Y | M |
- `R3(C, D)`: | M | N |

Join: Joining `R1`, `R2`, and `R3` on their common attributes reconstructs the original
relation, demonstrating that it is in 5NF.

In summary, 4NF addresses multi-valued dependencies, while 5NF addresses join


dependencies, ensuring that the decomposed relations can be joined back to recreate the
original data without loss of information.

(c) Explain the following terms with the help of an example of each – Assertion, Cursor,
Stored Procedure, Triggers.
Ans. Assertion, Cursor, Stored Procedure, and Triggers
1. Assertion

Concept: An assertion is a condition or constraint that must always hold true for a database. It
enforces business rules at the database level.

Example: Consider a database where employees should not have a salary greater than
$100,000. An assertion might be defined as:

This ensures that no employee can have a salary exceeding $100,000.

2. Cursor

Concept: A cursor is a database object used to retrieve, manipulate, and navigate through a
result set row by row.

Example: To process each employee’s record in a database:


This cursor fetches and processes each employee’s ID and name.

3. Stored Procedure

Concept: A stored procedure is a precompiled collection of SQL statements and optional


control-of-flow statements stored under a name and processed as a unit.

Example: A stored procedure to update an employee's salary:

This procedure updates the salary of an employee with a specific ID.

4. Trigger

Concept: A trigger is a set of instructions automatically executed in response to certain events


on a table or view.

Example: A trigger to automatically log salary changes:


This trigger logs changes to employee salaries into a separate table.

(d) Consider the following relational database:


Customer (custId, custName, custAddress, custPhone)
Account (AccountNumber, custId, TypeOfAccount, Balance)
Transaction (DateTimeofTransaction, AccountNumber, DebitORCredit, Amount)
The underlined attribute(s) in the relation forms the primary key. In relation Customer,
the custID is the unique identifier of a customer. The purposes of other attributes in
Customer relation are self explanatory. You may define the domain of different
attributes. The TypeOfAccount attribute can take the value (“Saving”, “Current”,
“Salary”, “Other”). Please note that at a specific time of a date only one transaction can
be performed from an account. Please also note that the Account relation has foreign
key custID and Transaction relation has foreign key AccountNumber. Write and run the
following SQL queries on the database:
(i) Create the tables with the primary and foreign key constraints
(ii) Insert at least 5 records in the first 2 tables and 20 records in the 3rd table.
(iii) List “Saving” Account details showing the AccountNumber, custName, custPhone,
Balance of all the accounts. These records should be shown in the order of custName.
(iv) Find all the transactions made by customer whose custID is “240002”.
(v) Find the list of those customers who have more than one account.
(vi) Find the total of Debit transactions made for Account Number “A0054”
(vii) Find the total of balance of all the accounts of a customer.
(viii) Find the list of customers whose name start with an alphabet “B”
(ix) List the pair of customers who share the same phone number.
(x) Find the list of the customers, who have not made any credit transaction since 1st
January, 2023.
Ans. SQL Queries for the Given Database

(i) Create Tables with Primary and Foreign Key Constraints

(ii) Insert Records


(iii) List “Saving” Account Details
(iv) Find All Transactions by Customer with `custID` 240002

(v) Find Customers with More Than One Account

(vi) Total Debit Transactions for Account Number “A0054”

(vii) Total Balance of All Accounts of a Customer

(viii) Customers Whose Name Starts with “B”


(ix) List of Customers Sharing the Same Phone Number

(x) Customers Who Have Not Made Any Credit Transaction Since January 1, 2023

These queries cover a range of operations including table creation, data insertion, and
complex data retrieval.

Question 3: (Covers Block 3) (2+4+2+4+2+3+3=20 Marks)


(a) Explain the ACID properties of transactions with the help of an example of each.
Ans. The ACID properties ensure reliable transactions in databases:

1. Atomicity: This property ensures that a transaction is all-or-nothing. For example, in a


bank transfer, either both debit and credit operations occur, or neither does. If a failure occurs
during the transfer, the system rolls back to the initial state, ensuring no partial updates.

2. Consistency: This property ensures that a transaction brings the database from one valid
state to another. For instance, if a database constraint enforces a rule that total account
balances must equal zero, a transaction that transfers funds must respect this constraint,
ensuring database integrity before and after the transaction.

3. Isolation: This property ensures that concurrent transactions do not interfere with each
other. For example, if two transactions simultaneously attempt to update the same account
balance, isolation ensures each transaction is executed in a way that they do not affect each
other's results.

4. Durability: This property ensures that once a transaction is committed, its changes persist
even in the case of a system crash. For example, once a payment is processed and committed,
it remains in the database even if the system shuts down immediately afterward.

(b) What are the problems that can be encountered, if the three transactions (given in
Figure 1) are run concurrently? Explain with the help of different transaction schedules
of these transactions

Ans. Potential Problems in Concurrent Transactions


When multiple transactions are executed concurrently, there's a risk of inconsistencies in the
database state. This can lead to incorrect results and data corruption. The following problems
can arise:
1. Lost Update Problem
Definition: A lost update occurs when two transactions modify the same data item and one
transaction's update is overwritten by the other.Example:
• Transaction A reads X and then updates it.
• Transaction B also reads X and updates it.
• Transaction B's update overwrites Transaction A's update, resulting in the loss of
Transaction A's changes.
Schedule:
Transaction A: Read X -> Update X -> ...
Transaction B: Read X -> Update X -> ...
2. Dirty Read Problem
Definition: A dirty read occurs when a transaction reads a data item that has been modified
by another transaction but not yet committed. If the modifying transaction later aborts, the
reading transaction will have read an inconsistent value.Example:
• Transaction A reads X.
• Transaction B modifies X but has not yet committed.
• Transaction A continues with its operations using the modified value of X.
• If Transaction B aborts, the changes to X are lost, and Transaction A's results will be
based on incorrect data.
Schedule:
Transaction A: Read X -> ...
Transaction B: Update X -> ... (not committed)
3. Uncommitted Read Problem
Definition: An uncommitted read occurs when a transaction reads a data item that has been
modified by another transaction but not yet committed. This can lead to inconsistent results if
the modifying transaction aborts.Example:
• Transaction A reads X.
• Transaction B modifies X but has not yet committed.
• Transaction A continues with its operations using the modified value of X.
• If Transaction B aborts, the changes to X are lost, and Transaction A's results will be
based on uncommitted data.
Schedule:
Transaction A: Read X -> ...
Transaction B: Update X -> ... (not committed)
4. Phantom Read Problem
Definition: A phantom read occurs when a transaction reads a set of data items, then another
transaction inserts or deletes data items in the same set, and the original transaction re-reads
the set and finds the changes.Example:
• Transaction A reads all rows in a table.
• Transaction B inserts a new row into the table.
• Transaction A re-reads the table and finds the new row.
Schedule:
Transaction A: Read all rows in a table -> ...
Transaction B: Insert a new row into the table -> ...
To prevent these problems, database systems employ concurrency control mechanisms such
as locking, timestamping, and optimistic concurrency control. These mechanisms ensure that
transactions are isolated from each other and that the database remains consistent.

(c) What is 2-Phase locking? Lock and unlock various data items of the transactions,
given in Figure 1, using 2-Phase locking such that no concurrency related problem
occurs, when the transactions A, B and C are executed concurrently.
Ans. 2-Phase Locking Protocol:

The 2-Phase Locking (2PL) protocol is a concurrency control mechanism that ensures
serializability of transactions by dividing the locking process into two distinct phases:

1. Growing Phase: A transaction can acquire any number of locks but cannot release any
locks.
2. Shrinking Phase: A transaction can release locks but cannot acquire any more.

Example Application of 2-Phase Locking:

Consider three transactions A, B, and C, and data items X, Y, and Z.

- Transaction A requires locks on X and Y.


- Transaction B requires locks on Y and Z.
- Transaction C requires locks on X and Z.

Execution with 2-Phase Locking:


1. Transaction A:
- Growing Phase: Lock X, then lock Y.
- Shrinking Phase: Release Y, then release X.

2. Transaction B:
- Growing Phase: Lock Y, then lock Z.
- Shrinking Phase: Release Z, then release Y.

3. Transaction C:
- Growing Phase: Lock X, then lock Z.
- Shrinking Phase: Release Z, then release X.

Locking Order:
1. Transaction A locks X, then Y.
2. Transaction B waits until A releases Y before locking Y, then locks Z.
3. Transaction C waits until A releases X before locking X, then locks Z.

By following this protocol, we ensure that no two transactions will interfere with each other’s
locks in a way that would cause deadlock or inconsistency.

(d) Explain the Log based Recovery with the help of an example. What are Redo and
Undo operations. Why do you need checkpoint? Explain the process of recovery with
check points with the help of an example.
Ans. Log-Based Recovery:

Log-based recovery uses a log file to keep track of all changes made during transactions,
enabling the database to recover to a consistent state after a crash. Each transaction’s
operations are recorded in a log, which includes details about data modifications.

Redo and Undo Operations:


- Redo: Reapplies changes from the log to ensure that committed transactions are reflected in
the database after a crash.
- Undo: Reverts changes from the log to roll back incomplete or aborted transactions,
ensuring that uncommitted changes are not reflected in the database.

Checkpoint:

A checkpoint is a process where the database saves its state to stable storage, marking a point
where all changes up to that time are committed. This helps minimize recovery time, as only
changes since the last checkpoint need to be processed.

Recovery with Checkpoints Example:

1. Before Crash: A checkpoint is taken. Transactions A and B are committed, but transaction
C is still in progress.

2. After Crash:
- Redo: Reapply changes from transactions A and B as they were committed.
- Undo: Roll back changes from transaction C as it was not committed.

This process ensures that the database recovers to a consistent state efficiently.

(e) Explain with the help of an example, how recovery is performed when Deferred
database modification scheme is used.
Ans. Deferred Database Modification Scheme:

In the Deferred Database Modification (DMM) scheme, changes made by a transaction are
not applied to the database until the transaction is committed. This means that updates are
kept in a temporary area and only written to the actual database after the transaction
completes successfully.

Example of Recovery Using DMM:


1. Transaction T starts and makes changes to data item X. These changes are stored in a log
but not applied to the database.

2. Transaction T commits. At this point, the changes are applied to the database, and the log
entries are flushed to ensure the database reflects the committed state.

3. In Case of a Crash:
- If Transaction T has not committed yet, its changes are not applied, and the database
remains in its state before T started.
- If Transaction T had committed, the recovery process re-applies the changes from the log
to ensure all committed updates are present in the database.

Thus, DMM ensures that only fully completed transactions affect the database, simplifying
recovery.

(f) What is the cost of selection operation, when the index scan method is used? Explain
with the help of an example. Explain the cost of Join operation when Merge-Join
method is used.
Ans. Cost of Selection Operation with Index Scan:

When using an index scan, the cost of a selection operation depends on the index type and the
number of qualifying records. For example, with a B-tree index, the cost includes:

1. Index Lookup Cost: Determining the index entry’s location, which is typically logarithmic
in relation to the number of entries.
2. Access Cost: Reading the actual data records, often constant or linear depending on the
number of matching entries.

Example: Suppose a B-tree index is used to find records where `age = 30` in a table with
10,000 entries. The index lookup cost is `O(log N)`, where `N` is the number of entries
(10,000), and the access cost depends on the number of records found (e.g., 50 records).

Cost of Join Operation with Merge-Join:

Merge-Join requires both relations to be sorted on the join attribute. The cost involves:
1. Sorting Cost: Sorting both relations, which is `O(N log N)` for each relation.
2. Merge Cost: Linear scan of both sorted relations to find matching tuples, which is `O(N +
M)`, where `N` and `M` are the sizes of the two relations.

Example: For two relations, A with 5,000 tuples and B with 8,000 tuples, sorting both
relations would be `O(5,000 log 5,000) + O(8,000 log 8,000)`, and merging would be
`O(5,000 + 8,000)`.

(g) Make the query tree for the following query (assume the database of problem 2(d)).
SELECT c.custName, c. custId, a.AccountNumber, t.DebitORCredit
FROM Customer c, Account a, Transaction t
WHERE c.custId = a.custId AND a.AccountNumber = t.AccountNumber AND Amount
> 10000;
Ans. To create a query tree for the SQL query, follow these steps:

1. Base Relations: Start with the base relations `Customer (c)`, `Account (a)`, and
`Transaction (t)`.

2. Selection Operation: Apply the selection `Amount > 10000` on `Transaction (t)`. This
filters the records in `Transaction` where the amount is greater than 10,000.

3. Join Operations:
- Perform a join between `Account (a)` and the filtered `Transaction (t)` on the condition
`a.AccountNumber = t.AccountNumber`.
- Next, join the result with `Customer (c)` on `c.custId = a.custId`.

4. Projection Operation: Finally, project the required attributes `c.custName`, `c.custId`,


`a.AccountNumber`, and `t.DebitORCredit` from the result of the join operations.

Query Tree Structure:

```
π (c.custName, c.custId, a.AccountNumber, t.DebitORCredit)
|

_____________________
| |
⨝ (c.custId = a.custId) ⨝ (a.AccountNumber = t.AccountNumber)
| |
Customer (c) σ (Amount > 10000)
|
Transaction (t)
|
Account (a)
```

In this tree:
- π denotes the projection operation.
- ⨝ denotes the join operation.
- σ denotes the selection operation.

Question 4: (Covers Block 4) (5 Marks each=20 Marks)


(a) What are object-relational database? Explain the concept of complex data types
used in these databases. Explain the object model in the context of Object Oriented
database systems. How objectoriented database systems are different from Object-
relational database systems?
Ans. Object-Relational Databases (ORDBs):

Object-relational databases extend the relational database model by incorporating features


from object-oriented databases. They support complex data types, enabling the storage and
manipulation of more sophisticated data structures compared to traditional relational
databases.

Complex Data Types in ORDBs:


1. Structured Types: Allow the definition of composite data types with multiple attributes. For
example, a `PERSON` type might include attributes like `name`, `address`, and
`dateOfBirth`.

2. Collections: Support arrays, sets, and lists of data types. For example, a `STUDENT` type
might have a list of `COURSES`.

3. User-Defined Types (UDTs): Allow users to define new data types that encapsulate both
data and methods. For example, a `CIRCLE` type might have attributes like `radius` and
methods like `calculateArea`.

4. Inheritance: Supports hierarchical relationships between types. For instance, a `VEHICLE`


type could be a base type with `CAR` and `TRUCK` as subtypes.

Object Model in Object-Oriented Databases (OODBMS):

1. Objects: Represent data as entities with both state (attributes) and behavior (methods).
Each object is an instance of a class.

2. Classes and Inheritance: Define types and hierarchies, where classes can inherit attributes
and methods from parent classes.

3. Encapsulation: Combines data and methods within an object, hiding implementation


details and exposing only necessary interfaces.

4. Polymorphism: Allows methods to operate on objects of different classes through a


common interface, enhancing flexibility.

Differences Between OODBMS and ORDBMS:

1. Data Model:
- OODBMS stores data as objects with encapsulated attributes and methods.
- ORDBMS stores data in a relational format with extensions for complex types.
2. Complex Data Handling:
- OODBMS natively supports complex data structures and relationships.
- ORDBMS extends relational databases with support for complex data types but relies on
traditional relational operations.

3. Query Language:
- OODBMS typically uses object-oriented query languages or extensions to SQL.
- ORDBMS primarily uses SQL with extensions for handling complex types and user-
defined types.

4. Inheritance:
- OODBMS supports class hierarchies and inheritance directly.
- ORDBMS uses table inheritance or type extension features to mimic hierarchical
structures.

In summary, ORDBMS enhances relational databases with object-oriented features for


complex data types, while OODBMS adopts a full object-oriented approach to data modeling.

(b) Explain the multi-dimensional data model of a data warehouse. Also, define the
concept of decision tree with the help of an example. List any four applications of data
mining.
Ans. Multi-Dimensional Data Model of a Data Warehouse:

The multi-dimensional data model is used in data warehousing to represent data in a way that
is intuitive for analysis and reporting. It organizes data into a structure that allows users to
analyze data from multiple perspectives.

1. Dimensions: Represent different perspectives or categories for analysis, such as time,


geography, or product. Dimensions are typically hierarchical, allowing drill-down or roll-up
operations. For example, the time dimension might include year, quarter, month, and day.

2. Measures: Represent the quantitative data to be analyzed, such as sales revenue, profit, or
quantity sold. Measures are aggregated along different dimensions.
3. Cubes: The core structure in a multi-dimensional model is the data cube, which stores data
in a multi-dimensional array format. Each cell in the cube contains aggregated data (e.g., total
sales) for a specific combination of dimensions.

Example: A sales data warehouse might have dimensions like `Product`, `Region`, and
`Time`, with measures like `Sales Amount` and `Units Sold`. Users can analyze total sales by
region and product across different time periods.

Decision Tree:

A decision tree is a supervised learning algorithm used for classification and regression tasks.
It splits data into branches based on feature values to make decisions or predictions.

1. Nodes: Represent decision points based on feature values or attributes.


2. Branches: Represent the outcome of a decision and lead to further nodes or leaf nodes.
3. Leaf Nodes: Represent the final outcome or decision.

Example: In a decision tree for classifying whether a person should receive a loan:
- Root Node: "Income Level" (e.g., High, Medium, Low).
- Branch 1: High income leads to "Approved" if other criteria are met.
- Branch 2: Medium income leads to further checks, such as "Credit Score".
- Branch 3: Low income typically leads to "Denied".

Applications of Data Mining:

1. Customer Segmentation: Identifying distinct customer groups for targeted marketing and
personalized services.
2. Fraud Detection: Detecting fraudulent transactions or behavior patterns in financial
services.
3. Market Basket Analysis: Analyzing purchase patterns to identify associations between
products, used in recommendation systems.
4. Predictive Maintenance: Predicting equipment failures or maintenance needs in
manufacturing to minimize downtime and costs.
(c) Explain the need of NoSQL databases. Explain the characteristics of any two types of
NoSQL databases.
Ans. Need for NoSQL Databases:

NoSQL databases address limitations of traditional relational databases, particularly for


modern applications requiring high scalability, flexibility, and performance. They handle
large volumes of diverse, unstructured, or semi-structured data, and offer:

1. Scalability: Easily scale horizontally by adding more servers, unlike relational databases
that scale vertically.
2. Flexibility: Support dynamic schemas, making it easier to adapt to changes in data
structure.
3. High Performance: Optimize for specific use cases like large-scale data retrieval or high-
speed writes.

Characteristics of Two Types of NoSQL Databases:

1. Document Stores (e.g., MongoDB):


- Schema Flexibility: Store data as JSON-like documents, allowing different structures
within the same collection.
- Indexing and Querying: Support indexing on various fields and complex querying
capabilities.

2. Column-Family Stores (e.g., Apache Cassandra):


- Column-Oriented Storage: Store data in columns rather than rows, optimizing for read and
write operations on specific columns.
- High Availability: Designed for distributed systems with features like replication and
partitioning to ensure data availability and fault tolerance.

(d) Write short notes on the following database technologies:


(i) Distributed databases
(ii) Block chain databases
Ans. (i) Distributed Databases:
Distributed databases are systems where data is stored across multiple physical locations,
which can be interconnected through a network. They provide several benefits:

- Scalability: Easily scale by adding more nodes, handling increased data volume and traffic.
- Fault Tolerance: Improve reliability by replicating data across multiple sites, ensuring
availability even if some nodes fail.
- Geographic Distribution: Allow data to be closer to users or systems, reducing latency and
improving access speed.

Distributed databases can be homogeneous (same DBMS across nodes) or heterogeneous


(different DBMSs), and they require complex synchronization and consistency mechanisms
to ensure data accuracy and coherence across nodes.

(ii) Blockchain Databases:

Blockchain databases are decentralized, distributed ledgers that use cryptographic techniques
to secure and verify transactions. Key features include:

- Immutability: Once recorded, transactions cannot be altered, ensuring data integrity and
trust.
- Decentralization: Operate across a network of nodes without a central authority, reducing
the risk of a single point of failure.
- Consensus Mechanisms: Use algorithms like Proof of Work or Proof of Stake to agree on
transaction validity, preventing fraud and double-spending.

Blockchain databases are widely used in cryptocurrencies, supply chain management, and
other applications requiring secure, transparent, and tamper-proof record-keeping.

You might also like