0% found this document useful (0 votes)
9 views

PDF Color

Uploaded by

dolar45755
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

PDF Color

Uploaded by

dolar45755
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

DBMS NOTES

Database and Database Management System


A database
is a collection of related information stored in a way that it is
available to many users for different purposes. It is a computerized
record-keeping system that allows easy and efficient storage, retrieval,
and modification of data.

"A database is a logically coherent, organized


collection of similar data. Similar data refers to the collection
of
data, which is stored based on the same context."

A Database Management System (DBMS)


is a collection of interrelated data and a set of programs to access
those data. Its primary goal is to provide a way to store and retrieve
database information that is both convenient and efficient.

Components of DBMS
Component Description

Application programmers, end users, and Database Administrators


Users
(DBA)

DBMS, operating system, metadata repository (if necessary), and


Software
application programs

Computers, storage devices (hard disks), and input/output devices


Hardware
(monitor, printer)

Numerical data, non-numerical data, and complex data entities


Data
(images, audio)

Need for DBMS


Historically, information systems employed
stand-alone systems for separate applications, each with their own set
of files. This led to:

DBMS NOTES 1
Duplication of data

Inconsistency

Waste of space

Difficulty in maintaining data integrity

DBMS solves these problems by providing a centralized repository of data that


can be accessed and modified efficiently.

Applications of DBMS
Banking: maintaining customer information, accounts, loans, and banking
transactions

Universities: maintaining student records, course registration, and grades

Railway Reservation: checking availability of reservation in different trains,


tickets, etc.

Airlines: reservation and schedule information

Telecommunication: keeping records of calls made, generating monthly


bills, etc.

Finance: storing information about holidays, sales, and purchases of


financial instruments

Sales: customer, product, and purchase information

Advantages of DBMS
Reduction in Data Redundancy: eliminates duplication of data, reducing
waste of space

Reduction in Inconsistency: ensures consistency of data by controlling


redundancy

Sharing of Data: allows multiple applications to use the same data

Enforcement of Standards: enforces standards for data naming, formatting,


and security

Improvement in Data Security: provides a centralized system for security


checks and enforcement

Maintenance of Data Integrity: ensures accuracy and consistency of data

DBMS NOTES 2
Better Interaction with Users: provides better service to users through
efficient data retrieval and modification## Advantages of DBMS

The availability of up-to-date information


improves in a DBMS since the data can be shared and the DBMS makes it
easy to respond to unforeseen information requests.

Efficient System:
It is very common to change the content of stored data. These changes
can easily be made in a database management system than in a
conventional system as these do not need to have any impact on
application programs. The cost of developing and maintaining systems is
also lower.

Disadvantages of DBMS

Problems associated with Centralization

"Centralization increases the security problems and


disruption due to downtime and failures."

Cost of Software and Hardware


The cost of software and hardware is a major drawback.

The cost of the hardware is also one of the major drawbacks.

Complexity of Backup and Recovery

"DBMS provides the centralization of the data,


which requires adequate backups of the data so that in case
of failure,
the data can be recovered."

Other Disadvantages
Atomicity and integrating problems are found.

Security of data is not good.

DBMS NOTES 3
There is no concurrent access and recovery.

Difference between Database and File System


Database File System

Data redundancy Does not exist Exists

Data inconsistency Does not exist Exists

Accessing database Easier Comparatively difficult

Data isolation Does not exist Exists in different formats

Atomicity and integrating Not found Found

Security of data Good Not good

Concurrent Access and Recovery Exists Does not exist

Advantages and Disadvantages of File-Based


Systems

Advantages of File-Based Systems


Helps in overall understanding of design complexity of database systems.

Provides a useful historical perspective on how we handle data.

Results in smooth transition from traditional file-based systems to newer


database systems.

Disadvantages of File-Oriented Systems


Data redundancy: Since these systems used a decentralized approach,
so each department used its own application programs and files.

Poor data control: There was no centralized control on the fields.

Poor data manipulation capabilities: Does not provide strong connections


between data in different files.

Data dependence: Files and records were described by specific physical


formats that were coded into the application programs.

Security Problems

DBMS NOTES 4
"Every user of the database system should not be
allowed to access all data. Since application programs are
added to the
file-oriented systems in an ad-hoc manner, it was difficult to
enforce
such security systems."

Who is a DBA?
A Database Administrator (DBA)

is an individual person or a group of persons with an overview of one


or more databases so that he/she can control the design and the use of
these databases.

Role of DBA
Defining conceptual schema: A DBA creates the conceptual schema
corresponding to the abstract level database design made by data
administrator.

Physical database design: A DBA decides how the data is to be represented


in the stored database.

Security and integrity checks: A DBA is responsible for providing


the authorization and authentication checks so that no malicious users
can access the database.

Giving backup and recovery strategies: A DBA must define and


implement an appropriate periodical recovery strategy to recover the
database from all types of failures.

Granting access to users: A DBA regulates the usage of specific parts of


the database by various users.

Characteristics of Database Approach


Self-describing nature of a Database system: A database system is self-
describing, meaning that it contains a
description of itself, including the structure of the data and the
relationships between them.

DBMS NOTES 5
Insulation between Programs, data, and data abstraction: A database
system provides insulation between programs, data, and data
abstraction, which means that changes to one do not affect the others.

Support of multiple views of data: A database system supports multiple


views of data, which means that different users can see different views of
the same data.

Sharing of Data and Multi-user Transaction Processing: A database


system allows sharing of data and multi-user transaction
processing, which means that multiple users can access and update the
data simultaneously.

Database Models
A database model is a collection of conceptual
tools for describing data, data relationships, data semantics, and
consistency constraints.

Types of Database Models


Model Description

Organises data in a tree structure, with each child node having only
Hierarchical
one parent node

Network Allows for many-to-many relationships between data entities

Organises data into tables with rows and columns, with each row
Relational
representing a single record

Object-
Uses objects to represent data and relationships between them
Oriented

Hierarchical Database Model


Organises data in a tree structure, with each child node having only one
parent node

Uses parent-child relationships to link records

Data access is quite predictable in structure, making it efficient for retrieval


and updates

Network Database Model

DBMS NOTES 6
Allows for many-to-many relationships between data entities

Uses a network structure to represent data relationships

Efficient in space utilisation and query execution times, but


inflexible and difficult to use## Relational Database Management
System (RDBMS)

The Relational Database Management System (RDBMS)


is based on the relational model developed by E.F. Codd. A relational
database represents all data in the database as simple two-dimensional
tables called relations.
Definition:

"A relational database represents all data in the database as


simple two-dimensional tables called relations."

Relations (Tables)
A relation (table) consists of rows (records) and columns (fields).

Table Structure:

Column Description

Author ID Unique identifier for the author

Author Name Name of the author

Author DOB Date of birth of the author

Pub ID Unique identifier for the publisher

Publisher Name of the publisher

Pub Address Address of the publisher

Operations
Three basic operations are used to develop useful sets of data:

Selection: Retrieves certain records from a table based on user-specified


criteria.

Projection: Extracts fields from a table, allowing the user to create new
tables that contain only the required information.

Join: Combines data from multiple tables to create a new table.

DBMS NOTES 7
These operations are all part of Relational Algebra.

Object-Oriented Database Model (OODBM)


The Object-Oriented Database Model (OODBM) stores
and maintains objects, which can contain both data and procedures that
manipulate the data.
Definition:

"An object-oriented database stores and maintains objects,


which can contain both data and procedures that manipulate
the data."

Object-Oriented Database Structure


The class population is the root of a class
hierarchy, which includes the Nation class. The Population class is also
the root of two sub-classes: Men and Women.
Class Hierarchy:

Class Description

Population Root class

Nation Sub-class of Population

Men Sub-class of Population

Women Sub-class of Population

Object Query Language (OQL)


OQL is a query language used to manipulate and retrieve data in an object-
oriented database.

Schema and Instances


Schema:

"The overall design of the database is called the database


schema."

DBMS NOTES 8
Instance:

"The collection of information stored in the database at a


particular moment is called an instance of the database."

Types of Schema
There are three types of schema:

Physical Schema: Describes the database design at the physical level,


specifying additional storage details.

Conceptual Schema: Describes the stored data in terms of the data model
of the DBMS.

External Schema: Allows data access to be customized at the level of


individual users or groups of users.

Data Independence
Definition:

"The ability to modify a schema definition in one


level without affecting a schema definition in the next higher
level is
called Data Independence."

Data independence is of two types:

Physical Data Independence: Enables changes to be made at the physical


level without affecting the conceptual schema.

Logical Data Independence: Enables changes to be made at the


conceptual schema level without affecting the external schema.## ER
Model Concepts

Relationships
A relationship is a meaningful association between one or more entity types.

ISA relationship: a relationship between two entities where one is a


subtype of the other

DBMS NOTES 9
HASA relationship: a relationship between two entities where one has or
owns the other

Relationship Type
A set of meaningful associations between one or more participating entity
types.

Each relationship type is given a name that describes its function.

Relationships with the same attributes fall into one relationship set.

Degree of a Relationship
Defined as the number of entities associated with the relationship.

Degree Description

Unary or Recursive A relationship between instances of a single entity type

Binary A relationship between two entities

Ternary A relationship between three entities

n-ary A general form for degree n

Connectivity or Cardinality
Describes the mapping of associated entity instances in a relationship.

One-to-One (1:1): at most one instance of an entity is associated with one


instance of another entity

One-to-Many (1:N): one instance of an entity is associated with zero, one,


or many instances of another entity

Many-to-Many (M:N): one instance of an entity is associated with zero,


one, or many instances of another entity, and vice versa

Attributes
Define the properties of a data object in an entity.

Simple (or Atomic) Attributes: an attribute that is composed of a single


component with an independent existence

Composite Attribute: an attribute composed of multiple components each


with an independent existence

DBMS NOTES 10
Single-Valued Attribute: an attribute that holds a single value for a single
entity

Multi-Valued Attribute: an attribute that holds multiple values for a single


entity

Derived Attribute: an attribute that represents a value that is derivable from


the value of related attributes or a set of attributes

Entity Set
Weak Entity Set: an entity set that does not possess sufficient attributes to
form a primary key

Strong Entity Set: an entity set that has a primary key

Mapping Constraints 🗂️
One-to-One: an entity in A is associated with at most one entity in B, and
an entity in B is associated with at most one entity in A

One-to-Many: an entity in A is associated with any number (zero or more)


of entities in B, and an
entity in B is associated with at most one entity in A

Many-to-One: an entity in A is associated with at most one entity in B, and


an entity in B is
associated with any number (zero or more) of entities in A

Many-to-Many: an entity in A is associated with any number (zero or more)


of entities in B, and an
entity in B is associated with any number (zero or more) of entities in A

Generalization, Specialization, and Aggregation

Generalization

The process of identifying some common


characteristics of a collection of entity sets and creating a
new entity

DBMS NOTES 11
set that contains entities processing these common
features.

Specialization

The process of identifying subsets of an entity (super-class


or super-type) that share some distinguishing characteristic.

Aggregation

The process of forming a new entity set by taking a set of


related entities and grouping them together.

Let me know if you need anything else! ## Generalization and Specialization

Generalization
is a top-down process of defining super-classes and their related
sub-classes. We first define a super-class, then sub-classes, and their
attributes and relationships.

Advantages of Generalization:

Enables the entity type to share common attributes among different classes

Allows for the creation of a class that can be refined progressively into finer
sub-classes

Enables the inheritance of common attributes among sub-classes

Inheritance

Inheritance is the process by which one class acquires the


properties of one or more other objects.

Key Points:

Involves sharing of attributes and operations or methods among classes


based on a hierarchical structure

DBMS NOTES 12
A class can be created at a broad level and then refined progressively into
finer sub-classes

Each sub-class incorporates or inherits all the properties of its super-class


to which it can add its own properties

Superclass and Subclass Entity Set


Superclass:

An entity type that includes distinct subclasses that share common


attributes

Can be created at a broad level and then refined progressively into finer
sub-classes

Subclass:

An entity type that is a subset of a superclass

Inherits all the properties of its superclass to which it can add its own
properties

Examples:
Superclass Subclass Unique Attributes

Job Employee Employee ID, Employee Name

Job Manager Manager ID, Manager Name

Vehicle Car Car Model, Car Year

Vehicle Truck Truck Model, Truck Year

E-R Diagrams

Banking System:
class Bank {
accounts: [
{
accountNumber: string,
accountHolder: string,
balance: number

DBMS NOTES 13
}
]
}

class Account {
transactions: [
{
transactionId: string,
transactionDate: date,
amount: number
}
]
}

Hospital System:
class Hospital {
patients: [
{
patientId: string,
patientName: string,
address: string
}
]
}

class Patient {
appointments: [
{
appointmentId: string,
appointmentDate: date,
doctor: string
}
]
}

DBMS NOTES 14
Advantages and Disadvantages of Different Models

Hierarchical Model
Advantages:

Simpllicity

Data sharing

Data security

Data integrity

Efficiency

Disadvantages:

Implementation complexity

Inflexibility

Operational anomalies

Difficulty in maintaining database and applications

Network Model
Advantages:

Simplicity

Database standards

Disadvantages:

System complexity

Operational anomalies

Not user-friendly

Absence of structural independence

Relational Model
Advantages:

Simplicity

DBMS NOTES 15
No anomalies

Structural independence

Easier design, implementation, maintenance, and usage

Better query capability

Disadvantages:

Hardware overheads

Easier to design, but may lead to poorly designed DBMS## Phenomenon of


Information Islands

Since relational database systems are easy to


implement and use, this will create a situation where too
many people or
departments will create their own databases and
applications. These are
known as information islands, which will prevent the
information
integration that is essential for the smooth and efficient
functioning
of the organization.

Database Users
There are different types of users depending on their need and way of
accessing the database:

Application Programmers:
They are the developers who interact with the database by means of DML
queries. These queries are written in the application programs like C,
C++; Java, etc.

Sophisticated Users: They


are database developers who write special database application programs.
They are the developers who develop complex programs to meet the
requirement.

DBMS NOTES 16
Specialized Users: These users are also sophisticated users, but they write
special database application programs.

Stand-alone Users: These users will have standalone databases designed


for a specific purpose.

Native Users: These are the users who use the existing applications and
interfaces.

File Organization
File Organization: A file organization refers to the way the files are physically
arranged on a storage device.

Types of File Organization


File Organization Description

Sequential File The most basic way to organize a collection of records in a


Organization file is to use sequential organization.

Provides an effective way to access individual records


Relative File Organization
directly.

Indexed Sequential File An effective way of organizing the records when there is a
Organization need to access individual records directly.

Factors Influencing the Choice of File Organization

Economy of storage

Convenience of updates

Ease of retrieval

Reliability

Security

Integrity

Sequential File Organization

DBMS NOTES 17
In sequential organized files, the records are
written one after another in order when the file is created and
can be
accessed only in that order in which they are written when
the file is
used for input.

Advantages:

Easy to handle

Involves no overhead

Can be sorted on tapes as well as disks

Well suited for batch-oriented applications

Records in a sequential file can be of varying lengths

Disadvantages:

Records can only be accessed in sequence

Does not support updation operation in place

Does not support interactive applications

Relative File Organization

Provides an effective way to access individual


records directly. In relative file organization, there is a
predictable
relationship between the key and the record's location in the
file.

Advantages:

Records can be accessed out of sequence randomly

Well suited for interactive (on-line) applications

Supports updation operation in place

Disadvantages:

DBMS NOTES 18
Can only be stored on disks

Involves more overhead in the form of maintenance

Handling is complex as compared to sequential files

Records can only be of fixed length## Indexed Sequential File Organization

Definition:
An indexed sequential file organization is a combination of sequential
file and relative file organizations. It provides the benefits of both
access methods, allowing for efficient sequential access and direct
access to individual records.
Structure:
The indexed sequential file organization consists of an index with
pointers to a sequential data file. The index is structured as a binary
search tree, allowing for fast lookup and retrieval of records.
Example:
Consider a credit card billing system with a master file of customer
account information. The account number is used as the index key,
allowing for fast retrieval of individual records. The file can be
accessed in batch mode to generate customer invoices and build summary
reports of accounts activity on a monthly basis.

Advantages:

Records can be accessed sequentially and directly.

Supports interactive and batch-oriented applications.

Supports update operations.

Disadvantages:

Can only be stored on disks.

Involves more overhead in maintaining the index.

Records cannot be of variable length.

Multi-Key File Organization


Definition:
A multi-key file organization is a method of organizing files to

DBMS NOTES 19
support multiple keys or access paths. It allows for efficient retrieval
of records based on different keys or combinations of keys.
Example:
Consider a banking system with several types of users, including
tellers, loan officers, branch managers, and account holders. Each user
needs to access the same data in different ways. A multi-key file
organization can support multiple access paths, including account ID,
overdraft limit, social security number, and group code.
Approach:
One approach to support multi-key file organization is to use a single
data file and multiple indexes, each providing a different access path
to the data records.

Hashing
Definition:
Hashing is a technique of storing and retrieving records in a file
using a hash function. The hash function calculates the address of the
page where the record is stored based on one or more fields in the
record.
Hash Function:
A hash function is a mathematical formula that manipulates the keys in
some way to compute the index for the keys in the hash table.
Example:
Consider a hash function that takes the first two characters of the
staff number, converts them to an integer value, and then adds this
value to the remaining digits of the field. The resulting sum is used as
the address of the disk page where the record is stored.
Collision Resolution:

Open Addressing: A technique of resolving collisions by probing other slots


in the hash table to find an empty slot.

Chained Overflow: A technique of resolving collisions by storing multiple


records with the same hash value in a linked list.

Collision Resolution Techniques

DBMS NOTES 20
Technique Description

Open Addressing Probes other slots in the hash table to find an empty slot

Chained Overflow Stores multiple records with the same hash value in a linked list

Separate Chaining Uses a linked list to store all records with the same hash value

Separate Chaining Example


Let's consider the insertion of elements 5, 28,
19, 15, 20, 33, 12, 17, 10 into a chained hash table with 9 slots. The
hash function is
h(k) = k mod 9 .

Initial State:

T0 | T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8
---| ---| ---| ---| ---| ---| ---| ---| ---
null | null | null | null | null | null | null | null | n
ull

Insertion of Elements:

h(5) = 5 mod 9 = 5

h(28) = 28 mod 9 = 1

h(19) = 19 mod 9 = 1

...

T0 | T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8
---| ---| ---| ---| ---| ---| ---| ---| ---
null | 28 -> 19 | null | null | null | 5 | null | null |
null

Chained Hash Table After Insertion


T0 | T1 | T2 | T3 | T4 | T5 | T6 | T7 | T8
---| ---| ---| ---| ---| ---| ---| ---| ---
null | 28 -> 19 | null | null | null | 5 -> 15 -> 33 | nu
ll | null | null

DBMS NOTES 21
```## Collision Resolution by Open Addressing 🗂️
In open addressing, all elements of the dynamic set are sto
red in the hash table itself. Each entry of the hash table
either contains an element of the dynamic set or a sentinel
value indicating that the slot is free.

### Linear Probing

Linear Probing uses the following hash function:

h(k, i) = (k mod m + i) mod m

where m is the size of the hash table, k is the key, and i


is the probe number.

**Example:** Consider inserting the keys 16, 26, 37, 59, 2


1, 65, 88 into a hash table of size m = 11 using linear pro
bing.

| Key | h(k, 0) | h(k, 1) | ... |


| --- | --- | --- | ... |
| 76 | 10 | | |
| 26 | 4 | | |
| 37 | 4 | 5 | |
| 59 | 4 | 6 | 7 |
| 21 | 10 | 0 | |
| 65 | 10 | 0 | 1 |
| 88 | 0 | 1 | 2 |

**Primary Clustering:** Linear Probing suffers from primary


clustering, where a cluster refers to a block of occupied s
lots, and primary clustering refers to many such blocks sep
arated by free slots.

### Quadratic Probing

Quadratic Probing uses the following hash function:

DBMS NOTES 22
h(k, i) = (k mod m + c1 \* i + c2 \* i^2) mod m

where m is the size of the hash table, k is the key, and c1


and c2 are given constants.

**Example:** Consider inserting the keys 76, 26, 37, 59, 2


1, 65, 88 into a hash table of size m = 11 using quadratic
probing with c1 = 1 and c2 = 3.

| Key | h(k, 0) | h(k, 1) | ... |


| --- | --- | --- | ... |
| 76 | 10 | | |
| 26 | 4 | | |
| 37 | 4 | 8 | |
| 59 | 4 | 7 | 10 |
| 21 | 10 | 1 | |
| 65 | 10 | 1 | 4 |
| 88 | 0 | 1 | 4 |

### Double Hashing

Double Hashing uses the following hash function:

h(k, i) = (h1(k) + i \* h2(k)) mod m

where m is the size of the hash table, k is the key, and h1


and h2 are two auxiliary hash functions.

**Example:** Consider inserting the keys 76, 26, 37, 59, 2


1, 65, 88 into a hash table of size m = 11 using double has
hing with h1(k) = k mod 11 and h2(k) = k mod 9.

| Key | h1(k) | h2(k) | h(k, 0) | h(k, 1) | ... |


| --- | --- | --- | --- | --- | ... |
| 76 | 10 | 4 | 10 | | |
| 26 | 4 | 1 | 4 | | |
| 37 | 4 | 1 | 4 | 5 | |

DBMS NOTES 23
| 59 | 4 | 5 | 4 | 9 | 14 |
| 21 | 10 | 3 | 10 | 1 | |
| 65 | 10 | 5 | 10 | 1 | 2 |
| 88 | 0 | 1 | 0 | 1 | 2 |

## B-Tree

A B-tree of order m can be defined as an m-way search tree


that satisfies the following properties:

> All leaf nodes are at the same level.


> All non-leaf nodes (except the root node) should have at
least [m/2] children.
> All nodes (except the root node) should have at least [m/
2] - 1 keys.
> If the root node is a leaf node (only node in the tree),
it will have no children and will have at least one key. If
the root node is a non-leaf node, it will have at least 2 c
hildren and at least one key.
> A non-leaf node with n-1 key values should have n childre
n.

**Advantages:**

* B-tree is perfectly balanced, so the number of nodes acce


ssed to find a key becomes less.
* B-tree avoids waste of storage space since any node (exce
pt the root) is at least half full.

**Example:**

The minimum and maximum number of children in any non-root


and non-leaf node of B-trees of different orders are shown
in the following table:

| Order (m) | Minimum Children | Maximum Children |


| --- | --- | --- |
| 3 | 2 | 3 |

DBMS NOTES 24
| 4 | 2 | 4 |
| 5 | 3 | 5 |
| ... | ... | ... |## B-Trees

A B-tree is a self-balancing search tree that keeps data so


rted and allows search, insert, and delete operations in lo
garithmic time.

### Properties of B-Trees

* **Order**: The maximum number of children a node can hav


e.
* **Minimum Children**: The minimum number of children a no
de must have.
* **Maximum Children**: The maximum number of children a no
de can have.

### B-Tree of Order 5

In a B-tree of order 5, the maximum number of permissible k


eys (MAX) is 4, and the minimum number of permissible keys
(MIN) is 2.

### Example of a B-Tree of Order 5

```js
+---------------+
| 40 |
+---------------+
| / \ |
|/ \ |
+---------------+---------------+
| 14 | 68 |
+---------------+---------------+
| / \ / \ / \ |
|/ \| |/ \| |/ \|
+---------------+---------------+--------------
-+

DBMS NOTES 25
| 5 | 15 | 35 | 45 | 55 |
+---------------+---------------+--------------
-+

B+ Trees
A B+ tree is a variation of a B-tree that is well-suited for disk access.

Properties of B+ Trees
Internal Nodes: Store only keys and child pointers.

Leaf Nodes: Store keys and their corresponding data items.

Linked List: All leaf nodes form a linked list.

Example of a B+ Tree
+---------------+
| 40 |
+---------------+
| / \ |
|/ \ |
+---------------+---------------+
| 14 | 68 |
+---------------+---------------+
| / \ / \ / \ |
|/ \| |/ \| |/ \|
+---------------+---------------+--------------
-+
| Data | Data | Data | Data | Data |
+---------------+---------------+--------------
-+

Relational Database Management Systems (RDBMS)

A Relational Database Management System (RDBMS) is a database


management system that is based on the relational model.

DBMS NOTES 26
Properties of Relational Tables
Values are Atomic: Columns in a relational table are not repeating groups
or arrays.

Column Values are of the Same Kind: All values in a column come from the
same domain.

Each Row is Unique: No two rows in a relational table are identical.

Each Column has a Unique Name: Column names are unique.

The Sequence of Rows is Insignificant: Rows can be retrieved in any order.

The Sequence of Columns is Insignificant: Columns can be retrieved in


any order.

RDBMS vs DBMS
RDBMS DBMS

Relationship between
Specified at table creation Programmatically specified
tables

Client/Server Architecture Supports Does not support

Distributed Databases Supports Does not support

Security Multiple levels of security Tight security

One database with many Many tables with one


Database
tables extension

Codd's 12 Rules
Dr. E.F. Codd's 12 rules for a Relational Database Management System
(RDBMS):

"A relational system must be able to manage


databases of arbitrary complexity, and it must be able to do
so in a way
that is conceptually simple, efficient, and flexible."

Rule 1: The information rule

Rule 2: The guaranteed access rule

DBMS NOTES 27
Rule 3: Systematic treatment of null values

Rule 4: Active online catalog based on the relational model

Rule 5: Comprehensive data sublanguage rule

Rule 6: View updating rule

Rule 7: High-level insert, update, and delete rule

Rule 8: Physical data independence

Rule 9: Logical data independence

Rule 10: Integrity independence

Rule 11: Distribution independence

Rule 12: Non-subversion rule

The disadvantages of a file processing


system include:
Data Redundancy: Since these systems use a decentralized approach,
each department often uses its own application programs and files, leading
to duplication of data.

Poor Data Control: There is no centralized control over the fields, making it
difficult to manage and enforce data standards.

Poor Data Manipulation Capabilities: File processing systems do not


provide strong connections between data stored in different files, making
data retrieval and manipulation cumbersome.

Data Dependence: Files and records are described by specific physical


formats that are coded into the application programs, making changes
difficult and costly.

DBMS NOTES 28
Security Problems: Enforcing security measures is challenging because
application programs are often added in an ad-hoc manner, leading to
inconsistent security practices.

Database Keys
Understanding database keys is crucial for designing and working with
relational databases. Here are definitions of primary key, superkey, and foreign
key:

Primary Key
The primary key is a column (or a set of columns) used to uniquely identify
each row in a table. Each table can have only one primary key.

Superkey
A super key is a set of one or more columns (attributes) that can be used to
uniquely identify a record in a table. Note that a table can have multiple
superkeys.

Foreign Key
A foreign key is a field (or collection of fields) in one table that uniquely
identifies a row of another table. It's used to link two tables together to enforce
referential integrity.
Suppose we have a table "Students" and another table "Courses". If the
"Courses" table has a primary key called "CourseID", and the "Students" table
has a column called "EnrolledCourse" that also contains "CourseID", then
"EnrolledCourse" column is a foreign key in the "Students" table referencing
the "CourseID" primary key in the "Courses" table.
Please note that the foreign key in one table points to a primary key in another
table.
Example:
Students Table

| StudentID (Primary)| StudentName| EnrolledCourse (Foreign


-Key)||---------------------|-------------|----------------

DBMS NOTES 29
---------------|| 1 | John | CS101
|| 2 | Jane | MTH101
|

Courses Table

| CourseID (Primary)| CourseTitle||--------------------|---


----------|| CS101 | CompSci || MTH101
| Math |

So, "EnrolledCourse" in Students table is a foreign key that references


"CourseID" in the Courses table.

Normalization
Normalization is a database design technique that reduces data redundancy
and eliminates undesirable characteristics like Insertion, Update and Deletion
Anomalies.
The main aim of Normalization is to divide a database into two or more tables
and defining relationships between them. In brief,

It minimizes redundancy and duplication by dividing a database into two or


more tables.

It facilitates data consistency within the tables.

Classification of Database Languages


Database languages are systematically organized into the following categories:

1. Data Definition Language (DDL)

Used to define the structure of the database.

Includes commands like CREATE, ALTER, and DROP.

1. Data Manipulation Language (DML)

Used to manipulate the data within the structure.

Includes commands like SELECT, UPDATE, INSERT, and DELETE.

1. Data Control Language (DCL)

DBMS NOTES 30
Used to control access and rights to database data.

Includes commands like GRANT and REVOKE.

1. Transaction Control Language (TCL)

Used to handle transactions within the database.

Includes commands like COMMIT and ROLLBACK.

The Classification of SQL


SQL (Structured Query Language) is considered a combination of the four
types of languages detailed above. It provides functionality for defining,
manipulating, controlling, and handling transactions in a database.

/* Example of SQL commands */


CREATE TABLE Students (Name VARCHAR(20), Age INT, Major VAR
CHAR(30));
INSERT INTO Students VALUES ('John Doe', 20, 'Computer Scie
nce');
SELECT * FROM Students;
GRANT SELECT ON Students TO user;
COMMIT;

Given its comprehensive capabilities, SQL falls under all categories making it a
powerful tool for interacting with relational databases.

DDL and DML commands in SQL


In SQL (Structured Query Language), we mainly have two types of commands,
DDL (Data Definition Language) and DML (Data Manipulation Language).

DDL (Data Definition Language)


DDL involves commands that define or modify the structure of a database.
Common DDL commands include:

CREATE : used to create a new table or database.

ALTER : used to modify an existing database object.

DROP : used to delete an entire table or database.

DBMS NOTES 31
CREATE TABLE Students(StudentID int, StudentName varchar(25
5));
ALTER TABLE Students ADD Email varchar(255);
DROP TABLE Students;

DML (Data Manipulation Language)


DML involves commands that manipulate data present in the database.
Common DML commands include:

SELECT : used to select data from a database.

INSERT : used to insert data into a table.

UPDATE : used to update existing data within a table.

DELETE : used to delete records from a database table.

SELECT * FROM Students;


INSERT INTO Students (StudentID, StudentName) VALUES (1, "J
ohn");
UPDATE Students SET Email = '[email protected]' WHERE StudentN
ame = 'John';
DELETE FROM Students WHERE StudentID = 1;

Functional Dependency
Functional Dependency is a fundamental concept in the study of database
systems. It refers to the relationship between two sets of attributes in a
database.
To explain briefly:

If we have a relation R and two sets of attributes A and B in R, we can


say A functionally determines B if for every valid instance of R, each
value of A is associated with precisely one value of B. This is denoted
as → A→B.

This means that if you know the value of A, you can predict the value
of B with certainty. This concept is key in the creation of database
schemas, particularly in the normalization process.

DBMS NOTES 32
Pitfalls of Locks Based Protocol
Lock-based protocols have several limitations. Here are some of the main
pitfalls:

Deadlock: This can occur when two or more operations try to lock each
other concurrently, creating a cycle where each is waiting for the other to
release a lock.

Resource starvation: A lock can cause resource starvation if it isn't


managed properly. This can cause some transactions to wait indefinitely.

Decrease in concurrent processing: The more locks in the system, the less
the concurrent processing. This could lead to inefficiencies in the system.

Increased overhead: Managing locks cause additional computing


overhead, affecting overall performance.

Note that the severity of these pitfalls may vary depending on the specific lock-
based protocol implemented.

Database Keys Definitions


1. Primary Key: A primary key uniquely identifies each record in a table. It
must contain unique values and it cannot contain null values.

For eg: `PRIMARY KEY(id)`

1. NOT NULL Key: A NOT NULL constraint ensures that a column cannot have
a null value.

For eg: `NOT NULL(name)`

1. Unique Key: A unique key is a set of one or more than one fields/columns
of a table that uniquely identify a record in a database table. It is like
Primary key but it can accept one null value.

For eg: `UNIQUE KEY(email)`

Different Methods of Indexes

DBMS NOTES 33
Indexes are essential tools to expedite data retrieval in a database. Let's
explore a few common types:

1. B-Tree:
This is the most common type of index. It allows the database to find data
by leading the database system through a tree of data nodes.

2. Bitmap Index:
It's mainly useful in databases where the data in indexed columns has a
limited number of distinct values. Bitmap indexes use bit arrays (commonly
referred to as bitmaps) and answer queries by performing bitwise logical
operations on these bitmaps.

3. Hash Index:
In a hash index, a unique hash function generates a unique numeric value
for any data input. This numeric value provides the address where the data
is stored.

4. Clustered Index:
It sorts and stores the data rows in the table or view based on their key
values. There can only be one clustered index per table.

5. Non-Clustered Index:

It's just like a book index where reference is given to the page where the
information is saved but the pages aren't sorted in order.

Remember, Indexing methods can significantly impact the


speed of data retrieval, but they also affect the speed of
writing data to the database. It's necessary to balance these
factors based on specific use cases.

Characteristics of SQL
SQL (Structured Query Language) is a standardized programming language
used for managing and manipulating databases. Here are some primary
characteristics of SQL:

Declarative: SQL is a declarative language, meaning you simply declare


what you want but not how to get it.

DBMS NOTES 34
Ubiquitous: SQL is almost universally used. Any application that interacts
with a database likely uses SQL.

ACID Compliant: SQL databases are often ACID (atomicity, consistency,


isolation, durability) compliant which ensures reliable processing in
transactions.

Well-Defined Standards: Standards for SQL have been established by


American National Standards Institute (ANSI) and International Organization
for Standardization (ISO).

Schemas: SQL databases use a schema, a structure defined in advance, to


define the data format.

Scalability and Flexibility: SQL can handle a large amount of data and can
be scaled up and down as per requirements.

High Performance: SQL can handle heavy loads and performs well in big
data scenarios.

Characteristics of a Database
A database is an organized collection of data that is stored and accessed
electronically. Databases allow us to preserve, retrieve, and manipulate data
effectively, making them critical for any type of serious computational work.
Here are some key characteristics of a database:

Structured: Data is organized in a structured manner using tables, keys,


indexes, etc.

Efficient Data Access: Databases provide efficient, quick, and secure


access to large amounts of data.

Data Consistency: Changes made in a database are immediately applied


ensuring consistency of data.

Security: Databases offer different levels of security to protect the data.

Concurrency Control: Multiple users can access the database concurrently


without conflict.

Backup and Recovery: Databases can be backed up and recovered in case


of data loss.

DBMS NOTES 35
These are the key features that make databases an essential tool for managing
data in any enterprise application.

Differences between Sequential and


Random access in file handling:
Sequential Access
In sequential access, data is read or written consecutively, from start to
end.

It is less efficient if you want to access data located near the end of the file,
as you have to traverse all the preceding data.

Example: Accessing elements in an array sequentially by traversing from


the start.

Random Access
In random access, data can be read or written no matter the order. You can
start from any position.

It is more efficient if you want to access data located anywhere in the file,
as you can directly reach to that point.

Example: Using a pointer to access data in a file directly, regardless of its


position.

Remember that the choice of access depends on the specific use-case in your
program.

Relational Algebra
Relational algebra is a procedural query language, which takes instances of
relations as input and yields instances of relations as output. It is mainly used to
manipulate the data in a relational database.

In simple words, Relational Algebra is a collection of operations on


relations, including:

Selection: picking certain rows,

Projection: picking certain columns,

DBMS NOTES 36
Union: combining two relations,

Set difference: finding elements in one relation but not in another,

Cartesian product: combining all rows from two relations,

Rename: renaming the relation's attributes.

Sure, no problem! Relational algebra mainly consists of a set of operations that


take one or two relations as input and produce a relation as output. Here are
some of the significant ones:

Relational Algebra Operations


1. Select Operation (σ):

This operation is used to select a subset of tuples from a relation based


on a given predicate condition.

2. Project Operation (π):

This operation is used to extract specific attributes from a relation and


discard the unnecessary attributes.

3. Union Operation ( ∪):


This operation is used to combine tuples from two different relations
into a single relation.

4. Set Difference Operation (-):

This operation is used to find the tuples present in one relation but not
in the second relation.

5. Cartesian Product Operation (x):

This operation is used to generate a Cartesian product of two relations.

6. Rename Operation (ρ):

This operation is used to rename the attributes or the relation.

7. Join Operation:

This is an essential operation that combines two relations based on a


matching condition.

8. Intersection Operation (∩):

This operation is used to find the common tuples between two relations.

DBMS NOTES 37
These operations are fundamental to manage and manipulate data in relational
databases.

3NF (Third Normal Form)


Rules of 3NF:

The relation is in 2NF.

There are no transitive dependencies of non-prime attribute on the primary


key.

Example:

Student (Student_id, Student_name, Student_addr, Student_ag


e, Class_id, Class_name)

Here Class_name is transitively dependent on primary key Student_id which


violates third normal form.

BCNF (Boyce-Codd Normal Form)


Rules of BCNF:

The relation should be in 3NF.

For any functional dependency X→Y, X should be a super key.


Example:

Class (Class_id, Class_name, Student_id, Student_name)

Here Student_name → Class_name but Student_name is not a super key which violates
BCNF.

Database Integrity Constraints


Integrity constraints are the rules that help maintain the quality and accuracy of
data in the database. There are several types of integrity constraints in a
relational database:

DBMS NOTES 38
1. Domain Constraint: This ensures that all entries in a column must only be
of a specific data type.

2. Primary Key Constraint: This ensures that all entries are unique and that
there are no null values in the primary key column.

3. Foreign Key Constraint: This ensures that the values in the foreign key
column match the values of a primary key in another table.

4. Unique Constraint: This ensures that all values in a column are unique,
differentiating the records.

5. Not Null Constraint: This ensures that a column cannot have a null value.

6. Check Constraint: This ensures that the value in a column meets a specific
condition.

Note: Violation of these constraints can affect the integrity of the database
data.

Database Integrity Constraints


Integrity constraints are the rules that help maintain the quality and accuracy of
data in the database. There are several types of integrity constraints in a
relational database:

1. Domain Constraint: This ensures that all entries in a column must only be
of a specific data type.

2. Primary Key Constraint: This ensures that all entries are unique and that
there are no null values in the primary key column.

3. Foreign Key Constraint: This ensures that the values in the foreign key
column match the values of a primary key in another table.

4. Unique Constraint: This ensures that all values in a column are unique,
differentiating the records.

5. Not Null Constraint: This ensures that a column cannot have a null value.

6. Check Constraint: This ensures that the value in a column meets a specific
condition.

Note: Violation of these constraints can affect the integrity of the database
data.

DBMS NOTES 39
Database Normalization: 4th and 5th
Normal Forms
Fourth Normal Form (4NF)
Fourth Normal Form (4NF) states that a table should not have any multi-valued
dependency, which means that each independent multi-valued fact should be
represented in separate tables.

Fifth Normal Form (5NF)


Fifth Normal Form (5NF), also known as Project-Join Normal Form (PJNF),
ensures data integrity by managing redundancy of data across multiple tables
derived from a larger table. It ensures that every join dependency in table is a
consequence of the candidate keys.

DBMS Architecture
The architecture of a Database Management System (DBMS) can be seen as
either 2-tier or 3-tier.

In 2-tier architecture, end users directly interact with the database.

3-tier architecture adds an intermediary layer between the user and the
database, often improving performance, manageability, and security.

Definitions
Entity: A distinct, real-world object in an entity-relationship model.

Entity Set: A collection of similar entities.

Strong Entity: An entity that can exist independently of other types of


entities.

Weak Entity: An entity that depends on another entity-type for its existence.

Note: For more detailed information, it would be beneficial to refer to a


specialized DBMS or database design course or textbook.

Normalization in Databases

DBMS NOTES 40
Normalization is a process in database design that aims to reduce redundant
data, ensure data consistency and enhance the integrity of the database.

Why do we normalize the database?


Normalization is necessary for the following reasons:

Eliminate Redundant Data: By separating data into different tables


according to their relevance and removing duplicate information,
normalization reduces redundancy.

Data Consistency: Normalization ensures any data preserved in the


database remains consistent across the system.

Data Integrity: It enforces integrity constraints, ensuring data follows


certain rules, enhancing the overall logical consistency and correctness of
data.

Please note that while normalization has its benefits, it's not always the optimal
approach. It depends on the specific application and requirements.

Nested SQL Queries


Nested SQL queries, also known as subqueries, are SQL queries embedded
within a main SQL query. They are used to manipulate the data returned from
the database in a more complex and dynamic way.

Syntax Example
Here is a basic example in SQL:

SELECT column_name
FROM table_name
WHERE column_name IN (SELECT column_name FROM table_name WH
ERE condition);

In the above code snippet:

The query within the parentheses is the subquery or the nested query.

This nested query is executed first and it returns a set of values.

DBMS NOTES 41
This set of values is then used by the main query to execute and retrieve
the final data.

Database Normalization
Database Normalization involves organizing the attributes and tables of a
database to minimize redundancy and dependency. It generally encompasses
First Normal Form (1NF), Second Normal Form (2NF), and Third Normal Form
(3NF).

Concept of Functional Dependency


A functional dependency is a constraint between two sets of attributes in a
database. Specifically, functional dependency occurs when one attribute (the
determinant) determines another attribute(s).
For instance, if A and B are attributes of a table, B is said to be functionally
dependent on A if being given a particular value of A we can determine the
exact value of B.

Example:
In a Student table, if `StudentID` determines `StudentName
`, then we have `StudentID -> StudentName`.

Full Functional Dependency


Full Functional Dependency indicates that if A and B are attributes of a table
and if B is functionally dependent upon A, then there is no subset
of A upon which B is also dependent.

Example:
In a Course table `(CourseID, ProfessorID) -> CourseName`,
`CourseName` is fully functionally dependent on `(CourseID,
ProfessorID)` as it can't be determined by a subset.

Normal Forms
1NF (First Normal Form): Data is only in tabular form with no repeating
groups.

DBMS NOTES 42
2NF (Second Normal Form): All non-key attributes are fully functionally
dependent on the primary key.

3NF (Third Normal Form): Every non-key attribute is non-transitively


dependent on the primary key.

Concept of Keys in Databases


A key in a database is a single or combination of multiple fields, which helps
you access a whole record in a database table with the help of specific values.

Types of Keys
There are several types of keys in a database:

1. Primary Key: A unique identifier for a record in a table. There can only be
one primary key in a table.

2. Foreign Key: Used to link two tables together. It's a field in a table that is a
primary key in another table.

3. Candidate Key: A column, or set of columns, in a table that can uniquely


identify any database record without referring to any other data.

4. Composite Key: A key that consists of two or more columns to uniquely


identify rows in a table.

5. Secondary Key: Also known as a non-prime key, this key is not a part of the
primary key but is still used for retrieval purposes.

6. Super Key: A set of attributes that together identify a tuple (a row) in a


relation (a table).

7. Alternate Key: Alternate Keys are the Candidate Keys excluding the Primary
Keys.

Mapping ER Model to Relational Model


Mapping an ER (Entity-Relationship) Model to a relational model involves a
series of stages:

1. Mapping of Regular Entity Types: Each regular entity type in the ER


Schema is converted into a separate table in the relational model. The

DBMS NOTES 43
primary key of the entity in the ER model becomes the primary key of the
new table.

2. Mapping of Weak Entity Types: Weak Entity types are the ones that are
dependent on some other entity type. A separate table is created with a
foreign key that refers to the primary key of its owner entity.

3. Mapping of Composite Attributes: Composite attributes are split up into


individual, simpler attributes.

4. Mapping of Multi-valued Attributes: For each multi-valued attribute, a new


table is created which is linked to the original entity's table via a foreign key.

5. Mapping of Relationships: Depending on the cardinality of the relationship


(one-to-one, one-to-many, many-to-many), appropriate foreign keys are
added to tables or new relational tables are created.

CREATE TABLE Employee (


Employee_ID INT PRIMARY KEY,
Name VARCHAR(255),
Department_ID INT REFERENCES Department(Department_ID)
);

The above SQL snippet illustrates the creation of a new 'Employee' table with a
foreign key 'Department_ID' referring to the 'Department' table, thereby
mapping a relationship between the two entities in the original ER model.

Lock in Programming:
Used in multithreaded applications to prevent multiple threads from
accessing the same resource concurrently, which could lead to inconsistent
results or corruption.

A thread will "lock" a resource before using it, which prevents other threads
from accessing it until the original thread has "unlocked" it.

Here is a basic example of using a lock in Python:

import threading

# creating a lock
lock = threading.Lock()

DBMS NOTES 44
# acquiring the lock
lock.acquire()

# releasing the lock


lock.release()

Sequence of acquisition and release ensures mutual exclusion, meaning only


one thread can access the resource at a time.

Binary Shared and Exclusive Locks


In databases, binary shared and exclusive locks help manage concurrent
access while maintaining data consistency.

Shared Locks
A shared lock (S lock) allows concurrent transactions to read (SELECT) a
resource but not to write (UPDATE) it.

Multiple S locks can be held on the same resource at once.

Exclusive Locks
An exclusive lock (X lock) prevents other transactions from both reading
(SELECT) and writing (UPDATE) to the resource.

Only a single X lock can be applied on a resource at a time.

In essence, 'Shared' means other transactions can read but not change, while
'Exclusive' means no other transactions can read or change.

Anomalies in INF INF (The Incomplete Normal Factorization).


Anomalies in INF refer to situations where the normal factorization (NF) of a
relational database schema does not yield a valid solution or result. Two main
types of anomalies can occur:

1. Update Anomaly: If we have redundant data, it could lead to


inconsistencies.

2. Deletion Anomaly: Important data could be lost if you have to delete


another piece of data.

DBMS NOTES 45
A well-designed database should avoid these anomalies. This involves methods
such as normalizing your database to 3rd normal form (3NF) or Boyce-Codd
normal form (BCNF).

Understanding Mapping
Mapping in mathematics refers to a concept where each element of a set,
called the domain, is paired with an element of another set, known as
the range.

This process can also be understood as a function that "maps" one set to
another.

In simpler terms, for every input from the domain, the function assigns
precisely one output in the range.

For instance,
= ( )y=f(x)
In this equation, the function f is a mapping that maps x (element from the
domain) to y (element from the range).

Different Types of Mapping


Mapping is a fundamental concept in various academic disciplines. Here are
the main types:

1. Mathematical Mappings: Here, a function or a 'map' is a relation from a set


of inputs to a set of possible outputs. So, each input is related to exactly
one output.

2. Geographical Mappings: In terms of physical geography, this refers to


cartographic representations of geographical features.

3. Genomic Mappings: This applies to the field of genetics. It involves


techniques to locate genes or gene sequences on a chromosome.

4. Concept Mappings: Used in teaching and learning, these are graphical


tools to organize and represent knowledge.

5. Texture Mappings: In computer graphics, texture mapping is a technique to


add detail, surface texture, or color to a computer-generated graphic or 3D
model.

DBMS NOTES 46
Please feel free to reach out if you need more detailed information on any of
these types.

ACID Property of Transaction


The ACID (Atomicity, Consistency, Isolation, Durability) properties of a
transaction are a set of properties that guarantee that database transactions
are processed reliably.

1. Atomicity: This ensures that all operations within a transaction are


completed successfully; if not, the transaction is aborted.

Example: If a bank transaction is made from account A to B, it is


essential both the debit in A and the credit in B happen atomically.

2. Consistency: This guarantees that a transaction brings the database from


one valid state to another.

Example: In a college database, the grades for a student cannot be


entered if he/she is not enrolled in the course, maintaining the
consistency of the database.

3. Isolation: Ensures that the concurrent execution of transactions results in a


system state that would be obtained if transactions were executed
sequentially.

Example: In an online shopping situation, when two users try to


purchase a last item at the same time, the transaction should behave as
if they were done sequentially.

4. Durability: Once a transaction has been committed, it will remain so, even
in the event of power loss, crashes, or errors.

Example: Once you save a document, it’s stored in the database and is
ensured to remain stored even if your computer suddenly crashes.

Each property adds robustness to the system and ensures the data integrity
and reliability of transactions.

Difference between DDL and DML


Data Definition Language (DDL)

DBMS NOTES 47
DDL stands for Data Definition Language.

It is used to create and modify the structure of database objects in the


database.

DDL is a set of SQL commands used to create, modify, and delete database
structures but not data.

Examples of DDL commands include CREATE , ALTER , and DROP .

Data Manipulation Language (DML)


DML stands for Data Manipulation Language.

It is used to retrieve, store, modify, delete, insert and update data in the
database.

DML is a set of SQL commands used to manipulate data, not structures.

Examples of DML commands include SELECT , INSERT , UPDATE and DELETE .

In summary, the key difference between DDL and DML is that DDL is used to
manipulate the database structure, while DML is used to manipulate the data
within the database.

Concurrency Control Techniques


Concurrency control techniques are vital for maintaining correctness in
transactions in a database system. Here are a few methods used:

1. Lock-based protocols
A simple yet effective strategy is to lock part of a database that a
transaction is accessing. For example, when Transaction A is working on
Data Item X, it locks X. Any other Transaction B cannot access X until
Transaction A releases the lock.

2. Timestamp-based protocols
In these types of protocol, transactions are assigned a timestamp to avoid
conflicts. For instance, if Transaction A's timestamp is earlier than
Transaction B, Transaction A will get precedence over B.

3. Validation-based protocols

DBMS NOTES 48
With this protocol, transactions are executed without checks, but at commit
time, validation is done to prevent incorrect outcomes. When transaction A
comes to commit, if overlapping transaction B fails the validation, B would
roll back.

4. Multiversion Concurrency Control (MVCC)


MVCC creates versions of data items to circumvent write operations by
multiple transactions. Example: If a write is performed by Transaction A on
Data Item X, Transaction B can access the older version of X while A's
operation is yet to commit.

5. Optimistic Concurrency Control


These protocols assume no conflict will happen and proceed without
obtaining locks. However, if a conflict is detected at transaction end, a
rollback will occur, and the transaction is restarted.

Security and Authorization


Security in information technology (IT) refers to the defense of digital
information and IT assets against internal and external, malicious and
accidental threats.
Authorization, on the other hand, is a security measure used to determine
user/client privileges or access levels related to system resources, including
computer programs, files, services, data and application features.

Key Points:
Security encompasses measures and controls that ensure confidentiality,
integrity, and availability of data.

Authorization is mainly about access control mechanisms such as


permission levels and privileges.

Remember, these two terms, although related, represent different aspects of


system protection in the field of IT.

Third Normal Form (3NF)


The Third Normal Form (3NF), established by EF Codd, is a rule in the relational
database model which ensures data integrity by removing any transitive

DBMS NOTES 49
dependency.

Key rules for 3NF:


1. The table is in the Second Normal Form (2NF).

2. There are no transitive functional dependencies (i.e., if A→B and B→C,


then C should not depend on A).

Example:
Consider a table StudentCourseProfessor with the fields:

StudentID

CourseID

ProfessorID

ProfessorName

This table is not in 3NF because the ProfessorName depends on


the ProfessorID, which in turn depends on the CourseID. To make the table
3NF, we'd split it into two:
StudentCourseProfessor

StudentID

CourseID

ProfessorID

Professor

ProfessorID

ProfessorName

After the changes, both tables are in 3NF as they adhere to both 2NF rules and
do not have transitive functional dependencies.

DBMS NOTES 50

You might also like