DBMS
DBMS
concept of keys
(i)Super key(ii)Candidate key(iii)Primary key(iv)foreign key
Ans:- Keys
o Keys play an important role in the relational database.
o It is used to uniquely identify any record or row of data from the
table. It is also used to establish and identify relationships between
tables.
Types of keys:
1. Primary key
o It is the first key used to identify one and only one instance of an
entity uniquely. An entity can contain multiple keys, as we saw in
the PERSON table. The key which is most suitable from those lists
becomes a primary key.
o In the EMPLOYEE table, ID can be the primary key since it is unique
for each employee. In the EMPLOYEE table, we can even select
License_Number and Passport_Number as primary keys since they
are also unique.
o For each entity, the primary key selection is based on requirements
and developers.
2. Candidate key
For example: In the EMPLOYEE table, id is best suited for the primary
key. The rest of the attributes, like SSN, Passport_Number,
License_Number, etc., are considered a candidate key.
3. Super Key
Super key is an attribute set that can uniquely identify a tuple. A super
key is a superset of a candidate key.
4. Foreign key
o Foreign keys are the column of the table used to point to the
primary key of another table.
o Every employee works in a specific department in a company, and
employee and department are two different entities. So we can't
store the department's information in the employee table. That's
why we link these two tables through the primary key of one table.
o We add the primary key of the DEPARTMENT table, Department_Id,
as a new attribute in the EMPLOYEE table.
o In the EMPLOYEE table, Department_Id is the foreign key, and both
the tables are related.
Q2.ER-diagram?
Ans:- ER Model is used to model the logical view of the system from
data perspective which consists of these components:
Entity, Entity Type, Entity Set –
An Entity may be an object with a physical existence – a particular
person, car, house, or employee – or it may be an object with a
conceptual existence – a company, a job, or a university course.
An Entity is an object of Entity Type and set of all entities is called as
entity set. e.g.; E1 is an entity having Entity Type Student and set of all
students is called Entity Set. In ER diagram, Entity Type is represented
as:
Attribute(s):
Attributes are the properties which define the entity type. For
example, Roll_No, Name, DOB, Age, Address, Mobile_No are the
attributes which defines entity type Student. In ER diagram, attribute is
represented by an oval.
1. Key Attribute –
The attribute which uniquely identifies each entity in the entity set is
called key attribute. For example, Roll_No will be unique for each
student. In ER diagram, key attribute is represented by an oval with
underlying lines.
2. Composite Attribute –
An attribute composed of many other attribute is called as composite
attribute. For example, Address attribute of student Entity type consists
of Street, City, State, and Country. In ER diagram, composite attribute
is represented by an oval comprising of ovals.
3. Multivalued Attribute –
An attribute consisting more than one value for a given entity. For
example, Phone_No (can be more than one for a given student). In ER
diagram, multivalued attribute is represented by double oval.
4. Derived Attribute –
An attribute which can be derived from other attributes of the entity
type is known as derived attribute. e.g.; Age (can be derived from
DOB). In ER diagram, derived attribute is represented by dashed oval.
The complete entity type Student with its attributes can be represented
as:
Q3. DML,DCL and DDL?
Ans:- Structured Query Language(SQL) as we all know is the database
language by the use of which we can perform certain operations on the
existing database and also we can use this language to create a
database. SQL uses certain commands like Create, Drop, Insert, etc. to
carry out the required tasks.
These SQL commands are mainly categorized into four categories as:
1. DDL – Data Definition Language
2. DML – Data Manipulation Language
3. DCL – Data Control Language
4. TCL – Transaction Control Language
CREATE - to create a database and its objects like (table, index, views,
store procedure, function, and triggers)
ALTER - alters the structure of the existing database
DROP - delete objects from the database
TRUNCATE - remove all records from a table, including all spaces allocated
for the records are removed
COMMENT - add comments to the data dictionary
RENAME - rename an object
DML
DML is short name of Data Manipulation Language which deals with data
manipulation and includes most common SQL statements such SELECT,
INSERT, UPDATE, DELETE, etc., and it is used to store, modify, retrieve,
delete and update data in a database.
DCL
DCL is short name of Data Control Language which includes commands
such as GRANT and mostly concerned with rights, permissions and other
controls of the database system.
TCL
TCL is short name of Transaction Control Language which deals with a
transaction within a database.
Q4.Data Abstraction?
Ans:-Data abstractions in DBMS refer to the hiding of unnecessary data from the
end-user. Database systems have complex data structures and relationships. These
difficulties are masked so that users may readily access the data, and just the
relevant section of the database is made accessible to them through data
abstraction. Let's understand this more with an example.
Example: If we want to retrieve any email from Gmail, we don't know where that
data is physically kept, such as in India or the United States, or what data model was
utilized to store it. These things are not essential to us. Only our email is of interest
to us.
Levels of Data abstractions in DBMS
The level of Data abstractions in dbms reduces the time complexity and helps make
the system efficient.
Now let's look at the levels of data abstractions in DBMS and discuss them in detail.
1. Physical level
It is the lowest level of abstraction for DBMSs, defining how data is stored, data
structures for storing data, and database access mechanisms.
Developers or database application programmers decide how to store data in the
database. It is complex to understand.
2. Logical level
The logical level is the next higher level or intermediate level. It explains what data is
stored in the database and how those data are related. It seeks to explain the
complete or entire data by describing what tables should be constructed and what
the linkages between those tables should be. It is less complex than the physical
level.
3. View level
This is the top level. There are various views at the view level, with each view
defining only a portion of the total data. It also facilitates user engagement by
providing a variety of views or numerous views of a single database. All users have
access to the view level. This is the easiest and most simple level.
Q5.Specialization,Aggregation,Generalisation?
Ans:- Generalization –
Generalization is the process of extracting common properties from a
set of entities and create a generalized entity from it. It is a bottom-up
approach in which two or more entities can be generalized to a higher
level entity if they have some attributes in common. For Example,
STUDENT and FACULTY can be generalized to a higher level entity
called PERSON as shown in Figure 1. In this case, common attributes
like P_NAME, P_ADD become part of higher entity (PERSON) and
specialized attributes like S_FEE become part of specialized entity
(STUDENT).
Specialization –
In specialization, an entity is divided into sub-entities based on their
characteristics. It is a top-down approach where higher level entity is
specialized into two or more lower level entities. For Example,
EMPLOYEE entity in an Employee management system can be
specialized into DEVELOPER, TESTER etc. as shown in Figure 2. In
this case, common attributes like E_NAME, E_SAL etc. become part of
higher entity (EMPLOYEE) and specialized attributes like TES_TYPE
become part of specialized entity (TESTER).
Aggregation –
An ER diagram is not capable of representing relationship between an
entity and a relationship which may be required in some scenarios. In
those cases, a relationship with its corresponding entities is aggregated
into a higher level entity. Aggregation is an abstraction through which
we can represent relationships as higher level entity sets.
For Example, Employee working for a project may require some
machinery. So, REQUIRE relationship is needed between relationship
WORKS_FOR and entity MACHINERY. Using aggregation,
WORKS_FOR relationship with its entities EMPLOYEE and PROJECT
is aggregated into single entity and relationship REQUIRE is created
between aggregated entity and MACHINERY.
Q6.Strong and weak entity set?
Ans:-
Strong Entity:
A strong entity is not dependent on any other entity in the schema. A
strong entity will always have a primary key. Strong entities are
represented by a single rectangle. The relationship of two strong
entities is represented by a single diamond.
Various strong entities, when combined together, create a strong entity
set.
Weak Entity:
A weak entity is dependent on a strong entity to ensure its existence.
Unlike a strong entity, a weak entity does not have any primary key. It
instead has a partial discriminator key. A weak entity is represented by
a double rectangle.
The relation between one strong and one weak entity is represented by
a double diamond. This relationship is also known as identifying
relationship.
Difference between Strong and Weak Entity:
Ans:-
Data Independence
o Data independence can be explained using the three-schema
architecture.
o Data independence refers characteristic of being able to modify the
schema at one level of the database system without altering the
schema at the next higher level.
1. Notation: σ p(r)
Where:
2. Project Operation:
Where
o Suppose there are two tuples R and S. The union operation contains
all the tuples that are either in R or S or both in R & S.
o It eliminates the duplicate tuples. It is denoted by ∪.
1. Notation: R ∪ S
Example:
1. Notation: R ∩ S
Input:
5. Set Difference:
1. Notation: R - S
Input:
6. Cartesian product
o The Cartesian product is used to combine each row in one table with
each row in the other table. It is also known as a cross product.
o It is denoted by X.
1. Notation: E X D
Example:
EMPLOYEE
7. Rename Operation:
The rename operation is used to rename the output relation. It is denoted
by rho (ρ).
Q2.SQL command?
As the name suggests, it is used when we have structured data (in the form
of tables). All databases that are not relational (or do not use fixed
structure tables to store data) and therefore do not use SQL, are called
NoSQL databases. Examples of NoSQL are MongoDB, DynamoDB,
Cassandra, etc
SQL Command Description
CREATE DATABASE Creates a new database
Above SQL query are often used and always starts with these commands.
Q2.Triggers in SQL?
Ans:- Trigger is a statement that a system executes automatically when
there is any modification to the database. In a trigger, we first specify when
the trigger is to be executed and then the action to be performed when the
trigger executes. Triggers are used to specify certain integrity constraints
and referential constraints that cannot be specified using the constraint
mechanism of SQL.
Example –
Suppose, we are adding a tuple to the ‘Donors’ table that is some person has
donated blood. So, we can design a trigger that will automatically add the
value of donated blood to the ‘Blood_record’ table.
Types of Triggers –
We can define 6 types of triggers for each table:
1. AFTER INSERT activated after data is inserted into the table.
Q4.Reference integrity?
The table from which the values are derived is known as Master or
Referenced Table and the Table in which values are inserted accordingly
is known as Child or Referencing Table, In other words, we can say that
the table containing the foreign key is called the child table, and the
table containing the Primary key/candidate key is called
the referenced or parent table. When we talk about the database
relational model, the candidate key can be defined as a set of attribute
which can have zero or more attributes.
Here column Roll is acting as Primary Key, which will help in deriving the
value of foreign key in the child table.
Functional Dependency
The functional dependency is a relationship that exists between two attributes. It ty
exists between the primary key and non-key attribute within a table.
1. X → Y
The left side of FD is known as a determinant, the right side of the production is know
dependent.
For example:
Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee
because if we know the Emp_Id, we can tell that employee name associated with it.
1. Emp_Id → Emp_Name
Example:
Example:
1. ID → Name,
2. Name → DOB
Multivalued Dependency
o Multivalued dependency occurs when two attributes in a table are
independent of each other but, both depend on a third attribute.
o A multivalued dependency consists of at least two attributes that
are dependent on a third attribute that's why it always requires at
least three attributes.
1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR
Q2.Normalisation and normal form?
Ans:-
Normalization
A large database defined as a single relation may result in data duplication. Th
repetition of data may result in:
So to handle these problems, we should analyze and decompose the relations wit
redundant data into smaller, simpler, and well-structured relations that are satis
desirable properties. Normalization is a process of decomposing the relations int
relations with fewer attributes.
What is Normalization?
o Normalization is the process of organizing the data in the database.
o Normalization is used to minimize the redundancy from a relation or set of relations. It
also used to eliminate undesirable characteristics like Insertion, Update, and Deletio
Anomalies.
o Normalization divides the larger table into smaller and links them using relationships.
o The normal form is used to reduce redundancy from the database table.
The main reason for normalizing the relations is removing these anomalies. Failure t
eliminate anomalies leads to data redundancy and can cause data integrity and othe
problems as the database grows. Normalization consists of a series of guidelines tha
helps to guide you in creating a good database structure.
51.8M
927
o Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple in
a relationship due to lack of data.
o Deletion Anomaly: The delete anomaly refers to the situation where the deletion
data results in the unintended loss of some other important data.
o Updatation Anomaly: The update anomaly is when an update of a single data valu
requires multiple rows of data to be updated.
Normal Description
Form
Advantages of Normalization
o Normalization helps to minimize data redundancy.
o Greater overall database organization.
o Data consistency within the database.
o Much more flexible database design.
o Enforces the concept of relational integrity.
Disadvantages of Normalization
o You cannot start building the database before knowing what the user needs.
o The performance degrades when normalizing the relations to higher normal forms, i.e
4NF, 5NF.
o It is very time-consuming and difficult to normalize relations of a higher degree.
o Careless decomposition may lead to a bad database design, leading to serious problems
Unit -4
Q1.Short note
(i)Transaction processing and state diagram
Ans:- (i)Transaction
o The transaction is a set of logically related operation. It contains a
group of tasks.
o A transaction is an action or series of actions. It is performed by a
single user to perform operations for accessing the contents of the
database.
1. Open_Account(X)
2. Old_Balance = X.balance
3. New_Balance = Old_Balance - 800
4. X.balance = New_Balance
5. Close_Account(X)
Y's Account
1. Open_Account(Y)
2. Old_Balance = Y.balance
3. New_Balance = Old_Balance + 800
4. Y.balance = New_Balance
5. Close_Account(Y)
Operations of Transaction:
Following are the main operations of transaction:
Read(X): Read operation is used to read the value of X from the database
and stores it in a buffer in main memory.
Write(X): Write operation is used to write the value back to the database
from the buffer.
1. 1. R(X);
2. 2. X = X - 500;
3. 3. W(X);
o The first operation reads X's value from database and stores it in a
buffer.
o The second operation will decrease the value of X by 500. So buffer
will contain 3500.
o The third operation will write the buffer's value to the database. So
X's final value will be 3500.
But it may be possible that because of the failure of hardware, software or
power, etc. that transaction may fail before finished all the operations in
the set.
For example: If in the above transaction, the debit transaction fails after
executing operation 2 then X's value will remain 4000 in the database
which is not acceptable by the bank.
properties.
Q3.ACID properties?
Ans:-
In this section, we will learn and understand about the ACID properties.
We will learn what these properties stand for and what does each property
is used for. We will also understand the ACID properties with the help of
some examples.
ACID Properties
The expansion of the term ACID defines for:
Q4.Deadlock handling prevention?
Ans:- Deadlock is a situation where a process or a set of processes is
blocked, waiting for some other resource that is held by some other waiting
process. It is an undesirable state of the system. The following are the four
conditions that must hold simultaneously for a deadlock to occur.
1. Mutual Exclusion – A resource can be used by only one process
at a time. If another process requests for that resource then the
requesting process must be delayed until the resource has been
released.
2. Hold and wait – Some processes must be holding some resources
in non shareable mode and at the same time must be waiting to
acquire some more resources, which are currently held by other
processes in non-shareable mode.
3. No pre-emption – Resources granted to a process can be released
back to the system only as a result of voluntary action of that
process, after the process has completed its task.
4. Circular wait – Deadlocked processes are involved in a circular
chain such that each process holds one or more resources being
requested by the next process in the chain.
Methods of handling deadlocks : There are three approaches to deal with
deadlocks.
1. Deadlock Prevention
2. Deadlock avoidance
3. Deadlock detection
These are explained as following below.
1. Deadlock Prevention : The strategy of deadlock prevention is
to design the system in such a way that the possibility of
deadlock is excluded. Indirect method prevent the occurrence of
one of three necessary condition of deadlock i.e., mutual
exclusion, no pre-emption and hold and wait. Direct method
prevent the occurrence of circular wait. Prevention techniques
– Mutual exclusion – is supported by the OS. Hold and Wait
– condition can be prevented by requiring that a process
requests all its required resources at one time and blocking the
process until all of its requests can be granted at a same time
simultaneously. But this prevention does not yield good result
because \
Q5. conflict serializability and view serializability?
Ans:-
As discussed in Concurrency control, serial schedules have less resource
utilization and low throughput. To improve it, two or more transactions are
run concurrently. But concurrency of transactions may lead to inconsistency
in database. To avoid this, we need to check whether these concurrent
schedules are serializable or not.
Conflict Serializable: A schedule is called conflict serializable if it can be
transformed into a serial schedule by swapping non-conflicting operations.
Conflicting operations: Two operations are said to be conflicting if all
conditions satisfy:
They belong to different transactions
They operate on the same data item
At Least one of them is a write operation
Example: –
Conflicting operations pair (R 1(A), W2(A)) because they belong to
two different transactions on same data item A and one of them is
write operation.
Similarly, (W1(A), W2(A)) and (W1(A), R2(A)) pairs are
also conflicting.
On the other hand, (R 1(A), W2(B)) pair is non-conflicting because
they operate on different data item.
Similarly, ((W1(A), W2(B)) pair is non-conflicting.
Unit-5
Q1.Concurrency control-timestamp?
Ans:-
2PL locking protocol
Every transaction will lock and unlock the data item in two different
phases.
Growing Phase − All the locks are issued in this phase. No
locks are released, after all changes to data-items are
committed and then the second phase (shrinking phase)
starts.
Shrinking phase − No locks are issued in this phase, all the
changes to data-items are noted (stored) and then locks are
released.
The 2PL locking protocol is represented diagrammatically as follows −
In the growing phase transaction reaches a point where all the locks it
may need has been acquired. This point is called LOCK POINT.
After the lock point has been reached, the transaction enters a shrinking
phase.
Types
Two phase locking is of two types −
Strict two phase locking protocol
A transaction can release a shared lock after the lock point, but it cannot
release any exclusive lock until the transaction commits. This protocol
creates a cascade less schedule.
Cascading schedule: In this schedule one transaction is dependent on
another transaction. So if one has to rollback then the other has to
rollback.
Example
Let T1 and T2 are two transactions.
T1=A+B and T2=B+A
T1 T2
Lock-X(A) Lock-X(B)
Read A; Read B;
Lock-X(B) Lock-X(A)
Here,
Lock-X(B) : Cannot execute Lock-X(B) since B is locked by T2.
Lock-X(A) : Cannot execute Lock-X(A) since A is locked by T1.
In the above situation T1 waits for B and T2 waits for A. The waiting time
never ends. Both the transaction cannot proceed further at least any one
releases the lock voluntarily. This situation is called deadlock.
Where,