DBMS Unit 1
DBMS Unit 1
Database Management Systems (DBMS) have been around for several decades, and their
history can be traced back to the early 1960s. In the early days, computer systems were
designed to manage data in a hierarchical or navigational manner, where data was stored in a
tree-like structure. This method of storing data was inefficient and difficult to use, as it
required a lot of manual effort to access and manage the data.
In the late 1960s, The first general-purpose DBMS, designed by Charles Bachman, was
called the Integrated Data Store (IDS) which was based on network data model for which
he was received the Turing Award (The most prestigious award which is equivalent to
Nobel prize in the field of Computer Science.).
In the late 1970s, Mr Edgar Codd proposed a new data representation framework called
the Relational Database Model. Mr Edgar Codd won the 1981 Turing Award for his
seminal work. This model was based on the concept of a table, with rows representing
individual records and columns representing individual fields within those records. The
relational model allowed for more efficient storage and retrieval of data and was easier to use
than the hierarchical or navigational models.
In the late 1980s IBM developed the Structured Query Language (SQL) for relational
databases, as a part of R project. This system was designed to manage large amounts of data
and was used primarily in corporate and government applications. SQL was adopted by the
American National Standards Institute (ANSI) and International Organization for
Standardization (ISO).
In the 1980s, several new DBMS products were introduced, including Oracle, Sybase, and
Microsoft SQL Server. These systems were designed to be more user-friendly and to support
more advanced data modeling and query languages.
In the 1990s, object-oriented DBMS (OODBMS) emerged, which were designed to store
and manage complex data structures, such as multimedia and other types of non-traditional
data. These systems were initially popular in research and academic environments, but their
adoption was limited in the commercial sector.
In the 1991, Microsoft ships MS access, a personal DBMS and that displaces all other
personal DBMS products.
In the 1997, XML applied to database processing. Many vendors begin to integrate XML into
DBMS products.
In the 2000s, web-based applications and cloud computing became more popular, and DBMS
systems began to adapt to these new technologies. New DBMS systems were developed to
support distributed and web-based applications, including NoSQL databases such as
MongoDB and Cassandra.
Today, DBMS systems continue to evolve, with an emphasis on scalability, performance, and
support for cloud-based applications. Some of the most popular DBMS systems in use today
include Oracle, Microsoft SQL Server, MySQL, PostgreSQL, and MongoDB.
What is DBMS?
Database Management System (DBMS) is a software for storing and retrieving user’s data
while considering appropriate security measures. It consists of a group of programs that
manipulate the database. The DBMS accepts the request for data from an application and
instructs the DBMS engine to provide the specific data. In large systems, a DBMS helps
users and other third-party software to store and retrieve data.
A file system is a software that manages and organizes the DBMS or Database Management System is a
files in a storage medium. It controls how data is stored software application. It is used for accessing, creating,
and retrieved. and managing databases.
The file system provides the details of data representation DBMS gives an abstract view of data that hides the
and storage of data. details.
Storing and retrieving of data can’t be done efficiently in a DBMS is efficient to use as there are a wide variety of
file system. methods to store and retrieve data.
Conventional file systems are used where there is less Database systems are used when security constraints
demand for security constraints(Security is very low). are high(Security is high).
File systems define the data in un-structured manner. Data Database systems define the data in a structured
is usually in isolated form. manner. Also there is well defined co-relation among
the data.
Data inconsistency is more in file systems. Data inconsistency is less in database systems.
File System Database systems
User locates the physical address of file to access the data User is unknown to the physical address of the data
in conventional file systems. used in database systems.
There is no ability to concurrently access the data using There is ability to access the data concurrently using
conventional file system. database systems.
Not provide support for complicated transactions. Easy to implement complicated transactions.
It doesn’t offer backup and recovery of data if it is lost. DBMS system provides backup and recovery of data
even if it is lost.
Each application has its data file so, the same data may have to be recorded and stored
many times.
Data dependence in the file processing system are data-dependent, but, the problem is
incompatible with file format.
Limited data sharing.
The problem with security.
Time-consuming.
It allows you to maintain the record of the big firm having a large number of items.
Required lots of labor work to do.
1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Based Data Model
6. Semi-structured Data model
1. Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises the data in the
hierarchical tree structure.
The hierarchy starts from the root which has root data and then it expands in the form of a
tree adding child node to the parent node. This model easily represents some of the real-world
relationships like food recipes, sitemap of a website etc.
Depicts a set of one-to-many (1:M) relationships
2. Network Model
This model is an extension of the hierarchical model, the only difference is that a record can
have more than one parent. It replaces the hierarchical tree with a graph.
The network model was created to represent complex data relationships more effectively
when compared to hierarchical models, to improve database performance and standards.
3. Entity-Relationship Model
An ER model is the logical representation of data as objects and relationships among them.
These objects are known as entities, and relationship is an association among these entities.
1. Entities − It is a real-world thing which can be a person, place, or even a concept. For
Example: Department, Admin, Courses, Teachers, Students, Building, etc are some of
the entities of a School Management System.
2. Attributes − An entity which contains a real-world property called an attribute. For
Example: The entity employee has the property like employee id, salary, age, etc.
3. Relationship − Relationship tells how two attributes are related. For Example:
Employee works for a department.
An entity has a real-world property called attribute and these attributes are defined by a set of
values called domain.
Advantages of Entity-Relationship Model
Relational Model
The relational model uses a collection of tables to represent both data and the relationships.
Tables are also known as relations. Each table has multiple columns represent as attributes,
Attributes are the properties which define a relation. Each row of the table represents as
Tuple, Tuple is one piece of information.
Tables: relations are saved in the table format. A table has two properties rows and
columns
Attribute: columns represent as attributes
Tuple: A Row represent as Tuple
Relation Schema: A relation schema represents the name of the relation with its
attributes.
Degree: The total number of attributes which in the relation is called the degree of the
relation.
Cardinality: Total number of rows present in the Table.
Column: The column represents the set of values for a specific attribute.
Relation instance: The set of tuples of a relation at a particular instance of time is
called as relation instance.
Relational model requires powerful hardware and large data storage devices.
May lead to slower processing time.
Poorly designed systems lead to poor implementation of database systems.
The complex real world problems are represented as objects with different attributes. In
Object Oriented Data Model, data and their relationships are contained in a single structure
which is referred as object. All objects have multiple relationships between them. Basically, it
is combination of Object Oriented programming and Relational Database Model.
Reduced Maintenance
Real-World Modeling
Improved Reliability and Flexibility
High Code Reusability
The semi-structured data model permits the specification of data where individual data items
of same type may have different sets of attributes. The Extensible Markup Language (XML)
is widely used to represent semistructured data model.
The internal level has an internal schema which describes the physical storage structure of the
database.
The internal schema is also known as a physical schema.
It uses the physical data model. It is used to define that how the data will be stored in a block.
The physical level is used to describe complex low-level data structures in detail.
The conceptual schema describes the design of a database at the conceptual level. Conceptual
level is also known as logical level.
The conceptual schema describes the structure of the whole database.
The conceptual level describes what data are to be stored in the database and also describes
what relationship exists among those data.
In the conceptual level, internal details such as an implementation of the data structure are
hidden.
Programmers and database administrators work at this level.
It hides the unrelated details of the database from the user. There may be “n” number of
external views for each database.
Each external view is defined using an external schema, which consists of definitions of
various types of external record of that specific view.
View level can be used by all users (all levels' users). This level is the least complex and easy
to understand.
An external level is only related to the data which is viewed by specific end users.
This level includes some external schemas.
External schema level is nearest to the user
An external schema is also known as view schema.
Each view schema describes the database part that a particular user group is interested
and hides the remaining database from that user group.
The view schema describes the end user interaction with database systems.
The ability to modify the schema definition of a DBMS at one level, without affecting the
schema definition of the next higher level is called data independence.
In addition to the data entered by users, a database system typically holds a large amount of
data. The system holds metadata about data which makes it easier to find and retrieve data.
Once a set of metadata in DBMS has been saved in a database, changing or updating the
metadata is challenging. However, as a database management system (DBMS) grows, it must
evolve to meet the needs of its users. Updating the schema or data would be a time-
consuming and complicated task if all of the data were dependent.
To address the problem with updating metadata, it is organized in a tiered structure, so that
changing data at one level does not affect data at another. This information is self-contained,
however, all this information is linked to one another.
So, data independence aids in the separation of data from the applications that use it.
Now that you know what data independence means, let's discuss its types. This is where your
knowledge of the 3-level architecture is important!
This is defined as the ability to modify the physical schema of the database without the
modification causing any changes in the logical/conceptual or view/external level.
Logical data independence is the ability to modify logical schema without causing any
unwanted modifications to the external schema or the application programs to be rewritten.
Logical data is database data, which means it stores information about how data is
managed within the database. Logical data independence is a method that makes sure
that if we make modifications to the table format, the data should not be affected.
The mapping between the external and conceptual levels will absorb any changes
made.
In other words, to distinguish the external level from the conceptual view, logical data
independence is used. Any modifications to the conceptual representation of the data
will not affect the user's view of the data.
Without rewriting current application scripts, you can add, modify, or delete a new
attribute, entity, or relationship.
To divide an existing record into two or more records.
Merging two records into a single one.
Important Fact
Because application programs are so conceptually reliant, logical data independence is more
difficult to achieve than physical data independence. Even a small number of changes to the
database's logical structure would require changes to be made to the applications as well. As a
result, obtaining logical data independence might be difficult.
It is concerned with the internal schema of the It is concerned with the conceptual schema of the
database. database.
Changes at the internal level may or may not be When the database's logical structure needs to be
required to increase the overall performance of modified, the changes made at the logical level are
the database. crucial.
In most cases, a change at the physical level If new fields are added or removed from the
does not necessitate a change at the application database, then updates are required to be made in
program level. the application software.
The Middle two parts (Query processor and storage manager) are important components
of database architecture.
Query processor:
The interactive query processor helps the database system to simplify and facilitate access to
data. It consists of DDL(Data Definition Language) interpreter, DML(Data Manipulation
Language) compiler and query evaluation engine.
The following are various functionalities and components of query processor
DDL interpreter: This is basically a translator which interprets the DDL statements
in data dictionaries.
DML compiler: It translates DML statements query language into an evaluation plan.
This plan consists of the instructions which query evaluation engine understands.
Query evaluation engine: It executes the low-level instructions generated by the
DML compiler.
When a user issues a query, the parsed query is presented to a query optimizer, which uses
information about how the data is stored to produce an efficient execution plan for evaluating
the query. An execution plan is a blueprint for evaluating a query. It is evaluated by query
evaluation engine.
Storage manager:
Storage manager is the component of database system that provides interface between the
low level data stored in the database and the application programs and queries submitted to
the system.
The storage manager is responsible for storing, retrieving, and updating data in the database.
The storage manager components include
Authorization and integrity manager: Validates the users who want to access the
data and tests for integrity constraints.
Transaction manager: Ensures that the database remains in consistent despite of
system failures and concurrent transaction execution proceeds without conflicting.
File manager: Manages allocation of space on disk storage and representation of the
information on disk.
Buffer manager: Manages the fetching of data from disk storage into main memory.
The buffer manager also decides what data to cache in main memory. Buffer manager
is a crucial part of database system.
Storage manager implements several data structures such as
1. Requirement analysis
2. Conceptual database design
3. Logical database design
4. Schema refinement
5. Physical database design
6. Application and security design
1. Requirement analysis
It is necessary to understand what data need to be stored in the
database, what applications must be built, what are all those
operations that are frequently used by the system.
The requirement analysis is an informal process and it requires
proper communication with user groups.
There are several methods for organizing and presenting
information gathered in this step. Some automated tools can also
be used for this purpose.
4. Schema refinement
In this step, relational database schema is analyzed to identify
the potential problems and to refine it.
The schema refinement can be done with the help of normalizing
and restructuring the relations.
ER Diagrams in DBMS
ER model in DBMS is the high-level data model. It stands for the Entity-
relationship model and is used to represent a logical view of the system
from a data perspective. In simple words, the entity relationship diagram
is a blueprint that can used to create a database. E-R diagrams are used
to model real-world objects like a person, a car, a company and the
relation between these real-world objects.
Features of ER model
Component Symbol
Entity
Weak Entity
Attribute
Component Symbol
Key Attribute
Composite Attribute
Multivalued Attribute
Derived Attribute
Relationship
Component Symbol
Weak Relationship
Participation Constraints
Entity in DBMS
Entity: An entity is anything in the real world, such as an object, class,
person, or place. Objects that physically exist and are logically constructed in
the real world are called entities. An entity is distinguishable from other
entity.
Entity type: The entity type is a collection of the entity having similar
attributes.
Entity set: is a group of entities of similar kinds. It can contain entities with
attributes that share similar values. It's collectively a group of entities of a
similar type. The entity set need not be disjoint.
Strong entity always has a primary key. While a weak entity has a partial discriminator key.
Strong entity is not dependent on any Weak entity depends on strong entity.
other entity.
Two strong entity’s relationship is While the relation between one strong and one weak entity
represented by a single diamond. is represented by a double diamond.
Strong entities have either total While weak entity always has total participation.
participation or not.
Relationships in DBMS
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities,
a relationship too can have attributes. These attributes are called descriptive
attributes.
Mapping Cardinalities:
express the number of entities to which another entity can be associated via
a relationship. For binary relationship sets between entity sets A and B, the
mapping cardinality must be one of:
One-to-one: An entity in A is associated with at most one entity in B, and
an entity in B is associated with at most one entity in A.
Participation Constraints
Generalization
Specialization
Aggregration
Some entities have relationships that form hierarchies. For instance,
Employee can be an hourly employee or contracted employee.
In this relationship hierarchies, some entities can act as superclass and some
other entities can act as subclass.
Superclass: An entity type that represents a general concept at a high level,
is called superclass.
Subclass: An entity type that represents a specific concept at lower levels,
is called subclass.
The subclass is said to inherit from superclass. When a subclass inherits from
one or more super classes, it inherits all their attributes. In addition to the
inherited attributes, a subclass can also define its own specific attributes.
The symbol used for specialization/ Generalization is
Generalization
Generalization is a process of extracting common properties from a set of
entities and creating a generalized entity from it. It is a bottom-up approach,
and it helps to reduce the size and complexity of the schema.
Example: Let us take two low-level entities as Car and Bus, and these two
will have many common attributes and some specific attributes. And we will
generalize and link the common attributes to the newly formed high-level
entity named Vehicle.
Specialization
Specialization is opposite to Generalization. In this, entity is divided into sub
entities bases on their characteristics (distinguishing features). It breaks an
entity into multiple entities from higher level to lower level. It is a top down
approach.
Aggregration
Aggregation refers to the process by which entities are combined to form a
single meaningful entity. The specific entities are combined because they do
not make sense on their own. To establish a single entity, aggregation
creates a relationship that combines these entities. The resulting entity
makes sense because it enables the system to function well.
UNIT 2
RELATIONAL MODEL
SQL Commands
There are five types of SQL commands: DDL, DML, DCL, TCL, and DQL.
1. Data Definition Language (DDL)
o DDL changes the structure of the table like creating a table, deleting a table, altering a
table, etc.
o All the command of DDL are auto-committed that means it permanently save all the
changes in the database.
o CREATE
o ALTER
o DROP
o TRUNCATE
Syntax:
CREATE TABLE table_name (column1 datatype, column2 datatype,column3 datatype,....);
The column parameters specify the names of the columns of the table.
The datatype parameter specifies the type of data the column can hold (e.g. varchar, integer,
date, etc.).
The following example creates a table called "Persons" that contains five columns: PersonID,
LastName, FirstName, Address, and City:
DROP: It is used to delete both the structure and record stored in the table.
Syntax
ALTER: It is used to alter the structure of the database. This change could be either to
modify the characteristics of an existing attribute or probably to add a new attribute.
Syntax:
TRUNCATE: It is used to delete all the rows from the table and free the space containing
the table.
Syntax:
o INSERT
o UPDATE
o DELETE
a. INSERT: The INSERT statement is a SQL query. It is used to insert data into the row of a
table.
Syntax:
INSERT INTO TABLE_NAME (col1, col2, col3,.... col N) VALUES (value1, value2, value
3, .... valueN);
UPDATE: This command is used to update or modify the value of a column in the table.
Syntax:
DCL commands are used to grant and take back authority from any database user.
o Grant
o Revoke
Example:
GRANT SELECT, UPDATE ON MY_TABLE TO SOME_USER, ANOTHER_USER;
TCL commands can only use with DML commands like INSERT, DELETE and UPDATE
only.
These operations are automatically committed in the database that's why they cannot be used
while creating tables or dropping them.
o COMMIT
o ROLLBACK
o SAVEPOINT
a. Commit: Commit command is used to save all the transactions to the database.
Syntax:
COMMIT;
Rollback: Rollback command is used to undo transactions that have not already been saved
to the database.
Syntax: ROLLBACK;
ROLLBACK;
SAVEPOINT: It is used to roll the transaction back to a certain point without rolling back
the entire transaction.
Syntax:
SAVEPOINT SAVEPOINT_NAME;
5. Data Query Language
o SELECT
a. SELECT: This is the same as the projection operation of relational algebra. It is used to
select the attribute based on the condition described by WHERE clause.
For example: SELECT emp_name FROM employee WHERE age > 20;