0% found this document useful (0 votes)
32 views

DBMS-1

Database management system CMS unit1

Uploaded by

playerpunju8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

DBMS-1

Database management system CMS unit1

Uploaded by

playerpunju8
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 73

DATABASE

MANAGEMENT
SCIENCE
UNIT-I

II-CSD R22
Unit-1
Database System Applications:
A Historical Perspective
File Systems versus a DBMS
the Data Model
Levels of Abstraction in a DBMS
Data Independence
Structure of a DBM
Introduction to Database Design:
Database Design and ER Diagrams
Entities
Attributes
and Entity Sets
Relationships and Relationship Sets
Additional Features of the ER Model
Conceptual Design With the ER Model
History of database management systems
● The first DBMS was developed in the early 1960s when Charles Bachman created a navigational(network) DBMS known as the Integrated Data
Store.
● In 1968, IBM developed Information Management System or IMS, a hierarchical DBMS designed for IBM mainframes that's still used today by
many large organizations.
● But the DBMS market changed forever as the relational model for data gained popularity. Introduced by Edgar Codd of IBM in 1970 in his
seminal paper "A Relational Model of Data for Large Shared Data Banks," the RDBMS soon became the industry standard. The first RDBMS was
Ingres, developed at the University of California, Berkeley by a team led by Michael Stonebraker in the mid-1970s.
● IBM was working on its System R project to develop an RDBMS.
● In 1979, the first successful commercial RDBMS, Oracle, was released, followed a few years later by IBM's Db2, Sybase SQL Server and many
others.
● In the 1990s, as object-oriented programming (OOP) became popular, several OOP database systems came to market, but they never gained
significant market share.
● Later in the 1990s, the term NoSQL was coined.
● Over the next decade, several types of new non-relational DBMS products -- including key-value, graph, document and wide-column store --
were grouped into the NoSQL category.
● Today, the DBMS market is dominated by RDBMSes, but NewSQL and NoSQL database systems continue to grow in popularity.
File Systems versus a DBMS
The file system is basically a way of arranging the files in a storage medium like a hard disk. The file
system organizes the files and helps in the retrieval of files when they are required. File systems consist of
different files which are grouped into directories. The directories further contain other folders and files. The
file system performs basic operations like management, file naming, giving access rules, etc.
Database Management System is basically software that manages the collection of related data. It is used
for storing data and retrieving the data effectively when it is needed. It also provides proper security
measures for protecting the data from unauthorized access. In Database Management System the data can
be fetched by SQL queries and relational algebra. It also provides mechanisms for data recovery and data
backup.
File System DBMS System
Parameter File System DBMS

Definition A file system is a method of storing and organizing data into Software that stores, manages, and retrieves data efficiently.
files in a hierarchical order.

Data Organization Organizes the data in a hierarchical directory. Organizes the data in a pre-defined structure.

Data Retrieval It searches entire directories and files to find any specified data. Uses SQL queries to retrieve the data.

Data Manipulation Very limited options are available such as renaming and Have advanced functionality to manipulate the data, such as
copying. updating, deleting, and inserting the records.

Scalability Limited scalability as the size and complexity increase. Provide scalability as it can handle extremely large amounts
of data.
Security Provide basic file permission for controlling access to files and Provide robust security measures for controlling access to
directories. data.

Backup & Recovery Require manual backups & recovery of files and data Provide backup & recovery mechanisms for ensuring data
availability and preventing data loss.

Uses Cases Suitable for simple data storage and retrieval tasks. Suitable for complex data management tasks such as
e-commerce, finance etc.
Database Schema Definition
A database schema refers to the logical and visual configuration
of the entire relational database. The database objects are often
grouped and displayed as tables, functions, and relations. A
schema describes the organization and storage of data in a Details of a Customer
database and defines the relationship between various tables. A
database schema includes descriptive details of the database that
can be depicted through schema diagrams.
Or
A database schema is a logical representation of data that
shows how the data in a database should be stored logically. It
shows how the data is organized and the relationship between the
tables.

Schema of Customer
Types of Database Schemas
There are 3 types of database schema:
Physical Database Schema
● A Physical schema defines, how the data or information is stored physically in the storage
systems in the form of files & indices. This is the actual code or syntax needed to create the
structure of a database, we can say that when we design a database at a physical level, it’s
called physical schema.
Logical Database Schema
● A logical database schema defines all the logical constraints that need to be applied to the
stored data, and also describes tables, views, entity relationships, and integrity constraints.
● The Logical schema describes how the data is stored in the form of tables & how the attributes of a
table are connected.
●Using ER modelling the relationship between the components of the data is maintained.

View Database Schema


● It is a view level design which is able to define the interaction between end-user and database.
● User is able to interact with the database with the help of the interface without knowing much about
the stored mechanism of data in database.
Database Model
A Database model defines the logical design and structure of a database. It defines how data will be stored, accessed, and updated in a database management system.

● As per your application's requirement, you can use a database model to define your database.
● The database model sets the rule, relationships, constraints, etc. to define how data is stored in the database.
● It's like creating a blueprint of your Database.
● There are different types of Database models and each one has its own set of features.
● You can define how you want to structure the application data using a database model.

Type of Database models


There are several different Database model types, some of them are old, while some of them are new, to cater to the new age requirements. Here is a list of the 7
popular Database models:

1. Hierarchical Model
2. Network Model
3. Entity-relationship Model
4. Relational Model
5. Object-oriented Model
6. NoSQL Model
7. Graph Model
Hierarchical Model

● The hierarchical database model organizes data into a tree-like structure, with a single root, to which all the other
data is linked.
● The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes.
● In this model, a child node will only have a single parent node.
● This model efficiently describes many real-world relationships like the index of a book, etc.
● IBM's Information Management System (IMS) is based on this model.
● Data is organized into a tree-like structure with a one-to-many relationship between two different types of data,
for example, one department can have many courses, many teachers, and of course many students(like shown in
the diagram below).

Advantages/Disadvantages of the Hierarchical Model

Here are a few points to mark the advantages and disadvantages of the Hierarchical database model:

1. Because it has one-to-many relationships between different types of data so it is easier and fast to fetch the data.
2. But the Hierarchical model is less flexible.
3. And it doesn't support many-to-many relationships.
Network Model

● The Network Model is an extension of the Hierarchical model.


● In this model, data is organized more like a graph, and allowed to have more than one parent node.
● In the network database model, data is more related as more relationships are established in this database model.
● Also, as the data is more related, hence accessing the data is also easier and fast.
● This database model uses many-to-many data relationships.
● Integrated Data Store (IDS) is based on this database model.
● This was the most widely used database model before Relational Model was introduced.
● The implementation of the Network model is complex, and it's very difficult to maintain it.
● The Network model is difficult to modify also.
● You may want to explore this if you are developing some social networking applications, although the Graph
Database model is new and is far better than the Network Database model.

Advantages of the Network Model

1. It supports complex relationships


2. It allows more flexibility
Entity-relationship Model

● In this database model, relationships are created by dividing objects of interest into entities and their characteristics into
attributes.
● Different entities are related using relationships.
● ER Models are defined to represent the relationships in pictorial form to make it easier for different stakeholders to
understand.
● This model is good to design a database, which can then be turned into tables in a relational model (explained below).
● Let's take an example, If we have to design a School Database, then the Student will be an entity with attributes name,
age, address, etc. As an Address is generally complex, it can be another entity with attributes street, pincode, city, etc, and
there will be a relationship between them.
● Relationships can also be of different types. You can learn about ER Diagrams in detail if you want to learn about entities
and relationships.

Advantages of the ER Model

1. It is easy to understand and design.


2. Using the ER model we can represent data structures easily.
3. As the ER model cannot be directly implemented into a database model, it is just a step toward designing the relational
database model.
Relational Model

● In this model, data is organized in two-dimensional tables and the relationship is maintained by storing a common
field.
● This model was introduced by E.F Codd in 1970, and since then it has been the most widely used database model.
● The basic structure of data in the relational model is tables. All the information related to a particular type is
stored in rows of that table.
● Hence, tables are also known as relations in the relational model.
● You can design tables, normalize them to reduce data redundancy, and use Structured Query language or SQL to
access data from the tables.
● Some of the most popular databases are based on this database model. For example, Oracle, MySQL, etc.

Advantages of the Relational Model

1. It's simple and easy to implement.


2. Poplar database software is available for this database model.
3. It supports SQL using which you can easily query the data.
Object-oriented Model

● In this model, data is stored in the form of objects.


● The behavior of the object-oriented database model is just like object-oriented programming.
● A very popular example of an Object Database management system or ODBMS is MongoDB
which is also a NoSQL database.
● This database model is not mature enough as compared to the relational database model.

Advantages of the Object-oriented Model

1. It can easily support complex data structures, with relationships.


2. It also supports features like Inheritance, Encapsulation, etc.
NoSQL Model

● The NoSQL database model supports an unstructured style of storing data.


● Data is stored as documents.
● The documents look more like JSON strings or Key-value based object representations.
● It provides a flexible schema.
● It does provide features like indexing, relationships between data, etc.
● The support for data querying is limited in the NoSQL database model.
● This database model is well-suited for Big data applications, real-time analytics, CMS (Content
Management systems), etc.

Advantages of the NoSQL Model

1. This database model is scalable.


2. This database model functions with high performance.
3. The NoSQL database model can handle large volumes of data.
Graph Model

● The Graph database model is based on more real-world like relationships.


● Data is represented using Nodes or entities.
● The nodes are related using edges.
● The popular database Neo4j is based on the Graph database model.
● If your application has simple data requirements, then you should not use the graph database
model.
● In modern applications like social networks, recommendation systems, etc. the graph database
model is well-suited.

Advantages of the Graph Model

1. It handles complex relationships very well.


2. In the modern world where there is so much data and the data has to be related in different ways,
the graph database model is very useful.
Data Abstraction in DBMS
Data Abstraction is one of the most important concepts in DBMS. Data abstraction is the
process of hiding unwanted and irrelevant details from the end user. It helps to store
information in such a way that the end user can access data which is necessary, the user will not
be able to see what data is stored or how it is stored in a database.

Data abstraction helps to keep data secure from unauthorized access and it hides all the
implementation details.

Levels of Abstraction in DBMS


● Physical or internal level
● logical or conceptual level
● view or external level
Physical or Internal Level
It is the lowest level of data abstraction which defines how data is stored in database . It defines data
structures used to store data and methods to access data in database. It is very complex to understand and
hence kept hidden from user. Database administrator decides how and where to store the data in database.
Physical level deals with actual storage details like data organization, disk space allocation and data access
methods.

Logical or Conceptual Level


It is intermediate level present next to physical level. It defines what data is present in database and their
relationships between them . It is less complex as compared to physical level. Programmers generally work
at this level and depending on data, structure of tables, relationships and their constraints is decided at this
level.

View or External Level


It is the highest level in abstraction. There are different levels of views and each view defines only a part of
whole data required to user. This level defines many views of same database for sim0lication of view to
user. This is the highest level and easiest to understand for user.
Data Independence
Data Independence is mainly defined as a
property of DBMS that helps you to change the
database schema at one level of a system without
requiring to change the schema at the next level.
it helps to keep the data separated from all
program that makes use of it.
We have namely two levels of data independence
arising from these levels of abstraction:
● Physical level data independence
● Logical level data independence
Physical Level Data Independence

It refers to the characteristic of being able to modify the physical schema without any alterations to the conceptual or
logical schema, done for optimization purposes, e.g., the Conceptual structure of the database would not be affected
by any change in storage size of the database system server.
Changing from sequential to random access files is one such example. These alterations or modifications to the
physical structure may include:
● Utilizing new storage devices.
● Modifying data structures used for storage.
● Altering indexes or using alternative file organization techniques etc.

Logical Level Data Independence

It refers characteristic of being able to modify the logical schema without affecting the external schema or
application program. The user view of the data would not be affected by any changes to the conceptual view of the
data. These changes may include insertion or deletion of attributes, altering table structures entities or relationships
to the logical schema, etc.
Structure of Database Management System

Database Management System (DBMS) is software that allows access to data stored in a database
and provides an easy and effective method of –
● Defining the information.
● Storing the information.
● Manipulating the information.
● Protecting the information from system crashes or data theft.
● Differentiating access permissions for different users.

Note: Structure of Database Management System is also referred to as Overall System Structure or
Database Architecture but it is different from the tier architecture of Database.
The database system is divided into three components:

1. Query Processor,
2. Storage Manager, and
3. Disk Storage.

Query Processor: It interprets the requests (queries) received from end user via an application program into
instructions. It also executes the user request which is received from the DML compiler.
Query Processor contains the following components –
Query Processor Components :
• DML Pre-compiler : It translates DML statements in a query language into low level instructions that
query evaluation engine understands. It also attempts to transform user's request into an equivalent
but more efficient form.
• Embedded DML Pre-compiler : It converts DML statements embedded in an application program to
normal procedure calls in the host language. The Pre-compiler must interact with the DML compiler to
generate the appropriate code.
• DDL Interpreter : It interprets the DDL statements and records them in a set of tables containing
meta data or data dictionary.
• Query Evaluation Engine : It executes low-level instructions generated by the DML compiler.
2. Storage Manager Components :
They provide the interface between the low-level data stored in the database and application programs and
queries submitted to the system.
Authorization and Integrity Manager : It tests for the satisfaction of integrity constraints checks the
authority of users to access data.
Transaction Manager : It ensures that the database remains in a consistent state despite the system failures
and that concurrent transaction execution proceeds without conflicting.
File Manager : It manages the allocation of space on disk storage and the data structures used to represent
information stored on disk.
Buffer Manager : It is responsible for fetching data from disk storage into main memory and deciding what
data to cache in memory.
Disk Storage :
Following data structures are required as a part of the physical system implementation.
Data Files : It stores the database.
Data Dictionary : It stores metadata (data about data) about the structure of the
database.
Indices : Provide fast access to data items that hold particular values.
Statistical Data : It stores statistical information about the data in the database. This
information is used by query processor to select efficient ways to execute query.
Structured Query Language (SQL), as we all know, is the database language by which
we can perform certain operations on the existing database, and we can also use this
language to create a database. SQL uses certain commands like CREATE, DROP, INSERT,
etc. to carry out the required tasks.
SQL commands are like instructions to a table. It is used to interact with the database with
some operations. It is also used to perform specific tasks, functions, and queries of data.
SQL can perform various tasks like creating a table, adding data to tables, dropping the
table, modifying the table, set permission for users.
DDL (Data Definition Language)
DDL or Data Definition Language actually consists of the SQL commands that can be used to define the
database schema. It simply deals with descriptions of the database schema and is used to create and
modify the structure of database objects in the database.
DDL is a set of SQL commands used to create, modify, and delete database structures but not data. These
commands are normally not used by a general user, who should be accessing the database via an
application.
DML(Data Manipulation Language)
The SQL commands that deal with the manipulation of data present in the
database belong to DML or Data Manipulation Language and this includes most of
the SQL statements.
It is the component of the SQL statement that controls access to data and to the
database. Basically, DCL statements are grouped with DML statements.
Database Design and ER Diagrams:
Peter Chen developed the ER diagram in 1976 .The ER model was created to provide a simple
and understandable model for representing the structure and logic of databases. It has since
evolved into variations such as the Enhanced ER Model and the Object Relationship Model.
The Entity Relational Model is a model for identifying entities to be represented in the database
and representation of how those entities are related. The ER data model specifies enterprise
schema that represents the overall logical structure of a database graphically.
The Entity Relationship Diagram explains the relationship among the entities present in the
database. ER models are used to model real-world objects like a person, a car, or a company and
the relation between these real-world objects. In short, the ER Diagram is the structural format
of the database.
Why Use ER Diagrams In DBMS?
● ER diagrams are used to represent the E-R model in a database, which makes them
easy to convert into relations (tables).
● ER diagrams provide the purpose of real-world modeling of objects which makes
them intently useful.
● ER diagrams require no technical knowledge and no hardware support.
● These diagrams are very easy to understand and easy to create even for a naive user.
● It gives a standard solution for visualizing the data logically.
Symbols Used in ER Model
ER Model is used to model the logical view of the system from a data perspective which consists
of these symbols:
● Rectangles: Rectangles represent Entities in the ER Model.
● Ellipses: Ellipses represent Attributes in the ER Model.
● Diamond: Diamonds represent Relationships among Entities.
● Lines: Lines represent attributes to entities and entity sets with other relationship types.
● Double Ellipse: Double Ellipses represent Multi-Valued Attributes.
● Double Rectangle: Double Rectangle represents a Weak Entity.
Components of ER Diagram
ER Model consists of Entities, Attributes, and Relationships among Entities in a
Database System.
Entity
An Entity may be an object with a physical existence – a particular person, car, house, or employee – or it may
be an object with a conceptual existence – a company, a job, or a university course.

1. Strong Entity
A Strong Entity is a type of entity that has a key Attribute. Strong Entity does not depend on other Entity in
the Schema. It has a primary key, that helps in identifying it uniquely, and it is represented by a rectangle.
These are called Strong Entity Types.
2. Weak Entity
An Entity type has a key attribute that uniquely identifies each entity in the entity set. But some entity type
exists for which key attributes can’t be defined. These are called Weak Entity types.
For Example, A company may store the information of dependents (Parents, Children, Spouse)
of an Employee. But the dependents can’t exist without the employee. So Dependent will be a
Weak Entity Type and Employee will be Identifying Entity type for Dependent, which means it
is Strong Entity Type.
Attributes

Attributes are the properties that define the entity type. For example, Roll_No, Name, DOB, Age,
Address, and Mobile_No are the attributes that define entity type Student. In ER diagram, the
attribute is represented by an oval.
1. Key Attribute
2. Composite Attribute
3. Multivalued Attribute
4. Derived Attribute
Key Attribute
The attribute which uniquely identifies each entity in the entity set is
called the key attribute. For example, Roll_No will be unique for each
student. In ER diagram, the key attribute is represented by an oval with
underlying lines.
Composite Attribute
An attribute composed of many other attributes is called a composite
attribute. For example, the Address attribute of the student Entity type
consists of Street, City, State, and Country. In ER diagram, the composite
attribute is represented by an oval comprising of ovals.
Multivalued Attribute
An attribute consisting of more than one value for a given entity. For
example, Phone_No (can be more than one for a given student). In ER
diagram, a multivalued attribute is represented by a double oval.
Derived Attribute
An attribute that can be derived from other attributes of the entity type
is known as a derived attribute. e.g.; Age (can be derived from DOB). In
ER diagram, the derived attribute is represented by a dashed oval.
The Complete Entity Type Student with its Attributes can be represented as:
Entity Set:
An Entity is an object of Entity
Type and a set of all entities is
called an entity set. For Example,
E1 is an entity having Entity Type
Student and the set of all students is
called Entity Set. In ER diagram,
Entity Type is represented as:
Relationship Type
A Relationship Type represents the association between entity types. For
example, ‘Enrolled in’ is a relationship type that exists between entity type
Student and Course. In ER diagram, the relationship type is represented by a
diamond and connecting the entities with lines.
Relationship Set

A set of relationships of the same type is known as a relationship set. The following
relationship set depicts S1 as enrolled in C2, S2 as enrolled in C1, and S3 as
registered in C3.
1. Unary Relationship: When there is only ONE entity set participating in a relation, the
relationship is called a unary relationship. For example, one person is married to only one
person.

2. Binary Relationship: When there are TWO entities set participating in a relationship, the
relationship is called a binary relationship. For example, a Student is enrolled in a Course.

3.Ternary Relationship: When there are n entities set participating in a relation, the
relationship is called an n-ary relationship.
Cardinality
The number of times an entity of an entity set participates in a relationship set is known as
cardinality. Cardinality can be of different types:
1. One-to-One: When each entity in each entity set can take part only once in the relationship, the
cardinality is one-to-one. Let us assume that a male can marry one female and a female can marry
one male. So the relationship will be one-to-one.
the total number of tables that can be used in this is 2.
Using Sets, it can be represented as:
One-to-Many (1:N) Relationship

A relationship where the items from one table can be linked


to only one or many items from another table is called a
one-to-many relationship; in some cases, one item from the
first table correlates with only one item from the second
table. This connection becomes very strong in that it is
particularly used to describe situations where one object can
be linked to many similar or identical objects.
Many-to-One(N:1): When entities in one entity set can take part only
once in the relationship set and entities in other entity sets can take part
more than once in the relationship set, cardinality is many to one. Let
us assume that a student can take only one course but one course can
be taken by many students. So the cardinality will be n to 1.
Many-to-Many (N:M) Relationship

The duality of a many-to-many relationship is characterized by the


presence of multiple records belonging to a table in association with
multiple records from another table. The interconnection of these
relationships follows a junction table format, which is the component
that holds both tables together.
Additional Features of the ER Model
Using the ER model for bigger data creates a lot of complexity while designing a
database model, So in order to minimize the complexity Generalization,
Specialization, and Aggregation were introduced in the ER model and these were
used for data abstraction in which an abstraction mechanism is used to hide
details of a set of objects. Some of the terms were added to the Enhanced ER
Model, where some new concepts were added. These new concepts are:
● Generalization
● Specialization
● Aggregation
Generalization
Generalization is the process of extracting common properties from a set of
entities and creating a generalized entity from it. It is a bottom-up approach in
which two or more entities can be generalized to a higher-level entity if they have
some attributes in common.
Mobile Laptop
dimensio
memory
n

price battery
Electronic Device

Is
a

Mobile Laptop

mobile_id laptop_id
Specialization
In specialization, an entity is divided into sub-entities based on its
characteristics. It is a top-down approach where the higher-level entity is
specialized into two or more lower-level entities.
Emp_
E_SA
name
L
EMPLOYEE

I Emp_n
ame E_S
s AL
a TESTER

DEVELOP TES_
TESTER TYPE
ER

TES_
TYPE
Aggregation
In aggregation, the relation between two entities is treated as a single
entity. In aggregation, relationship with its corresponding entities is
aggregated into a higher level entity.
Conceptual Modeling using the Entity-Relationship Model
Contents
• Basic concepts: entities and entity types, attributes and keys, relationships and relationship
types

• Entity-Relationship schema (aka ER diagram)

• Constraints on relationship types

• Design choices

• Enhanced Entity-Relationship model features

• Steps in designing an ER schema

• Translation of an ER schema to tables


• Entity-Relationship model is used in the conceptual design of a database (☞ conceptual
level, conceptual schema)
• Design is independent of all physical considerations (DBMS, OS, . . . ). Questions that are
addressed during conceptual design:
– What are the entities and relationships of interest (miniworld)?
– What information about entities and relationships among entities needs to be stored in
the database?
– What are the constraints (or business rules) that (must) hold for the entities and
relationships?
• A database schema in the ER model can be represented pictorially (Entity-Relationship
diagram)
Steps in Designing an Entity-Relationship Schema
[Step 1] Identify entity types (entity type vs. attribute)

[Step 2] Identify relationship types

[Step 3] Identify and associate attributes with entity and relationship types

[Step 4] Determine attribute domains

[Step 5] Determine primary key attributes for entity types

[Step 6] Associate (refined) cardinality ratio(s) with relationship types

[Step 7] Design generalization/specialization hierarchies including constraints (includes


natural language statements as well)
Translation of ER Schema into Tables
• An ER schema can be represented by a collection of tables which represent contents of the
database (instance).

• Primary keys allow entity types and relationship types to be expressed uniformly as tables.

• For each entity and relationship type, a unique table can be derived which is assigned the
name of the corresponding entity or relationship type.

• Each table has a number of columns that correspond to the attributes and which have
unique names. An attribute of a table has the same domain as the attribute in the ER schema.

• Translating an ER schema into a collection of tables is the basis for deriving a relational
database schema from an ER diagram.

You might also like