Rdbms Unit I
Rdbms Unit I
◦ Manufacturing: For management of the supply chain and for tracking production of items
in factories, inventories of items in warehouses and stores, and orders for items.
◦ Online retailers: For sales data noted above plus online order tracking, generation of
recommendation lists, and maintenance of online product evaluations.
• Banking and Finance
◦ Banking: For customer information, accounts, loans, and banking transactions.
◦ Credit card transactions: For purchases on credit cards and generation of
monthly statements.
◦ Finance: For storing information about holdings, sales, and purchases of financial
instruments such as stocks and bonds; also, for storing real-time market data to enable online
trading by customers and automated trading by the firm.
• Universities: For student information, course registrations, and grades (in addition to
standard enterprise information such as human resources and accounting).
• Airlines: For reservations and schedule information. Airlines were among thefirst to use
databases in a geographically distributed manner.
• Telecommunication: For keeping records of calls made, generating monthly bills,
maintaining balances on prepaid calling cards, and storing information about the
communication networks.
View of data in DBMS narrate how the data is visualized at each level of data abstraction? Data
abstraction allow developers to keep complex data structures away from the users. The developers
achieve this by hiding the complex data structures through levels of abstraction.
There is one more feature that should be kept in mind i.e. the data independence. While changing
the data schema at one level of the database must not modify the data schema at the next level. In this
section, we will discuss the view of data in DBMS with data abstraction, data independence, data
schema in detail.
1. Data Abstraction
2. Data Independence
3. Instance and Schema
4. Data model
1) Data Abstraction
Data abstraction is hiding the complex data structure in order to simplify the user’s interface of
the system. It is done because many of the users interacting with the database system are not that
much computer trained to understand the complex data structures of the database system.
To achieve data abstraction, we will discuss a Three-Schema architecture which abstracts the
database at three levels discussed below:
Three-Schema Architecture:
The main objective of this architecture is to have an effective separation between the user
interface and the physical database. So, the user never has to be concerned regarding the internal
storage of the database and it has a simplified interaction with the database system.
The three-schema architecture defines the view of data at three levels:
The physical or the internal level schema describes how the data is stored in the hardware. It also
describes how the data can be accessed. The physical level shows the data abstraction at the lowest
level and it has complex data structures. Only the database administrator operates at this level.
It is a level above the physical level. Here, the data is stored in the form of the entity set, entities,
their data types, the relationship among the entity sets, user operations performed to retrieve or
modify the data and certain constraints on the data. Well adding constraints to the view of data
adds the security. As users are restricted to access some particular parts of the database. It is the
developer and database administrator who operates at the logical or the conceptual level.
It is the highest level of data abstraction and exhibits only a part of the whole database. It exhibits
the data in which the user is interested. The view level can describe many views of the same data.
Here, the user retrieves the information using different application from the database.The figure
below describes the three-schema architecture of the database:
In the figure above you can clearly distinguish between the three levels of abstraction. To understand
it more clearly let us take an example:
We have to create a database of a college. Now, what entity sets would be involved? Student,
Lecturer, Department, Course and so on…
Now, the entity sets Student, Lecturer, Department, Course will be stored in the storage as
the consecutive blocks of the memory location. This is the physical or internal level and is hidden
from the programmers but the database administrator is it aware of it.
At the logical level, the programmers define the entity sets and relationship among these entity sets
using a programming language like SQL. So, the programmers work at the logical level and even the
database administrator also operates at this level.
At the view level, the users have the set of applications which they use to retrieve the data they are
interested in.
2) Data Independence
Data independence defines the extent to which the data schema can be changed at one level without
modifying the data schema at the next level. Data independence can be classified as shown below:
Logical data independence describes the degree up to which the logical or conceptual schema
can be changed without modifying the external schema. Now, a question arises what is the
need to change the data schema at a logical or conceptual level?
Well, the changes to data schema at the logical level are made either to enlarge or reduce the
database by adding or deleting more entities, entity sets, or changing the constraints on data.
Physical data independence defines the extent up to which the data schema can be changed at
the physical or internal level without modifying the data schema at logical and view
level.Well, the physical schema is changed if we add additional storage to the system or we
reorganize some files to enhance the retrieval speed of the records.
What is an instance?
We can define an instance as the information stored in the database at a particular point of time. Let
us discuss it with the help of an example.
As we discussed above the database comprises of several entity sets and the relationship between
them. Now, the data in the database keeps on changing with time. As we keep inserting or deleting
the data to and from the database.
Now, at a particular time if we retrieve any information from the database then that corresponds to
an instance.
What is schema?
Whenever we talk about the database the developers have to deal with the definition of database and
the data in the database. The definition of a database comprises of the description of what data it
would contain what would be the relationship between the data. This definition is the database
schema. So that’s all about the view of data in the database which help us to understand the database
from users, developers and database administrator aspects.
4)Data Models
• Data Model is the modeling of the data description, data semantics, and consistencyconstraints
of the data.
• It provides the conceptual tools for describing the design of a database at each level of
data abstraction.
• Therefore, there are following four data models used for understanding the structure
of the database.
• Hierarchical database
• Network database
• Relational database
• ER model database
1. Hierarchical DBMS
2. Network Model
The network database model allows each child to have multiple parents. It helps you to
address the need to model more complex relationships like as the orders/parts many-to-many
relationship. In this model, entities are organized in a graph which can be accessed through
several paths.
3. Relational model
Relational DBMS is the most widely used DBMS model because it is one of the easiest. This
model is based on normalizing data in the rows and columns of the tables. Relational model
stored in fixed structures and manipulated using SQL.
4. Entity-Relationship Model
Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships
among them. While formulating real-world scenario into the database model, the ER Model
creates entity set, relationship set, general attributes and constraints.
5. Semi structured Data Model.
The semi structured data model permits the specification of data where individual data items
of the same type may have different sets of attributes. This is in contrast to the data models
mentioned earlier, where every data item of a particular type must have the same set of
attributes. The Extensible Markup Language (XML) is widely used to represent semi
structured data.
Historically, the network data model and the hierarchical data model preceded the relational
data model. These models were tied closely to the underlying implementation, and
complicated the task of modeling data.
o DDL stands for Data Definition Language. It is used to define database structure or pattern.
o It is used to create schema, tables, indexes, constraints, etc. in the database.
o Using the DDL statements, you can create the skeleton of the database.
o Data definition language is used to store the information of metadata like the number of tables
and schemas, their names, indexes, columns in each table, constraints, etc.
o A DBMS has appropriate languages and interfaces to express database queries and updates.
o Database languages can be used to read, store and update the data in the database.
These commands are used to update the database schema that's why they come under Data definition
language.
DML stands for Data Manipulation Language. It is used for accessing and manipulating data in a
database. It handles user requests.
• Procedural DMLs require a user to specify what data are needed and how to
specify what data are needed without specifying how to get those data.
o DCL stands for Data Control Language. It is used to retrieve the stored or saved data.
o The DCL execution is transactional. It also has rollback parameters.
(But in Oracle database, the execution of data control language does not have the feature of
rolling back.)
There are the following operations which have the authorization of Revoke:
TCL is used to run the changes made by the DML statement. TCL can be grouped into a logical
transaction.Here are some tasks that come under TCL:
DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture is used to
deal with a large number of PCs, web servers, database servers and other components that are
connected with networks.
o The client/server architecture consists of many PCs and a workstation which are connected via the
network.
o DBMS architecture depends upon how users are connected to the database to get their request
done.
Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of
two types like: 2-tier architecture and 3-tier architecture.
1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can directly sit
on the DBMS and uses it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy tool
for end users.
o The 1-Tier architecture is used for development of the local application, where programmers can
directly communicate with the database for the quick response.
2-Tier Architecture
o The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on
the client end can directly communicate with the database at the server side. For this interaction,
API's like: ODBC, JDBC are used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and transaction
management.
o To communicate with the DBMS, client-side application establishes a connection with the server
side.
3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this architecture,
client can't directly communicate with the server.
o The application on the client-end interacts with an application server which further communicates
with the database system.
o End user has no idea about the existence of the database beyond the application server. The
database also has no idea about any other user beyond the application.
o The 3-Tier architecture is used in case of large web application.
• Na¨ıve users are unsophisticated users who interact with the system by invoking one of the
application programs that have been written previously.
• For example, a clerk in the university who needs to add a new instructor to department A
invokes a program called new hire. This program asks the clerk for the name of the new
instructor, her new ID, the name of the department (that is, A), and the salary. The typical user
interface for na¨ıve users is a forms interface, where the user can fill in appropriate fields of
the form. Na¨ıve users may also simply read reports generated from the database.
• Application programmers are computer professionals who write application programs.
Application programmers can choose from many tools to develop user interfaces. Rapid
application development (RAD) tools are tools that enable an application programmer to
construct forms and reports with minimal programming effort.
• Sophisticated users interact with the system without writing programs. Instead, they form
their requests either using a database query language or by using tools such as data analysis
software. Analysts who submit queries to explore data in the database fall in this category.
• Specialized users are sophisticated users who write specialized database applications that do
not fit into the traditional data-processing framework. Among these applications are
computer-aided design systems, knowledgebase and expert systems, systems that store data
with complex data types (for example, graphics data and audio data), and environment-
modeling systems covers several of these applications.
Database Administrator
One of the main reasons for using DBMSs is to have central control of both the data and the programs
that access those data. A person who has such central control over the system is called a database
administrator (DBA). The functions of a DBA include:
• Schema definition. The DBA creates the original database schema by executing a set of data
definition statements in the DDL.
• Storage structure and access-method definition.
• Schema and physical-organization modification. The DBA carries out changes to the schema and
physical organization to reflect the changing needs of the organization, or to alter the physical
organization to improve performance.
• Granting of authorization for data access. By granting different types of authorization, the
database administrator can regulate which parts of the database various users can access. The
authorization information is kept in a special system structure that the database system consults
whenever someone
attempts to access the data in the system.
• Routine maintenance. Examples of the database administrator’s routine maintenance activities are:
◦ Periodically backing up the database, either onto tapes or onto remote servers, to prevent loss of
data in case of disasters such as flooding.
◦ Ensuring that enough free disk space is available for normal operations, and upgrading disk space
as required.
◦ Monitoring jobs running on the database and ensuring that performance is not degraded by very
expensive tasks submitted by some users.
Late 1990s:
Large decision support and data-mining applications
Large multi-terabyte data warehouses
Emergence of Web commerce
Early 2000s:
XML and XQuery standards
Automated database administration
Later 2000s:
Web databases (semi-structured data, XML, complex data types)
Cloud computing
Giant data storage systems (Google BigTable, Yahoo PNuts, Amazon
Web Services, …)
Advanced databases (mainly non-relational (e.g., graph-based, textbased) but also advanced
relational)
1.10 Entity Relationship model:
o ER model stands for an Entity-Relationship model. It is a high-level data model. This model
is used to define the data elements and relationship for a specified system.
o attributes of an entity, relationship sets, and attributes of relationship sets, can be represented
with the help of an ER diagram
o Entities are represented by means of rectangles. Rectangles are named with the entity set they
represent.
o It develops a conceptual design for the database. It also develops a very simple and easy to
design view of data.
o In ER modeling, the database structure is portrayed as a diagram called an entity-relationship
diagram.
For example, Suppose we design a school database. In this database, the student will be an entity
with attributes like address, name, id, age, etc. The address can be another entity with attributes like
city, street name, pin code, etc and there will be a relationship between them.
Component of ER Diagram
An entity in DBMS (Database management System) is a real-world thing or a real-world object which is
distinguishable from other objects in the real world. For example, a car is an entity.
Entity Type:
1. Strong Entity Type: It is an entity that has its own existence and is independent.
• The entity relationship diagram represents a strong entity type with the help of a single rectangle.
Below is the ERD of the strong entity type:
• In the above example, the "Customer" is the entity type with attributes such as ID, Name, Gender,
and Phone Number. Customer is a strong entity type as it has a unique ID for each customer.
• It is an entity that does not have its own existence and relies on a strong entity for its existence.
• The Entity Relationship Diagram represents the weak entity type using double rectangles. Below
is the ERD of the weak entity type:
• In the above example, "Address" is a weak entity type with attributes such as House No., City,
Location, and State.
• The relationship between a strong and a weak entity type is known as an identifying relationship.
• Using a double diamond, the Entity-Relationship Diagram represents a relationship between the
strong and the weak entity type.
• Let us see an example of the relationship between the Strong entity type and weak entity type with
the help of ER Diagram:
• Example of Entity Relationship Diagram representation of the above weak entity set:
Strong entity set always has a It does not have enough attributes to
primary key. build aprimary key.
It is represented by a rectangle It is represented by a double
symbol. rectangle symbol.
It contains a Primary key It contains a Partial Key which is
represented by the underline represented by a dashed underline
symbol. symbol.
The member of a strong entity set The member of a weak entity set
is called as dominant entity set. called as a subordinate entity set.
Primary Key is one of its In a weak entity set, it is a
attributes whichhelps to identify its combination of primary key and
member. partial key of the strong entity set.
In the ER diagram the relationship The relationship between one strong
between two strong entity set and a weak entity set shown by using
shown by using a diamond symbol. the double diamond symbol.
The connecting line of the strong The line connecting the weak entity
entity set with the relationship is set for identifying relationship is
single. double.
1.10.3Attributes
1. Key attribute:
A key attribute can uniquely identify an entity from an entity set. For example, student roll number
can uniquely identify a student from a set of students. Key attribute is represented by oval same as
other attributes however the text of key attribute is underlined.
2. Composite attribute:
An attribute that is a combination of other attributes is known as composite attribute. For example,
In student entity, the student address is a composite attribute as an address is composed of other
attributes such as pin code, state, country.
3. Multivalued attribute:
An attribute that can hold multiple values is known as multivalued attribute. It is represented
with double ovals in an ER Diagram. For example – A person can have more than one phone
numbers so the phone number attribute is multivalued.
4. Derived attribute:
A derived attribute is one whose value is dynamic and derived from another attribute. It is
represented by dashed oval in an ER Diagram. For example – Person age is a derived attribute as
it changes over time and can be derived from another attribute (Date of birth).
1.10.4 Keys
Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.
For example, the roll_number of a student makes him/her identifiable among students.
2) Candidate Key - Candidate keys are defined as the set of fields from which primary key can be
selected.
In otherwords we can say ,candidate key is known as nominee's for primary key field. A minimal
super key is called a candidate key. An entity set may have more than one candidate key.
3) Super Key - Super Key is a superset of Candidate key. A set of attributes (one or more) that
collectively identifies an entity in an entity set.
4) Alternate Key - The candidate key which are not selected for primary key are known as
alternative keys. Alternate keys are known as Secondary keys.
5) Composite Key - If a table is not having a primary key then we use the composite keys which
comes with the combination of two columns which uniquely identify the record or tuple.
In otherwords we can say, key that consists of two or more attributes that uniquely identify an
entity occurrence is called Composite key.
6) Foreign Key - A foreign key is a column or group of columns in a relational database table that
provides a link between data in two tables.
Relationships are represented by diamond-shaped box. Name of the relationship is written inside
the diamond-box. All the entities (rectangles) participating in a relationship.
A weak entity is a type of entity which doesn't have its key attribute. It can be identified uniquely
by considering the primary key of another entity. For that, weak entity sets need to have
participation.
1.10.6 Relationship
Degree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
• Binary = degree 2
• Ternary = degree 3
• n-ary = degree
• One instance of an entity is associated with only one instance of the other entity set.
(ii) One to many relationship: The relationship is said to be one to many, when
one instance of an entity set is related to more than one instance of another
entity set.
Ex.
Father M Children
(a) Has
(b) Teacher
Teacher M Courses
teaches
teacher Courses
iii) Many to one relationship: When many instances of an entity set are
associated with at most one instance of another entity set.
Ex.
Students Opt. for College
a single college.
Many students opt. for
Cstreet amount
Loan
Cid
Customer Brower
C name
Loan #
(i) Total participation: Every entity in the entity set participates in at least one
relationship in the relationship set.
Eg. Participation of loan in borrower is total.
(ii) Partial participation: Some entities entities many not participate in any relationship
in the relationship set.
Ex. Participation of customer in borrower is partial.
C name
C City Loan # amount
cid Cstreet
• One-to-many − One entity from entity set A can be associated with more than one entities
of entity set B however an entity from entity set B, can be associated with at most one
entity.
• Many-to-one − More than one entities from entity set A can be associated with at most one
entity of entity set B, however an entity from entity set B can be associated with more than
one entity from entity set A.
• Many-to-many − One entity from A can be associated with more than one entity from B
and vice versa.
onstraints are used for modeling limitations on the relations between entities.
There are two types of constraints on the Entity Relationship (ER) model −
• One-to-many − When more than one instance of an entity is associated with a relationship,
it is marked as '1:N'. The following image reflects that only one instance of entity on the
left and more than one instance of an entity on the right can be associated with the
relationship. It depicts one-to-many relationship.
• Many-to-one − When more than one instance of entity is associated with the relationship,
it is marked as 'N:1'. The following image reflects that more than one instance of an entity
on the left and only one instance of an entity on the right can be associated with the
relationship. It depicts many-to-one relationship.
• Many-to-many − The following image reflects that more than one instance of an entity on
the left and more than one instance of an entity on the right can be associated with the
relationship. It depicts many-to-many relationship.
Participation Constraints
• Total Participation − Each entity is involved in the relationship. Total participation is
represented by double lines.
• Partial participation − Not all entities are involved in the relationship. Partial participation
is represented by single lines.
Subclasses
A subclass is a class derived from the superclass. It inherits the properties of the superclass and also
contains attributes of its own. An example is:
Car, Truck and Motorcycle are all subclasses of the superclass Vehicle. They all inherit common attributes
from vehicle such as speed, colour etc. while they have different attributes also i.e Number of wheels in
Car is 4 while in Motorcycle is 2.
Super classes
A superclass is the class from which many subclasses can be created. The subclasses inherit the
characteristics of a superclass. The superclass is also known as the parent class or base class.
In the above example, Vehicle is the Superclass and its subclasses are Car, Truck and Motorcycle.
Generalization
As mentioned above, the process of generalizing entities, where the generalized entities contain the
properties of all the generalized entities, is called generalization. In generalization, a number of
entities are brought together into one generalized entity based on their similar characteristics. For
example, pigeon, house sparrow, crow and dove can all be generalized as Birds.
Specialization
Specialization is the opposite of generalization. In specialization, a group of entities is divided into
sub-groups based on their characteristics. Take a group ‘Person’ for example. A person has name,
date of birth, gender, etc. These properties are common in all persons, human beings. But in a
company, persons can be identified as employee, employer, customer, or vendor, based on what
role they play in the company.
Similarly, in a school database, persons can be specialized as teacher, student, or a staff, based on
what role they play in school as entities.
Inheritance
We use all the above features of ER-Model in order to create classes of objects in object-oriented
programming. The details of entities are generally hidden from the user; this process known
as abstraction.
Inheritance is an important feature of Generalization and Specialization. It allows lower-level
entities to inherit the attributes of higher-level entities.
For example, the attributes of a Person class such as name, age, and gender can be inherited by
lower-level entities such as Student or Teacher.