22CS4201 DBMS UNIT 1 - Notesbbhjsjsjjsnsns
22CS4201 DBMS UNIT 1 - Notesbbhjsjsjjsnsns
Introduction
• Definition: A Database Management System (DBMS) is a collection of
interrelated data and various programs that are used to handle that data.
• The primary goal of DBMS is to provide a way to store and retrieve the
required information from the database in convenient and efficient manner.
• For managing the data in the database two important tasks are conducted -
(i) Define the structure for storage of information.
(ii) Provide mechanism for manipulation of information.
• In addition, the database systems must ensure the safety of information stored.
Database System Applications
There are wide range of applications that make use of database systems. Some
of the applications are -
1) Accounting: Database systems are used in maintaining information
employees, salaries, and payroll taxes.
2) Manufacturing: For management of supply chain and tracking production of
items in factories database systems are maintained.
3) For maintaining customer, product and purchase information the databases
are used.
4) Banking: In banking sector, for customer information, accounts and loan and
for performing banking applications the DBMS is used.
5) For purchase on credit cards and generation of m`onthly statements database
systems are useful.
6) Universities: The database systems are used in universities for maintaining
student information, course registration, and accounting.
7) Reservation systems: In airline/railway reservation systems, the database is
used to at maintain the reservation and schedule information.
8) Telecommunication: In telecommunications for keeping records of the calls
made, generating monthly bills, maintaining balances on prepaid calling cards,
and storing information about communication networks the database systems
are used.
Purpose of Database System
• Earlier database systems are created in response to manage the commercial
data. These data is typically stored in files. To allow users to manipulate these
files various programs are written for
1) Addition of new data
2) Updating the data
3) Deleting the data.
• As per the addition of new need, separate application programs were required
to write. Thus as the time goes by, the system acquires more files and more
application programs.
• This typical file processing system is supported by conventional operating
system. Thus the file processing system can be described as -
• The system that stores the permanent records in files and it needs different
application programs to extract or add the records.
Before introducing database management system, this file processing system
was in use. However, such a system has many drawbacks. Let us discuss them
Disadvantages of Traditional File Processing System
The traditional file system has following disadvantages:
1) Data redundancy: Data redundancy means duplication of data at several
places. Since different programmers create different files and these files might
have different structures, there are chances that some information may appear
repeatedly in some or more format at several places.
2) Data inconsistency: Data inconsistency occurs when various copies of same
data may no longer get matched. For example changed address of an employee
may be reflected in one department and may not be available (or old address
present) for other department.
3) Difficulty in accessing data: The conventional file system does not allow to
retrieve the desired data in efficient and convenient manner.
4) Data isolation: As the data is scattered over several files and files may be in
different formats, it becomes to retrieve the desired data from the file for writing
the new application.
5) Integrity problems: Data integrity means data values entered in the database
fall within a specified range and are of correct format. With the use of several
files enforcing such constraint on the data becomes difficult.
6) Atomicity problems: An atomicity means particular operation must be
carried out sib entirely or not at all with the database. It is difficult to ensure
atomicity in conventional file processing system.
7) Concurrent access anomalies: For efficient execution, multiple users update
data simultaneously, in such a case data need to be synchronized. As in
traditional file systems, data is distributed over multiple files, one cannot access
these files concurrently.
8) Security problems: Every user is not allowed to access all the data of
database no system. Since application program in file system are added in an ad
hoc manner, enforcing such security constraints become difficult.
Database systems offer solutions to all the above mentioned problems.
Difference between Database System and Conventional File System
Views of Data
• Database is a collection of interrelated data and set of programs that allow
users to access or modify the data.
• Abstract view of the system is a view in which the system hides certain details
of how the data are stored and maintained.
• The main purpose of database systems is to provide users with abstract view of
the data.
• The view of the system helps the user to retrieve data efficiently.
• For simplifying the user interaction with the system there are several levels of
abstraction - these levels are - Physical level, logical level and view level.
Data Abstraction
Data abstraction: Data abstraction means retrieving only required amount of
information /of the system and hiding background details.
There are several levels of abstraction that simplify the user interactions with
the system. These are
1) Physical level:
• This is the lowest level.
• This level describes how actually the data are stored.
• This level describes complex low level data structures.
2) Logical level:
• This is the next higher level, which describes the what data are stored in
database.
• This level also describes the relationship among the data.
• The logical level thus describes then entire database in terms of small number
of relatively simple structures.
• The database administrators use logical level of abstraction for deciding what
information to keep in database.
3) View level:
This is highest level of abstraction that describes only part of the entire
database.
• The view level can provide the access to only part of the database.
• This level helps in simplifying the interaction with the system.
• The system can provide multiple views of the same system.
• Clerk at the reservation system, can see only part of the database and can
access the required information of the passenger.
Fig. 1.3.1 shows the relationship among the three levels of abstraction.
1. Internal level:
• It contains internal schema.
• This schema represents the physical storage structure of database.
• This schema is maintained by the software and user is not allowed to modify
it.
• This level is closest to the physical storage. It typically describes the record
layout of the files and types of files, access paths etc.
2. Conceptual level:
• It contains conceptual schema.
• This schema hides the details of internal level.
• This level is also called as logical level as it contains the constructs used for
designing the database.
• It contains information like table name, their columns, indexes and constraints,
database operations.
• A representational data model is used to describe conceptual schema when a
database system is implemented.
3. External level:
• It contains the external schema or user views.
• At this level, the user will get to see only the data stored in the database. Either
they will see whole data values or any specific records. They will not have any
information about how they are stored in the databases
• The processes of transforming requests and results between levels are called
mappings.
• In the three schema architecture there are two mappings -
1) External - Conceptual Mapping and
2) Conceptual - Internal Mapping.
Data Independence
Definition: Data independence is an ability by which one can change the data at
one level without affecting the data at another level. Here level can be physical,
conceptual or external.
Data independence is one of the important characteristics of database
management system.
By this property, the structure of the database or the values stored in the
database can be easily modified by without changing the application programs.
There are two types of data independence
Instances: When information is inserted or deleted from the database then the
database gets changed. The collection of information at particular moment is
called instances. For example - following is an instance of student database
Types of Schema: The database has several schema based on the levels of
abstraction.
(1) Physical Schema: The physical schema is a database design described at the
physical level of abstraction.
(2) Logical Schema: The logical schema is a database design at the logical
level of abstraction.
(3) Subschema:A database may have several views at the view level which are
called subschemas.
Data Models
Definition: It is a collection of conceptual tools for describing data,
relationships among data, semantics (meaning) of data and constraints.
• Data model is a structure below the database.
• Data model provides a way to describe the design of database at physical,
logical and view level.
Data Model is the modeling of the data description, data semantics, and
consistency constraints of the data. It provides the conceptual tools for
describing the design of a database at each level of data abstraction. Therefore,
there are following four data models used for understanding the structure of the
database:
(1) Relational model:
• Relation model consists of collection of tables which stores data and also
represents the relationship among the data.
• Table is also known as relation.
• The table contains one or more columns and each column has unique name.
• Each table contains record of particular type, and each record type defines a
fixed number of fields or attributes.
• For example - Following figure shows the relational model by showing the
relationship between Student and Result database. For example - Student Ram
lives in city Chennai and his marks are 78. Thus the relationship between these
two databases is maintained by the SeatNo. Column
Advantages:
(i) Structural Independence: Structural independence is an ability that allows
us to make changes in one database structure without affecting other. The
relational levsiz model have structural independence. Hence making required
changes in thedatabase is convenient in relational database model.
(ii)Conceptual Simplicity: The relational model allows the designer to simply
focus on logical design and not on physical design. Hence relational models are
conceptually simple to understand.
(iii) Query Capability: Using simple query language (such as SQL) user can
get egile information from the database or designer can manipulate the database
structure.
(iv) Easy design,maintenance and usage: The relational models can be
designed logically hence they are easy to maintain and use.
Disadvantages:
(i) Relational model requires powerful hardware and large data storage devices.
(ii) May lead to slower processing time.
(iii) Poorly designed systems lead to poor implementation of database systems.
1) Entity relationship model:
• As the name suggests the entity relationship model uses collection of basic
objects called entities and relationships.
• The entity is a thing or object in the real world.
• The entity relationship model is widely used in database design.
• For example - Following is a representation of Entity Relationship modelin
which the relationship works_for is between entities Employee and
Department.
Advantages:
i) Simple: It is simple to draw ER diagram when we know entities and
relationships.
ii) Easy to understand: The design of ER diagram is very logical and hence
they are easy to design and understand.
iii) Effective: It is effective communication tool.
iv) Integrated: The ER model can be easily integrated with Relational model.
v) Easy conversion: ER model can be converted easily into other type of
models.
Disadvantages:
i) Loss of information: While drawing ER model some information can be
hidden or lost.
ii) Limited relationships: The ER model can represent limited relationships as
compared to other models.
iii) No Representation for data manipulation: It is not possible to represent
data manipulation in ER model.
iv) No industry standard: There is no industry standard for notations of ER
diagram.
(3) Object Based Data Model:
• The object oriented languages like C++, Java, C# are becoming the
dominant in software development.
• This led to object based data model.
• To The object based data model combines object oriented features with
relationaldata model.
Advantages:
i) Enriched modelling: The object based data model has capability of
modelling the real world objects.
ii) Reusability: There are certain features of object oriented design such as
inheritance, polymorphism which help in reusability.
iii) Support for schema evolution: There is a tight coupling between data and
b applications, hence there is strong support for schema evolution.
iv)Improved performance: Using object based data model there can be
significant improvement in performance using object based data model.
Disadvantages:
i) Lack of universal data model: There is no universally agreed data model for
an object based data model, and most models lack a theoretical foundation.
ii) Lack of experience: In comparison with relational database management the
use of object based data model is limited. This model is more dependent on the
skilled egi programmer.
iii) Complex: More functionalities present in object based data model make the
design complex.
(4) Semi-structured data model:
• The semi-structured data model permits the specification of data where
individual data items of same type may have different sets of attributes.
• The Extensible Markup Language (XML) is widely used to represent semi-
structured data model.
Advantages
i) Data is not constrained by fixed schema.
ii) It is flexible.
iii) It is portable.
Disadvantages
i) Queries are less efficient than other types of data model.
ER Diagrams
An E-R diagram can express the overall logical structure of a database
graphically.E-R diagrams are used to model real-world objects like a person, a
car, a company and the relation between these real-world objects.
Features of ER model
i) E-R diagrams are used to represent E-R model in a database, which makes
them easy to be converted into relations (tables).
ii) E-R diagrams provide the purpose of real-world modeling of objects which
makes them intently useful.
iii) E-R diagrams require no technical knowledge and no hardware support.
iv) These diagrams are very easy to understand and easy to create even by a
naive user.
v) It gives a standard solution of visualizing the data logically.
Various Components used in ER Model are-
Mapping Cardinality Representation using ER Diagram
There are four types of relationships that are considered for key constraints.
i) One to one relation: When entity A is associated with at the most one entity
B then it shares one to one relation. For example - There is one project manager
who manages only one project.
ii) One to many :When entity A is associated with more than one entities at a
time then there is one to many relation. For example - One customer places
order at a time.
iii) Many to one : When more than one entities are associated with only one
entity then there is is many to one relation. For example – Many student take a
ComputerSciCourse
iv) Many to many: When more than one entities are associated with more than
one entities. For example -Many teachers can teach many students.
Ternary Relationship
The relationship in which three entities are involved is called ternary
relationship. For example -
Binary and Ternary Relationships
• Although binary relationships seem natural to most of us, in reality it is
sometimes necessary to connect three or more entities. If a relationship connects
three entities, it is called ternary or "3-ary."
• Ternary relationships are required when binary relationships are not sufficient
to accurately describe the semantics of an association among three entities.
• For example - Suppose, you have a database for a company that contains the
entities, PRODUCT, SUPPLIER, and CUSTOMER. The usual relationships
might be PRODUCT/ SUPPLIER where the company buys products from a
supplier - a normal binary relationship. The intersection attribute for
PRODUCT/SUPPLIER is wholesale_price
• Now consider the CUSTOMER entity, and that the customer buys products. If
all customers pay the same price for a product, regardless of supplier, then you
have a simple binary relationship between CUSTOMER and PRODUCT. For
the CUSTOMER/PRODUCT relationship, the intersection attribute is
retail_price.
• Single ternary relation: Now consider a different scenario. Suppose the
customer buys products but the price depends not only on the product, but also
on the supplier. Suppose you needed a customerID, a productID, and a
supplierID to identify a price. Now you have an attribute that depends on three
things and hence you have a relationship between three entities (a ternary
relationship) that will have the intersection attribute, price.
Enhanced ER Model
Specialization and Generalization
• Some entities have relationships that form hierarchies. For instance, Employee
can be an hourly employee or contracted employee.
• In this relationship hierarchies, some entities can act as superclass and some
other entities can act as subclass.
• Superclass: An entity type that represents a general concept at a high level, is
called superclass.
• Subclass: An entity type that represents a specific concept at lower levels, is
called subclass.
• The subclass is said to inherit from superclass. When a subclass inherits from
one or more superclasses, it inherits all their attributes. In addition to the
inherited attributes, a subclass can also define its own specific attributes.
• The process of making subclasses from a general concept is called
specialization. This is top-down process. In this process, the sub-groups are
identified within an entity set which have attributes that are not shared by all
entities.
• The process of making superclass from subclasses is called generalization.
This is a bottom up process. In this process multiple sets are synthesized into
high level entities.
• The symbol used for specialization/ Generalization is,
• For example - There can be two subclass entities namely Hourly_Emps and
Contract_Emps which are subclasses of Empoyee class. We might have
attributes hours_worked and hourly wage defined for Hourly_Emps and an
attribute contractid defined for ContractEmps.
Therefore, the attributes defined for an Hourly_Emps entity are the attributes for
Employees plus Hourly_Emps. We say that the attributes for the entity set
Employees are inherited by the entity set Hourly_Emps and that Hourly-Emps
ISA (read is a) Employees. It can be represented by following Fig. 2.4.1.
Constraints on Specialization/Generalization
There are four types of constraints on specialization/generalization relationship.
These are -
1) Membership constraints: This is a kind of constraints that involves
determining which entities can be members of a given lower-level entity. There
are two types of TO S membership constraints –
i) Condition defined: In condition-defined lower-level entity
sets,membership is evaluated on the basis of whether or not an entity satisfies an
explicit condition or predicate. For example - Consider the high-level entity Set
Employee that has attribute Employee_type. All Employee entities are
evaluated on defining Employee_type attribute. All entities that satisfy
the condition student type = "ContractEmployee" are included in Contracted
Employee. Since all the lower-level entities are evaluated on the basis of the
same attribute this type of generalization is said to be attribute-defined.
ii) User defined: This is kind of entity set that in which the membership
is manually defined.
2) Disjoint constraints: The disjoint constraint only applies when a superclass
has more than one subclass. If the subclasses are disjoint, then an entity
occurrence can be a member of only one of the subclasses. For entity Student
has either Postgraduate Student entity or Undergraduate Student
Aggregation
A feature of the entity relationship model that allows a relationship set to
participate in another relationship set. This is indicated on an ER diagram by
drawing a dashed box around the aggregation.
For example - We treat the relationship set work and the entity sets employee
and project as a higher-level entity set called work.
Examples based on ER Diagram
An E-R diagram can express the overall logical structure of a database
graphically.
Example 2.5.1 Draw the ER diagram for banking systems (home loan
applications).
OR Draw an ER diagram corresponding to customers and loans. AU: May.-14,
Marks 8
OR Write short notes on: E-R diagram for banking system. AU: Dec.-14,
Marks 8
Solution:
Example 2.5.2 Consider the relation schema given in Figure. Design and draw
an ER diagram that capture the information of this schema.AU: May-17, Marks
5
Employee(empno,name,office,age)
Books(isbn,title,authors,publisher)
Loan(empno,isbn,date)
Solution:
Example 2.5.3 Construct an E-R diagram for a car insurance company whose
customers own one or more cars each. Each car has associated with it zero to
any number of recorded accidents. Each insurance policy covers one or more
cars and has one or more premium payments associated with it. Each payment
is for particular period of time and has an associated due date and date when
the payment was received
Solution:
Example 2.5.4 A car rental company maintains a database for all vehicles in
its current fleet. For all vehicles, it includes the vehicle identification number
license number, manufacturer, model, date of purchase and color. Special data
are included for certain types of vehicles.
Trucks: Cargo capacity
Sports cars: horsepower, renter age requirement
Vans: number of passengers
Off-road vehicles: ground clearance, drivetrain (four-or two-wheel drive)
Construct an ER model for the car rental company database.
Solution:
Example 2.5.5 Draw E-R diagram for the "Restaurant Menu Ordering System",
which will facilitate the food items ordering and services within a restaurant.
The entire restaurant scenario is detailed as follows. The customer is able to
view the food items menu, call the waiter, place orders and obtain the final bill
through the computer kept in their table. The Waiters through their wireless
tablet PC are able to initialize a table for customers, control the table functions
to assist customers, orders, send orders to food preparation staff (chef) and
finalize the customer's bill. The Food preparation staffs (chefs), with their
touch-display interfaces to the system, are able to view orders sent to the
kitchen by waiters. Duringpreparation they are able to let the waiter know the
status of each item, and can send notifications when items are completed. The
system should have full accountability and logging facilities, and should
support supervisor actions to account for exceptional circumstances, such as a
meal being refunded or walked out on.
Solution:
Example 2.5.6 A university registrar's office maintains data about the
following entities:
(1) courses, including number, title, credits, syllabus, and prerequisites;
(2) course offerings, including course number, year, semester, section
number, instructor(s), timings, and classroom;
(3) students, including student-id, name, and program; and
(4) instructors, including identification number, name, department, and title.
Further, the enrollment of students in courses and grades awarded to students
in each course they are enrolled for must be appropriately modeled. Construct
an E-R diagram for the registrar's office. Document all assumptions that you
make about the mapping constraints.
Solution:
Example 2.5.7 What is aggregation in ER model? Develop an ER diagram
using aggregation that captures following information: Employees work for
projects. An employee working for particular project uses various machinery.
Assume necessary attributes. State any assumptions you make. Also discuss
about the ER diagram you have designed.
Solution Aggregation: Refer section 2.4.3.
ER Diagram: The ER diagram for above described scenario can be drawn as
follows-
We can then create a binary relationship manages for between Manager and
(Employee, Project, Machinery).
Example 2.5.8 Construct an E-R diagram for a hospital with a set of patients
and a set of medical doctors. Associate with each patient a log of the various
tests and examinations conducted. AU: Dec.-07, Marks 8
Solution:
a. Weak Entity
An entity that depends on another entity called a weak entity. The weak
entity doesn't contain any key attribute of its own. The weak entity is
represented by a double rectangle.
2. Attribute
The attribute is used to describe the property of an entity. Eclipse is used
to represent an attribute.
For example, id, age, contact number, name, etc. can be attributes of a
student.
a. Key Attribute
c. Multivalued Attribute
An attribute can have more than one value. These attributes are known as
a multivalued attribute. The double oval is used to represent multivalued
attribute.
For example, a student can have more than one phone number.
d. Derived Attribute
For example, A person's age changes over time and can be derived from
another attribute like Date of birth.
3. Relationship
A relationship is used to describe the relation between entities. Diamond
or rhombus is used to represent the relationship.
For example, A female can marry to one male, and a male can marry to
one female.
ADVERTISEMENT
ADVERTISEMENT
b. One-to-many relationship
When only one instance of the entity on the left, and more than one
instance of an entity on the right associates with the relationship then this
is known as a one-to-many relationship.
For example, Scientist can invent many inventions, but the invention is
done by the only specific scientist.
c. Many-to-one relationship
When more than one instance of the entity on the left, and only one
instance of an entity on the right associates with the relationship then it is
known as a many-to-one relationship.
For example, Student enrolls for only one course, but a course can have
many students.
d. Many-to-many relationship
When more than one instance of the entity on the left, and more than one
instance of an entity on the right associates with the relationship then it is
known as a many-to-many relationship.
For example, Employee can assign by many projects and project can
have many employees.
Notation of ER diagram
Database can be represented using the notations. In ER diagram, many
notations are used to express the cardinality. These notations are as
follows:
ER model is used to represent real life scenarios as entities. The properties of
these entities are their attributes in the ER diagram and their connections are
shown in the form of relationships.
Some examples of ER model are −
Hospital ER Model
Company ER Model
The entities in this ER model are Employee, Department and Project. These
entities have the following attributes −