0% found this document useful (0 votes)
145 views

DBMS Notes

A database is a collection of related data organized in a systematic way that allows a computer program to consult it to answer questions. A database management system (DBMS) is software that facilitates creating and managing databases and allows for defining, constructing, manipulating, and accessing the data. Using file systems to store data instead of a DBMS has limitations like uncontrolled redundancy, inconsistent data, inflexibility, and poor data sharing and security. DBMSs address these issues through features like controlled redundancy, data consistency, integration, access controls, and flexibility.

Uploaded by

balainsai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
145 views

DBMS Notes

A database is a collection of related data organized in a systematic way that allows a computer program to consult it to answer questions. A database management system (DBMS) is software that facilitates creating and managing databases and allows for defining, constructing, manipulating, and accessing the data. Using file systems to store data instead of a DBMS has limitations like uncontrolled redundancy, inconsistent data, inflexibility, and poor data sharing and security. DBMSs address these issues through features like controlled redundancy, data consistency, integration, access controls, and flexibility.

Uploaded by

balainsai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 11

INTRODUCTION

A database is a collection of data elements (facts) stored in a computer in such a


systematic way that a computer program can consult it to answer questions. The answers
to those questions become information that can be used to make decisions that may not be
made with the data elements alone. The computer program used to manage and query a
database is known as a database management system (DBMS).
A database is a collection of related data that we can use for
 Defining (specifying types of data)
 Constructing (storing & populating)
 Manipulating (querying, updating, reporting)
A Database Management System (DBMS) is a software package to facilitate the
creation and maintenance of a computerized database. A Database System (DB) is a DBMS
together with the data.
Features of a database
 It is a persistent (stored) collection of related data.
 The data is input (stored) only once.
 The data is organized (in some fashion).
 The data is accessible and can be queried (effectively and efficiently).
FILE SYSTEMS VERSUS DATABASE SYSTEMS
DBMS are expensive to create in terms of software, hardware, and time invested.
Why should we use them? Why couldn’t we just keep all our data in files, and use
word-processors to edit the files appropriately to insert, delete, or update data? And we
could write our own programs to query the data! This solution is called maintaining data in
flat files. However, flat files have the following limitations.
• Uncontrolled redundancy
• Inconsistent data
• Inflexibility
• Limited data sharing
• Poor enforcement of standards
• Low programmer productivity
• Excessive program maintenance
• Excessive data maintenance
Drawbacks of file systems :
Drawbacks of using file systems to store data are:
Data redundancy and inconsistency
Due to availability of multiple file formats, storage in files may cause
duplication of information in different files.
Difficulty in accessing data
In order to retrieve, access and use stored data, we need to write a new
program to carry out-each new task
Data isolation
To isolate data we need to store them in multiple files and different
Formats.
Integrity problems
Integrity constraints (e.g. account balance > 0) become part of program
code which has to be written every time. It is hard to add new constraints or
to change existing ones.
Atomicity of updates
Failures of files may leave database in an inconsistent state with partial
updates carried out
E.g. transfer of funds from one account to another should happen either
completely or Partially.
oncurrent access by multiple users
Concurrent access of files is needed for better performance and it is also true
that Uncontrolled concurrent accesses of files can lead to inconsistencies
E.g. two people reading a balance and updating it at the same time
Several Security related problems caused in file system.
Database Management System Approach
The Data maintained in the form of DBMS has the following advantages
Controlled redundancy
o consistency of data & integrity constraints
Integration of data
o self-contained & represents semantics of application
Data and operation sharing
o multiple interfaces
Services & Controls
o security & privacy controls
o backup & recovery
o enforcement of standards
Flexibility
o data independence
o data accessibility
o reduced program maintenance
Ease of application and development
Limitations of Database Management System
Database Management System is
 more expensive
 more complex
 general
 simple
 stringent real-time
 single user
 static

Levels of Abstraction
 Physical level : It is also known as internal schema which describes data
storage structures and access paths. It typically uses a physical data
model and the details are hidden from database users. It describes how a
record (e.g. customer) is stored.
 Logical level: It is also known as conceptual schema which describes the
structure and constraints for the whole database. It uses a conceptual or
implementation model. It describes data stored in database, and the
relationships among the data.
type customer = record
street : string;
city : integer;
end;

 View level: It is also known as External schema, whichdescribes various


user views. Usually uses the same data model as conceptual level. As
shown in Fig 1.1 , Different views may have different interpretations of
the same data item. Application programs hide details of data types.
Views can also hide information (e.g. salary) for security purposes.

Diagram for Levels of Data abstraction


DATA MODELS
Model
o A structure that demonstrates all the required features of the parts of the
real world which is of interest to the users of the information in the model.
o Representation and reflection of the real world (Universe of Discourse)
Data Model
o A set of concepts that can be used to describe the structure of a database: the
data types, relationships, constraints, semantics and operational behaviour.
o It is a tool for data abstraction
o A collection of tools for describing
 data
 data relationships
 data semantics
 data constraints
o Entity-Relationship model
The Entity-Relationship (ER) model is a conceptual data model,capable of
describing the data requirements for a new informationsystem in a direct and
easy to understand graphical notation.

o Relational model
The Relational Model

The relational model is the most popular type of database and an extremely powerful tool, not
only to store information, but to access it as well. Relational databases are organized as
tables. The beauty of a table is that the information can be accessed or added without
reorganizing the tables. A table can have many records and each record can have many fields.

Tables are sometimes called a relation. For instance, a company can have a
database called customer orders, within this database will be several different
tables or relations all relating to customer orders. Tables can include customer
information (name, address, contact, info, customer number, etc) and other
tables (relations) such as orders that the customer previously bought (this can
include item number, item description, payment amount, payment method, etc).
It should be noted that every record (group of fields) in a relational database has
its own primary key. A primary key is a unique field that makes it easy to
identify a record.

o Object based data model:


Object-oriented programming (especially in Java,C++, or C#) has become the dominant
software-development methodology.This led to the development of an object-oriented
data model that can beseen as extending the E-R model with notions of encapsulation,
methods(functions), and object identity. The object-relational data model combines
features of the object-oriented data model and relational data model.

Semistructured Data Model. The semistructured data model permits the specification
of data where individual data items of the same type may have different sets of attributes. This
is in contrast to the data models mentioned earlier, where every data item of a particular type
must have the same set of attributes. The Extensible Markup Language (XML) is widely used to
represent semistructured data.
 Older models: network model and hierarchical model
The Network Model
 Data are represented by collections of records.
 Relationships among data are represented by links.
 Organization is that of an arbitrary graph and represented by Network dia- gram.
The Following figure shows a sample network database that is the equivalent of the
relational database of previous figure.

The Hierarchical Model


 This model is Similar to the network model and the concepts are derived from the
Information Management System and System-200.
 Organization of the records is as a collection of trees, rather than arbitrary
graphs.
 Schema represented by a Hierarchical Diagram.
o One record type, called Root, does not participate as a child record type.
o Every record type except the root participates as a child record type in
exactly one type.
o Leaf is a record that does not participate in any record types.
A record can act as a Parent for any number of records.

Data Independence

Data independence means that any part of data is not dependent on any other part
of the data in the DBMS. They are no way dependent upon their physical storage and
logical arrangements. As shown in Fig. 1.5, they depend only upon various exter- nal
factors such as new hardware, new users, new technology, new linkages and link- ages to
other databases.
o Logical data independence
o change the conceptual schema without having to change the external
schemas
o Physical data independence
o change the internal schema without having to change the conceptual
schema
DBMS ARCHITECTURE
Overall System Structure

The DBMS architecture constitutes the Disk storage, Storage manager, Query
processor and Database users, connected through various tools and applications like query
and application tools and application programs and application interfaces.
Database Users
Users are differentiated by the way they expect to interact with the system
o Application programmers: They interact with system through DML calls
o Sophisticated users: They form requests in a database query language
o Specialized users: They write specialized database applications that do not fit into
the traditional data processing framework
o Naïve users: They invoke one of the permanent application programs that have
been written previously
E.g. people accessing database over the web, bank tellers, clerical staff
Database Administrator
Database Administrator Coordinates all the activities of the database system; he
has a good understanding of the enterprise’s information resources and needs.
o Database administrator’s duties include:
o Schema definition
o Storage structure and access method definition
o Schema and physical organization modification
o Granting user authority to access the database
o Specifying integrity constraints
o Acting as liasion with users
o Monitoring performance and responding to changes in requirements
Transaction Management within Storage Manager
o A transaction is a collection of operations that performs a single logical
function in a database application
o Transaction-management component ensures that the database remains in a
consistent (correct) state despite system failures (e.g., power failures and
operating system crashes) and transaction failures.
o Concurrency-control manager controls the interaction among the
concurrent transactions, to ensure the consistency of the database.
This process is performed by a system inside Storage manager.
Storage Manager

o Storage manager is a program module that provides the interface between


the low-level data stored in the database and the application programs and
queries submitted to the system.

o It executes on the basis of data obtained from query execution engine.

o It consists of Buffer manager, File manager, authentication and integrity


manager and transaction manager.

o The storage manager is responsible for the following tasks:

o Interaction within various managers.

o Efficient storing, retrieving and updating of data


Disk Storage

Disk storage consists of Data in the form logical tables, indices, data dictionary and
statistical data. Data Dictionary stores the data about data i.e. its structure, etc. Indices are
used for easy searching in a data base. Statistical data is the log storage details about
various transactions which occur on the database.

Query processor

As described in Fig.1.6, the query processor consists of the following layers and the
functionalities of each layer are described well with Fig. 1.7.

The users submit query which passes to optimizer where the query is optimized, the
physical execution plan goes to execution engine.

Query Execution engine passes the request to index/ file / record manager. That in
turn passes the request to buffer manager requesting it to allocate memory to store pages.
Buffer manager in turn sends the pages to storage manager which takes care of physical
storage.

The resulting data out of physical storage comes in reverse order. The catalog is the
Data Dictionary which contains statistics and schema. Every query execution which takes
place in execution engine is logged and recovered when required.

Three level Schema architecture

A single user may use a simple accounting database on personal computer to manage a small business. At
other extreme, a very large company may have databases at numerous locations around the counter that are
linked together. There are two generic database architectures namely centralized and distributed. There
are also some applications which run on servers with two tier architecture and three tier architecture.
 Two-tier architecture: E.g. client programs using ODBC/JDBC to
communicate with a database
 Three-tier architecture: E.g. web-based applications, and applications built
using “middleware”

Null value:
The null value is a special value that signifies that the value is unknown or does not exist. For example,
suppose as before that we include the attribute phone number in the instructor relation. It may be that an
instructor does not have a phone number at all, or that the telephone number is unlisted. We would then
have to use the null value to signify that the value is unknown or does not exist.

Data base administrator functions:

Schema definition.
Storage structure and access-method definition.
Schema and physical-organization modification.
Granting of authorization for data access.
Routine maintenance.

Weak and strong entity set

An entity set thatdoes not have sufficient attributes to form a primary key is termed a weak entity
set. An entity set that has a primary key is termed a strong entity set.

Keys:
A superkey may contain extraneous attributes. For example, the combination of ID and name is a
superkey for the relation instructor. If K is a superkey, then so is any superset of K. We are often
interested in superkeys for which no proper subset is a superkey. Such minimal superkeys are called
candidate keys.

primary key to denote a candidate key that is chosen by the database designer as the principal
means of identifying tuples within a relation.

A relation, say r1, may include among its attributes the primary key of another relation, say r2. This
attribute is called a foreign key from r1, referencing r2.The relation r1 is also called the
referencing relation of the foreign key dependency, and r2 is called the referenced relation of the
foreign key.

Integrity Constraints:

1. primary key (Aj1 , Aj2, . . . , Ajm ): The primary-key specification says that attributes
Aj1 , Aj2, . . . , Ajm form the primary key for the relation. The primary key attributes are required to be
nonnull and unique; that is, no tuple can have a null value for a primary-key attribute, and no two
tuples in the relation can be equal on all the primary-key attributes.

2. foreign key (Ak1 , Ak2, . . . , Akn ) references s: Theforeign key specification says that the
values of attributes (Ak1 , Ak2, . . . , Akn ) for any tuple in the relation must correspond to
values of the primary key attributes of some tuple in relation s.
3. not null: The not null constraint on an attribute specifies that the null value is not allowed for
that attribute; in other words, the constraint excludes the null value from the domain of that
attribute. For example, in Figure 3.1, the not null constraint on the name attribute of the
instructor relation ensures that the name of an instructor cannot be null.

You might also like