DBMS Notes
DBMS Notes
Levels of Abstraction
Physical level : It is also known as internal schema which describes data
storage structures and access paths. It typically uses a physical data
model and the details are hidden from database users. It describes how a
record (e.g. customer) is stored.
Logical level: It is also known as conceptual schema which describes the
structure and constraints for the whole database. It uses a conceptual or
implementation model. It describes data stored in database, and the
relationships among the data.
type customer = record
street : string;
city : integer;
end;
o Relational model
The Relational Model
The relational model is the most popular type of database and an extremely powerful tool, not
only to store information, but to access it as well. Relational databases are organized as
tables. The beauty of a table is that the information can be accessed or added without
reorganizing the tables. A table can have many records and each record can have many fields.
Tables are sometimes called a relation. For instance, a company can have a
database called customer orders, within this database will be several different
tables or relations all relating to customer orders. Tables can include customer
information (name, address, contact, info, customer number, etc) and other
tables (relations) such as orders that the customer previously bought (this can
include item number, item description, payment amount, payment method, etc).
It should be noted that every record (group of fields) in a relational database has
its own primary key. A primary key is a unique field that makes it easy to
identify a record.
Semistructured Data Model. The semistructured data model permits the specification
of data where individual data items of the same type may have different sets of attributes. This
is in contrast to the data models mentioned earlier, where every data item of a particular type
must have the same set of attributes. The Extensible Markup Language (XML) is widely used to
represent semistructured data.
Older models: network model and hierarchical model
The Network Model
Data are represented by collections of records.
Relationships among data are represented by links.
Organization is that of an arbitrary graph and represented by Network dia- gram.
The Following figure shows a sample network database that is the equivalent of the
relational database of previous figure.
Data Independence
Data independence means that any part of data is not dependent on any other part
of the data in the DBMS. They are no way dependent upon their physical storage and
logical arrangements. As shown in Fig. 1.5, they depend only upon various exter- nal
factors such as new hardware, new users, new technology, new linkages and link- ages to
other databases.
o Logical data independence
o change the conceptual schema without having to change the external
schemas
o Physical data independence
o change the internal schema without having to change the conceptual
schema
DBMS ARCHITECTURE
Overall System Structure
The DBMS architecture constitutes the Disk storage, Storage manager, Query
processor and Database users, connected through various tools and applications like query
and application tools and application programs and application interfaces.
Database Users
Users are differentiated by the way they expect to interact with the system
o Application programmers: They interact with system through DML calls
o Sophisticated users: They form requests in a database query language
o Specialized users: They write specialized database applications that do not fit into
the traditional data processing framework
o Naïve users: They invoke one of the permanent application programs that have
been written previously
E.g. people accessing database over the web, bank tellers, clerical staff
Database Administrator
Database Administrator Coordinates all the activities of the database system; he
has a good understanding of the enterprise’s information resources and needs.
o Database administrator’s duties include:
o Schema definition
o Storage structure and access method definition
o Schema and physical organization modification
o Granting user authority to access the database
o Specifying integrity constraints
o Acting as liasion with users
o Monitoring performance and responding to changes in requirements
Transaction Management within Storage Manager
o A transaction is a collection of operations that performs a single logical
function in a database application
o Transaction-management component ensures that the database remains in a
consistent (correct) state despite system failures (e.g., power failures and
operating system crashes) and transaction failures.
o Concurrency-control manager controls the interaction among the
concurrent transactions, to ensure the consistency of the database.
This process is performed by a system inside Storage manager.
Storage Manager
Disk storage consists of Data in the form logical tables, indices, data dictionary and
statistical data. Data Dictionary stores the data about data i.e. its structure, etc. Indices are
used for easy searching in a data base. Statistical data is the log storage details about
various transactions which occur on the database.
Query processor
As described in Fig.1.6, the query processor consists of the following layers and the
functionalities of each layer are described well with Fig. 1.7.
The users submit query which passes to optimizer where the query is optimized, the
physical execution plan goes to execution engine.
Query Execution engine passes the request to index/ file / record manager. That in
turn passes the request to buffer manager requesting it to allocate memory to store pages.
Buffer manager in turn sends the pages to storage manager which takes care of physical
storage.
The resulting data out of physical storage comes in reverse order. The catalog is the
Data Dictionary which contains statistics and schema. Every query execution which takes
place in execution engine is logged and recovered when required.
A single user may use a simple accounting database on personal computer to manage a small business. At
other extreme, a very large company may have databases at numerous locations around the counter that are
linked together. There are two generic database architectures namely centralized and distributed. There
are also some applications which run on servers with two tier architecture and three tier architecture.
Two-tier architecture: E.g. client programs using ODBC/JDBC to
communicate with a database
Three-tier architecture: E.g. web-based applications, and applications built
using “middleware”
Null value:
The null value is a special value that signifies that the value is unknown or does not exist. For example,
suppose as before that we include the attribute phone number in the instructor relation. It may be that an
instructor does not have a phone number at all, or that the telephone number is unlisted. We would then
have to use the null value to signify that the value is unknown or does not exist.
Schema definition.
Storage structure and access-method definition.
Schema and physical-organization modification.
Granting of authorization for data access.
Routine maintenance.
An entity set thatdoes not have sufficient attributes to form a primary key is termed a weak entity
set. An entity set that has a primary key is termed a strong entity set.
Keys:
A superkey may contain extraneous attributes. For example, the combination of ID and name is a
superkey for the relation instructor. If K is a superkey, then so is any superset of K. We are often
interested in superkeys for which no proper subset is a superkey. Such minimal superkeys are called
candidate keys.
primary key to denote a candidate key that is chosen by the database designer as the principal
means of identifying tuples within a relation.
A relation, say r1, may include among its attributes the primary key of another relation, say r2. This
attribute is called a foreign key from r1, referencing r2.The relation r1 is also called the
referencing relation of the foreign key dependency, and r2 is called the referenced relation of the
foreign key.
Integrity Constraints:
1. primary key (Aj1 , Aj2, . . . , Ajm ): The primary-key specification says that attributes
Aj1 , Aj2, . . . , Ajm form the primary key for the relation. The primary key attributes are required to be
nonnull and unique; that is, no tuple can have a null value for a primary-key attribute, and no two
tuples in the relation can be equal on all the primary-key attributes.
2. foreign key (Ak1 , Ak2, . . . , Akn ) references s: Theforeign key specification says that the
values of attributes (Ak1 , Ak2, . . . , Akn ) for any tuple in the relation must correspond to
values of the primary key attributes of some tuple in relation s.
3. not null: The not null constraint on an attribute specifies that the null value is not allowed for
that attribute; in other words, the constraint excludes the null value from the domain of that
attribute. For example, in Figure 3.1, the not null constraint on the name attribute of the
instructor relation ensures that the name of an instructor cannot be null.