DBMS_UNIT1
DBMS_UNIT1
Introduction
Basic Concept
- In 1990, at a Conference Silberschatz quoted on a database System.
- The root of DBMS lies in file-based System.
Terms of DBMS
1. Data
- It is known as fact that can be recorded and have implicit meaning
- Actually, data is of raw or isolated facts from which required information is produced.
- Data item refers to an elementary description of things, events, activities and transactions that
are recorded
- e.g., 45679. is meaningless.
other examples can be documents, photographic Images, video segments etc.
2. Information
- It is defined as Collection of related data that when put together produces useful (meaningful)
message to the recipient.
- Actually, data is Converted to more useful or intelligent form
- e.g.; marks, roll number of student form data Whereas marksheet is the information.
- other examples can be pay slip, schedules, reports, worksheet, invoices etc.
Note: Information may further be processed to form knowledge (wisdom).
# Difference between Data and Information
Data Information
Data is raw fact and figure. e.g., 57 is data Data when stored in Some form, like Mark:
57; then it becomes information.
Data does not have significance in business. Information has in business.
3. Metadata/Data dictionary
- It is data about data, also known as System Catalog.
- It is data that describes the objects in the data-base.
- It describes database structure, constraints, applications, size of data etc.
- It is integral tool for information resource management.
- It is used by developers to develop software, queries, Controls and procedures to manage and
manipulate the warehouse data.
4. Database
- It is defined as the Collection of logically inter-related data and a description of data designed
to meet the information need of organization.
- It is a self-describing Collection of integrated records, this nature, makes it program data
independence.
- for e-g; a dictionary, a telephone directory, Student record register etc. arranged in some order
(Arranged form).
features of database.
a) shared – Shared among different users/applications.
b) Persistence- exists permanently i.e., live beyond the Scope of process.
c) validity- should be correct with respect to real world entity.
d) Security-Data should be protected from unauthorized user.
e) Consistency values should be consistent with respect to the relationship.
£) Non-redundancy - No two data items in data base should represent same real-world
entity...
g) Independence- Data should be abstract i.e., if should be independent of each other at
different levels.
- DBMS is intermediate layer between Programs- and data. Programs access DBMS which then
access the data.
- A DBMS e a Services for accessing a database but it maintains all required feature of the data.
for e.g.; dbase, FoxPro, MS-Access, Oracle, Sybase etc.
-
Programmer data
Database
DBMS
8. Distributed database
- It is defined as the Collection of data under different DBMS's running on different Computers.
- It is Central System Connected to remate Computers.
- We can access all data from each Site and thus there is not much load on a single central site.
- It can reside on network servers on www, on intranet or extranets.
Advantages of DBMS
I. Minimization of redundancy/duplicacy.
- There is no duplicacy of same data because of centralized Control of data by database
Administrator.
II. Avoidance of Inconsistency of data
- Due to redundancy, there is problem of inconsistency of data, which can be removed by
minimizing the redundancy.
III. Sharing of data.
- stored data is accessible to multiple usersi.e., data can be shared to multiple application or
users at same time.
IV. Data privacy and Security
a. Authentication - getting username/password.
b. Authorization- gaining the permission to read/write.
V. Data Integrity
- Data Stored in Database is accurate and consistent.
- To ensure data integrity thereenforcednumber of Constraints (rules) which user can
apply. e.g., Mobile numbermust be 10 digit and numeric.
VI. Data Independence - Physical Data Independence and Logical Data Independence
- If change is made at lower-level architecture, upper level is not affected i.e., data is
independent at each level.
VII. Increased ConcurrencyControl
- Simultaneous execution of transaction occurs at same place.
- Multiple users can access the same data item at same time.
VIII. Recovery and Backup
- Users Can backup and recover data from Database usingDBMS.
DBMS
database
Disadvantages of DBMS
I. Complexity of Recovery and Backup
- Backup and Recovery of data is difficult when data increases in large volume in terms of
TeraByte, Penta Byte, ZettaByteetc.
- More knowledge and skill is required for Backup and recovery.
II. Expensive Hardware and software Cost
- Expensive and quality hardware and software are used for better performance.
- Hardware and software are purchased which are of high cost. e.g., MSSQL Server.
III. Larger size memory for DBMS
- Due to Complexity of DBMS and wide functionality it requires more space in main
memory and Hard disk.
IV. Conversion cost
- The Conversion from old file processing System to modern DBMS is high in terms of
moneyand time.
- Even extra hardware may be required.
V. Technically skilled staff is required.
- Common user Can not operate DBMS.
- Technically skilled staff is required.
- Additional Costmay occur to train andupdate the staff.
• 1980s
- Relational model was not used since it did not match the performance of existing network
and hierarchical database model and were Competitive to each other
- The fully functional System R prototype led to DBM's first relational database product,
SOL/DS.
- Initial Commercial relational Database Systems, such as IBM DB2, Oracle, Ingres and
DEC Rdb.
- Relational database Came in dominance among data models.
- Research was done onparallel and distributed as well as initial work on object-oriented
database.
• Early 1990s
- SQL language was designed primarily for decision support applications, withintensive
update for query and transaction processing.
- Database Vendors introduced parallel database products and also added object-relational
support.
- There was explosive growth of worldwide webwhere database were deployed much
extensively.
- Database system had to support web interfaceto data.
• 2000s
- In mid 2000s XML emerged, associated with query language xQuery as new database
technology.
- for minimizing System administration, there is growth in "autonomic-computing / auto ad
min"technique.
- Significant growth in use of open-Source database systems, particularly PostgreSQL and
MySQL.
- Data-mining techniques are widely used. for e.g., application include web-based product-
recommendation system and automatic placement of relevant advertisements on web
page.
- E.g., Amazon, Facebook, Google, Microsoft, yahoo.
b) Software
- It is basic interface or layer between the physical database and the users of the system.
- Most Commonly it is known as Database Management System (DBMS).
- DBMS shields the database Wess from Com- plex hardware-level detail.
- Basically, Software Component includes:
DBMS software + Application programs + OS
c) Data
- From the end user point of view, data is the most important component of DBMS
environment.
- Data acts as bridge between the machine com- ponent and user Component.
- The database contains both the operational data and metadata.
- The database should contain all data needed by the organization.
Note: Database should be designed, built and populated for particular audience and for specific
purpose.
d) Users
- There are users who can access / retrieve data on demand using application and interface
provided by DBMS.
Database users can be classified as follows:
1. Agile users / Naive users
- These users are unaware of presence of database System
- These users work through menu driven application program where response is indicated to
user.
- e.g.; An ATM user, operation performed by is very limited and is precise effect on
database.
2. online users
- Users interact with database via online terminal or indirectly via user interface and
application Program.
- These users are aware of the presence of database system.
- they may have acquired Certain amount of expertise with limited interaction with database.
3. Application programmers
- These programmers write database application programs using some programming language.
- programs have Commands available to manipulate a database.
5. Procedures
- It is instructions and rules that govern the design and use of database.
- users and staff require documented procedure to use and run the system.
- The instruction might consist:
✓ logon to the DBMS. -
✓ Use particular DBMS facility or application program
✓ Start and Stop DBMS.
✓ Make & backup Copic of DBMS.
✓ Handle hardware or software failures.
✓ change structure of table, reorganize the database across multiple disks to improve
performance
# Database Administrator (DBA)
• It is individual person or a group of persons with an overview of one or more database
that he/she can control and design of and the use of databases.
• A DBA is highest Salary paid person in an organization.
• A DBA provides the necessary technical support for implementing policy decision of
database.
• He/she is a Central Controller of the database System who manages all resources like
database, DBMS and related software.
• He/ She is supported by team of System programmers and other technical assistants.
Functions of DBA
i. Defining Conceptual schema.
- create original database scheme and the Structure of database.
ii. Physical database design
- DBA decide who how the data to be in the Stored database.
iii. Securityand integrity Check
- He/she is responsible for providing authorization and authentication.
- He/she must ensure integrity of database also.
iv. Backup and recovery strategies
- DBA must define and implement periodicrecovery strategy to recover the database from
all types of failure.
v. Granting access to users.
- A DBA regulates the usage of specific parts of the database by various users.
- He/she grants access to use the database to users.
#Data Models
Schemas and Instances
Schema
- It is defined as the outline or a plan that describes the records existing at a particular
level.
- Actually, DBMS has three level architecture and there are three different types of
schemas in the database i.e., one scheme per level.
- It is overall description of the database for example,
- A Schema diagram displays only some aspects of a schema like names of record type and
data items and some type of constraints.
- If is specified during database design and it not expected on change frequently.
- It also describes the way in which data elements at one level can be mapped to
corresponding data element in next level.
- They are generally stored in data dictionary.
Instance
- It is collection of information stored in database at particular moment as an instance
of the database.
- Database Change over time when the information is inserted or deleted.
- for example, instance of Stud_ Address
Roll_no Name Address Place Pin
1 Ram prasad Jnk-10 Janakpur 46500
2 Hari Sharma Jnk-12 Kathmandu 57912
Subschema
- It is defined asan application programmer's or user's view of data item and record types.
- It is subset of schema and inherits same property schema has.
- Different application can have different view of data.
DBMS Architecture / 3-tier Architecture/ ANSI-SPARCArchitecture
- In 1971, Conference on Data System and Language (CODASYL) appointed Data Base Task
Group (DBTG) who gave standard terminology and general architecture for Database
Systems.
- In 1975, American National Standard Institute (ANSI) - Standards Planning and
Requirements Committee (SPARC) recognized the need for three-level approach.
- DBMS architecture has three levels:
a) External or view level.
b) Conceptual or logical level
c) logical or storage /Physical level
Note: Internal level covers the physical implementation of database to achieve optimal run-
time performance and Storage utilization.
Below figure shows the specific information available at each level regarding a particular user.
# Mapping
- There are two types of mapping between three different views.
- It is the responsibility of for mapping between three types of schemas for DBMS.
• The ANSI- SPARC's 3-tier architecture provides efficient mapping and more data
independence.
# Data Independence
- The major objective of ANTS ANSI-SPARC'S 3-tier architecture is to provide data
Independence which upper levels are unaffected by changes in lowerlevels and vice-versa.
- There are two types of data independence:
a) Physical data independence
b)Logical data independence
a) Physical data independence.
- It indicates that the physical storage structure or devices could be changed without affecting the
Conceptual schema.
- The Change would be absorbed by mapping between the conceptual and physical/Internal
levels.
- This independence in achieved by the presence of internal level of database and the mapping/
transformation from Conceptual level of database to the internal level.
- If there is need to change the file organization. our physical device used, a change is required in
Conceptual / internal mapping between the Conceptual and internal level.
[Note: Physical data independence criteria requires that conceptual level does not specify storage
Structures or access methods (like indexing, hashing) to retrieve the data from physical storage
medium]
# Database Language
- A DBMS must provide appropriate language for each category of users to express
database queries and updates.
- These Database languagesare used to createand maintain database on Computer.
6. Host language
- It is language in which DML Commands are embedded.
- Most DBMS have a facility for embedding SQL queries in High level programming
languagelike VB 6.0, VC# 6.0 etc.
# Database Interfaces
- A database management system (DBMS) inter- face is a user interface which allows for
the ability to input queries to a database without wing the query language itself.
- Background detail of database is unknown to user.
- Database interface provides mechanism. through which user interact with the database
without using query language (like SOL).
- Generally, the user interacts with the database management system through an interface. The
DBMS does the processing and retrieves the data from the database.
- Database system is divided into two modules,
1. Storage management
2. Query processing.
- Data stored in database may costs more than trillion bytes of data.
- Main memory cannot accommodate for such large amount of data, hence the data is stored in
disk.
- But for processing data, which needs to be transferred from disk to the main memory.
- But this transfer consumes processor time, hence the data needs to be arranged such that the
data transfer rate is not too high.
- This is taken care by storage manager.
- The main task of database system is to provide the user with simplified view of data.
- This is achieved by hiding the physical level implementation details from the user, the user is
provided with only high-level view.
- But, for faster processing the operations need to be done at physical level. The queries at
logical level into optimal sequence of operations at physical level.
1. Storage Management
- Storage management is handled by storage manager that is basically a program module.
- This module acts as an interface between low-level data stored in database and the
application programs.
i)Transaction manger
ii) File manager
iii) Buffer manager
iv) Integrity manager
v) Authorization manager.
i) Transaction Manager
- It manages the transactions so as to ensure that data remain in consistent state even after
the system failures.
- It also enables the execution of concurrent transactions without any conflicts.
v) Authorization Manager
- It checks the authority of users and allows only authorized users to access the data.
The following are the different data structures used by storage manager.
(a) Data Files
These are the files that contain the database.
(b) Data Dictionary
It maintains metadata regarding the different data structures used in database.
(c) Indices
It provides fast access to the required data items.
2.Query Processor
i) Interpreter of Data Definition Language Statements
- DDL statements written by DBA to define the schema are interpreted and stored in the data
dictionary.
Data Models
I. Hierarchical model
- It is one of the oldest data models, dating late 1950s.
- Data is viewed as a collection of relations (Segments) that form hierarchical relation.
- The hierarchical relations are connected together in family tree.
- Each segment contains multiple segment instances.
- The segment pointed to the logical association is referred as child segment and others
Segment as parent segment.
- The Segment pointed to the logical association is referred as child segment and other
segment as parent Segment.
- In general, the segment themselves could form a tree with multiple levels and multiple
branches at each level.
- The segment without parent is called the root.
- The segment that has no children are called as leaves of hierarchical model.
Example
Let us Consider two segments be:
faculty: Name, Department, Course taught
B C
D E F
fig: Network data model.
- In figure above member -B has only one owner A whereas member-E has two owners 'B'
and 'C'.
- Each link between two record types represents 1: M relationships.
- There is lateral as well as top-down connections, hence allows 1:1, 1: M and M:M
relations among entities.
- This model supports multiple paths to Same record, hence avoids data redundancy.
Examples:
a) TOTAL by Cincom Systems Inc.
5) IDMS Cullinane Corp.
c) EDMS by Kerox Corp.
INVOICE-LINE
Fig: A relation.
Here,
degree=4, tuples=5, cardinality=5
6. keys of a relation
- Normally, all data related to student is not Stored in a single table.
- Data that is permanent like Name, date of birth, address, parent's address is stored in one
table, referred to as MASTER or PARENT table.
- MASTER table will contain only one record Student For every Student table
Student table
Roll_no Name Address Place Pin
1 Ram Jnk-1 Janakpur 11002
2 Shyam Pkr-5 Pokhara 15600
3 Hari Dharan-9 Dharan 12905
4 Manish Jnk-10 Janakpur 11001
Student table
Roll_no Subject Date Marks
1 Ram Jnk-1 Janakpur
2 Shyam Pkr-5 Pokhara
3 Hari Dharan-9 Dharan
4 Manish Jnk-10 Janakpur
2. No anomalies
- Unlike hierarchical and Network model, this model does not suffer from Insert, update,
delete and retrieval operation.
3. Structural independence
- This model does not depend on navigational data access System.
- So, Changes in database structure does not affect the data access.
2. Ease of design
- Ease of design and use Can lead to development and implementation of poorly designed
DBMS.
- The poorly designed database will slow the System and results in performance degradation and
data Corruption.
4. SQL does not provide index to provide for efficient browsing alphabetically.
1. An ER-model maps well to relational model i.e., the Constructs used in the ER model Can
be easily transformed into relational tables.
2. An ER- model Can be used by database designer to Communicate the design to the enduser.
3. An ER-model Can be used as a design plan. by database developer to implement a data
model in specific DBMS Software.
1. Entities
- It is person, place or thing which exist and is distinguishable from one another.
- for example, Employees, Table, Chalk Board, penetc.
- entity is analogous to a table in relationalmodel.
a) Independent entity
- It is entity that does not relies on another for identification. e.g.; In organization Scenario:
e.g., in organization scenario: dept is independent &
b)Dependent entity manager is dependent
- It is entity that relies on another for identification.
3. Relationships
- It is an association between two or more entities They are classified in terms of degree,
connectivity, Cardinality and existence.
- for example,ISA relationship, HASA relationship etc.
4. Degree of a Relationship
- It is defined as the number of entities associated.
- There are three types of degrees of relationships:
Project
5. Cardinality or Connectivity.
- It describes the mapping of associated entity instances in the relationship.
- The values of Connectivity are "one" or "many".
- while the actual Count of elements associated with the Connectivity is called Cardinality of
the relationship Connectivity.
- There are three basic types of connectivity. for relations and are follows:
c) Manytomany(M: N)
- It is when many for one instance of entity A, there are zero, one or
or many instances of entity B.
- For example, many teachersteachmanyStudents.
Note:M: N relationship Cannot be directly translated to relational tables but instead must be
transformed in to two or more one-to-many relationships usingassociative entities.
6. Attributes
- It defines the properties of data object of entity for example attributes of flower are it's
color, it's name, it's texture.
- Similarly, attributes of a ball or it's shape, Color, size etc.
- Here flower, ball are the data objects. The classification of attributesare as such:
Attribute
Note:Modeling an attribute into Simple or Composite attribute depends on user view of the data.
V. Derived attribute
- It is attribute that represents a value that is derivable from the value of related attribute or
set of attributes, not essentially in same attribute.
- for example, the age attribute Can be derived from Date-of- birth and so are they related.
# Direction
- The direction of a relationship shows the originating entity of a relationships.
- The entity from which a relationship originates is parent entity and where terminates is
child entity.
- Directed line and undirected line is used between the relationship Set and entity set.
• Directed line- Used to indicate one occurrence.
• undirected line-used to indicatemany occurrences.
Examples:
Department Manager
Manager
Manager
Department
Total Partial
Student taking
taking Course
- If one or more entity of entity of are left out without participating in relationship then is
known as partial participation.
- E.g.; course taken by students i.e., there can beCourse which is not takenby any student.
Custcity
#Notations used in ER-Diagram
# Strong and Weak Entity sets
- The entity set which does not have sufficient attributes to form a primary key is known as
weak entity Set.
- On the other hand, the entity set that has a primary key is known as strong entity set.
- Each weak entity set must be a part of 1:M relationship set with relationships double
diamond.
- A member of a strong entity set is called dominant entityand a member of weak entity
setis called a subordinate entity.
- The discriminator (or partial key) of a weak entity set of attributes that distinguishes all
entities of a weak entity set that depends on one particular strong entity.
- primary key of a weak entity set is formedprimary key of strong entity set on which the
weak entity Set is existence dependent.
- weak entity set is shown double rectangles.
- The discriminator of a weak entity set is shown with dashed line.
loanno payno
Amount paydate
ed
Loan Loan
Loan payment
strong weak
Generalization
- It is the process of identifying some Common Characteristics of a collection of entity sets
and creating anew entity set that contains common features.
- It is a form of abstraction that specifics two or more entities that share Common
attribute can be generalized into higher level entity called super type generic type.
- lower-level entities become the subtype or dependent entities to super type entity.
- Generalization is bottom-up process.
- For e.g., sub-entities car, bus, bike can be generalized into one general super class (base
class) named Vehicle.
Vehicle
Specialization
- It is a process of identifying subsets of an entity (super class/super type) that share some
distinguishing characteristics.
- The process of defining sub class is based on the basis of some distinguishing
characteristics of entities in Super class.
- Here, firstly we define a super-class, then sub-classes and then their attributes and
relationships.
- It is top-down process.
account
ISA
Example 2:
An insurance agent sells insurance policies to clients. Policies can be of different types such as
vehicle insurance, life insurance, accident insurance etc. The agent collects monthly premiums
on the policies the form of cheques of local bank. Appropriate attribute must be assumed for
various entities such as agents, vehicles, policy.
Draw an E-R model for above System. Your E-R model should take advantage of extended E-R
notation where relevant.
Aggregation
- It is a process of Compiling information on an object, thereby abstracting a higher-level
object.
- E-R model cannot express relationship among relationships, Aggregation is a solution to
this problem.
- Aggregation shows a "has a' or ` is part of' relationships between entity types where one
recipient, represents the "whole” and other as “part”.
- This special wind of relationship is called an aggregation [BOOCH, 1998].
- Let us Consider a ternary relationship - work on, between Employee, Branch and
Manager.
- The best way to model this situation is to use aggregation. So, the relationship setworkon
is a higher level
- We can then create a binary relationship, manages between work-on and manager to
represent who manages that task.