0% found this document useful (0 votes)
19 views42 pages

DBMS Architectures and Features - Lecture 7 - Introduction To Databases (1007156ANR)

This document discusses database management system (DBMS) architectures and components. It describes DBMS components like the DML preprocessor, query compiler, and database manager. It also explains architectures like teleprocessing, file-server, and two-tier client-server architectures.

Uploaded by

Beat Signer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views42 pages

DBMS Architectures and Features - Lecture 7 - Introduction To Databases (1007156ANR)

This document discusses database management system (DBMS) architectures and components. It describes DBMS components like the DML preprocessor, query compiler, and database manager. It also explains architectures like teleprocessing, file-server, and two-tier client-server architectures.

Uploaded by

Beat Signer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Introduction to Databases

DBMS Architectures and Features

Prof. Beat Signer

Department of Computer Science


Vrije Universiteit Brussel

beatsigner.com

2 December 2005
DBMS Components
Programmers Users DB Admins
Application Database
Queries
Programs Schema

DBMS
DML Query DDL
Preprocessor Compiler Compiler

Program Authorisation Catalogue


Object Code Control Manager

Integrity Command Query


Checker Processor Optimiser

Transaction
Scheduler
Manager

Data
Manager Buffer Recovery
Manager Manager
Database
Manager

Access File
Methods Manager

System Data, Indices and


Based on 'Components of a DBMS', Database Systems,
Buffers System Catalogue
T. Connolly and C. Begg, Addison-Wesley 2010

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 2


DBMS Components ...
▪ DML preprocessor
▪ transforms embedded SQL statements into statements of the host
language
▪ interacts with the query compiler to generate the appropriate host
language code
▪ Query compiler
▪ transforms queries into a set of low-level instructions (query plan)
which are forwarded to the database manager component
▪ DDL compiler
▪ converts a set of DDL statements into a set of tables
▪ tables and metadata are stored in the system catalogue
(catalogue manager)

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 3


DBMS Components ...
▪ Catalogue manager
▪ provides access and manages the system catalogue
▪ used by most DBMS components
▪ Database manager
▪ processes user-submitted queries
▪ interfaces with application programs
▪ contains a set of components
- query optimiser
- transaction manager
- ...

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 4


Database Manager Components
▪ Authorisation control
▪ checks whether as user has the necessary rights to execute a
specific operation
▪ Command processor
▪ executes the steps of a given query plan handed over by the
authorisation control component
▪ Integrity checker
▪ ensures that the operation is not going to violate any integrity
constraints (e.g. key constraints)
▪ Query optimiser
▪ computes an optimal query execution strategy
▪ transforms the initial query plan into the best available sequence
of operations on the given data
March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 5
Database Manager Components ...
▪ Transaction manager
▪ processes any transaction-specific operations
▪ Scheduler
▪ manages the relative order in which transaction operations are
executed
▪ Recovery manager
▪ deals with commits and aborts of transactions
▪ ensures that the database remains in a consistent state in case of
failures
▪ Buffer manager
▪ transfers data between main memory and secondary storage

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 6


DBMS Architectures
▪ There is a wide variety of different DBMS architectures
▪ Teleprocessing
▪ File-Server Architecture
▪ Two-Tier Client-Server Architecture
▪ Three-Tier Client Server Architecture
▪ N-Tier Architecture
▪ Peer-to-Peer Architecture
▪ Distributed DBMS
▪ Service-Oriented Architecture
▪ Cloud Architecture
▪ ...

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 7


Teleprocessing

Terminal 1 Terminal 3

Mainframe

Terminal 2 Terminal n

▪ Traditional multi-user system architecture


▪ single mainframe and multiple (dumb) terminals
▪ Heavy load on the central mainframe
▪ runs application programs and DBMS
▪ formats data for presentation on terminals

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 8


Teleprocessing ...
▪ Tendency to replace expensive mainframes with network
of personal computers (downsizing)
▪ file-server architectures
▪ client-server architectures
▪ ...

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 9


File-Server Architecture
data request
LAN or
WAN file
Database
Workstation 1 (DBMS) File-Server

Workstation 2 (DBMS) Workstation n (DBMS)

▪ A file-server is a computer that is connected to a network


and mainly serves as a shared storage
▪ e.g. for realising shared access to a database
▪ In a file-server architecture the processing is distributed
over the network
▪ workstations (application and DBMS) request data (files)
March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 10
File-Server Architecture ...
▪ SQL Request Example
SELECT name, street
FROM Customer, Order
WHERE Order.customerID = Customer.customerID AND name = 'Max Frisch';

▪ Since the file-server is not SQL-aware, the Customer and


Order relations (files) have to be transferred to the client
▪ Disadvantages
▪ heavy network traffic
▪ high total costs of ownership (TCO)
- maintain a full instance of the DBMS on each client (workstation)
▪ complex integrity, concurrency and recovery control
- multiple DBMSs may concurrently access the same shared file

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 11


Two-Tier Client-Server Architecture
data request
LAN or
WAN selected data
Database
Client 1 Server (DBMS)

Client 2 Client n

▪ Application consists of a client (first tier) and a server


(second tier) that might run on different machines
▪ clear separation of concerns between client and server
▪ thin client vs. thick client
- less or more application logic on the client side

▪ Supports decentralised business environments


March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 12
Two-Tier Client-Server Architecture ...
▪ Client (first tier) tasks
▪ presentation of data (user interface)
▪ business and data processing logic
▪ send database requests to the server and process the results
▪ Server (second tier) tasks
▪ manage (concurrent) database access (data services)
- authorisation, integrity checks, query/update processing, recovery control, ...
▪ business logic (e.g. validation of data)
▪ Different possible client-server topologies
▪ single client and single server
▪ multiple clients and single server
▪ multiple clients and multiple servers

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 13


Two-Tier Client-Server Architecture ...
▪ Communication between client and server via inter-
process communication (IPC) or over network
▪ special protocols such as ODBC (JDBC) introduced earlier in the
course when discussing the Call Level Interface (CLI)
▪ Advantages
▪ increased performance
- certain tasks performed in parallel and server can be tuned for DB processing
▪ reduced communication costs
- only selected data is transferred
▪ reduced hardware costs
- only server has to run a DBMS
▪ increased consistency/security through separation of concerns
- constraint checking in a single place (server)

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 14


Two-Tier Client-Server Architecture ...
▪ Disadvantages
▪ limitations in terms of enterprise scalability with thousands of
potential clients
- significant client-side administration overhead
• e.g. expensive deployment of new business and data application logic
- thick client requires a considerable amount of resources (CPU, RAM, ...) to run
applications efficiently

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 15


Three-Tier Client-Server Architecture
data request
LAN or
selected data
WAN
Database
Client 1 Application Server Database Server

Client 2 Client n

▪ In the 1990s, the three-tier client-server architecture was


introduced to address the enterprise scalability problem
▪ e.g. driven by emerging web applications
▪ Application consists of a presentation tier (client),
a logic tier (application server) and a data tier
(database server) that might run on different platforms
March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 16
Three-Tier Client-Server Architecture ...
▪ Presentation tier (first tier) tasks
▪ presentation of data (user interface)
▪ basic input validation (thin client)
▪ send requests to the server and visualise results
▪ Logic tier (second tier / middle tier) tasks
▪ business logic
▪ data processing logic
▪ Data tier (third tier) tasks
▪ basic data validation
▪ manage (concurrent) database access (data services)
- authorisation, integrity checks, query/update processing, recovery control, ...

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 17


Three-Tier Client-Server Architecture ...
▪ Advantages
▪ reduced costs for thin clients due to lower resource requirements
(CPU, RAM, ...)
- e.g. applications running in a web browser
▪ application logic is centralised in a single application server
- reduces the software distribution problem (updates) that is present in
two-tier client-server architectures
▪ increased modularity
- easier to replace one tier without affecting the other tiers
▪ load balancing becomes easier with a clear separation between
the core business logic and the database functionality

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 18


Web Information Systems (WIS)

HTTP Request

Internet
HTTP Response Database
Client Application DB Server
Server

▪ The three-tier architecture maps very naturally to web


environments
▪ browser as a thin client, application server and database server
▪ The move towards thin browser clients has dramatically
reduced the costs for software deployment

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 19


N-Tier Architecture

HTTP Request

Internet Database
Application DB Server
HTTP Response
Web Server
Client
Server

HTML Pages

▪ The three-tier architecture can be extended with


additional intermediary tiers for increased flexibility
▪ e.g. separation between web server and application server in the
previous web information system example
- increases the flexibility for load balancing by introducing multiple web servers
- only dynamic content delivered by the application server whereas static
content is directly managed by the web server

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 20


Peer-to-Peer Architecture

Database 1 Database 3
Site 1 LAN or Site 3
WAN

Database 2 Database n
Site 2 Site n

▪ Systems exchanging information and services in a peer-


to-peer-like manner without a central authority
▪ no global schema → need for schema integration (matching)
▪ Data and service sharing
▪ no dedicated clients and servers
▪ sites may dynamically form new client/server relationships
March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 21
Middleware
▪ Software that connects (mediates) between software
components or applications
▪ hide complexity of heterogenous and distributed components
(e.g. servers) and provide a uniform interface
▪ There exist different types of middleware
▪ remote procedure call (RPC)
- Java RMI
- CORBA
- XML RPC
▪ asynchronous publish/subscribe
- subscribe for different types of messages
▪ SQL-oriented data access
- open database connectivity (ODBC), JDBC, ...
▪ ...
March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 22
Transaction Processing Monitor

Service 1
Database 1
.
Client 1 DB Server 1
TP Monitor .
.

Service r
Database n
Client m Application Server DB Server n

▪ Complex applications can be built on top of several


resource managers
▪ e.g. multiple DBMSs, operating systems, ...
▪ A Transaction Processing Monitor (TP Monitor) is a
middleware component that provides uniform access to
the services of a number of resource managers
March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 23
Transaction Processing Monitor ...
▪ A TP Monitor offers a number of advantages
▪ transaction routing
- increase scalability by directing transactions to specific DBMSs
▪ distributed transaction management
- manage transactions that require access to multiple heterogeneous DBMSs
- e.g. based on XA standard for distributed transaction processing
▪ load balancing
- balance requests across multiple DBMSs by directing to least loaded server
▪ funnelling
- establish connections with DBMSs and funnel user requests through these
connections thereby reducing the number of required connections
▪ increased reliability
- if a DBMS fails, the TP monitor can resubmit the request to another DBMS or
hold the transaction request until the DBMS becomes available and resubmit

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 24


Parallel Database Architectures

Processor Memory Processor Memory Processor


Memory

Processor Memory

Processor Memory Processor


Disk Disk
Disk

Memory Disk
Processor Processor
Disk Disk
shared memory shared disk shared nothing

▪ Parallel machines (multiple processors) can be used to


speed up the transaction processing
▪ Different models for parallel database architectures
▪ shared memory, shared disk, shared nothing and hierarchical

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 25


Parallel Database Architectures ...
▪ Shared memory
▪ processors and disks have access to a shared memory via a bus
▪ very efficient communication between processors
▪ not scalable since memory bus becomes a bottleneck
▪ Shared disk
▪ all processors can access all disks via an interconnection network
▪ each processor has its own memory
▪ certain degree of fault tolerance if processor/memory fails
▪ also disks maybe have fault tolerance (e.g. RAID architecture)
▪ interconnection to the disk systems becomes bottleneck

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 26


Parallel Database Architectures ...
▪ Shared nothing
▪ each node consists of a processor, memory and one or
more disks
▪ high-speed interconnection network between processors
▪ more scalable than shared memory or shared disk model
▪ increased communication costs for non-local disk access
▪ Hierarchical
▪ combines the different models (composition)
▪ top level is shared nothing between nodes
- each node can be a shared memory or shared disk "subsystem"

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 27


Distributed DBMS (DDBMS)

Database Database
Site 1 LAN or Site 3
WAN

Database Database
Site 2 Site n

▪ Distributed database
▪ logically related collection of shared data and metadata that is
distributed over a network
▪ Distributed DBMS
▪ software system to manage the distributed database in a
transparent way

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 28


Distributed DBMS (DDBMS) ...
▪ Distinction between local and global transactions
▪ local transaction
- accesses only data from the site from which the transaction was initiated
▪ global transaction
- accesses data from several different sites

▪ Reasons for building a distributed DBMS


▪ data sharing
- possibility to access data that resides at other sides
▪ autonomy
- each site retains a certain degree of control over the local data
▪ availability
- if one site fails the other sites may still be able to continue operating
- data might be replicated at serveral sites for increased availability

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 29


Distributed DBMS (DDBMS) ...
▪ Reasons for building a distributed DBMS ...
▪ costs and scalability
- use cluster of PCs instead of large mainframe systems
▪ integration of existing DBMS
- coexistence of legacy systems with new applications
▪ dynamic organisational structure
- mergers and acquisitions

▪ Implementation issues
▪ transactions have to be executed atomically accross different
sites (two-phase commit protocol)
- commit decision is left to a single coordinator
▪ distributed concurrency control
- deadlock detection has to be carried out across multiple sites

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 30


Service-Oriented Architectures (SOA)
▪ Architecture that modularises functionality as
interoperable services
▪ loose coupling of services
▪ service encapsulation
▪ interoperability between different operating systems and
programming languages
▪ new services can be defined as a mashup of existing services
▪ Service-oriented database architecture (SODA)
▪ e.g. single SQL Server processes acting as service providers

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 31


Service-Oriented Architectures (SOA) ...
▪ Share business logic, data and processes via web
service APIs
▪ Big Web Services
- Universal Description, Discovery and Integration (UDDI)
- Web Services Description Language (WSDL)
- Simple Object Access Protocol (SOAP)
▪ RESTful Web Services
▪ Web services are based on established technologies
such as the Extensible Markup Language (XML)
▪ Special service orchestration languages for the use of
services
▪ e.g. Business Process Execution Language (BPEL)

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 32


Cloud Computing

Google
Microsoft

Client 1 The Cloud Client 3

Yahoo
Amazon

Client 2 Client n

▪ Internet-based computing with on-demand and pay-per-


use access to shared resources, data and software
▪ Main characteristics
▪ web-based access (e.g. Web Service API or browser)
▪ pay only for the services that are actually used (pay-per-use)
▪ no initial investment (e.g. for resources) required
March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 33
Cloud Computing ...
▪ Cloud services
▪ infrastructure as a service (IaaS)
- OS virtualisation
- e.g. Amazon EC2, Rackspace, ...
▪ platform as a service (PaaS)
- provide platform to run applications
- e.g. Google App Engine, Windows Azure Platform, ...
▪ software as a service (SaaS)
- provide software as a service over the Internet
- e.g. Google Docs, ...

▪ Cloud service vendor gets some degree of control

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 34


Cloud Computing ...
▪ New challenges for database management
in cloud computing
▪ cloud database server might be less reliable
- might become difficult to guarantee a specific quality of service (QoS) for an
application realised in the cloud
▪ backup, replication, ...
▪ Online or web-based databases
▪ store data in the cloud or on servers on the Internet

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 35


Cloud Data Service Example
▪ Amazon Simple Storage Service (Amazon S3)
▪ online storage service with unlimited storage space
▪ store objects (up to 5 TB in size) in buckets
▪ Web Service API
▪ Amazon SimpleDB
▪ distributed database written in Erlang
▪ offers a Web Service API
▪ makes use of S3 and EC2
▪ on demand scaling
▪ non-relational data store
- schemaless
- hashtables with set of key value pairs
- eventual consistency

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 36


Mobile DBMS
▪ Users want access to information on the move via mobile
devices
▪ tourist information systems
▪ salesperson who is visiting their customers
▪ emergency services
▪ ...
▪ New requirements for mobile DBMSs
▪ small footprint databases that can run on mobile devices with
limited resources
- e.g. db4objects, http://www.db4o.com
▪ location-dependent queries
▪ context-aware queries

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 37


Mobile DBMS ...
▪ New requirements for mobile DBMSs ...
▪ communicate with centralised database server via wireless
network or fixed Internet connections
▪ replicate data on a centralised server and on a mobile device
- synchronisation challenges
▪ caching of data and transactions to cope with potential network
connection failures
▪ opportunistic (peer-to-peer based) information exchange with
other mobile DBMSs
- e.g. dynamic P2P Bluetooth connections with other devices in range
(proximity-based information exchange)
▪ security
- which portion of a database can/should be replicated on a mobile device?
▪ ...

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 38


Homework
▪ Study the following chapter of the
Database System Concepts book
▪ chapter 17
- sections 17.1-17.6
- Database-System Architectures

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 39


Exercise 7
▪ Structured Query Language (SQL)
▪ JDBC

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 40


References
▪ A. Silberschatz, H. Korth and S. Sudarshan,
Database System Concepts (Sixth Edition),
McGraw-Hill, 2010
▪ T. Connolly and C. Begg, Database Systems: A Practical
Approach to Design, Implementation and Management
(Fifth Edition), Addison-Wesley, 2010

March 27, 2019 Beat Signer - Department of Computer Science - [email protected] 41


Next Lecture
Storage Management

2 December 2005

You might also like