0% found this document useful (0 votes)

4 views

Advanced Database Chapter 6 Distributed database

Chapter 6 discusses distributed databases and client-server architectures, covering concepts such as data fragmentation, replication, and allocation. It highlights the advantages of distributed databases, including increased availability and improved performance, while also addressing challenges like concurrency control and recovery. The chapter concludes with an overview of client-server architecture, detailing the roles of clients and servers in managing data distribution and query processing.

Uploaded by

leulz3000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

4 views

Advanced Database Chapter 6 Distributed database

Uploaded by

leulz3000

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 33

Advanced Database

Chapter 6:
Distributed Databases and Client-
Server Architectures
Outline
 Distributed Database Concepts
 Data Fragmentation, Replication and Allocation
 Types of Distributed Database Systems
 Query Processing in Distributed Databases
 Concurrency Control and Recovery
 Client-Server Architecture

2
Distributed Database Concepts
 A distributed database (DDB) is a collection of
multiple logically related databases distributed over a
computer network.
 A transaction can be executed by multiple

networked computers in a unified manner

3
Distributed Database System
 Advantages
 Management of distributed data with different

levels of transparency:

This refers to the physical placement of data
(files, relations, etc.) which is not known to the
user (distribution transparency).

4
Distributed Database System(cont…)
 Example:

 EMPLOYEE, PROJECT and WORKS_ON tables may

be fragmented horizontally and stored with possible
replication as shown below:

5
Distributed Database System(cont…)
 Advantages (cont…)

Replication transparency:

It allows to store copies of a data at multiple sites
for better availability.

Fragmentation transparency:

Allows to fragment a relation horizontally (create
a subset of tuples of a relation) or vertically (create
a subset of columns of a relation)

6
Distributed Database System(cont…)
 Advantages (cont...)

 Increased availability:


Availability is the probability that the system is
continuously available (usable or accessible)
during a time interval.

A distributed database system has multiple nodes
(computers) and if one fails then others are
available to do the job.

7
Distributed Database System(cont…)
 Other Advantages (cont…)
 Improved performance:


A distributed DBMS fragments the database to
keep data closer to where it is needed most

This reduces data management overhead (access
and modification time) significantly
 Easier expansion (scalability):


Refers to expansion of the system in terms of
adding more data, increasing database sizes or
adding more processors

8
Data Fragmentation, Replication
and Allocation
 Data Fragmentation
 Split a relation into logically related and correct parts.

A relation can be fragmented in three ways:

-Horizontal Fragmentation -
Vertical Fragmentation
-Mixed (hybrid) Fragmentation
 Horizontal fragmentation
 It is a horizontal subset of a relation which contain

those of tuples which satisfy selection conditions.

9
Data Fragmentation, Replication and
Allocation(cont…)
 Vertical fragmentation
 It is a subset of a relation which is created by a

subset of columns. Thus, a vertical fragment of a

relation will contain values of selected columns.
 Because there is no condition for creating a vertical

fragment, each fragment must include the primary

key attribute of the parent relation Employee.

In this way all vertical fragments of a relation
are connected.

10
cont.….
 Mixed (Hybrid) fragmentation
 A combination of Vertical fragmentation and

Horizontal fragmentation
 This is achieved by SELECT-PROJECT operations.

11
Data Replication and Allocation
Data Replication refers the distribution of whole or
part of the data to a number of sites
 Useful in improving availability of data & Improve

performance of global queries since the result of

such query can be obtained from any one site
 In full replication, the entire database is replicated

and in partial replication some selected part is

replicated to some of the sites.

12
Data Replication and Allocation(cont…)
Data Replication and allocation
 The disadvantage of full replication is that it can slow

down update operation since a single logical update

must be performed on every copy of the database to
keep the copies consistent
 Each fragment must be assigned to a particular site in the

distributed system. This process is called data

distribution (or data allocation).

13
Types of Distributed Database Systems
 Homogeneous Window
Site 5 Unix
 All sites of the
Oracle Site 1
database system Window
Oracle

have identical Site 4 Communications

network
setup, i.e., same
database system Oracle
software. Site 3 Site 2
Linux Oracle Linux Oracle

14
Types of Distributed Database Systems
 Heterogeneous
 Federated: Each site may run different database

system but the data access is managed through a single

conceptual schema.
Object Unix Relational
Oriented Site 5 Unix
Site 1
Hierarchical
Window
Site 4 Communications
network

Network
Object DBMS
Oriented Site 3 Site 2 Relational
Linux Linux

15
Types of Distributed Database Systems
 The type of heterogeneity present in FDBSs may arise
from several sources:
 Differences in data models:


Relational, Objected oriented, hierarchical, network,
etc.
 Differences in constraints:


Each site may have their own data accessing and
processing constraints.
 Differences in query language:


Even with the data model, the language and their
version may vary. SQL has multiple versions.

16
Query Processing in Distributed
Databases
 Issues

Cost of transferring data (files and results) over the
network.

This cost is usually high. So, some optimization is
necessary.
 Example: Suppose we have the Employee relation at site 1
and Department relation at Site 2
Employee at site 1. 10,000 rows. Row size = 100 bytes.

This means, table size = 106 bytes.

Department at Site 2. 100 rows. Row size = 35 bytes.

This means, table size = 3,500 bytes.
17
Query Processing in Distributed
Databases (cont…)
 Issues (cont…)
 Query Q : For each employee, retrieve employee name and

department name Where the employee works.

 Q: Fname,Lname,Dname (EmployeeDno = Dnumber Department)

Employee Fname Minit Lname SSN Bdate Address Sex Slary Superssn Dno

Department Dname Dnumber Mgrssn Mgrstartdate

18
Query Processing in Distributed
Databases (cont…)
 Result
 If every employee is related to a department, the result

of this query will have 10,000 tuples

 Suppose that each result tuple is 40 bytes long. The

query is submitted at site 3 and the result is sent to this

site: Query result size = 40 * 10,000 = 400,000 bytes.
 Suppose that Employee and Department relations are

not present at site 3 Employee

Site 1

Site 2 Site 3
Department

19
Query Processing in Distributed
Databases (cont…)
 Strategies (Available options):
1. Transfer Employee and Department to site 3.

Total transfer bytes = 1,000,000 + 3500 = 1,003,500
bytes.
2. Transfer Employee to site 2, execute join at site 2 and
send the result to site 3.

Total transfer size = 1,000,000 + 400,000 = 1,400,000
bytes.
3. Transfer Department relation to site 1, execute the join at
site 1, and send the result to site 3

Total bytes transferred = 3500 + 400,000 = 403,500
bytes.
 Optimization criteria: minimizing data transfer.
 Preferred strategy: strategy 3.

20
Query Processing in Distributed
Databases (cont…)
 Now suppose the result site is 2.
 Possible strategies :
1. Transfer Employee relation to site 2, execute the
query and present the result to the user at site 2

Total transfer size = 1,000,000 bytes for Q.
2. Transfer Department relation to site 1, execute join
at site 1 and send the result back to site 2

Total transfer size for Q:
 3500 +400,000 = 403,500 bytes

21
Concurrency Control and Recovery
 Distributed Databases encounter a number of
concurrency control and recovery problems which are
not present in centralized databases.
 Some of these problems are listed below:
 Dealing with multiple copies of data items

 Failure of individual sites

 Communication link failure

 Distributed commit

 Distributed deadlock

22
Concurrency Control and Recovery
(cont…)
 Details
 Dealing with multiple copies of data items:


The concurrency control must maintain global
consistency.

Likewise, the recovery mechanism must recover all
copies and maintain consistency after recovery.
 Failure of individual sites:


Database availability must not be affected due to the
failure of one or two sites and the recovery scheme
must recover them before they are available for use.

23
Concurrency Control and
Recovery (cont…)
 (Details….)
 Communication link failure:
 This failure may create network partition which

would affect database availability even though all

database sites may be running.
 Distributed commit:
 Problems can arise with transactions that is

accessing databases stored on multiple sites if some

sites fail during the commit process . The 2 phase
commit is used to deal with this problem

24
Concurrency Control and
Recovery (cont…)
 (Details….)
 Distributed deadlock:
 Since transactions are processed at multiple sites,

two or more sites may get involved in deadlock.

This must be resolved in a distributed manner.

25
Concurrency Control and
Recovery (cont…)
 Distributed Concurrency control
 Primary site technique: A single site is assigned

as a primary site which serves as a coordinator for

transaction management.

Primary site
Site 5
Site 1

Site 4 Communications neteork

Site 3 Site 2

26
Concurrency Control and
Recovery
 Transaction management:
 Concurrency control and commit are managed by

this site
 All locks are kept at that site and all requests for

locking or unlocking are sent there

 In two phase locking, this site manages locking and

releasing of data items

 If all transactions follow two-phase policy at all

sites, then serializability is guaranteed

27
Concurrency Control and
Recovery (cont…)
 Advantages:

It is an extension to the centralized two phase
locking and hence simple to Implement and
manage

Data items are locked only at one site but they
can be accessed at any site at which they reside
 Disadvantages:

All transaction management activities go to
primary site which is likely to overload the site.

If the primary site fails, the entire system is
inaccessible
28
Concurrency Control and
Recovery (cont…)
 Primary site with backup site
 To aid recovery, a backup site is designated which

behaves as a shadow of primary site.

 In case of primary site failure, backup site can act

as primary site.

29
Client-Server Database Architecture
 It consists of clients running client software, a set of
servers which provide all database functionalities and
a reliable communication infrastructure.

Server 1 Client 1

Client 2

Server 2 Client 3

Server n Client n

30
Client-Server Database Architecture
 Server: is responsible for local data management at a
site, much like centralized DBMS software.
 Client: is responsible for most of the distribution
function; it accesses data distribution information from
the DBMS catalog and processes all requests that
require access to more than one site.
 The communication software manages communication
among clients and servers

31
Client-Server Database Architecture
 The processing of a SQL queries goes as follows:
 Client parses a user query and decomposes it into a

number of independent sub-queries.

 Each server processes its query and sends the result

to the client.
 The client combines the results of sub queries and

produces the final result.

32
 Thank you….
 Question?

ch6 Distributed Database
No ratings yet
ch6 Distributed Database
35 pages
Chapter 4 Distributed Databases
No ratings yet
Chapter 4 Distributed Databases
36 pages
ch6 Distributed Database
No ratings yet
ch6 Distributed Database
25 pages
7-Distributed DB
No ratings yet
7-Distributed DB
37 pages
Chapter 7 - Distributed Database System
No ratings yet
Chapter 7 - Distributed Database System
27 pages
Data Communication Basics CH 7
No ratings yet
Data Communication Basics CH 7
27 pages
7 Distributed DB
No ratings yet
7 Distributed DB
38 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
60 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Chapter -7 Distributed Database System
No ratings yet
Chapter -7 Distributed Database System
29 pages
Lecture 2 Distriburted Databases
No ratings yet
Lecture 2 Distriburted Databases
45 pages
DDB Slides
No ratings yet
DDB Slides
30 pages
4.1 Lecture 4 Distributed Databases
No ratings yet
4.1 Lecture 4 Distributed Databases
42 pages
DistributedDatabases 3
No ratings yet
DistributedDatabases 3
14 pages
Chapter - 7 Distributed Database System
100% (1)
Chapter - 7 Distributed Database System
54 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
Enterprise Systems: Distributed Databases and Systems - DT211 4
No ratings yet
Enterprise Systems: Distributed Databases and Systems - DT211 4
25 pages
Chapter 6
No ratings yet
Chapter 6
45 pages
Chapter 5 - Distributed Databases Roobera
No ratings yet
Chapter 5 - Distributed Databases Roobera
58 pages
Distributed Database Frank Chinembiri and Florence-2
No ratings yet
Distributed Database Frank Chinembiri and Florence-2
42 pages
Advanced Database Chapter 6 and 7
No ratings yet
Advanced Database Chapter 6 and 7
30 pages
DDBS Unit 1
No ratings yet
DDBS Unit 1
11 pages
Distributed Databases: by Chien-Pin Hsu CS157B Section 1 Nov 11, 2004
No ratings yet
Distributed Databases: by Chien-Pin Hsu CS157B Section 1 Nov 11, 2004
24 pages
Distributed Database
No ratings yet
Distributed Database
22 pages
Distributed DBMS
No ratings yet
Distributed DBMS
62 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
DDB Slides
No ratings yet
DDB Slides
67 pages
Chapter 10 - Distributed Databases
No ratings yet
Chapter 10 - Distributed Databases
27 pages
Distributed Data Management: Distributed Systems Department of Computer Science UC Irvine
No ratings yet
Distributed Data Management: Distributed Systems Department of Computer Science UC Irvine
67 pages
Unit V
No ratings yet
Unit V
22 pages
ADBS_Chapter_Seven
No ratings yet
ADBS_Chapter_Seven
22 pages
Database
No ratings yet
Database
6 pages
Unit-2_Distributed Database System
No ratings yet
Unit-2_Distributed Database System
7 pages
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
58 pages
AdvDB@Chap4s
No ratings yet
AdvDB@Chap4s
29 pages
ADBMS Sem 1 Mumbai University (MSC - CS)
No ratings yet
ADBMS Sem 1 Mumbai University (MSC - CS)
39 pages
Ch6-Introduction to Distributed Database (2)
No ratings yet
Ch6-Introduction to Distributed Database (2)
22 pages
ddb unit 1-5
No ratings yet
ddb unit 1-5
190 pages
Final
No ratings yet
Final
46 pages
Chapter 6
No ratings yet
Chapter 6
28 pages
Database MC A
No ratings yet
Database MC A
16 pages
Unit V
No ratings yet
Unit V
9 pages
Distributed Databases: Not Just A Client/server System
No ratings yet
Distributed Databases: Not Just A Client/server System
43 pages
Unit I (Distributed Databases)
No ratings yet
Unit I (Distributed Databases)
8 pages
Chapter 6 DDBMS
No ratings yet
Chapter 6 DDBMS
41 pages
Distributed DBM S
No ratings yet
Distributed DBM S
67 pages
Distributed Databases: Not Just A Client/server System
No ratings yet
Distributed Databases: Not Just A Client/server System
43 pages
Distributed Databases: Benefits and Issues To Be Considered
No ratings yet
Distributed Databases: Benefits and Issues To Be Considered
25 pages
Distributed Databases: CMP-3440 - Database Systems
No ratings yet
Distributed Databases: CMP-3440 - Database Systems
12 pages
DDBMS
No ratings yet
DDBMS
44 pages
DBMS Unit V
No ratings yet
DBMS Unit V
120 pages
Distributed Database I
No ratings yet
Distributed Database I
20 pages
Cs 1006 Advanced Databases s6 Cse
100% (2)
Cs 1006 Advanced Databases s6 Cse
31 pages
Unit 5
No ratings yet
Unit 5
17 pages
10 Distributeddbms
No ratings yet
10 Distributeddbms
56 pages
26 Distributed Dbms Nosql
No ratings yet
26 Distributed Dbms Nosql
45 pages
Topic 7 DDBMS
No ratings yet
Topic 7 DDBMS
28 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
123 pages
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
Database And Computer Management: SERIES 1, #3
From Everand
Database And Computer Management: SERIES 1, #3
Elias Mutegi
No ratings yet
Advanced Database Lab part About Function
No ratings yet
Advanced Database Lab part About Function
35 pages
un told truth about ethiopia nad america wr
No ratings yet
un told truth about ethiopia nad america wr
293 pages
ethiopian current market structure
No ratings yet
ethiopian current market structure
1 page
Basic Usage of MobaXterm
No ratings yet
Basic Usage of MobaXterm
5 pages
Amazon Polly
No ratings yet
Amazon Polly
2 pages
Visually Creating Blockchain Applications Using Hyperledger Fabric and Joget
100% (1)
Visually Creating Blockchain Applications Using Hyperledger Fabric and Joget
17 pages
Product Catalog: End-To-End Networking Solutions Serving The Needs of Small and Medium Businesses
No ratings yet
Product Catalog: End-To-End Networking Solutions Serving The Needs of Small and Medium Businesses
63 pages
7 TH
No ratings yet
7 TH
6 pages
VLSI IC Design 112 Lec01 Part9 Released
No ratings yet
VLSI IC Design 112 Lec01 Part9 Released
16 pages
Foundation of Tcs
No ratings yet
Foundation of Tcs
142 pages
2bizbox Quick Start Tutorial v3.0.0
No ratings yet
2bizbox Quick Start Tutorial v3.0.0
24 pages
Course 2 Master 2
No ratings yet
Course 2 Master 2
8 pages
HDFC Reward Catalogue PDF
No ratings yet
HDFC Reward Catalogue PDF
203 pages
QIP Short Term Course On: Computational Fluid Dynamics: Development, Application & Analysis (FCCFD-2017)
No ratings yet
QIP Short Term Course On: Computational Fluid Dynamics: Development, Application & Analysis (FCCFD-2017)
2 pages
Computer Organization and Design Slide PDF
No ratings yet
Computer Organization and Design Slide PDF
25 pages
DSAnkylosBalanceBaseCompatibilityChart
No ratings yet
DSAnkylosBalanceBaseCompatibilityChart
1 page
PharmaDex User Guide - Applicant Part - 27-April-2017
No ratings yet
PharmaDex User Guide - Applicant Part - 27-April-2017
21 pages
Training On EDT (1) - Copy (1) (3) 1
100% (3)
Training On EDT (1) - Copy (1) (3) 1
52 pages
Commercial Programming Lecture Notes 2020
100% (1)
Commercial Programming Lecture Notes 2020
68 pages
Quizizz: Sempoa: Quiz Started On: Thu 05, Nov 07:58 AM Total Attendance: 43 Average Score: 4820 Class Level # Correct
No ratings yet
Quizizz: Sempoa: Quiz Started On: Thu 05, Nov 07:58 AM Total Attendance: 43 Average Score: 4820 Class Level # Correct
28 pages
Interval Billing
No ratings yet
Interval Billing
43 pages
JIS Fasteners Lib PDF
No ratings yet
JIS Fasteners Lib PDF
312 pages
SAP Business One Hardware Requirements Guide: Release 9.0 and Higher
No ratings yet
SAP Business One Hardware Requirements Guide: Release 9.0 and Higher
13 pages
Advance Networks week 1
No ratings yet
Advance Networks week 1
17 pages
Huawei OceanStor T Series Technical White Paper
No ratings yet
Huawei OceanStor T Series Technical White Paper
38 pages
III CSE Lab Manual-Ooad
No ratings yet
III CSE Lab Manual-Ooad
55 pages
AI PPT 1
No ratings yet
AI PPT 1
9 pages
R18 IT - Internet of Things (IoT) Unit-III
No ratings yet
R18 IT - Internet of Things (IoT) Unit-III
139 pages
Unit 5
No ratings yet
Unit 5
8 pages
Five To One
No ratings yet
Five To One
2 pages
Skill Week 2
No ratings yet
Skill Week 2
9 pages
TCEQ Superfund Site Boundaries: Table of Contents
No ratings yet
TCEQ Superfund Site Boundaries: Table of Contents
8 pages
Netlify Serverless Functions Cheat Sheet
No ratings yet
Netlify Serverless Functions Cheat Sheet
5 pages
Algorithms Worksheet 4 Merge Sort and Quicksort
No ratings yet
Algorithms Worksheet 4 Merge Sort and Quicksort
5 pages

Advanced Database Chapter 6 Distributed database

Uploaded by

Advanced Database Chapter 6 Distributed database

Uploaded by

Advanced Database

networked computers in a unified manner

 EMPLOYEE, PROJECT and WORKS_ON tables may

A relation can be fragmented in three ways:

those of tuples which satisfy selection conditions.

subset of columns. Thus, a vertical fragment of a

fragment, each fragment must include the primary

performance of global queries since the result of

and in partial replication some selected part is

down update operation since a single logical update

distributed system. This process is called data

have identical Site 4 Communications

system but the data access is managed through a single

department name Where the employee works.

 Q: Fname,Lname,Dname (EmployeeDno = Dnumber Department)

Department Dname Dnumber Mgrssn Mgrstartdate

of this query will have 10,000 tuples

query is submitted at site 3 and the result is sent to this

not present at site 3 Employee

 Failure of individual sites

 Communication link failure

would affect database availability even though all

accessing databases stored on multiple sites if some

two or more sites may get involved in deadlock.

as a primary site which serves as a coordinator for

Site 4 Communications neteork

locking or unlocking are sent there

releasing of data items

sites, then serializability is guaranteed

behaves as a shadow of primary site.

number of independent sub-queries.

produces the final result.

You might also like