0% found this document useful (0 votes)
22 views

Chapter 5 - Distributed Databases Roobera

Uploaded by

adnanabdi4961
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Chapter 5 - Distributed Databases Roobera

Uploaded by

adnanabdi4961
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 58

Chapter 5

Distributed Databases
Distributed Database Concepts
A distributed computing system is a system which consists of a
number of processing elements, not necessarily homogeneous, that
are interconnected by a computer network, and that cooperate in
performing certain assigned tasks.
A distributed database (DDB) is a collection of multiple logically
interrelated databases distributed over a computer network.
A distributed database management system (DDBMS) is a
software system that manages a distributed database while making
the distribution transparent to the user.

2
Advantages of Distributed Databases
1. Management of distributed data with different
levels of transparency:
 Data Transparency: refers to the ability of a user to
access data as if it were stored in a single location,
regardless of its physical location within the network.
 Location Transparency: refers to the ability of the
system to hide the physical location of data, and the
user can access data without knowing its exact
location.
 Replication Transparency: refers to the ability of
the system to manage the replication of data across
multiple nodes automatically and transparently to
the user. 3
Distributed Databases
Failure Transparency: refers to the ability of the
system to transparently handle node failures, allowing
users to continue accessing the data, even if one or
more nodes have failed.
Scalability Transparency: refers to the ability of the
system to transparently handle increasing amounts of
data and user load, allowing the system to scale out as
needed.
These levels of transparency are important in
ensuring that a distributed database operates
seamlessly, effectively and transparently to the user.

4
Advantages of Distributed Databases
1. Management of distributed data with different levels of
transparency:
 A DBMS should be distribution transparent in the sense of hiding the
details of where each file (table, relation) is physically stored within the
system.
 The physical placement of data (files, relations, etc.) which is not
known to the user (distribution transparency).

5
Advantages…..
The EMPLOYEE, PROJECT, and WORKS_ON tables may be
fragmented horizontally and stored with possible replication as
shown below.

6
Advantages …..
• Types of transparencies
• Distribution or network transparency: users do not have to worry about operational details of
the network.
 There is Location transparency, which refers to freedom of issuing command from any
location without affecting its working.
 Then there is Naming transparency, which allows access to any names object (files, relations,
etc.) from any location.
• Replication transparency:
 It allows to store copies of a data at multiple sites.
 This is done to minimize access time to the required data.
 It makes the user unaware of the existence of copies.
• Fragmentation transparency:
 Allows to fragment a relation horizontally (create a subset of tuples of a relation) or
vertically (create a subset of columns of a relation).
 Makes the user unaware of the existence of fragments.

7
Advantages …..
2. Increased reliability and availability:
 Reliability is the probability that a system is running (not down) at a certain time

point.
 Availability is the probability that the system is continuously available (usable or

accessible) during a time interval.


 A distributed database system has multiple nodes (computers) and if one fails then

others are available to do the job.


3. Improved performance:
 A distributed DBMS fragments the database to keep data closer to where it is needed

most.
 This reduces data management (access and modification) time significantly

4. Easier expansion (scalability):


 Allows new nodes (computers) to be added anytime without chaining the entire

configuration. 8
Functions of DDBMS
 Keeping track of data: The ability to keep track of the data distribution,
fragmentation, and replication by expanding the DDBMS catalog.
 Distributed query processing: The ability to access remote sites and
transmit queries and data among the various sites via a communication
network.
 Distributed transaction management: The ability to devise execution
strategies for queries and transactions that access data from more than one
site and to synchronize the access to distributed data and maintain integrity
of the overall database.
 Replicated data management: The ability to decide which copy of a
replicated data item to access and to maintain the consistency of copies of a
replicated data item. 9
Functions …..
 Distributed database recovery: The ability to recover from individual site crashes
and from new types of failures such as the failure of a communication links.
 Security: Distributed transactions must be executed with the proper management of
the security of the data and the authorization/access privileges of users.
 Distributed directory (catalog) management: A directory contains information
(metadata) about data in the database. The directory may be global for the entire
DDB, or local for each site. The placement and distribution of the directory are
design and policy issues.
 At the physical hardware level, the following main factors distinguish a DDBMS
from a centralized system:
 There are multiple computers, called sites or nodes.
 These sites must be connected by some type of communication network to transmit data and
commands among sites,
10
Disadvantages of Distributed Databases
• Complexity- The data replication, failure recovery, network management
…make the system more complex than the central DBMSs.
• Cost- Since DDBMS needs more people and more hardware, maintaining
and running the system can be more expensive than the centralized system .
• Problem of connecting Dissimilar Machine- Additional layers of
operating system software are needed to translate and coordinate the flow
of data between machines.
• Data integrity and security problem- Because data maintained by
distributed systems can be accessed at any locations in the network,
controlling the integrity of a database can be difficult.
11
Data Fragmentation
 There are two approaches to store the relation in the distributed database:
Replication and Fragmentation
 Data Fragmentation: is a technique used to break up the database into logically
related units called fragments.
 A database can be fragmented as:
 Horizontal Fragmentation
 Vertical Fragmentation
 Mixed (Hybrid) Fragmentation

 The main reasons for fragmenting a relation are


 Efficiency- data that is not needed by the local applications is not stored
 Parallelism- a transaction can be divided into several subqueries that operate
on fragments which will increase the degree of concurrency.

12
Data Fragmentation …..
Horizontal Fragmentation: divides a relation "horizontally" by

grouping rows to create subsets of tuples, where each subset has a


certain logical meaning.
 It is a horizontal subset of a relation which contain those of tuples which

satisfy selection conditions.


 Eg: Consider the EMPLOYEE relation with selection condition (DNO = 5).

All tuples satisfy this condition will create a subset which will be a horizontal
fragment of EMPLOYEE relation.
 A selection condition may be composed of several conditions connected by

AND or OR.

13
Data Fragmentation …..
Vertical Fragmentation: divides a relation "vertically" by columns.
 It is a subset of a relation which is created by a subset of columns. Thus

a vertical fragment of a relation will contain values of selected columns.


 A vertical fragment can be created by keeping the values of some

attributes.
 Each fragment must include the primary key attribute of the parent

relation for each fragment to be connected.


Mixed(Hybrid) Fragmentation: the combination of vertical and
horizontal fragmentation.
 This is achieved by SELECT-PROJECT operations which is represented

by Li(Ci (R)).
14
Data Fragmentation …..
Representation
 There are three rules that must be followed during fragmentation
 Completeness: if a relation r is decomposed into fragments
r1, r2… rn , each data item that can be found in r must
appear in at least one fragment.
 Reconstruction: it must be possible to define a relation
operation that will reconstruct the relation r from fragments.
 Disjointness: if a data item di appears in fragment ri , then
it shouldn’t appear in any other fragment

15
Data Replication
Database is replicated to all sites.

In full replication the entire database is replicated and in partial

replication some selected part is replicated to some of the sites.


Full replication is useful in improving the availability of data,

because the system can continue to operate as long as at least


one site is up.

16
Data Replication…..
It also improves performance of retrieval for global
queries, because the result of such a query can be obtained
locally from anyone site.
The disadvantage of full replication is that it can slow
down update operations.
Each fragment(or each copy of a fragment) must be
assigned to a particular site in the distributed system. This
process is called data distribution (or data allocation).
17
Types of Distributed Systems
 Homogeneous
 All sites of the database system have identical setup, i.e., same database
system software.
 The system may have little or no local autonomy(not standalone)
 The underlying operating systems can be a mixture of Linux, Window,
Unix, etc.
Window
Site 5 Unix
Oracle Site 1
Oracle
Window
Site 4 Communications
network

Oracle
Site 3 Site 2
Linux Oracle Linux Oracle 18
Types…..
 Heterogeneous
 Federated: Each site may run different database system but the data access is
managed through a single conceptual schema.
 This implies that the degree of local autonomy is minimum. Each site must
adhere to a centralized access policy. There may be a global schema.
 Each server is an independent and autonomous centralized DBMS that has its

own local users, local transactions, and DBA and


 hence has a very high degree of local autonomy.

 Multidatabase: There is no one conceptual global schema. For data access a


schema is constructed dynamically as needed by the application software.
Object Unix Relational
Oriented Site 5 Unix
Site 1
Hierarchical
Window
Site 4 Communications
network

Network
Object DBMS
Oriented Site 3 Site 2 Relational
Linux Linux 19
Types…..
 Federated Database Management Systems Issues
Differences in data models:

 Relational, Objected oriented, hierarchical, network, etc.


Differences in constraints:

 Each site may have their own data accessing and processing
constraints.
Differences in query language:

 Some site may use SQL, some may use SQL-89, some may use
SQL-92, and so on.

20
Query Processing in Distributed Databases
• Issues
 Cost of transferring data (files and results) over the network.
 Thiscost is usually high, so some optimization is necessary.
 Example: relations Employee at site1 and Department at Site2
 Employee at site 1. 10,000 rows. Row size = 100 bytes.
Table size = 106 bytes.
 Department at Site 2. 100 rows. Row size = 35 bytes.

Table size = 3,500 bytes.


 Q: For each employee, retrieve employee name and department

name Where the employee works.


 Q: Fname,Lname,Dname (Employee Dno = Dnumber Department)
Fname Minit Lname SSN Bdate Address Sex Salary Superssn Dno

Dname Dnumber Mgrssn Mgrstartdate


21
Query Processing…..
Result
The result of this query will have 10,000 rows, assuming
that every employee is related to a department.
Suppose each result row 40 bytes long. The query is
submitted at site 3 and the result is sent to this site.
Problem: Employee and Department relations are not
present at site 3.

22
Query Processing…..
 Strategies:
1. Transfer Employee and Department to site 3.
• Total transfer bytes = 1,000,000 + 3500 = 1,003,500 bytes.
2. Transfer Employee to site 2, execute join at site 2 and send the
result to site 3.
• Query result size = 40 * 10,000 = 400,000 bytes. Total transfer
size = 400,000 + 1,000,000 = 1,400,000 bytes.
3. Transfer Department relation to site 1, execute the join at site 1,
and send the result to site 3.
• Total bytes transferred = 400,000 + 3500 = 403,500 bytes.

• Optimization criteria: minimizing data transfer(Communication cost).


– Preferred approach: strategy 3.

23
Query Processing…..
 Consider the query
– Q’: For each department, retrieve the department name and the name of the
department manager
 Relational Algebra expression:
 Fname,Lname,Dname (Employee Mgrssn = SSN Department)
 The result of this query will have 100 tuples, assuming that every department has a
manager, the execution strategies are:
1. Transfer Employee and Department to the result site and perform the join at site 3.
• Total bytes transferred = 1,000,000 + 3500 = 1,003,500 bytes.
2. Transfer Employee to site 2, execute join at site 2 and send the result to site 3.
Query result size = 40 * 100 = 4000 bytes.
• Total transfer size = 4000 + 1,000,000 = 1,004,000 bytes.
3. Transfer Department relation to site 1, execute join at site 1 and send the result to
site 3.
• Total transfer size = 4000 + 3500 = 7500 bytes.
– Preferred strategy: Choose strategy 3.
24
Query Processing…..

 Now suppose the result site is 2. Possible strategies :


1. Transfer Employee relation to site 2, execute the query
and present the result to the user at site 2.
 Total transfer size = 1,000,000 bytes for both queries Q and
Q’.
2. Transfer Department relation to site1, execute join at
site1 and send the result back to site 2.
 Total transfer size for Q = 400,000 + 3500 = 403,500 bytes
and for Q’ = 4000 + 3500 = 7500 bytes.

25
Concurrency Control and Recovery
Distributed Databases encounter a number of concurrency
control and recovery problems which are not present in
centralized databases. Some of them are:
Dealing with multiple copies of data items

Failure of individual sites

Communication link failure

Distributed commit

Distributed deadlock

26
Concurrency Control …..
 Dealing with multiple copies of data items:
 The concurrency control must maintain global consistency. Likewise, the recovery

mechanism must recover all copies and maintain consistency after recovery.
 Global consistency refers to the state in which all nodes in a distributed system

have the same view of the data. It is a property of a distributed system that ensures
that all operations and updates are performed in a consistent and coordinated
manner across all nodes, regardless of their location.

 Failure of individual sites:


 Database availability must not be affected due to the failure of one or two sites and the

recovery scheme must recover them before they are available for use.

27
Concurrency Control …..
 Communication link failure:
 Communication link failure refers to the inability of two devices to communicate with each
other due to a problem with the communication channel connecting them.

 This failure may create network partition which would affect database availability even
though all database sites may be running.
 Distributed commit:
 Distributed commit is a mechanism for ensuring that a transaction in a distributed
system is either fully committed or fully rolled back, ensuring the consistency of
data across the system. In a distributed commit,
 A transaction may be fragmented and they may be executed by a number of
sites. This require a two or three-phase commit approach for transaction
commit.
 Distributed deadlock:
 Since transactions are processed at multiple sites, two or more sites may get
involved in deadlock. This must be resolved in a distributed manner.
28
Concurrency Control …..
 The process of distributed commit typically involves the
following steps:
 All nodes participating in the transaction begin by preparing to
commit.
 The coordinator node sends a commit request to all participants.
 Each participant acknowledges its readiness to commit.
 If all participants have acknowledged, the coordinator sends a final
commit message to all participants.
 Each participant then performs the necessary updates and sends a
final confirmation message to the coordinator.
 The coordinator, upon receiving final confirmation messages from all
participants, sends a "commit done" message to all participants.
 Each participant performs any necessary clean up and the transaction
is considered complete.

Distributed deadlock:
 Since transactions are processed at multiple sites, two or more sites may get
involved in deadlock. This must be resolved in a distributed manner.
29
Concurrency Control …..
Distributed deadlock:
 Since transactions are processed at multiple sites, two or more sites may get
involved in deadlock. This must be resolved in a distributed manner.

A distributed deadlock occurs when multiple processes in a


distributed system are each waiting for resources held by the other
processes. This creates a circular dependency that prevents any of the
processes from continuing, resulting in a deadlock situation.

In a distributed system, deadlocks can occur across multiple nodes


and can be more difficult to detect and resolve than in a single-node
system. This is because each node may have its own view of the
resources it holds and the resources it is waiting for, and the coordinator
node responsible for detecting and resolving deadlocks may not have
complete information about the state of the system.

30
Concurrency Control …..
There are several methods for detecting and resolving
distributed deadlocks, including:
Centralized Deadlock Detection: In this approach,
a central coordinator node periodically examines the
state of the system to detect deadlocks.
Distributed Deadlock Detection: In this approach,
each node in the system periodically exchanges
information with its neighbors to detect deadlocks.
Timeouts: In this approach, each process waits for a
specified amount of time before assuming that a
deadlock has occurred and taking appropriate action.

31
Concurrency Control …..
Once a deadlock is detected, there are several
strategies for resolving it, including:
Abort one or more of the processes involved in the
deadlock.
Preempt one or more resources held by a process
involved in the deadlock and allocate them to another
process.
Rollback one or more of the transactions involved in
the deadlock.

32
Concurrency Control…..
 Distributed Concurrency Control Based on a Distinguished Copy of a
Data Item
A distinguished copy of a data item is a specific instance of
a data item that is designated as the authoritative or
primary copy of that data item in a distributed system.

In a distributed system, multiple copies of a data item may


exist across different nodes. When updates are made to
the data item, it is important to ensure that all copies of
the data item are updated consistently and correctly. The
distinguished copy of the data item serves as the source of
truth for the data item, and all updates to the data item
are made to this copy first. The updates are then
propagated to all other copies of the data item
33
Concurrency Control…..
 Distributed Concurrency Control Based on a Distinguished Copy of a
Data Item
site that contains this distinguished copy. The idea is to
designate a particular copy of each data item as a
distinguished copy.

 The distinction of a copy of a data item as the


distinguished copy can be based on various factors, such
as the location of the data item, the processing capabilities
of the node where the data item is stored, or the
availability of the data item. The choice of the
distinguished copy of a data item can have significant
impact on the performance, scalability, and reliability of a
distributed system, so it is important to choose an
appropriate approach based on the specific requirements
of the system. 34
Concurrency Control…..
Primary site technique:
 Primary site technique is a method used in distributed systems to
maintain a single authoritative copy of a data item .
 All distinguished copies are kept at the same site.
 A single site is designated as a primary site which serves as a
coordinator for transaction management.
 Concurrency control and commit are managed by this site.
 In two phase locking, this site manages locking and releasing data
items. Primary site
Site 5
Site 1

Site 4 Communications neteork

Site 3 Site 2

35
Concurrency Control…..
Primary site technique:
 in a distributed system, multiple copies of a data item may exist across
different nodes. The primary site technique designates one node, the
primary site, as the authoritative source for the data item. All updates
to the data item are made to the primary site first, and then
propagated to all other nodes in the system that store a copy of the
data item.

 The primary site technique is used to ensure consistency and accuracy


of the data item across the system, and to prevent conflicting updates
from multiple nodes. It also allows for efficient updates, as only the
primary site needs to be updated directly, rather than updating
multiple copies individually.

36
Concurrency Control…..
Primary site technique:

 The choice of the primary site can be based on various factors, such as
the location of the data item, the processing capabilities of the node
where the data item is stored, or the availability of the data item. The
primary site can be designated statically, for example by configuring
the system to always use a specific node as the primary site, or
dynamically, for example by using an algorithm to choose the most
appropriate node based on the current state of the system.

37
Concurrency Control…..
Primary site technique:

 In the primary site technique, the primary site is responsible for


executing transactions and maintaining the authoritative copy of the
data. To implement two-phase locking at the primary site, the
following steps can be performed:

Lock acquisition: When a transaction requests access to a data item,
it first acquires a lock on that data item. This lock can be a shared lock
for read-only access, or an exclusive lock for write access. The lock is
acquired at the primary site, which is responsible for maintaining the
lock table.
 Lock propagation: Once the lock has been acquired at the primary
site, it is propagated to all other nodes in the system that have a copy
of the data item. This ensures that all nodes have a consistent view of
the locks held on the data item.
38
Concurrency Control…..
Lock release: When a transaction has completed its
access to a data item, it releases the lock. The lock is
released at the primary site, and then propagated to
all other nodes.
Deadlock detection: To avoid deadlocks, a deadlock
detection algorithm can be used at the primary site. If
a deadlock is detected, one of the transactions
involved in the deadlock is rolled back to release its
locks, allowing the other transactions to proceed

39
Concurrency Control…..
Primary site technique…..
 Advantages:
 An extension to the centralized two phase locking so
implementation and management is simple.
 Data items are locked only at one site but they can be accessed
at any site.
 Disadvantages
 All transaction management activities go to primary site which
is likely to overload the site.
 If the primary site fails, the entire system is inaccessible.
 This can limit system reliability and availability.
 To aid recovery, a backup site is designated which behaves as a
shadow of primary site. In case of primary site failure, backup site
can act as primary site.

40
Concurrency Control…..
The Primary Copy Technique is a method used in
distributed systems to manage data consistency across
multiple nodes. The technique designates one node as the
primary node, and all updates to the data are made on the
primary node first. This node holds the authoritative copy
of the data, and all other nodes in the system have replicas
of the data.

The primary copy technique is used to ensure that all


updates to the data are made in a consistent and orderly
manner, and that all nodes in the system have the same
view of the data. When a transaction wants to update the
data, it first sends a request to the primary node. The
primary node executes the transaction and updates the
authoritative copy of the data. Then the updated data is
propagated to the replica nodes.

41
Concurrency Control…..
In the primary copy technique, conflicts between
updates made by different transactions can be
resolved by the primary node, which has the complete
information about the state of the data. The primary
copy technique can be used in various types of
distributed systems, including database systems,
cloud computing systems, and file systems. It is an
important aspect of distributed data management and
should be carefully designed and implemented to
ensure the correct functioning of the syste

42
Concurrency Control…..
Primary Copy Technique:
 In this approach, instead of a site, a data item partition is designated as
primary copy. To lock a data item just the primary copy of the data item
is locked.
 Advantages:

 Since primary copies are distributed at various sites, a single site is not
overloaded with locking and unlocking requests.
 Disadvantages:

 Identification of a primary copy is complex. A distributed directory


must be maintained, possibly at all sites.

43
Concurrency Control…..
Recovery from a coordinator failure
Recovery from a coordinator failure in a distributed system
refers to the process of restoring normal functioning of the
system after a failure of the coordinator node. The coordinator
is a special node in a distributed system that is responsible for
coordinating the execution of transactions and ensuring data
consistency across the nodes.

The recovery process can vary depending on the design of the
system and the type of coordinator failure. In some cases, the
recovery process may involve the selection of a new coordinator
node, which takes over the responsibilities of the failed node. In
other cases, the recovery process may involve rolling back any
transactions that were executed by the failed coordinator but
not yet committed, and then resuming normal processing.
.
44
RECOVARY…..
 The recovery process should be designed to minimize the impact of
the coordinator failure on the system, and to ensure that the data
remains consistent and available to users. This can involve
implementing robust data backup and recovery strategies, and
monitoring the system for potential failures and taking corrective
actions as needed.

 In addition, the recovery process should be efficient and fast, to


minimize the downtime of the system and to allow normal processing
to resume as soon as possible. This can involve the use of check
pointing, which is the process of periodically saving the state of the
system so that it can be quickly restored in case of a failure.

 Overall, recovery from a coordinator failure is a critical aspect of the


design and implementation of a distributed system, and should be
carefully planned and tested to ensure its reliability and efficiency.

45
RECOVARY…..
Primary site approach with no backup site:
 Aborts and restarts all active transactions at all sites.

 The primary site approach with no backup site refers to a distributed

system architecture where there is only one node designated as the


primary site, and no secondary or backup site. This means that all
updates to the data are made at the primary site, and there is no
fallback option in case of failure of the primary site.

In this architecture, the primary site holds the authoritative copy of
the data and is responsible for executing transactions and maintaining
data consistency. All other nodes in the system have replicas of the
data, but they do not have the ability to make updates.

46
RECOVARY…..
Primary site approach with backup site:
 Suspends all active transactions, designates the backup site as the

primary site and identifies a new back up site. Primary site


receives all transaction management information to resume
processing.
 The primary site approach with a backup site refers to a

distributed system architecture where there is a primary site and a


secondary or backup site. The primary site is responsible for
executing transactions and maintaining the authoritative copy of
the data, while the backup site acts as a secondary repository of
the data.
47
RECOVARY…..
Primary site approach with backup site:
In this architecture, if the primary site fails, the
backup site takes over as the new primary site and
continues to execute transactions and maintain the
data. This provides a high degree of reliability, as the
system can continue to function even if the primary site
fails.

The primary site and the backup site communicate


with each other regularly to keep their copies of the
data synchronized. This can be achieved through
techniques such as database replication, in which
changes made at the primary site are automatically
reflected at the backup site.
48
RECOVARY…..
Recovery from a coordinator failure
 In both approaches a coordinator site or copy may become

unavailable. This will require the selection of a new coordinator.


 Recovery from a coordinator failure in a distributed system refers to
the process of restoring normal functioning of the system after a
failure of the coordinator node. The coordinator is a special node in a
distributed system that is responsible for coordinating the execution
of transactions and ensuring data consistency across the nodes.

 The recovery process can vary depending on the design of the system
and the type of coordinator failure. In some cases, the recovery
process may involve the selection of a new coordinator node, which
takes over the responsibilities of the failed node. In other cases, the
recovery process may involve rolling back any transactions that were
executed by the failed coordinator but not yet committed, and then
resuming normal processing.
49
RECOVARY…..
Recovery from a coordinator failure
The recovery process should be designed to minimize the
impact of the coordinator failure on the system, and to
ensure that the data remains consistent and available to
users. This can involve implementing robust data backup
and recovery strategies, and monitoring the system for
potential failures and taking corrective actions as needed.

In addition, the recovery process should be efficient and


fast, to minimize the downtime of the system and to allow
normal processing to resume as soon as possible. This can
involve the use of check pointing, which is the process of
periodically saving the state of the system so that it can be
quickly restored in case of a failure.
50
Concurrency Control…..
Primary and backup sites fail or no backup site:
 Use election process to select a new coordinator site.
 Any processY, that attempts to communicate with the coordinator
site repeatedly and fails to do so can assume that the coordinator
is down and can start the election process by sending a message
to all running sites proposing that Y become the new coordinator.
As soon as Y receives a majority of yes votes, Y can declare that
it is the new coordinator.

51
Concurrency Control…..
Primary and backup sites fail or no backup site:
 The statement describes an election process for selecting a new
coordinator site in a distributed system. The process is triggered when
a site, process Y, fails to communicate with the existing coordinator
and assumes that the coordinator is down.

 Process Y then sends a message to all running sites proposing that it


become the new coordinator. The message contains a request for a
majority of yes votes from the other sites. As soon as process Y
receives a majority of yes votes, it can declare itself to be the new
coordinator.

 This election process is used to dynamically select a new coordinator


in case of a failure of the existing coordinator. The process ensures
that there is a new coordinator in place as soon as possible, and that
the new coordinator is selected in a democratic and consensus-based
manner. 52
Concurrency Control…..
Primary and backup sites fail or no backup site:

The election process is a critical component of the design of


a distributed system, as it ensures that the system can
continue to function even in the event of a failure of the
coordinator. The process should be carefully designed to
ensure that it is efficient, reliable, and fair.

53
Concurrency Control…..
Distributed Concurrency control based on voting:
Distributed concurrency control based on voting is a
technique for managing concurrent access to shared
resources in a distributed system. In this technique,
the coordinator site collects voting information from
all participating sites before deciding to execute a
transaction.
The voting process ensures that all participating sites
agree on the order of transactions, and that
transactions are executed in a consistent manner
across all sites. This helps to ensure that the data
remains consistent and up-to-date, even in the
presence of concurrent access to the shared resources.
54
Concurrency Control…..
Distributed Concurrency control based on voting:
The coordinator site acts as the central authority,
collecting voting information from all sites, and making
the final decision on the execution of transactions. The
coordinator site also ensures that transactions are executed
in a serializable order, preventing inconsistencies that can
result from concurrent access to shared resources.

Overall, distributed concurrency control based on voting is


a useful technique for managing concurrent access to
shared resources in a distributed system. It ensures that
the data remains consistent and up-to-date, even in the
presence of concurrent access to shared resources, and that
transactions are executed in a consistent manner across all
sites.
55
Concurrency Control…..
 Distributed Concurrency control based on voting:
 There is no primary copy of coordinator.

 Each copy maintains its own lock and can grant or deny the request for it.

 If a transaction wants to lock a data item, it sends lock request to all the

sites that have the data item.


 If majority of sites grant lock, then the requesting transaction gets the

data item.
 To avoid unacceptably long wait, a time-out period is defined. If the

requesting transaction does not get any vote information then the
transaction is aborted.
 Locking information (grant or denied) is sent to all these sites.

56
Distributed Recovery
There are two major problems with regard to distributed recovery.
1. It is difficult to determine a site is down without exchanging
numerous messages with other sites.
• Suppose that site X sends a message to site Y and expects a response
from Y but does not receive it. There are several possible explanations
for this:
 The message was not delivered to Y because of communication failure.
 Site Y is down and could not respond.
 Site Y is running and sent a response, but the response was not delivered.
2. Distributed commit.
• When a transaction is updating data at several sites, it cannot commit
until it is sure that the effect of the transaction on every site cannot be
lost.

57
Thank You

Any Question???

58

You might also like