0% found this document useful (0 votes)

85 views59 pages

Zookeeper

Uploaded by

Manisha Yuvaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

85 views59 pages

Zookeeper

Uploaded by

Manisha Yuvaraj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Big Data

Technologies

Sathyavathi.S
Department of Information Technology
Overview of this Lecture
•Zoo keeper
ZooKeeper

A highly-available service for coordinating

processes of distributed applications.

• Developed at Yahoo! Research

• Started as sub-project of Hadoop, now a top-level

Apache project

• Development is driven by application needs

http://zookeeper.apache.
SQL vs NoSQL

Agenda Introduction of MongoDB

MongoDB Features

Replication/ High Availability

Sharding/ Scaling
SQL vs NoSQL
 NoSQL (often interpreted as Not only SQL) database

 It provides a mechanism for storage and retrieval of

data that is modeled in means other than the tabular
relations used in relational databases.

SQL NoSQL
Relational Database Management System (RDBMS) Non-relational or distributed database system.

These databases have fixed or static or predefined schema They have dynamic schema

These databases are best suited for complex queries These databases are not so good for complex queries

Vertically Scalable Horizontally scalable

Follows ACID property Follows BASE property

SQL vs NoSQL
NoSQL Types

Graph database

Document-oriented

Column family
What is MongoDB?
 MongoDB is an open source, document-oriented database designed with
both scalability and developer agility in mind.
 Instead of storing your data in tables and rows as you would with a
relational database, in MongoDB you store JSON-like documents with
dynamic schemas(schema-free, schema less).
{
"_id" : ObjectId("5114e0bd42…"),
“FirstName" : "John",
“LastName" : "Doe",
“Age" : 39,
“Interests" : [ "Reading", "Mountain Biking ]
“Favorites": {
"color": "Blue",
"sport": "Soccer“
}
}
MongoDB is Easy to Use
Scheme Free RDBMS vs MongoDB
MongoDB does not need any pre-defined data schema
Every document could have different data!
RDBMS MongoDB
{name: “will”, {name: “jeff”, {name: “brendan”, Database Database
eyes: “blue”, eyes: “blue”, boss: “will”} Table Collection
birthplace: “NY”, loc: [40.7, 73.4],
aliases: [“bill”, “ben”],
loc: [32.7, 63.4],
boss: “ben”} Row Document (JSON, BSON)
boss: ”ben”}
{name: “matt”, Column Field
weight:60, Index Index
height: 72,
{name: “ben”, Join Embedded Document
loc: [44.6, 71.3]}
age:25} Partition Shard
Features Of MongoDB
• Document-Oriented storege
• Full Index Support
• Replication & High Availability
• Auto-Sharding
• Aggregation
• MongoDB Atlas
• Various APIs
• JavaScript, Python, Ruby, Perl, Java, Java, Scala, C#, C++,
Haskell, Erlang
• Community
Replication
• Replication provides redundancy and increases data availability.
• With multiple copies of data on different database servers,
replication provides a level of fault tolerance against the loss of a
single database server.

Copy of database Copy of database

Replication
Sharding
• Sharding is a method for
distributing data across multiple
machines.
• MongoDB uses sharding to support
deployments with very large data
sets and high throughput
operations.
Sharding Architecture

• Shard is a Mongo instance

to handle a subset of
original data.
• Mongos is a query router to
shards.
• Config Server is a Mongo
instance which stores
metadata information and
configuration details of
cluster.
Sharding/Replication

• Replication Split data sets

across multiple data nodes
for high availability.
• Sharding scale up/down
horizontally when it is
required for high throughput
ZooKeeper in the
Hadoop ecosystem

Pig Hive Sqoop

(Data (SQL (Data Transfer)
Flow) )
(Coordinati
ZooKeep

(Serializati
MapReduce (Job

Avro
Scheduling/Execution)
on)

on)
er

HBase (Column DB)

HDFS
Coordination

Proper
coordination is
not easy.
Fallacies of distributed
computing
• The network is reliable

• There is no latency

• The topology does not change

• The network is homogeneous

• The bandwidth is infinite

• …
Motivation
• In the past: a single program running on a single
computer with a single CPU

• Today: applications consist of independent programs

running on a changing set of computers

• Difficulty: coordination of those independent programs

• Developers have to deal with coordination logic and

application logic at the same time

ZooKeeper: designed to relieve developers from

writing coordination
9 logic code.
Lets think ….
Question: how do you elect
the leader?

application logic

A program that
crawls the
Web

one machine (the leader)

should coordinate the effort
coordination logic
a cluster with a
few hundred machines
Question: how do you lock
a service? •a
p
p
l
i
c
a
t
i
o
a cluster with a n
coordination logic
few hundred machines
one database 12 l
Question: how can the
configuration
be distributed?

configuration file
application logic

A program that
crawls the
Web
Every worker should
start with the same
configuration
a cluster with a coordination logic
few hundred machines
13
Solution approaches
• Be specific: develop a particular service for each
coordination task
• Locking service

• Leader
• election etc.
Be general: provide an API to make many services
• possible
ZooKeeper
API that enables
application specific primitives are
The developers
Rest to
implement their own implemented on the
primitives easily server side
15
How can a distributed
system look like?

MASTER

Slave Slave Slave Slave

+ simple
- coordination performed by the master
- single point of failure
- scalability
How can a distributed
system look like?

+ not a single point of failure anymore

- scalability is still an issue
How can a distributed
system look like?

+ scalability
What makes distributed
system coordination difficult?

Partial failures make application writing difficult

message

nothing comes back network failure

Sender does not know:

• whether the message was received
• whether the receiver’s process died before/after
processing the message
19
Typical coordination
problems in distributed
systems
• Static configuration: a list of operational parameters for the
system processes

• Dynamic configuration: parameter changes on the fly

• Group membership: who is alive?

• Leader election: who is in charge who is a backup?

• Mutually exclusive access to critical resources (locks)

• Barriers (supersteps in Giraph for instance)

The ZooKeeper API allows us to implement all these coordination

tasks20 easily.
ZooKeeper principles
ZooKeeper’s design
principles
• API is wait-free Remember the
dining philosophers,
• No blocking primitives in ZooKeeper forks & deadlocks.
• Blocking can be implemented by a
• client No deadlocks

• Guarantees
• Client requests are processed in FIFO order
• Writes to ZooKeeper are linearisable

• Clients receive notifications of changes before the

changed data becomes visible
ZooKeeper’s
strategy to be fast
and reliable
• ZooKeeper service is an ensemble of servers that
use replication (high availability)

• Data is cached on the client side:

Example: a client caches the ID of the current leader
instead of probing ZooKeeper every time.

• What if a new leader is elected?

• Potential solution: polling (not optimal)
• Watch mechanism: clients can watch for an
update of a given data object ZooKeeper is optimised for
read-dominant
operations!
ZooKeeper
terminology
• Client: user of the ZooKeeper service

• Server: process providing the ZooKeeper service

• znode: in-memory data node in ZooKeeper,

organised in a hierarchical namespace (the data tree)

• Update/write: any operation which modifies the state

of the data tree

• Clients establish a session when connecting to

ZooKeeper
ZooKeeper’s data
model: filesystem
• znodes are organised in a hierarchical namespace

• znodes can be manipulated by clients through

the ZooKeeper API

• znodes are referred to by UNIX style file system

paths /

/app1 /app2

All znodes store data (file like) & can have

/app1/p_1 /app1/p_2 /app1/p_3
children (directory like).
znodes
• znodes are not designed for general data storage
(usually require storage in the order of kilobytes)

• znodes map to abstractions of the client

application

Group membership protocol:

Client process pi creates
znode p_i under /app1. /
/app1 persists as long as the
/app1 /app2
process is running.

/app1/p_1 /app1/p_2 /app1/p_3

znode flags
• Clients manipulate znodes by creating and
deleting them
ephemeral (Greek): passing, short-lived

• EPHEMERAL flag: clients create znodes which

are deleted at the end of the client’s session

• SEQUENTIAL flag: monotonically increasing

counter appended to a znode’s path;
counter value of a new znode under a parent is
always larger than value of existing children
/app1_5
create(/app1_5/p_, data, SEQUENTIAL)

/app1_5/p_1 /app1_5/p_2 /app1_5/p_3

znodes & watch flag
• Clients can issue read operations on znodes with a
watch flag

• Server notifies the client when the information on the

znode has changed

• Watches are one-time triggers associated with a

session (unregistered once triggered or session closes)

• Watch notifications indicate the change, not the new

data
Sessions

• A client connects to ZooKeeper and initiates a

session

• Sessions have an associated timeout

• ZooKeeper considers a client faulty if it does not

receive anything from its session for more than that
timeout

• Session ends: faulty client or explicitly ended

by client
A few
implementation
details
ZooKeeper data is replicated on each server that
composes the service
replicated
across all
servers
(in-memory)

updates first
logged to
disk; write-
write request requires ahead log
coordination between and snapshot
servers for recovery

Source: http://bit.ly/13VFohW 30
A few
implementation
details
• ZooKeeper server services clients

• Clients connect to exactly one server to submit

requests
• read requests served from the local replica
• write requests are processed by an agreement
protocol (an elected server leader initiates
processing of the write request)

31
Lets work through
some examples
No partial read/writes
(no open, seek or

ZooKeeper API close methods).

• String create(path, data, flags)

• creates a znode with path name path, stores data in it and sets flags
(ephemeral, sequential)

• void delete(path, version)

• deletes the anode if it is at the expected version

• Stat exists(path, watch)

• watch flag enables the client to set a watch on the znode

• (data, Stat) getData(path, watch)

• returns the data and meta-data of the znode

• Stat setData(path, data, version)

• writes data if the version number is the current version of the znode

• String[] getChildren(path, watch)

or similar methods.
Example: configuration
• String create(path, data, flags)
• void delete(path, version)
• watch)
Questions: • Stat exists(path,
(data, Stat) getData(path, watch)
1.How does a new worker query ZK • Stat setData(path, data, version)
for a configuration? • String[] getChildren(path, watch)
2. How does an administrator
change
the configuration on the fly?
3. How do the workers read the new /
45
configuration?
[configuration stored in /app1/config] / /
1.getData(/app1/config,true) app configuration
app1 app2
2.setData(/app1/config/config_data,-1)
[notify watching clients]
3. getData(/app1/config,true)

/app1/ /app1/
config progress
Example: group • String create(path, data, flags)

membership •
•
void delete(path, version)
watch)
• Stat exists(path,
(data, Stat) getData(path, watch)
• Stat setData(path, data, version)
Questions: • String[] getChildren(path, watch)
1.How can all workers (slaves) of an
application register themselves on ZK? /
2. How can a process find out about all
active workers of an application?
/app1
[a znode is designated to store workers]
3.create(/app1/workers/ 46
worker,data,EPHEMERAL) /app1/workers
4. getChildren(/app1/
workers,true)

/app1/workers/worker1 /app1/workers/worker2
Example: • String create(path, data, flags)

simple
• voi delete(path, version)
d
•
• (data, Stat) getData(path,
Sta exists(path, watch) watch)

locks •
•
t
Stat setData(path, data, version)
String[] getChildren(path, watch)

Question:
1. How can all workers of an application use a single resource through
a lock?

create(/app1/lock1,…,EPHE.) /
/app1

yes /app1/workers
ok? use locked resource
/app1/lock1

/app1/workers/worker1 /app1/workers/worker2
getData(/app1/lock1,true)
all processes compete at all times for the lock
36
Example:
locking without
herd effect
id=create(/app1/locks/lock_,SEQ.|EPHE.)

ids = getChildren(/app1/locks/,false)

/
ye
id=min(ids s exit (use lock)
)?
/app1
no
/app1/locks
exists(max_id<id,true)

/app1/locks/lock_1 /app1/locks/lock_2
wait for notification

Question:
1. How can all workers of an application use a single resource through
a lock? 37
• String create(path, data, flags)

Example: •
•
void delete(path, version)
watch)
Stat exists(path,
(data, Stat) getData(path, watch)
leader election
•
• Stat setData(path, data, version)
• String[] getChildren(path, watch)

Question:
1. How can all workers of an application elect a leader among
themselves?

getData(/app1/workers/leader,true)
/

ok follow
ye /app1
?
s

create(/app1/workers/leader,IP,EPHE.)
/app1/workers

n
o ok lead /app1/workers/leader /app1/workers/worker1
? ye
s

if the leader dies, elect again (“herd effect”)

38
Zookeeper Video
• https://youtu.be/AS5a91DOmks
ZooKeeper
applications
The Yahoo! fetching
service
• Fetching Service is part of Yahoo!’s crawler infrastructure

• Setup: master commands page-fetching processes

• Master provides the fetchers with configuration
• Fetchers write back information of their status and health

• Main advantage of ZooKeeper:

• Recovery from master failures
• Guaranteed availability despite failures

• Used primitives of ZK: configuration metadata,

leader election
Yahoo! message
broker
• A distributed publish-subscribe system

• The system manages thousands of topics that clients

can publish messages to and receive messages
from
• The topics are distributed among a set of servers to
provide scalability

• Used primitives of ZK: configuration metadata (to

distribute topics), failure detection and group
membership
Yahoo! message broker
primary and
backup server
per topic; topic
subscribers
monitored
by all
servers

ephemeral
nodes
Source: http://bit.ly/13VFohW
Throughput
Setup: 250 clients, each client has at least 100
outstanding requests (read/write of 1K data)

crossing
eventually
always happens

only only
write request
read
requests s

Source: http://bit.ly/13VFohW
Recovery from failure
Setup: 250 clients, each client has at least 100
outstanding requests (read/write of 1K data);
5 ZK machines (1 leader, 4 followers), 30%
writes
(1)failure & recovery
of a follower
(2)failure & recovery
of a different
follower
(3) failure of the
leader
(4)failure of
followers (a,b),
recovery at (c)
(5) failure of the
leader
(6)recovery of
the leader
Source: http://bit.ly/13VFohW
44
References
• [book] ZooKeeper by Junqueira & Reed, 2013
(available on the TUD campus network)

• [paper] ZooKeeper: Wait-free coordination for Internet-

scale systems by Hunt et al., 2010; http://bit.ly/
13VFohW
Summary
• Whirlwind tour through ZooKeeper

• Why do we need it?

• Data model of ZooKeeper: znodes

• Example implementations of different coordination

tasks

ZooKeeper: Distributed Coordination Service
100% (1)
ZooKeeper: Distributed Coordination Service
42 pages
ZooKeeper: Coordination in Distributed Systems
No ratings yet
ZooKeeper: Coordination in Distributed Systems
75 pages
Unit 5 Lecture No-4 (Zookeeper)
No ratings yet
Unit 5 Lecture No-4 (Zookeeper)
20 pages
HBase and ZooKeeper Overview
No ratings yet
HBase and ZooKeeper Overview
96 pages
Zookeeper and Hbase
No ratings yet
Zookeeper and Hbase
43 pages
Apache Zookeeper
No ratings yet
Apache Zookeeper
31 pages
Understanding ZooKeeper for Coordination
No ratings yet
Understanding ZooKeeper for Coordination
8 pages
Understanding ZooKeeper for Coordination
No ratings yet
Understanding ZooKeeper for Coordination
28 pages
Zookeeper HBase SPARK
No ratings yet
Zookeeper HBase SPARK
25 pages
Zookeeper
No ratings yet
Zookeeper
14 pages
017.1 - ZooKeeper
No ratings yet
017.1 - ZooKeeper
6 pages
Unit V-HBase
No ratings yet
Unit V-HBase
10 pages
ZooKeeper: Distributed System Coordination
No ratings yet
ZooKeeper: Distributed System Coordination
4 pages
Introduction to Apache ZooKeeper
No ratings yet
Introduction to Apache ZooKeeper
23 pages
Apache ZooKeeper Tutorial Overview
No ratings yet
Apache ZooKeeper Tutorial Overview
10 pages
Apache Zookeeper
No ratings yet
Apache Zookeeper
28 pages
ZooKeeper: Cluster Coordination Guide
No ratings yet
ZooKeeper: Cluster Coordination Guide
13 pages
Apache ZooKeeper: Coordination for Distributed Systems
No ratings yet
Apache ZooKeeper: Coordination for Distributed Systems
4 pages
Apache ZooKeeper: Distributed Coordination
No ratings yet
Apache ZooKeeper: Distributed Coordination
42 pages
Zookeeper
No ratings yet
Zookeeper
4 pages
Unit 5 Lecture No-4 (Zookeeper)
No ratings yet
Unit 5 Lecture No-4 (Zookeeper)
20 pages
Zookeeper Programmers
No ratings yet
Zookeeper Programmers
20 pages
CIM M1 - Ch-4
No ratings yet
CIM M1 - Ch-4
16 pages
Understanding Apache Zookeeper Basics
No ratings yet
Understanding Apache Zookeeper Basics
61 pages
NoSQL and MongoDB Overview Guide
No ratings yet
NoSQL and MongoDB Overview Guide
47 pages
Understanding Apache ZooKeeper Basics
No ratings yet
Understanding Apache ZooKeeper Basics
27 pages
Distributed Systems with ZooKeeper
No ratings yet
Distributed Systems with ZooKeeper
1 page
Nosql Databases
No ratings yet
Nosql Databases
379 pages
Zookeeper Getting Started Guide
No ratings yet
Zookeeper Getting Started Guide
5 pages
Apache ZooKeeper
No ratings yet
Apache ZooKeeper
3 pages
Complete Unit 3 Notes
No ratings yet
Complete Unit 3 Notes
30 pages
NoSQL & MongoDB Overview
No ratings yet
NoSQL & MongoDB Overview
47 pages
Apache ZooKeeper Fundamentals Guide
No ratings yet
Apache ZooKeeper Fundamentals Guide
24 pages
AWS EC2 Basics for Beginners
No ratings yet
AWS EC2 Basics for Beginners
56 pages
NoSQL for Tech Professionals
No ratings yet
NoSQL for Tech Professionals
40 pages
Big Data Trends and HDFS Overview
No ratings yet
Big Data Trends and HDFS Overview
20 pages
Big Data Architecture Overview
No ratings yet
Big Data Architecture Overview
16 pages
Overview of Distributed File Systems
No ratings yet
Overview of Distributed File Systems
13 pages
Understanding Znodes in Zookeeper
No ratings yet
Understanding Znodes in Zookeeper
11 pages
Key Differences in Database Technologies
No ratings yet
Key Differences in Database Technologies
26 pages
Module 12 Zookeeper - Cluster Distributed Coordination Service
No ratings yet
Module 12 Zookeeper - Cluster Distributed Coordination Service
26 pages
017 - Apache ZooKeeper
No ratings yet
017 - Apache ZooKeeper
4 pages
Bigdata Unit 4
No ratings yet
Bigdata Unit 4
97 pages
Understanding the Hadoop Ecosystem
No ratings yet
Understanding the Hadoop Ecosystem
48 pages
NGD Question Bank Answers
No ratings yet
NGD Question Bank Answers
41 pages
Nosql and Data Scalability: Getting Started With
100% (1)
Nosql and Data Scalability: Getting Started With
6 pages
Scaling MongoDB: Best Practices Guide
No ratings yet
Scaling MongoDB: Best Practices Guide
32 pages
HDFS High Availability & NoSQL Overview
No ratings yet
HDFS High Availability & NoSQL Overview
99 pages
CockroachLabs CockroachDB Vs MongoDB
No ratings yet
CockroachLabs CockroachDB Vs MongoDB
1 page
BIG Data Analytics 21CSH-471: Computer Science & Engineering
No ratings yet
BIG Data Analytics 21CSH-471: Computer Science & Engineering
32 pages
Big Data Analytics Assessment Feb 2022
No ratings yet
Big Data Analytics Assessment Feb 2022
9 pages
Distributed Systems: Tutorial 6 - Apache Zookeeper™
No ratings yet
Distributed Systems: Tutorial 6 - Apache Zookeeper™
18 pages
Amazon Dynamo DB - Presentation
100% (1)
Amazon Dynamo DB - Presentation
30 pages
Cloud Computing and Distributed Systems
No ratings yet
Cloud Computing and Distributed Systems
46 pages
Visual Guide To NoSQL Systems - Nathan Hurst's Blog
No ratings yet
Visual Guide To NoSQL Systems - Nathan Hurst's Blog
10 pages
Big Data Characteristics and MongoDB Insights
No ratings yet
Big Data Characteristics and MongoDB Insights
28 pages
ICT Technician Database Assessment Guide
No ratings yet
ICT Technician Database Assessment Guide
3 pages
Media Content Allocation Using Indices
No ratings yet
Media Content Allocation Using Indices
5 pages
How To Use Enhanced DIAdapter
No ratings yet
How To Use Enhanced DIAdapter
14 pages
BMC Remedy Action Request System 7.0
No ratings yet
BMC Remedy Action Request System 7.0
2 pages
Nastran Errors List
No ratings yet
Nastran Errors List
1,297 pages
PRASHANSA BHATIA - Experiment-8 - Stored Procedures and Functions
No ratings yet
PRASHANSA BHATIA - Experiment-8 - Stored Procedures and Functions
9 pages
Firebird 2.0 SQL Error Codes Guide
No ratings yet
Firebird 2.0 SQL Error Codes Guide
26 pages
TIB Ebx 6.1.3 Installation
No ratings yet
TIB Ebx 6.1.3 Installation
82 pages
BCA Sem-4 Database Management Exam Guide
No ratings yet
BCA Sem-4 Database Management Exam Guide
2 pages
Nutanix Private Cloud Solutions Overview
No ratings yet
Nutanix Private Cloud Solutions Overview
33 pages
Basics of MicroStrategy Reporting and Project Design
No ratings yet
Basics of MicroStrategy Reporting and Project Design
72 pages
Senior .NET Developer Resume Summary
No ratings yet
Senior .NET Developer Resume Summary
5 pages
Java JDBC Database Programming Guide
No ratings yet
Java JDBC Database Programming Guide
24 pages
Database Concepts and FAQs
No ratings yet
Database Concepts and FAQs
6 pages
Fundamentals of Computer Generations
No ratings yet
Fundamentals of Computer Generations
25 pages
Task Management System Overview
No ratings yet
Task Management System Overview
15 pages
Dynamic Role Management in PeopleSoft
100% (1)
Dynamic Role Management in PeopleSoft
6 pages
Database Management System Assignment 5
No ratings yet
Database Management System Assignment 5
10 pages
Button Configuration Form Instructions SW12-570
No ratings yet
Button Configuration Form Instructions SW12-570
39 pages
SSC Project
No ratings yet
SSC Project
11 pages
Free Computer Science Notes PDF
100% (1)
Free Computer Science Notes PDF
118 pages
MongoDB Key Features and Concepts
No ratings yet
MongoDB Key Features and Concepts
3 pages
SQL Server CLR Integration Guide
No ratings yet
SQL Server CLR Integration Guide
17 pages
Data Warehousing: Architecture & Importance
No ratings yet
Data Warehousing: Architecture & Importance
76 pages
Laratrust Setup Guide
No ratings yet
Laratrust Setup Guide
23 pages
Spreadsheet Data Analysis and SQL Queries
No ratings yet
Spreadsheet Data Analysis and SQL Queries
7 pages
Database Recovery Techniques
No ratings yet
Database Recovery Techniques
6 pages
Business Model Innovation For Circular Economy
No ratings yet
Business Model Innovation For Circular Economy
19 pages
ID09011 B 02817 D 6407
No ratings yet
ID09011 B 02817 D 6407
266 pages
School Management System Overview
100% (1)
School Management System Overview
15 pages

Zookeeper

Uploaded by

Zookeeper

Uploaded by

Big Data

A highly-available service for coordinating

• Developed at Yahoo! Research

• Started as sub-project of Hadoop, now a top-level

• Development is driven by application needs

Agenda Introduction of MongoDB

Replication/ High Availability

 It provides a mechanism for storage and retrieval of

Vertically Scalable Horizontally scalable

Follows ACID property Follows BASE property

Copy of database Copy of database

• Shard is a Mongo instance

• Replication Split data sets

Pig Hive Sqoop

HBase (Column DB)

• The topology does not change

• The network is homogeneous

• The bandwidth is infinite

• Today: applications consist of independent programs

• Difficulty: coordination of those independent programs

• Developers have to deal with coordination logic and

ZooKeeper: designed to relieve developers from

one machine (the leader)

Slave Slave Slave Slave

+ not a single point of failure anymore

Partial failures make application writing difficult

nothing comes back network failure

Sender does not know:

• Dynamic configuration: parameter changes on the fly

• Group membership: who is alive?

• Leader election: who is in charge who is a backup?

• Mutually exclusive access to critical resources (locks)

• Barriers (supersteps in Giraph for instance)

The ZooKeeper API allows us to implement all these coordination

• Clients receive notifications of changes before the

• Data is cached on the client side:

• What if a new leader is elected?

• Server: process providing the ZooKeeper service

• znode: in-memory data node in ZooKeeper,

• Update/write: any operation which modifies the state

• Clients establish a session when connecting to

• znodes can be manipulated by clients through

• znodes are referred to by UNIX style file system

All znodes store data (file like) & can have

• znodes map to abstractions of the client

Group membership protocol:

/app1/p_1 /app1/p_2 /app1/p_3

• EPHEMERAL flag: clients create znodes which

• SEQUENTIAL flag: monotonically increasing

/app1_5/p_1 /app1_5/p_2 /app1_5/p_3

• Server notifies the client when the information on the

• Watches are one-time triggers associated with a

• Watch notifications indicate the change, not the new

• A client connects to ZooKeeper and initiates a

• Sessions have an associated timeout

• ZooKeeper considers a client faulty if it does not

• Session ends: faulty client or explicitly ended

• Clients connect to exactly one server to submit

ZooKeeper API close methods).

• String create(path, data, flags)

• void delete(path, version)

• Stat exists(path, watch)

• (data, Stat) getData(path, watch)

• Stat setData(path, data, version)

• String[] getChildren(path, watch)

if the leader dies, elect again (“herd effect”)

• Setup: master commands page-fetching processes

• Main advantage of ZooKeeper:

• Used primitives of ZK: configuration metadata,

• The system manages thousands of topics that clients

• Used primitives of ZK: configuration metadata (to

• [paper] ZooKeeper: Wait-free coordination for Internet-

• Why do we need it?

• Data model of ZooKeeper: znodes

• Example implementations of different coordination

You might also like