Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Zookeeper
What is Apache Zookeeper?
(Serialization)
MapReduce (Job Scheduling/Execution)
Avro
HBase (Column DB)
HDFS
Motivation for using Zookeeper
• In the past : a single program running on a single computer with a single CPU
• Today : applications consist of independent programs running on a changing set of computers
• Challenge : coordination of independent programs; developers had to deal with coordination logic and
the application logic same time.
•
implemented by a client No deadlocks
• Guarantees
•
ZooKeeper are linearisable
• Update/write: any operation which modifies the state of the data tree
/app1_5
ephemeral (Greek): passing, short-lived
Create (/app1_5/p_, data, SEQUENTIAL)
/app1_5/p_1 /app1_5/p_2/app1_5/p_3
znodes & watch flag
• Clients can issue read operations on znodes with a watch flag
• Server notifies the client when the information on the znode has changed
• Watches are one-time triggers associated with a session (unregistered once triggered or
session closes)
• ZooKeeper considers a client faulty if it does not receive anything from its
session for more than that timeout
updates first
logged to disk;
write-ahead log
write request requires
and snapshot
coordination between servers for recovery
ZooKeeper API
• String create(path, data, flags) No partial read/writes
• creates a znode with path name path, stores data in it and sets flags (no open, seek or
(ephemeral, sequential) close methods).
/app1/config /app1/progress
Example: group membership
• String create(path, data, flags)
Questions: • void delete(path, version)
1.How can all workers (slaves) of an • Stat exists(path, watch)
application register themselves on ZK? • (data, Stat) getData(path, watch)
2. How can a process find out about all • Stat setData(path, data, version)
active workers of an application? • String[] getChildren(path, watch)
/
[a znode is designated to store workers]
1.create(/app1/workers/
worker,data,EPHEMERAL)
2. getChildren(/app1/workers,true) /app1
/app1/workers
/app1/workers/worker1 /app1/workers/worker2
Example : simple locks •
•
String create(path, data, flags)
Void delete(path,version)
• Stat exists(path, watch)
• (data, Stat) getData(path, watch)
Question: • Stat setData(path, data, version)
1. How can all workers of an application • String[] getChildren(path, watch)
use a single resource through a lock?
create(/app1/lock1,…,EPHE.) /
/app1
yes
ok? use locked resource /app1/workers
/app1/lock1
/app1/workers/worker1 /app1/workers/worker2
getData(/app1/lock1,true)
ids = getChildren(/app1/locks/,false)
/
yes
id=min(ids)? exit (use lock)
/app1
no
/app1/locks
exists(max_id<id,true)
/app1/locks/lock_1 /app1/locks/lock_2
wait for notification
Question:
1. How can all workers of an application use a single resource through
a lock?
37
Example: leader election • String create(path, data, flags)
• void delete(path, version)
• Stat exists(path, watch)
Question: • (data, Stat) getData(path, watch)
1. How can all workers of an application elect • Stat setData(path, data, version)
a leader among themselves? • String[] getChildren(path, watch)
getData(/app1/workers/leader,true)
/
ok? follow
yes /app1
create(/app1/workers/leader,IP,EPHE.)
/app1/workers
no
ok? lead /app1/workers/leader /app1/workers/worker1
yes
Instagram
Summary
• A distributed application is an application which can run on multiple systems in a network
• Apache Zookeeper is an open source distributed coordination service that helps you manage a large set of hosts
• It allows for mutual exclusion and cooperation between server processes
• Server, Client, Leader, Follower, Ensemble/Cluster, ZooKeeper WebUI are important zookeeper components
• Three types of Znodes are Persistence, Ephemeral and sequential
• ZDM watch is a one-time trigger which is sent to the client that set watch. It occurred when data from that watch changes
• Zookeeper uses ACLs to control access to its znodes
• Zookeeper uses : Managing the configuration, Naming services., selecting the leader, Queuing the messages, Managing
the notification system, Synchronization, Distributed Cluster Management, etc.
• Yahoo, Facebook, eBay, Twitter, Netflix are some known companies using zookeeper
• The main drawback of tool is that loss may occur if you are adding new Zookeeper Servers
Thank you
Presentation by
Anshumaan
Pratap Arunava Saha Ayush Gupta G. Tejarawind
42148 42155 42177
42139
Operations D