Insights on Oracle & Tech: ZooKeeper: Knowing the Basics

ZooKeeper can be run in two different modes:^[1]

Standalone

is convenient for evaluation, development, and testing

Replicated

should be used in production

In this article, we will focus on running zookeeper in replicated mode and knowing its basics.

ZooKeeper Service

Apache ZooKeeper can be used in distributed applications (e.g., Yahoo! Message Broker) to enable highly reliable distributed coordination. First example is that you can use zookeeper for the high-availability of Spark standalone masters.^[2]A standalone Spark Master can run with recovery mode enabled by using zookeeper and be able to recover state among the available swarm of masters. Another example is that HDFS NameNode HA can be enabled to allow a cluster to be configured such that a NameNode is not a single point of failure.^[10] In these cases, HDFS relies on Zookeeper to manage the details of failover.

ZooKeeper Functionalities

ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical name space of data registers (or znodes), much like a file system. Here are the high-level descriptions of its functionalities:

Provides similar semantics as Google's Chubby for coordinating distributed systems, and being a consistent and highly available key-value store makes it an ideal cluster configuration store and directory of services
Is a centralized coordination service for

maintaining configuration data:

status information
configuration (e.g., security rules)
location information

naming
providing distributed synchronization
providing group services

Provides the following capabilities

Consensus
Group management
Presence protocols

Provides a variety of client bindings is available for a number of languages

In the release

ships with C, Java, Perl and Python client bindings

From the community

check here for a list of client bindings that are available from the community but not yet included in the release

ZooKeeper Architecture

When run in replicated mode, the zookeeper service comprise of an Ensemble of servers. A Zookeeper cluster, Ensemble, consists of a Leader node and followers. A leader node is chosen by consensus within the ensemble. If the leader fails another node will be elected as leader.

The design for the Ensemble requires that all know about each other. Zookeeper servers maintain an in-memory image of the data tree along with a transaction logs and snapshots in a persistent store. The downside to an in-memory database is that the size of the database that zookeeper can manage is limited by memory.

ZooKeeper Servers

To start ZooKeeper you need a configuration file which governs ZooKeeper's behavior. Here is a sample in conf/zoo.cfg:

clientPort=2181
initLimit=10
autopurge.purgeInterval=24
syncLimit=5
tickTime=2000
dataDir=/data/zookeeper
autopurge.snapRetainCount=30
server.1=zoo1:2888:3888 
server.2=zoo2:2888:3888 
server.3=zoo3:2888:3888

The entries of the form server.X list the servers that make up the ZooKeeper service. When the server starts up, it knows which server it is by looking for the file myid in the data directory (i.e., dataDir). That file contains the server number, in ASCII.

Note the two port numbers after each server name: " 2888" and "3888". Peers use the former port to connect to other peers. Such a connection is necessary so that peers can communicate, for example, to agree upon the order of updates. More specifically, a ZooKeeper server uses this port to connect followers to the leader. When a new leader arises, a follower opens a TCP connection to the leader using this port. Because the default leader election also uses TCP, we currently require another port for leader election. This is the second port in the server entry.

For the details of other configuration parameters, read here.

ZooKeeper Clients

Clients only connect to a single ZooKeeper server. The client maintains a TCP connection through which it sends requests, get responses, gets watch events, and send heartbeats. If the TCP connection to the server breaks, the client will connect to a different server.

In ZooKeeper's configuration file, clientPort (e.g. 2181) is used to specify the port to listen for client connections. There are two ways to connect to ZooKeeper service using:

telnet or nc^[6]
ZooKeeper Command Line Interface (CLI)^[7]

telnet or nc

You can issue the commands to ZooKeeper via telnet or nc, at the client port. Each command is composed of four letters. For example, command ruok can test if server is running in a non-error state. The server will respond with imok if it is running. Otherwise it will not respond at all.

$ echo ruok | nc 127.0.0.1 2181
imok

For the full list of commands, read here.

zkCli.sh

ZooKeeper Command Line Interface (CLI)^[7] is used to interact with the ZooKeeper ensemble for development purpose. It is useful for debugging and working around with different options.

To perform ZooKeeper CLI operations, first turn on your ZooKeeper server (“bin/zkServer.sh start”) and then, ZooKeeper client (“bin/zkCli.sh”).

$ ./current/zookeeper-client/bin/zkCli.sh
Connecting to localhost:2181
<snipped>

[zk: localhost:2181(CONNECTED) 0] help
ZooKeeper -server host:port cmd args
        stat path [watch]
        set path data [version]
        ls path [watch]
        <snipped>


[zk: localhost:2181(CONNECTED) 1]

For information on installing the client side libraries, refer to the Bindings section of the ZooKeeper Programmer's Guide.

Cross Column

Thursday, January 19, 2017

ZooKeeper: Knowing the Basics

ZooKeeper Service

ZooKeeper Functionalities

ZooKeeper Architecture

ZooKeeper Servers

ZooKeeper Clients

References

No comments: