System Designer

What is Apache Zookeeper?

Apache Zookeeper is a centralized coordination service for distributed applications. It provides a simple interface for distributed synchronization, configuration management, naming, and group services that are essential for building robust distributed systems.

Key Features

• Hierarchical namespace (like a file system)
• Strong consistency guarantees
• Atomic operations and transactions
• Watches for real-time notifications
• High availability through replication
• Sequential consistency for client operations

Zookeeper Ensemble Calculator

Ensemble Nodes5 nodes (must be odd)

Total ZNodes10,000 znodes

Connected Clients1,000 clients

Avg Data per ZNode (KB)1 KB (max 1MB per znode)

Ensemble Metrics

Quorum Size:3 nodes

Max Failures:2 nodes

Total Storage:15000 MB

Throughput:166,667 ops/sec

Avg Latency:5 ms

Availability:99.94%

Common Coordination Patterns

Leader Election

Elect a single leader from multiple nodes to coordinate distributed operations.

• Create sequential ephemeral znodes
• Lowest sequence number becomes leader
• Watch previous node for leadership changes

Distributed Lock

Implement mutual exclusion across distributed processes.

• Create ephemeral sequential znodes
• Check if you have the lowest sequence
• Watch the previous node to wait for turn

Service Discovery

• Services register ephemeral znodes
• Clients watch service directory
• Automatic cleanup when services die

Configuration Management

Centrally manage and distribute configuration changes.

• Store config in persistent znodes
• Applications watch for changes
• Atomic updates across all nodes

Real-World Examples

Apache Kafka

Kafka uses Zookeeper for broker coordination, topic configuration, and consumer group management.

• Broker registration and health monitoring
• Partition leader election
• Topic and partition metadata storage

Apache Hadoop

Hadoop ecosystem uses Zookeeper for NameNode high availability and resource manager coordination.

• HDFS NameNode failover coordination
• YARN ResourceManager election
• HBase region server coordination

Netflix

Netflix uses Zookeeper for service discovery and distributed configuration in their microservices architecture.

• 10,000+ microservices coordination
• Dynamic load balancing configuration
• Feature flag management

Basic Operations

Java Client Example

import org.apache.zookeeper.*;
import org.apache.zookeeper.data.Stat;

public class ZookeeperExample implements Watcher {
    private ZooKeeper zooKeeper;
    private static final String CONNECTION_STRING = "localhost:2181";
    private static final int SESSION_TIMEOUT = 3000;

    public void connect() throws Exception {
        zooKeeper = new ZooKeeper(CONNECTION_STRING, SESSION_TIMEOUT, this);

        // Wait for connection to establish
        synchronized (this) {
            wait();
        }
    }

    // Create a znode
    public void createZnode(String path, String data) throws Exception {
        zooKeeper.create(
            path,
            data.getBytes(),
            ZooDefs.Ids.OPEN_ACL_UNSAFE,
            CreateMode.PERSISTENT
        );
        System.out.println("Created znode: " + path);
    }

    // Read znode data
    public String getData(String path) throws Exception {
        Stat stat = new Stat();
        byte[] data = zooKeeper.getData(path, true, stat);
        return new String(data);
    }

    // Update znode data
    public void setData(String path, String newData) throws Exception {
        Stat stat = zooKeeper.exists(path, true);
        if (stat != null) {
            zooKeeper.setData(path, newData.getBytes(), stat.getVersion());
            System.out.println("Updated znode: " + path);
        }
    }

    // Leader election pattern
    public void electLeader(String electionPath) throws Exception {
        // Create election directory if it doesn't exist
        if (zooKeeper.exists(electionPath, false) == null) {
            zooKeeper.create(electionPath, new byte[0],
                ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
        }

        // Create ephemeral sequential znode
        String myPath = zooKeeper.create(
            electionPath + "/candidate-",
            InetAddress.getLocalHost().getHostAddress().getBytes(),
            ZooDefs.Ids.OPEN_ACL_UNSAFE,
            CreateMode.EPHEMERAL_SEQUENTIAL
        );

        // Check if we're the leader (lowest sequence number)
        List&lt;String&gt; children = zooKeeper.getChildren(electionPath, false);
        Collections.sort(children);

        String smallestChild = children.get(0);
        if (myPath.endsWith(smallestChild)) {
            System.out.println("I am the leader!");
        } else {
            // Watch the node before us
            String watchPath = electionPath + "/" +
                children.get(children.indexOf(myPath.substring(myPath.lastIndexOf('/') + 1)) - 1);
            zooKeeper.exists(watchPath, this);
            System.out.println("Watching: " + watchPath);
        }
    }

    // Distributed lock pattern
    public boolean acquireLock(String lockPath, String clientId) throws Exception {
        String myLockPath = zooKeeper.create(
            lockPath + "/" + clientId + "-lock-",
            clientId.getBytes(),
            ZooDefs.Ids.OPEN_ACL_UNSAFE,
            CreateMode.EPHEMERAL_SEQUENTIAL
        );

        List&lt;String&gt; children = zooKeeper.getChildren(lockPath, false);
        Collections.sort(children);

        String mySequence = myLockPath.substring(myLockPath.lastIndexOf('/') + 1);

        if (children.get(0).equals(mySequence)) {
            System.out.println("Lock acquired by: " + clientId);
            return true;
        } else {
            // Watch the previous node
            int myIndex = children.indexOf(mySequence);
            String watchPath = lockPath + "/" + children.get(myIndex - 1);
            zooKeeper.exists(watchPath, this);
            System.out.println("Waiting for lock...");
            return false;
        }
    }

    @Override
    public void process(WatchedEvent event) {
        if (event.getState() == KeeperState.SyncConnected) {
            synchronized (this) {
                notify();
            }
        } else if (event.getType() == Event.EventType.NodeDeleted) {
            // Previous node deleted, we might be next in line
            System.out.println("Node deleted, checking for leadership/lock...");
        }
    }

    public void close() throws Exception {
        zooKeeper.close();
    }
}

Best Practices

✅ Do

•Use odd number of nodes (3, 5, 7) for proper quorums
•Keep znode data small (<1MB, ideally <1KB)
•Use ephemeral nodes for temporary data
•Implement proper watch handling and reconnection logic
•Monitor ensemble health and performance regularly

❌ Don't

•Use Zookeeper for large data storage
•Create deep hierarchies (performance degrades)
•Ignore session timeouts and connection events
•Use blocking operations in watch callbacks
•Deploy single-node clusters in production

No quiz questions available

Quiz ID "zookeeper" not found