System Designer

What is CockroachDB?

CockroachDB is a distributed SQL database that provides ACID transactions, horizontal scaling, and survivability without compromising on SQL capabilities. Built by Cockroach Labs, it's designed to be as resilient as its namesake - able to survive datacenter failures, network partitions, and other disasters while maintaining strong consistency.

Distributed

Automatically scales across multiple nodes and regions

Consistent

ACID transactions with strong consistency guarantees

Survivable

Automatic failover and disaster recovery

CockroachDB Performance Calculator

Number of Nodes3 nodes

Data Size (GB)100 GB

Replication Factor

Read/Write Ratio70% reads, 30% writes

42.75 GB

Total Memory

720

Write QPS

9,450

Read QPS

Max Node Failures

15ms

Write Latency

33%

Storage Efficiency

Getting Started with CockroachDB

CockroachDB uses standard SQL syntax, making it familiar to developers while providing distributed capabilities.

Installation and Setup

# Download and install CockroachDB
curl https://binaries.cockroachdb.com/cockroach-v23.1.0.linux-amd64.tgz | tar -xz
sudo cp cockroach-v23.1.0.linux-amd64/cockroach /usr/local/bin/

# Start a single-node cluster (development)
cockroach start-single-node --insecure --listen-addr=localhost:26257

# Start a multi-node cluster (production)
# Node 1
cockroach start --insecure --store=node1 --listen-addr=node1:26257 --http-addr=node1:8080 --join=node1:26257,node2:26257,node3:26257

# Node 2  
cockroach start --insecure --store=node2 --listen-addr=node2:26257 --http-addr=node2:8080 --join=node1:26257,node2:26257,node3:26257

# Initialize the cluster
cockroach init --insecure --host=node1:26257

Basic SQL Operations

-- Connect to CockroachDB
cockroach sql --insecure --host=localhost:26257

-- Create a database and table
CREATE DATABASE ecommerce;
USE ecommerce;

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email STRING UNIQUE NOT NULL,
    name STRING NOT NULL,
    created_at TIMESTAMP DEFAULT now()
);

-- Insert data
INSERT INTO users (email, name) VALUES 
    ('alice@example.com', 'Alice Johnson'),
    ('bob@example.com', 'Bob Smith');

-- Query data
SELECT * FROM users WHERE created_at > '2023-01-01';

-- Show cluster status
SHOW CLUSTER SETTING cluster.organization;
SHOW RANGES FROM TABLE users;

Real-World Examples

Netflix - Global Content Delivery

Netflix uses CockroachDB for global content metadata management across multiple regions.

• 200+ million users served globally
• 99.99% uptime with automatic failover
• Sub-100ms read latency worldwide
• 500TB+ of metadata replicated across regions

Bose - IoT Device Management

Bose leverages CockroachDB for managing millions of connected audio devices worldwide.

• 10+ million connected devices
• 50,000+ QPS peak device telemetry
• 5 regions with automatic data locality
• Zero downtime during regional outages

WeWork - Global Workspace Management

WeWork uses CockroachDB for real-time workspace booking and occupancy tracking.

• 500+ locations across 30+ countries
• 1M+ bookings processed monthly
• Real-time occupancy data with strong consistency
• Multi-region deployment with data sovereignty

Advanced Features

Multi-Region Deployment

-- Set up geo-partitioned tables
ALTER DATABASE ecommerce CONFIGURE ZONE USING constraints='[+region=us-east]';

-- Create region-specific data
ALTER TABLE users PARTITION BY LIST (region) (
    PARTITION us_east VALUES IN ('us-east'),
    PARTITION us_west VALUES IN ('us-west'),
    PARTITION eu_west VALUES IN ('eu-west')
);

-- Pin data to specific regions
ALTER PARTITION us_east OF INDEX users@primary 
CONFIGURE ZONE USING constraints='[+region=us-east]';

Change Data Capture (CDC)

-- Set up change feeds for real-time data sync
CREATE CHANGEFEED FOR TABLE users 
INTO 'kafka://localhost:9092' 
WITH updated, resolved='10s';

-- Stream changes to cloud storage
CREATE CHANGEFEED FOR TABLE orders
INTO 's3://my-bucket/orders?AWS_ACCESS_KEY_ID=...'
WITH format='avro', confluent_schema_registry='http://localhost:8081';

Best Practices

✅ Do

•Use UUID primary keys for better distribution
•Design schemas to avoid hot spots
•Use appropriate replication factors (3 for most cases)
•Monitor cluster health with built-in dashboards
•Implement proper backup and recovery strategies

❌ Don't

•Use sequential primary keys (creates hot spots)
•Run large transactions without batching
•Ignore replication lag in multi-region setups
•Over-partition small tables
•Deploy without proper monitoring and alerting

Performance Optimization

Query Optimization

-- Analyze query performance
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'alice@example.com';

-- Create appropriate indexes
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_users_created_at ON users (created_at DESC);

-- Use HINT for query optimization
SELECT * FROM users@{FORCE_INDEX=idx_users_email} WHERE email = 'alice@example.com';

-- Check query statistics
SELECT * FROM crdb_internal.node_statement_statistics 
WHERE application_name = 'my_app'
ORDER BY total_time DESC LIMIT 10;

Cluster Tuning

-- Configure cluster settings for performance
SET CLUSTER SETTING kv.range_size = '134217728'; -- 128MB ranges
SET CLUSTER SETTING kv.snapshot_rebalance.max_rate = '64MiB';
SET CLUSTER SETTING rocksdb.min_wal_sync_interval = '500µs';

-- Monitor cluster performance
SELECT * FROM crdb_internal.cluster_queries ORDER BY total_time DESC;
SHOW CLUSTER SETTING diagnostics.reporting.enabled;

No quiz questions available

Quiz ID "cockroachdb" not found