Skip to main contentSkip to user menuSkip to navigation

CockroachDB

Master CockroachDB: distributed SQL, ACID transactions, resilience, and global database architecture.

30 min readIntermediate
Not Started
Loading...

What is CockroachDB?

CockroachDB is a distributed SQL database that provides ACID transactions, horizontal scaling, and survivability without compromising on SQL capabilities. Built by Cockroach Labs, it's designed to be as resilient as its namesake - able to survive datacenter failures, network partitions, and other disasters while maintaining strong consistency.

Distributed

Automatically scales across multiple nodes and regions

Consistent

ACID transactions with strong consistency guarantees

Survivable

Automatic failover and disaster recovery

CockroachDB Performance Calculator

3 nodes
100 GB
70% reads, 30% writes
42.75 GB
Total Memory
720
Write QPS
9,450
Read QPS
1
Max Node Failures
15ms
Write Latency
33%
Storage Efficiency

Getting Started with CockroachDB

CockroachDB uses standard SQL syntax, making it familiar to developers while providing distributed capabilities.

Installation and Setup
# Download and install CockroachDB
curl https://binaries.cockroachdb.com/cockroach-v23.1.0.linux-amd64.tgz | tar -xz
sudo cp cockroach-v23.1.0.linux-amd64/cockroach /usr/local/bin/

# Start a single-node cluster (development)
cockroach start-single-node --insecure --listen-addr=localhost:26257

# Start a multi-node cluster (production)
# Node 1
cockroach start --insecure --store=node1 --listen-addr=node1:26257 --http-addr=node1:8080 --join=node1:26257,node2:26257,node3:26257

# Node 2  
cockroach start --insecure --store=node2 --listen-addr=node2:26257 --http-addr=node2:8080 --join=node1:26257,node2:26257,node3:26257

# Initialize the cluster
cockroach init --insecure --host=node1:26257
Basic SQL Operations
-- Connect to CockroachDB
cockroach sql --insecure --host=localhost:26257

-- Create a database and table
CREATE DATABASE ecommerce;
USE ecommerce;

CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email STRING UNIQUE NOT NULL,
    name STRING NOT NULL,
    created_at TIMESTAMP DEFAULT now()
);

-- Insert data
INSERT INTO users (email, name) VALUES 
    ('alice@example.com', 'Alice Johnson'),
    ('bob@example.com', 'Bob Smith');

-- Query data
SELECT * FROM users WHERE created_at > '2023-01-01';

-- Show cluster status
SHOW CLUSTER SETTING cluster.organization;
SHOW RANGES FROM TABLE users;

Real-World Examples

Netflix - Global Content Delivery

Netflix uses CockroachDB for global content metadata management across multiple regions.

  • 200+ million users served globally
  • 99.99% uptime with automatic failover
  • Sub-100ms read latency worldwide
  • 500TB+ of metadata replicated across regions

Bose - IoT Device Management

Bose leverages CockroachDB for managing millions of connected audio devices worldwide.

  • 10+ million connected devices
  • 50,000+ QPS peak device telemetry
  • 5 regions with automatic data locality
  • Zero downtime during regional outages

WeWork - Global Workspace Management

WeWork uses CockroachDB for real-time workspace booking and occupancy tracking.

  • 500+ locations across 30+ countries
  • 1M+ bookings processed monthly
  • Real-time occupancy data with strong consistency
  • Multi-region deployment with data sovereignty

Advanced Features

Multi-Region Deployment

-- Set up geo-partitioned tables
ALTER DATABASE ecommerce CONFIGURE ZONE USING constraints='[+region=us-east]';

-- Create region-specific data
ALTER TABLE users PARTITION BY LIST (region) (
    PARTITION us_east VALUES IN ('us-east'),
    PARTITION us_west VALUES IN ('us-west'),
    PARTITION eu_west VALUES IN ('eu-west')
);

-- Pin data to specific regions
ALTER PARTITION us_east OF INDEX users@primary 
CONFIGURE ZONE USING constraints='[+region=us-east]';

Change Data Capture (CDC)

-- Set up change feeds for real-time data sync
CREATE CHANGEFEED FOR TABLE users 
INTO 'kafka://localhost:9092' 
WITH updated, resolved='10s';

-- Stream changes to cloud storage
CREATE CHANGEFEED FOR TABLE orders
INTO 's3://my-bucket/orders?AWS_ACCESS_KEY_ID=...'
WITH format='avro', confluent_schema_registry='http://localhost:8081';

Best Practices

✅ Do

  • Use UUID primary keys for better distribution
  • Design schemas to avoid hot spots
  • Use appropriate replication factors (3 for most cases)
  • Monitor cluster health with built-in dashboards
  • Implement proper backup and recovery strategies

❌ Don't

  • Use sequential primary keys (creates hot spots)
  • Run large transactions without batching
  • Ignore replication lag in multi-region setups
  • Over-partition small tables
  • Deploy without proper monitoring and alerting

Performance Optimization

Query Optimization

-- Analyze query performance
EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'alice@example.com';

-- Create appropriate indexes
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_users_created_at ON users (created_at DESC);

-- Use HINT for query optimization
SELECT * FROM users@{FORCE_INDEX=idx_users_email} WHERE email = 'alice@example.com';

-- Check query statistics
SELECT * FROM crdb_internal.node_statement_statistics 
WHERE application_name = 'my_app'
ORDER BY total_time DESC LIMIT 10;

Cluster Tuning

-- Configure cluster settings for performance
SET CLUSTER SETTING kv.range_size = '134217728'; -- 128MB ranges
SET CLUSTER SETTING kv.snapshot_rebalance.max_rate = '64MiB';
SET CLUSTER SETTING rocksdb.min_wal_sync_interval = '500µs';

-- Monitor cluster performance
SELECT * FROM crdb_internal.cluster_queries ORDER BY total_time DESC;
SHOW CLUSTER SETTING diagnostics.reporting.enabled;
No quiz questions available
Quiz ID "cockroachdb" not found