Database Sharding

Horizontal partitioning strategies and trade-offs

Not Started
Loading...

Database sharding splits data across multiple servers to achieve horizontal scaling. When a single database can't handle your load, sharding distributes both data and traffic. The challenge lies in choosing the right shard key and managing the complexity of distributed queries and transactions.

The golden rule: shard by the most common access pattern. If users always query their own data, shard by user_id. For multi-tenant apps, shard by tenant_id. Use consistent hashing to minimize resharding pain, and always plan for cross-shard queriesβ€”they're inevitable but should be rare.

Sharding Strategies Comparison
Implementation Complexity
2/5Range
5/5Consistent Hash
Data Distribution
2/5Range
5/5Hash
Range Query Performance
2/5Hash
5/5Range
Resharding Ease
2/5Range
4/5Consistent Hash

Horizontal (Range-based)

Split data by ranges of shard key values

Low Complexity
Medium Performance
Pros
  • β€’Simple to understand
  • β€’Easy range queries
  • β€’Predictable data location
Cons
  • β€’Hotspots possible
  • β€’Uneven data distribution
  • β€’Manual rebalancing
Example: User IDs 1-1M β†’ Shard1, 1M-2M β†’ Shard2

Hash-based

Use hash function to determine shard placement

Medium Complexity
High Performance
Pros
  • β€’Even distribution
  • β€’Automatic balancing
  • β€’No hotspots
Cons
  • β€’Complex range queries
  • β€’Difficult resharding
  • β€’Hash function dependency
Example: hash(user_id) % shard_count = target_shard

Directory-based

Lookup service maps keys to shards

High Complexity
Medium Performance
Pros
  • β€’Flexible routing
  • β€’Dynamic rebalancing
  • β€’Query optimization
Cons
  • β€’Additional complexity
  • β€’Lookup service SPOF
  • β€’Extra network hop
Example: Directory service: user_123 β†’ shard_7

Consistent Hashing

Hash ring distributes load with minimal reshuffling

High Complexity
High Performance
Pros
  • β€’Minimal data movement
  • β€’Handles failures well
  • β€’Elastic scaling
Cons
  • β€’Complex implementation
  • β€’Uneven loads possible
  • β€’Virtual nodes needed
Example: Amazon DynamoDB, Cassandra ring topology
Sharding Challenges & Solutions

Sharding introduces complexity. Here are the main challenges and proven approaches to solve them.

Cross-shard Queries

Queries spanning multiple shards are expensive

Solutions:
  • Denormalization: Duplicate data to avoid joins
  • Application-level joins: Fetch and combine in app
  • Read replicas: Aggregate data for reporting
  • Event sourcing: Rebuild views from event stream

Distributed Transactions

ACID guarantees across shards are complex

Solutions:
  • Avoid when possible: Design for single-shard transactions
  • 2PC (Two-Phase Commit): Slow but consistent
  • Saga pattern: Compensating transactions
  • Eventually consistent: Accept temporary inconsistency

Shard Key Selection

Wrong shard key leads to hotspots and poor performance

Solutions:
  • High cardinality: Many unique values
  • Even distribution: Avoid skewed access patterns
  • Query alignment: Match common query patterns
  • Immutable keys: Avoid changing shard placement

Resharding

Adding/removing shards requires data migration

Solutions:
  • Pre-shard: Over-provision initially
  • Virtual shards: Multiple virtual shards per physical
  • Live migration: Move data without downtime
  • Consistent hashing: Minimize data movement
Real-world Sharding Examples

Instagram

400M+ users, 1000s of shards
Approach

User ID-based sharding

PostgreSQL sharded by user_id using Django ORM

Key Lessons
  • β€’Started simple with range sharding
  • β€’Migrated to hash-based for better distribution
  • β€’Pre-allocated shard space for growth

Discord

150M+ users, billions of messages
Approach

Guild-based sharding

Cassandra clusters sharded by guild (server) ID

Key Lessons
  • β€’Guild-based keeps related data together
  • β€’Consistent hashing handles node failures
  • β€’Hot guilds can still cause issues

Uber

100M+ users, global presence
Approach

Geographic + hash sharding

MySQL sharded by city + hash(trip_id)

Key Lessons
  • β€’Geographic sharding reduces latency
  • β€’City boundaries can be problematic
  • β€’Multi-level sharding adds complexity

Pinterest

400M+ users, billions of pins
Approach

Object type-based sharding

HBase/MySQL sharded by object type and ID

Key Lessons
  • β€’Different object types have different access patterns
  • β€’Board data kept together for performance
  • β€’Cross-type queries are expensive
Performance Impact & Metrics

Key performance considerations when implementing database sharding.

Query Performance

single Shard
95%
cross Shard
5%
Impact: 10-100x slower
Mitigation: Design for single-shard queries

Data Movement

range Sharding
High
hash Sharding
Medium
consistent Hash
Low
Mitigation: Use consistent hashing

Hotspot Risk

range Sharding
High
hash Sharding
Low
directory Based
Medium
Mitigation: Monitor shard utilization

Sharding Best Practices

  • β€’ Start with vertical scaling first
  • β€’ Choose shard key carefully - it's hard to change
  • β€’ Design for single-shard queries when possible
  • β€’ Monitor shard utilization and hotspots
  • β€’ Plan for resharding from day one
  • β€’ Consider read replicas before sharding

Common Mistakes

  • β€’ Sharding too early or too late
  • β€’ Using timestamp as shard key (hotspots)
  • β€’ Ignoring cross-shard query performance
  • β€’ Not planning for data migration
  • β€’ Tight coupling between shards
  • β€’ Over-optimizing for edge cases

πŸ“ Test Your Knowledge

6 questions β€’ Progress: 0/6