Database Sharding
Horizontal partitioning strategies and trade-offs
Database sharding splits data across multiple servers to achieve horizontal scaling. When a single database can't handle your load, sharding distributes both data and traffic. The challenge lies in choosing the right shard key and managing the complexity of distributed queries and transactions.
The golden rule: shard by the most common access pattern. If users always query their own data, shard by user_id. For multi-tenant apps, shard by tenant_id. Use consistent hashing to minimize resharding pain, and always plan for cross-shard queriesβthey're inevitable but should be rare.
Horizontal (Range-based)
Split data by ranges of shard key values
- β’Simple to understand
- β’Easy range queries
- β’Predictable data location
- β’Hotspots possible
- β’Uneven data distribution
- β’Manual rebalancing
Hash-based
Use hash function to determine shard placement
- β’Even distribution
- β’Automatic balancing
- β’No hotspots
- β’Complex range queries
- β’Difficult resharding
- β’Hash function dependency
Directory-based
Lookup service maps keys to shards
- β’Flexible routing
- β’Dynamic rebalancing
- β’Query optimization
- β’Additional complexity
- β’Lookup service SPOF
- β’Extra network hop
Consistent Hashing
Hash ring distributes load with minimal reshuffling
- β’Minimal data movement
- β’Handles failures well
- β’Elastic scaling
- β’Complex implementation
- β’Uneven loads possible
- β’Virtual nodes needed
Sharding introduces complexity. Here are the main challenges and proven approaches to solve them.
Cross-shard Queries
Queries spanning multiple shards are expensive
- Denormalization: Duplicate data to avoid joins
- Application-level joins: Fetch and combine in app
- Read replicas: Aggregate data for reporting
- Event sourcing: Rebuild views from event stream
Distributed Transactions
ACID guarantees across shards are complex
- Avoid when possible: Design for single-shard transactions
- 2PC (Two-Phase Commit): Slow but consistent
- Saga pattern: Compensating transactions
- Eventually consistent: Accept temporary inconsistency
Shard Key Selection
Wrong shard key leads to hotspots and poor performance
- High cardinality: Many unique values
- Even distribution: Avoid skewed access patterns
- Query alignment: Match common query patterns
- Immutable keys: Avoid changing shard placement
Resharding
Adding/removing shards requires data migration
- Pre-shard: Over-provision initially
- Virtual shards: Multiple virtual shards per physical
- Live migration: Move data without downtime
- Consistent hashing: Minimize data movement
User ID-based sharding
PostgreSQL sharded by user_id using Django ORM
- β’Started simple with range sharding
- β’Migrated to hash-based for better distribution
- β’Pre-allocated shard space for growth
Discord
150M+ users, billions of messagesGuild-based sharding
Cassandra clusters sharded by guild (server) ID
- β’Guild-based keeps related data together
- β’Consistent hashing handles node failures
- β’Hot guilds can still cause issues
Uber
100M+ users, global presenceGeographic + hash sharding
MySQL sharded by city + hash(trip_id)
- β’Geographic sharding reduces latency
- β’City boundaries can be problematic
- β’Multi-level sharding adds complexity
Object type-based sharding
HBase/MySQL sharded by object type and ID
- β’Different object types have different access patterns
- β’Board data kept together for performance
- β’Cross-type queries are expensive
Key performance considerations when implementing database sharding.
Query Performance
Data Movement
Hotspot Risk
Sharding Best Practices
- β’ Start with vertical scaling first
- β’ Choose shard key carefully - it's hard to change
- β’ Design for single-shard queries when possible
- β’ Monitor shard utilization and hotspots
- β’ Plan for resharding from day one
- β’ Consider read replicas before sharding
Common Mistakes
- β’ Sharding too early or too late
- β’ Using timestamp as shard key (hotspots)
- β’ Ignoring cross-shard query performance
- β’ Not planning for data migration
- β’ Tight coupling between shards
- β’ Over-optimizing for edge cases
π Test Your Knowledge
6 questions β’ Progress: 0/6