System Designer

🗃️ SSTable Storage Calculator

Record Count

Average Key Size (bytes)

Average Value Size (bytes)

Block Size (KB)

Compression Ratio

Index Sampling Rate

Bloom Filter False Positive Rate

📊 Storage Metrics

Uncompressed Size: 1007.1 MB

Compressed Size: 705.0 MB

Total Blocks: 16,130

Records per Block: 62

Index Entries: 7,813

Index Size: 305.2 KB

Bloom Filter Size: 1167.9 KB

⚡ Performance Metrics

Avg Read Time: 10.0 ms

Write Time: 7049.6 ms

Memory Usage: 1.4 MB

Compaction Cost: 1409.9 MB

Hash Functions: 7

💡 Optimization Tips

What are SSTables?

Sorted String Tables (SSTables) are immutable, sorted files that store key-value pairs. They form the foundation of many modern NoSQL databases and storage engines.

Key Properties

Immutable: Never modified after creation
Sorted: Keys are stored in lexicographic order
Compressed: Uses block-level compression
Indexed: Sparse index for efficient lookups
Self-contained: Includes metadata and bloom filters

Advantages

Excellent sequential write performance
High compression ratios
Efficient range queries
Simple crash recovery
Predictable performance characteristics

SSTable Internal Structure

📁 File Layout (from bottom to top)

1. Data Blocks

Compressed blocks containing the actual key-value pairs in sorted order

Each block contains multiple records
Block size typically 4KB-64KB
Compression applied per block
Checksum for integrity verification

2. Meta Blocks

Optional blocks for metadata like bloom filters and statistics

Bloom filter for existence checks
Column family metadata
Statistics and histograms
Custom application metadata

3. Index Block

Sparse index mapping keys to data block offsets

One entry per data block
Contains last key of each block
Enables binary search for range queries
Typically fits in memory

4. Footer

Fixed-size footer containing essential metadata

Offset to index block
Offset to meta blocks
File format version
Magic number for validation

SSTable Read Operations

🔍 Point Lookups

Step 1: Bloom Filter Check

Quick check if key might exist (eliminates ~99% of negative lookups)

Step 2: Index Search

Binary search in sparse index to find relevant data block

Step 3: Block Read

Read and decompress the target data block

Step 4: Block Search

Binary search within block to find exact key

Time Complexity: O(log B + log R/B)
B = blocks, R = records

📊 Range Queries

Step 1: Find Start Block

Use index to locate first block containing start key

Step 2: Sequential Scan

Read consecutive blocks until end key is reached

Step 3: Filter Results

Apply additional predicates and collect matching records

Optimization:

Block-level filtering using metadata can skip entire blocks

Performance: Excellent for range scans
Minimal seeks, sequential I/O

LSM Trees and Write Path

🌳 Log-Structured Merge Trees

📝 MemTable (L0)

In-memory buffer

Sorted tree (Red-Black/Skip List)

Fast writes

Volatile

→

💾 SSTables (L1+)

Immutable files

Sorted on disk

Compressed

Persistent

📥 Write Process

1. Writes go to WAL (Write-Ahead Log) for durability

2. Data inserted into in-memory MemTable

3. When MemTable is full, flush to L0 SSTable

4. Background compaction merges SSTables across levels

5. Each level has size limits, triggering cascading compactions

Compaction Strategies

🎯 Size-Tiered Compaction

Merge SSTables of similar sizes when threshold is reached

Pros:

Good for write-heavy workloads
Lower write amplification
Simpler implementation

Cons:

Higher space amplification
Temporary space spikes during compaction
Less predictable read performance

Used by: Cassandra (default), ScyllaDB

📊 Leveled Compaction

Maintain fixed levels with non-overlapping key ranges

Pros:

Better space utilization
Predictable read performance
Lower space amplification

Cons:

Higher write amplification
More complex implementation
Can hurt write-heavy workloads

Used by: LevelDB, RocksDB, Cassandra (option)

Bloom Filters in SSTables

🌸 How Bloom Filters Work

Probabilistic Data Structure:

Can definitively say "NOT present" but only "MAYBE present"

Bit Array + Hash Functions:

Multiple hash functions set bits in array for each key

Query Process:

Check if all hash positions are set; if not, key definitely absent

False Positive Rate:

Tunable based on bits per key and hash functions

📈 Performance Impact

Read Optimization:

Eliminates 90-99% of unnecessary disk I/O

📝 Write Performance

• Excellent: Sequential writes only
• Batching: Amortizes I/O costs
• Write Amplification: 2-10x due to compaction
• Latency: Bounded by memtable size

📖 Read Performance

• Point reads: O(log N) with bloom filters
• Range scans: Very efficient
• Read Amplification: Multiple SSTable checks
• Cache friendly: Predictable access patterns

💾 Space Efficiency

• Compression: 50-90% space savings
• Space Amplification: 1.1-2x of raw data
• Overhead: Indexes, bloom filters, metadata
• Compaction: Requires temporary space

SSTable Best Practices

✅ Optimization Strategies

• Tune bloom filter false positive rates
• Choose appropriate block sizes for workload
• Monitor compaction metrics and adjust strategy
• Use compression algorithms suited to data type
• Size MemTables based on available memory
• Partition data to avoid hotspots
• Monitor write amplification

⚠️ Common Pitfalls

• Ignoring compaction strategy for workload
• Insufficient disk space for compaction
• Poor key design causing hotspots
• Bloom filter false positive rate too high
• Block size not aligned with read patterns
• Not monitoring write amplification
• Forgetting to tune compression settings

No quiz questions available

Questions prop is empty

SSTable & LSM Trees