System Designer

Latency numbers are the foundation of performance optimization. Every operation has a cost, and understanding these costs helps you optimize at the right level. Memory is 200x faster than disk, but network can be 1000x slower than memory. These aren't just numbers—they're the physics constraints that shape system architecture.

The golden rule: Optimize where the biggest gaps are. Going from HDD to SSD saves 20ms. Adding a cache layer saves 100ms. But moving compute closer to users can save 150ms—often the biggest win.

⚡ Quick Decision

Optimize CPU/Memory When:

• Latency under 1ms required
• High-frequency operations
• Cache hit optimization needed

Optimize Storage When:

• Database queries slow (>10ms)
• Need SSD over HDD (20x faster)
• I/O bound workloads

Optimize Network When:

• Cross-region latency >50ms
• Global user distribution
• Large payload transfers

💡 For implementation guides and code examples: See our technology deep dives: Observability, Redis, PostgreSQL

Latency Numbers Every Programmer Should Know

Key performance numbers across CPU, storage, and network operations.

CPU / Memory

L1 Cache Reference0.5 ns

CPU-local; near zero

Branch Mispredict5 ns

Pipeline flush

L2 Cache Reference7 ns

Shared per core cluster

Mutex Lock/Unlock25 ns

Uncontended fast-path

Main Memory Reference100 ns

NUMA/DDR; locality matters

Storage

Compress 1 KB (zippy)3 μs

CPU bound; varies with level

Read 4 KB from SSD150 μs

Random I/O

Read 1 MB from Memory250 μs

Sequential memcpy

Read 1 MB from SSD1 ms

NVMe, QD impacts

HDD Disk Seek10 ms

Rotational latency

Read 1 MB from HDD20 ms

Sustained read

Network

Send 1 KB over 1 Gbps10 μs

Serialization only

Round trip in datacenter500 μs

Same DC, multiple hops

Packet California → Netherlands150 ms

Speed-of-light floor

Network Round-Trip Time Comparison

Understanding the massive scale differences in network latency.

Intra-DC RTT0.5ms

Cross-Region RTT80ms

Intercontinental RTT150ms

🧠 Latency Memory Palace

Visual mnemonics to remember the most critical latency numbers. These stick better than raw microseconds.

⚡

L1 Cache

0.5ns = Lightning speed

🧠

Memory

100ns = Brain synapse

💿

SSD Read

150μs = CD track skip

🌍

Network

150ms = Blink of eye

🎯 Optimization Priority Matrix

Where to focus your optimization efforts based on current performance and potential gains.

🔥 High Impact Optimizations

Add CDN: Save 50-150ms on global requests

Cache Layer: Turn 10ms DB calls into 1ms cache hits

HDD → SSD: 20x improvement (10ms → 0.5ms)

Connection Pooling: Eliminate connection overhead

⚠️ Micro-Optimizations (Later)

CPU Cache: Only for high-frequency hot paths

Memory Layout: Premature unless measured bottleneck

Compression: When bandwidth exceeds CPU cost

Algorithm Tweaks: Profile first, optimize second

📏 Scale Perspective

Understanding the massive scale differences helps prioritize optimization efforts.

1,000x

Memory vs Network

100ns → 100ms

20x

SSD vs HDD

0.5ms → 10ms

300x

Same-DC vs Global

0.5ms → 150ms

💡 Insight: The biggest performance gains come from architectural changes (adding cache layers, CDNs) rather than micro-optimizations (CPU cache efficiency). Focus on the 1000x improvements first.

These numbers are approximations and vary significantly with hardware, network conditions, and workload patterns. Use them for rough estimates and architectural decisions, but always measure your specific use case.