Communication Patterns

20 min readAdvanced
Not Started
Loading...

Master the patterns and protocols that enable services to communicate effectively in distributed systems. Learn when to use synchronous vs asynchronous communication and how to handle failures gracefully.

Types of Service Communication

How services communicate fundamentally shapes your system's performance, reliability, and complexity. Each approach has distinct trade-offs that affect user experience and operational requirements.

Communication Design Principles

  • Loose Coupling: Services should depend on contracts, not implementations
  • Fault Tolerance: Assume network calls will fail and design accordingly
  • Performance: Choose patterns that match your latency and throughput requirements
  • Consistency: Decide between strong consistency and availability based on business needs
1

Synchronous (Request-Response)

Direct communication where sender waits for response

Latency
10-100ms
Complexity
Low
Reliability
Medium
Protocols
HTTP/REST, gRPC, GraphQL
Best Use Cases
  • User-facing APIs
  • Real-time queries
  • Simple CRUD operations
Common Challenges
  • Blocking calls
  • Cascading failures
  • Tight coupling
2

Asynchronous (Fire-and-Forget)

Sender doesn't wait for response, continues processing

Latency
1-10ms
Complexity
Medium
Reliability
High
Protocols
Message Queues, Event Streams, Webhooks
Best Use Cases
  • Background processing
  • Event notifications
  • Data pipeline
Common Challenges
  • Message ordering
  • Duplicate handling
  • Dead letter queues
3

Event-Driven Architecture

Services communicate through domain events and state changes

Latency
1-50ms
Complexity
High
Reliability
Very High
Protocols
Apache Kafka, AWS EventBridge, RabbitMQ
Best Use Cases
  • Microservices coordination
  • Real-time analytics
  • CQRS/Event Sourcing
Common Challenges
  • Event schema evolution
  • Eventual consistency
  • Complex debugging

Performance Characteristics

Different communication patterns have vastly different performance characteristics. Choose based on your system's latency, throughput, and reliability requirements.

Response Time
100mssync HTTP
5msasync queue
Throughput
1000 req/secsync HTTP
50000 req/secasync queue
Fault Tolerance
60%sync HTTP
95%async queue

Synchronous (HTTP/gRPC)

Latency
Network roundtrip + processing
Medium
Throughput
Blocking calls reduce concurrency
Limited
Debugging
Request/response correlation
Easy

Asynchronous (Queues)

Latency
Non-blocking, immediate return
Low
Throughput
Parallel processing, buffering
High
Debugging
Eventual consistency, correlation
Hard

Event-Driven

Latency
Depends on event processing
Variable
Throughput
Stream processing, parallelism
Very High
Debugging
Complex event flows, timing
Very Hard

Common Messaging Patterns

These patterns provide proven solutions for common distributed system communication challenges. Each pattern addresses specific reliability, scalability, and consistency requirements.

1

Point-to-Point (Queue)

One producer sends to one consumer via queue

Example Flow
Order processing: Order Service → Payment Queue → Payment Service
Guarantees
Exactly-once delivery to single consumer
Pros
  • Simple model
  • Load balancing
  • Guaranteed processing
Cons
  • Single consumer bottleneck
  • No broadcast capability
Best for: Background jobs, task processing
2

Publish-Subscribe (Topic)

One producer sends to multiple interested consumers

Example Flow
User registration: User Service → User Created Event → Email, Analytics, Billing
Guarantees
At-least-once delivery to all subscribers
Pros
  • Multiple consumers
  • Loose coupling
  • Easy to add new consumers
Cons
  • Potential duplicate delivery
  • Consumer management complexity
Best for: Event notifications, real-time updates
3

Request-Reply with Correlation

Asynchronous request with response correlation ID

Example Flow
Price calculation: Order Service → Pricing Request → Pricing Response
Guarantees
Response matched to original request
Pros
  • Non-blocking
  • Timeout handling
  • Parallel processing
Cons
  • Correlation complexity
  • Response handling
  • State management
Best for: Long-running computations, external API calls
4

Saga Pattern

Coordinate long-running transactions across services

Example Flow
Booking flow: Reserve Hotel → Reserve Flight → Charge Card (with compensations)
Guarantees
Either all steps complete or compensate
Pros
  • Distributed transactions
  • Failure recovery
  • Business process modeling
Cons
  • Complex implementation
  • Compensation logic
  • Debugging difficulty
Best for: Multi-service transactions, business workflows

Real-World Implementations

Learn how major tech companies implement communication patterns at scale. These examples show practical applications and lessons learned from production systems.

1

Netflix

1M+ events/second
Video Streaming Pipeline

Video upload triggers encoding, thumbnail generation, metadata extraction, CDN distribution

Pattern & Tech
Event-driven with Kafka
Apache KafkaAWS SQSHystrix Circuit Breaker
Key Lessons
  • Event schemas matter
  • Dead letter queues essential
  • Monitor everything
2

Uber

100K+ rides/minute
Ride Matching System

Real-time location updates (async) + ride matching logic (sync) + payment processing (async)

Pattern & Tech
Hybrid sync/async
Apache KafkaRedis StreamsgRPC
Key Lessons
  • Latency vs consistency trade-offs
  • Circuit breakers prevent cascades
  • Idempotency is crucial
3

Slack

10M+ concurrent connections
Message Delivery

Real-time message delivery with guaranteed ordering and offline support

Pattern & Tech
WebSocket + Message Queues
WebSocketApache PulsarRedis
Key Lessons
  • Connection management is hard
  • Message ordering matters
  • Graceful degradation
4

Shopify

1M+ orders/day
Order Processing

Inventory check → Payment → Fulfillment → Shipping with full audit trail

Pattern & Tech
Saga with Event Sourcing
EventStoreRabbitMQGraphQL
Key Lessons
  • Event sourcing enables replay
  • Compensation is complex
  • Testing sagas is critical

Error Handling & Resilience Patterns

Failure Handling

Circuit Breaker

Prevent cascading failures by failing fast when downstream services are unhealthy

Example: After 5 failures in 30s, open circuit for 60s

Retry with Backoff

Automatically retry failed requests with increasing delays

Example: Retry after 1s, 2s, 4s, 8s, then give up

Dead Letter Queue

Route messages that can't be processed to special queue for investigation

Example: After 3 failed processing attempts, move to DLQ

Data Consistency

Idempotency

Ensure operations can be safely retried without side effects

Example: Use unique request IDs to prevent duplicate processing

Outbox Pattern

Ensure database updates and message publishing happen atomically

Example: Write message to database table, separate process publishes

Eventual Consistency

Accept temporary inconsistency for better availability and performance

Example: User sees their own posts immediately, others see eventually

Communication Technology Comparison

TechnologyTypeLatencyThroughputOrderingBest For
HTTP/RESTSyncMediumMediumN/AUser-facing APIs, CRUD operations
gRPCSyncLowHighN/AService-to-service, high performance
Apache KafkaAsyncLowVery HighStrongEvent streaming, real-time analytics
RabbitMQAsyncMediumMediumOptionalTask queues, workflow orchestration
WebSocketAsyncVery LowMediumStrongReal-time updates, gaming, chat

Communication Pattern Decision Guide

Use Synchronous When:

  • ☐ User is waiting for response (interactive)
  • ☐ Strong consistency required
  • ☐ Simple request-response pattern
  • ☐ Error handling needs to be immediate
  • ☐ Low latency is critical

Use Asynchronous When:

  • ☐ Background processing acceptable
  • ☐ High throughput required
  • ☐ Loose coupling preferred
  • ☐ Fault tolerance is critical
  • ☐ Multiple consumers needed

🎯 Communication Pattern Successes and Failures

Learn from real-world communication pattern decisions and their impact on system reliability

Scenarios

WhatsApp Message Delivery Reliability
How WhatsApp ensures message delivery across 2+ billion users
Twitter Tweet Fanout Architecture
Twitter's evolution from pull to push-based tweet delivery
Netflix Video Streaming Coordination
Netflix's event-driven architecture for video processing pipeline
Zoom Video Conferencing During COVID
How Zoom scaled from 10M to 300M daily participants
Discord Server and Voice Chat Scale
Discord's real-time communication architecture for gaming communities
Airbnb Booking Saga Implementation
Airbnb's distributed transaction handling for booking flow

Context

How WhatsApp ensures message delivery across 2+ billion users

Metrics

Daily Messages
100+ billion messages
Delivery Success
99.9% success rate
Pattern Used
Store-and-forward queuing
Offline Support
30-day message retention

Outcome

Asynchronous message queuing with persistent storage enables reliable delivery even when recipients are offline. Messages stored until successful delivery.

Key Lessons

  • Store-and-forward queuing essential for mobile messaging reliability
  • Message acknowledgments at multiple levels (sent, delivered, read)
  • Offline-first design: assume network is unreliable
  • End-to-end encryption must work with asynchronous delivery patterns

📝 Communication Patterns Quiz

1 of 5Current: 0/5

When would you choose asynchronous communication over synchronous for a user-facing feature?