System Designer - Learn System Design & Software Architecture

What is Amazon Kinesis?

Amazon Kinesis is a platform for streaming data on AWS, offering powerful services to make it easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information. Kinesis enables you to ingest real-time data such as video, audio, application logs, website clickstreams, and IoT telemetry data for machine learning, analytics, and other applications.

With Kinesis, you can process and analyze data as it arrives and respond immediately instead of having to wait until all your data is collected before processing can begin. It's used by Netflix for real-time video analytics, by Spotify for music recommendation engines, and by Tesla for processing millions of vehicle telemetry events.

Kinesis Performance & Cost Calculator

Records per Second: 1,000

Average Record Size: 1 KB

Number of Shards: 2

Data Retention: 24 hours

$23.5

Monthly Cost

0.98 MB/s

Throughput

50%

Shard Utilization

82.4 GB

Daily Data

Shards Needed: 1 (recommended)

Storage Cost: $1.9/month

Total Storage: 82.4 GB

Kinesis Service Family

Kinesis Data Streams

Real-time data streaming with custom processing applications.

• Sub-second processing latency
• 1,000 records/sec per shard
• Multiple consumer applications
• 24h to 365 days retention
• Enhanced fan-out capability

Kinesis Data Firehose

Fully managed delivery to data lakes and analytics services.

• Near real-time (60s minimum)
• Automatic scaling
• Built-in data transformation
• Format conversion (Parquet/ORC)
• No infrastructure management

Kinesis Data Analytics

SQL-based real-time analytics on streaming data.

• Standard ANSI SQL queries
• Windowing and aggregation
• Pattern matching (MATCH_RECOGNIZE)
• Automatic scaling
• Built-in ML functions

Kinesis Video Streams

Secure video streaming for analytics and ML.

• Millions of device streams
• Rekognition integration
• WebRTC support
• Time-indexed storage
• Automatic durability

Real-World Kinesis Implementations

Netflix

Processes 500+ billion events daily for real-time personalization, A/B testing, and operational monitoring.

• 500+ billion events per day
• Real-time recommendation updates
• Video quality optimization
• Operational monitoring and alerting

Spotify

Uses Kinesis for real-time music recommendation engine and user behavior analytics across 400M+ users.

• Real-time playlist generation
• Music discovery algorithms
• User behavior pattern analysis
• Content recommendation optimization

Tesla

Collects and processes millions of vehicle telemetry events for autonomous driving improvements.

• Vehicle sensor data streaming
• Real-time fleet monitoring
• Autonomous driving training data
• Predictive maintenance alerts

Capital One

Processes financial transactions in real-time for fraud detection and risk management.

• Real-time fraud detection
• Transaction monitoring
• Risk scoring algorithms
• Compliance reporting automation

Common Kinesis Architecture Patterns

Real-time Analytics Pipeline

Stream processing for immediate insights and automated actions.

Analytics Pipeline Flow

Data Sources → Kinesis Data Streams → Lambda/KCL Consumer → Real-time Dashboard
                                   ↓
                          Kinesis Analytics (SQL) → Alerts & Actions

Data Lake Ingestion

Continuous delivery of streaming data to data lakes with transformation.

Data Lake Architecture

Applications → Kinesis Data Firehose → S3 Data Lake (Parquet)
                      ↓                    ↓
                  Lambda Transform    Athena/Glue Analytics

IoT Data Processing

Scalable ingestion and processing of IoT sensor data with anomaly detection.

IoT Devices → API Gateway → Lambda → Kinesis Data Streams
↓
Kinesis Analytics (Anomaly Detection)
↓
SNS Alerts

Kinesis Best Practices

✅ Do

• Use meaningful partition keys for even distribution
• Implement exponential backoff for retry logic
• Monitor shard-level metrics and hot shards
• Use Enhanced Fan-Out for multiple consumers
• Batch records when possible (up to 500 per call)
• Set appropriate data retention periods
• Use dead letter queues for failed processing
• Implement idempotent record processing

❌ Don't

• Use sequential partition keys (creates hot shards)
• Ignore ProvisionedThroughputExceeded errors
• Create too many small records (high overhead)
• Process records synchronously without parallelism
• Forget to handle duplicate record delivery
• Use Kinesis for small-scale data (under 100 MB/hour)
• Mix different data types in same stream
• Ignore consumer lag monitoring

Shard Scaling Guidelines

Scale Up When

WriteProvisionedThroughputExceeded

Consistent throttling errors

Optimal Utilization

70-80% of capacity

700-800 records/sec per shard

Scale Down When

Low utilization

Under 200 records/sec per shard

No quiz questions available

Questions prop is empty