Design a Real-time Fraud Detection System
Build a machine learning system that detects fraudulent transactions in real-time while minimizing false positives and handling massive scale.
System Requirements
Functional Requirements
- Real-time transaction scoring (< 100ms)
- Batch fraud model training and updates
- Multi-layered fraud rules and ML models
- Alert generation and case management
- Historical fraud pattern analysis
Non-Functional Requirements
- Process 100K+ transactions per second
- Sub-100ms latency for scoring decisions
- 99.9% uptime for real-time scoring
- False positive rate < 1%
- Fraud detection rate > 95%
Capacity Estimation
Transaction Volume & Response Times
Legitimate vs Fraud
1%Fraud
99%Legitimate
Daily Transactions
120K TPSPeak Hour
50K TPSAverage Hour
Response Times
10msRule Engine
80msML Models
Performance Metrics
Daily Transactions
Peak: 120K TPS
10B+
Fraud Detection Rate
ML + Rules combined
97%
False Positive Rate
Minimized for UX
0.8%
Response Time P99
Real-time scoring
85ms
Infrastructure Requirements
Scoring Service
500+ servers for 120K TPS
Feature Store
10TB Redis + Cassandra
ML Training
GPU clusters + Spark
Practice Questions
1
How would you handle model drift in fraud detection? Design an automated retraining pipeline.
2
Design a feature store that serves both real-time and batch ML models with sub-10ms latency.
3
How do you balance fraud detection accuracy vs user experience? Design a feedback loop system.