Design a Real-time Fraud Detection System

Build a machine learning system that detects fraudulent transactions in real-time while minimizing false positives and handling massive scale.

System Requirements

Functional Requirements

  • Real-time transaction scoring (< 100ms)
  • Batch fraud model training and updates
  • Multi-layered fraud rules and ML models
  • Alert generation and case management
  • Historical fraud pattern analysis

Non-Functional Requirements

  • Process 100K+ transactions per second
  • Sub-100ms latency for scoring decisions
  • 99.9% uptime for real-time scoring
  • False positive rate < 1%
  • Fraud detection rate > 95%

Capacity Estimation

Transaction Volume & Response Times

Legitimate vs Fraud
1%Fraud
99%Legitimate
Daily Transactions
120K TPSPeak Hour
50K TPSAverage Hour
Response Times
10msRule Engine
80msML Models

Performance Metrics

Daily Transactions
Peak: 120K TPS
10B+
Fraud Detection Rate
ML + Rules combined
97%
False Positive Rate
Minimized for UX
0.8%
Response Time P99
Real-time scoring
85ms

Infrastructure Requirements

Scoring Service
500+ servers for 120K TPS
Feature Store
10TB Redis + Cassandra
ML Training
GPU clusters + Spark

Practice Questions

1

How would you handle model drift in fraud detection? Design an automated retraining pipeline.

2

Design a feature store that serves both real-time and batch ML models with sub-10ms latency.

3

How do you balance fraud detection accuracy vs user experience? Design a feedback loop system.