Design a Search Ranking System

Build a machine learning-powered search system that delivers highly relevant, personalized results at massive scale with real-time indexing and ranking.

System Requirements

Functional Requirements

  • Real-time search with sub-second latency
  • Relevance scoring with multiple signals
  • Personalized search results per user
  • Query auto-completion and suggestions
  • Faceted search and filtering
  • Search analytics and click tracking
  • A/B testing for ranking algorithms
  • Safe search and content filtering

Non-Functional Requirements

  • Handle 100K+ queries per second
  • Index 10B+ documents with real-time updates
  • Sub-200ms query response time P95
  • 99.9% search availability
  • Support 50+ languages and locales
  • Handle typos and fuzzy matching
  • Scale to petabytes of indexed content
  • Maintain relevance quality > 90%

Ranking Signal Architecture

Text Relevance

TF-IDF, BM25, semantic matching

40%
Weight
Key Factors:
Keyword match
Title relevance
Content quality
Language match
Implementation:
Elasticsearch/Solr + embeddings

Authority/Quality

Domain authority, page quality, freshness

25%
Weight
Key Factors:
Domain authority
Content freshness
Source credibility
Spam detection
Implementation:
Graph algorithms + ML models

User Behavior

Click-through rates, dwell time, bounce rate

20%
Weight
Key Factors:
CTR
Dwell time
Bounce rate
Return visits
Implementation:
Real-time ML pipeline

Personalization

User history, preferences, location

10%
Weight
Key Factors:
Search history
Location
Device
Time context
Implementation:
User embeddings + collaborative filtering

Business Logic

Promoted content, partnerships, compliance

5%
Weight
Key Factors:
Sponsored results
Regional preferences
Content policies
Business rules
Implementation:
Rule engine + manual overrides

System Architecture Components

Query Processing

  • • Query parsing & analysis
  • • Intent classification
  • • Query expansion
  • • Spell correction
  • • Auto-completion

Search Engine

  • • Inverted index
  • • Sharding & replication
  • • Faceted search
  • • Fuzzy matching
  • • Caching layer

Ranking Engine

  • • ML model serving
  • • Feature extraction
  • • Score combination
  • • Personalization
  • • A/B testing

Content Indexer

  • • Web crawling
  • • Content extraction
  • • Duplicate detection
  • • Quality scoring
  • • Real-time updates

Analytics Engine

  • • Click tracking
  • • Query analytics
  • • Performance monitoring
  • • User behavior analysis
  • • Relevance evaluation

User Profile Service

  • • Search history
  • • Preference learning
  • • User embeddings
  • • Privacy controls
  • • Personalization

Capacity Estimation

Search Traffic & Performance

Query Types
70%Informational
30%Transactional
Result Clicks
92%First Page
8%Beyond Page 1
Search Volume
150K QPSPeak Hours
60K QPSOff-Peak

Performance Metrics

Daily Queries
Peak: 150K QPS
10B+
Query Latency P95
End-to-end response
180ms
Index Size
Compressed & sharded
500TB+
Relevance Score
Human evaluation
92%
Cache Hit Rate
Query result caching
80%

Infrastructure Requirements

Search Cluster
2000+ nodes, 5PB storage
ML Serving
500 GPU nodes for ranking
Cache Layer
100TB Redis cluster

Practice Questions

1

Design a learning-to-rank system that incorporates both content features and user behavior signals in real-time.

2

How would you handle query expansion and semantic search to improve recall for long-tail queries?

3

Design an A/B testing framework for search ranking algorithms that accounts for position bias and novelty effects.

4

How would you implement personalized search while maintaining user privacy and avoiding filter bubbles?

5

Design a real-time indexing pipeline that can handle billions of document updates while maintaining search consistency.