Database Fundamentals
Master the foundation of data storage and retrieval. Learn when to use SQL vs NoSQL, understand database types, and make informed decisions for your system design.
Database Types & When to Use Them
Different database types excel at different problems. Understanding their strengths and weaknesses helps you choose the right tool for your specific use case.
Relational (SQL)
- • ACID compliance
- • Complex queries
- • Data consistency
- • Mature ecosystem
- • Vertical scaling limits
- • Schema rigidity
- • JOIN performance
- • Financial transactions
- • User accounts
- • Inventory management
- • Reporting systems
Document (NoSQL)
- • Flexible schema
- • JSON-like structure
- • Horizontal scaling
- • Developer friendly
- • No complex queries
- • Data duplication
- • Eventual consistency
- • Content management
- • Catalogs
- • User profiles
- • Real-time analytics
Key-Value
- • Ultra-fast reads
- • Simple model
- • Massive scale
- • High availability
- • No complex queries
- • Limited relationships
- • No transactions
- • Caching
- • Session storage
- • Shopping carts
- • Real-time recommendations
Graph
- • Relationship queries
- • Complex connections
- • Path finding
- • Pattern matching
- • Complex setup
- • Limited tools
- • Steep learning curve
- • Social networks
- • Fraud detection
- • Recommendation engines
- • Network analysis
CAP Theorem in Practice
CAP Theorem states you can only guarantee 2 out of 3: Consistency, Availability, Partition tolerance. In distributed systems, network partitions are inevitable, so you must choose between consistency and availability.
Consistency (C)
All nodes see the same data simultaneously
Availability (A)
System remains operational at all times
Partition Tolerance (P)
System continues despite network failures
CP Systems (Consistency + Partition Tolerance)
AP Systems (Availability + Partition Tolerance)
Database Scaling Strategies
Vertical Scaling (Scale Up)
Horizontal Scaling (Scale Out)
Common Horizontal Scaling Techniques
Read Replicas
Create read-only copies of your database to handle read traffic
Sharding
Split data across multiple databases based on shard key
Federation
Split databases by function (users, products, orders)
Database Selection Framework
Use this decision tree to choose the right database type for your specific requirements.
Step 1: Data Structure
- • Complex relationships between entities
- • Need complex queries and aggregations
- • ACID compliance is critical
- • Structured data with fixed schema
- • Flexible or evolving schema
- • Simple queries, key-based access
- • Horizontal scaling requirements
- • Semi-structured or unstructured data
Step 2: Scale Requirements
Any database will work. Choose based on team expertise.
SQL with proper indexing and read replicas usually sufficient.
Consider NoSQL, sharding, or distributed SQL systems.
Database Quick Reference
When in Doubt
- • Start with PostgreSQL (best general-purpose DB)
- • Add Redis for caching and sessions
- • Consider read replicas before sharding
- • Monitor before optimizing
- • Avoid premature optimization
Red Flags
- • Multiple database types without clear justification
- • Choosing NoSQL only for "web scale"
- • Ignoring data consistency requirements
- • Not planning for growth patterns
- • Choosing unfamiliar technology under pressure
🎯 Database Selection in the Real World
Learn from actual database decisions made by major tech companies
Scenarios
Context
Instagram moved from MySQL to Cassandra for photo metadata storage
Metrics
Outcome
Cassandra's AP properties (availability + partition tolerance) perfectly matched Instagram's need for global photo storage with eventual consistency.
Key Lessons
- •Photo metadata doesn't require strong consistency - eventual consistency is acceptable
- •Cassandra's peer-to-peer architecture eliminated single points of failure
- •Linear scaling allowed Instagram to handle massive growth without complex sharding
- •Trade-off: Lost complex query capabilities but gained operational simplicity