What is Azure Cosmos DB?
Azure Cosmos DB is Microsoft's globally distributed, multi-model database service designed for modern applications requiring massive scale, low latency, and high availability. It offers turnkey global distribution across 50+ Azure regions, five well-defined consistency levels, and comprehensive SLAs for availability, latency, throughput, and consistency.
With support for multiple APIs including SQL, MongoDB, Cassandra, Azure Table, and Gremlin, Cosmos DB allows you to use familiar tools and skills while gaining the benefits of a globally distributed database. It automatically scales throughput and storage based on demand and provides predictable performance through Request Units (RUs).
Cosmos DB Performance Calculator
Queries: 100/sec
Global Storage: 250GB
Partitions: 5
RU Utilization: 2.0%
Five Consistency Levels
Strong Consistency
Linearizability guarantee - reads always return the most recent committed value. Highest latency, strongest consistency.
Bounded Staleness
Reads lag behind writes by at most K versions or T time interval. Configurable staleness bounds.
Session Consistency (Default)
Read-your-writes, monotonic reads within a client session. Perfect balance of consistency and performance.
Consistent Prefix
Reads see writes in order, but may lag behind. No out-of-order reads, good for collaborative scenarios.
Eventual Consistency
No ordering guarantee, lowest latency and highest availability. Good for counters and non-critical data.
Multi-API Support
SQL API (Native)
• JSON document model
• SQL query syntax
• JavaScript stored procedures
• ACID transactions
MongoDB API
• MongoDB wire protocol
• Existing MongoDB drivers
• Aggregation pipeline
• GridFS support
Cassandra API
• CQL (Cassandra Query Language)
• Wide-column data model
• Existing Cassandra drivers
• Keyspace and table concepts
Azure Table API
• Azure Table Storage compatible
• Key-value data model
• Premium performance
• Global distribution
Gremlin API
• Apache TinkerPop Gremlin
• Graph data model
• Vertices and edges
• Graph traversals
Real-World Cosmos DB Implementations
Xbox Live
Powers gaming profiles, achievements, and social features for 100+ million gamers worldwide.
- • Global gaming profile consistency
- • Real-time leaderboards and achievements
- • Session consistency for gaming sessions
- • Multi-region low-latency access
Progressive Insurance
Uses Cosmos DB for real-time insurance quote calculations and customer data management.
- • Real-time insurance quotes
- • Customer profile management
- • Claims processing workflows
- • Regulatory compliance across states
Jet.com
Leverages Cosmos DB for e-commerce catalog, pricing, and recommendation systems.
- • Product catalog and inventory
- • Dynamic pricing algorithms
- • Customer recommendation engine
- • Order processing and tracking
Symantec
Utilizes Cosmos DB for global threat intelligence and security data analytics.
- • Global threat intelligence database
- • Real-time security event processing
- • Malware signature distribution
- • Customer security dashboard analytics
Cosmos DB Code Examples
SQL API Query
Query documents with SQL syntax and JOIN operations:
SELECT
u.id,
u.name,
u.email,
COUNT(o.id) as order_count,
SUM(o.total) as total_spent
FROM users u
JOIN orders o IN u.orders
WHERE u.city = 'Seattle'
AND o.date >= '2024-01-01'
GROUP BY u.id, u.name, u.email
ORDER BY total_spent DESC
MongoDB API
Use MongoDB drivers and aggregation pipeline:
db.users.aggregate([
{
$match: {
city: "Seattle",
"orders.date": { $gte: "2024-01-01" }
}
},
{
$unwind: "$orders"
},
{
$group: {
_id: "$_id",
name: { $first: "$name" },
total_spent: { $sum: "$orders.total" },
order_count: { $sum: 1 }
}
},
{ $sort: { total_spent: -1 } }
])
Gremlin Graph Traversal
Graph traversals for social networks and recommendations:
// Find friends of friends who like similar products
g.V('user123')
.out('follows')
.out('follows')
.where(
out('likes')
.in('likes')
.hasId('user123')
)
.dedup()
.values('name')
.limit(10)
Cosmos DB Best Practices
✅ Do
- • Choose partition keys with high cardinality
- • Use session consistency for most applications
- • Implement proper retry logic with backoff
- • Monitor RU consumption and optimize queries
- • Use autoscale for variable workloads
- • Design for cross-partition queries sparingly
❌ Don't
- • Use sequential or timestamp partition keys
- • Ignore hot partition warnings
- • Over-provision RUs for steady-state workloads
- • Store large documents (>100KB) without consideration
- • Use strong consistency unless absolutely required
- • Mix transactional and analytical queries