| Users | Services | Databases | Data Volume | Traffic | Complexity |
|---|---|---|---|---|---|
| 100 users | 5-10 | 5-10 small DBs | Low (MBs) | Low (few QPS) | Simple coordination |
| 10,000 users | 10-20 | 10-20 medium DBs | Medium (GBs) | Medium (hundreds QPS) | Need service discovery, monitoring |
| 1,000,000 users | 20-50 | 20-50 large DBs | High (TBs) | High (thousands QPS) | Complex data consistency, backups |
| 100,000,000 users | 50+ | 50+ very large DBs | Very High (PBs) | Very High (tens of thousands QPS) | Advanced sharding, cross-service sync |
Database per service pattern in Microservices - Scalability & System Analysis
At small scale, the database for each service handles requests well. As users grow to thousands or millions, the first bottleneck is the database instance for a service. Each database can handle only so many queries per second (typically 5,000-10,000 QPS). When traffic grows, the database CPU, memory, or disk I/O saturates first.
Also, cross-service data consistency and communication overhead increase, causing latency and complexity.
- Horizontal scaling: Add more instances of the service and database replicas to distribute load.
- Read replicas: Use read-only replicas to offload read queries from the primary database.
- Database sharding: Split large databases by user ID or other keys to spread data and queries.
- Caching: Use in-memory caches (like Redis) to reduce database hits for frequent reads.
- Service isolation: Keep databases per service to avoid contention and allow independent scaling.
- Asynchronous communication: Use message queues to decouple services and reduce synchronous DB calls.
- Monitoring and automation: Track DB performance and automate scaling decisions.
Assuming 1 million users generating 10,000 QPS total:
- Each service DB handles ~200-500 QPS (if 20 services), within typical DB limits.
- Storage per DB: 1 TB for user data and logs.
- Network bandwidth: 10,000 QPS * 1 KB/request = ~10 MB/s total, manageable on 1 Gbps links.
- Cache layer reduces DB load by 30-50%, saving CPU and I/O.
- Adding replicas and shards increases infrastructure cost but improves availability and performance.
When discussing scalability for database per service pattern, start by explaining the isolation benefits. Then identify the database as the first bottleneck as traffic grows. Discuss how to horizontally scale databases with replicas and sharding. Mention caching and asynchronous communication to reduce load. Finally, highlight monitoring and automation for proactive scaling.
Question: Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first and why?
Answer: First, add read replicas to offload read queries from the primary database. This reduces CPU and I/O pressure. If writes also increase, consider sharding the database to distribute write load. Caching frequent reads can also help. This approach addresses the database bottleneck effectively.