| Scale | Users | Groups | Expenses | Key Changes |
|---|---|---|---|---|
| Small | 100 | 20 | 500 | Simple in-memory or single database instance, no caching needed |
| Medium | 10,000 | 2,000 | 50,000 | Database load increases, need read replicas and caching for frequent queries |
| Large | 1,000,000 | 200,000 | 5,000,000 | Database sharding, horizontal scaling of app servers, caching layers, async processing |
| Very Large | 100,000,000 | 20,000,000 | 500,000,000 | Multi-region deployment, advanced sharding, CDN for static data, event-driven architecture |
User, Group, Expense classes in LLD - Scalability & System Analysis
At small scale, the database is the first bottleneck because it handles all user, group, and expense data. As users and expenses grow, the database query load and write volume increase beyond a single instance's capacity.
- Database: Add read replicas to handle read-heavy queries like fetching user groups and expenses.
- Caching: Use in-memory caches (e.g., Redis) for frequently accessed data like user profiles and group memberships.
- Sharding: Partition the database by user ID or group ID to distribute write and read load across multiple servers.
- Horizontal Scaling: Add more application servers behind a load balancer to handle increased traffic.
- Async Processing: Use message queues for expense processing to reduce synchronous load.
- CDN: For static assets related to users or groups, use CDN to reduce bandwidth and latency.
- At 1M users, assuming each user generates 5 expense requests per day, total requests per second ~ 60 (5M requests/day ÷ 86400 seconds).
- Database needs to handle ~100 QPS for reads and writes combined, requiring multiple replicas and sharding.
- Storage: Each expense record ~1 KB, so 5M expenses ~5 GB storage, manageable but grows linearly.
- Network bandwidth: Assuming 1 KB per request, 60 QPS ~ 60 KB/s, low but grows with user base.
Start by identifying key entities (User, Group, Expense) and their relationships. Discuss expected load and data growth. Identify the first bottleneck (usually database). Propose scaling solutions step-by-step: caching, read replicas, sharding, horizontal scaling. Mention trade-offs and monitoring.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to distribute read load and implement caching for frequent queries. Then consider sharding the database to handle increased write volume.