| Scale | Users | Requests per Second (RPS) | Data Storage | System Changes |
|---|---|---|---|---|
| Small | 100 users | ~10 RPS | Few MBs (policy rules, logs) | Single server, simple DB, no caching |
| Medium | 10,000 users | ~1,000 RPS | GBs (policy versions, user requests) | DB read replicas, caching, load balancer |
| Large | 1,000,000 users | ~50,000 RPS | TBs (logs, audit trails, refunds) | Sharded DB, distributed cache, microservices |
| Very Large | 100,000,000 users | ~5,000,000 RPS | Petabytes (archived data, analytics) | Multi-region deployment, CDN, event-driven architecture |
Cancellation and refund policy in LLD - Scalability & System Analysis
At small to medium scale, the database becomes the first bottleneck. This is because all cancellation and refund requests require consistent reads and writes to policy data and transaction records. The DB must handle many concurrent queries and updates, especially during peak refund periods.
- Database Scaling: Use read replicas to distribute read load. Implement connection pooling to manage DB connections efficiently.
- Caching: Cache static policy rules and frequently accessed refund statuses to reduce DB hits.
- Horizontal Scaling: Add more application servers behind a load balancer to handle increased request volume.
- Sharding: Partition the database by user ID or region to distribute write load and improve performance.
- Event-Driven Architecture: Use message queues to process refunds asynchronously, reducing synchronous DB load.
- Multi-Region Deployment: Deploy services closer to users to reduce latency and distribute traffic.
- At 10,000 users with ~1,000 RPS, assuming each request is 1 KB, bandwidth needed is ~1 MB/s.
- Storage for policy data and logs grows from MBs to GBs as users increase.
- Refund processing requires additional compute resources; asynchronous processing reduces peak load.
- Database must handle up to 1,000 QPS at medium scale; plan for replicas and sharding accordingly.
Start by identifying key components: user requests, policy data, refund transactions. Discuss expected load and data growth. Identify bottlenecks early, usually the database. Propose scaling solutions step-by-step: caching, read replicas, horizontal scaling, sharding, and asynchronous processing. Always justify why each solution fits the bottleneck.
Your database handles 1,000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to distribute read queries and implement caching for static policy data to reduce database load before considering sharding or more complex solutions.