| Users / Games | 100 Players | 10,000 Players | 1,000,000 Players | 100,000,000 Players |
|---|---|---|---|---|
| Concurrent Games | ~10-20 games | ~1,000 games | ~100,000 games | ~10,000,000 games |
| Turn Requests per Second | ~50-100 TPS | ~5,000 TPS | ~500,000 TPS | ~50,000,000 TPS |
| State Storage Size | Small (MBs) | Medium (GBs) | Large (TBs) | Very Large (PBs) |
| Latency Requirement | Low (100ms) | Low (100ms) | Very Low (50ms) | Very Low (50ms) |
| System Complexity | Simple queue or lock | Distributed locks, message queues | Sharded state, event sourcing | Global coordination, multi-region sync |
Player turn management in LLD - Scalability & System Analysis
The first bottleneck is the turn state management storage. At small scale, a single server can handle turn updates and locks. As players and games grow, the database or state store that tracks whose turn it is becomes overwhelmed by concurrent updates and queries. This causes delays and inconsistent turn order.
- Horizontal scaling: Add more application servers to handle turn requests concurrently.
- Distributed locking: Use distributed locks or consensus (e.g., Redis Redlock, ZooKeeper) to manage turn order safely across servers.
- Sharding: Partition games by player ID or game ID to spread load across multiple databases or caches.
- Caching: Use in-memory caches (Redis, Memcached) to quickly read/write turn state and reduce database load.
- Event sourcing: Store turn events in an append-only log to rebuild state and support replay or recovery.
- CDN and edge computing: For turn notifications, use CDN or edge servers to reduce latency for players globally.
- At 10,000 players with ~5,000 TPS, a single Redis instance (handling ~100K ops/sec) can support turn state caching.
- Database writes for turn updates at 5,000 QPS require connection pooling and read replicas to avoid overload.
- Storage for turn history: assuming 1KB per turn event, 5,000 TPS means ~5MB/s or ~432GB/day of data.
- Network bandwidth: 5,000 TPS with 1KB payload = ~5MB/s, well within 1Gbps network capacity.
- At 1M players, sharding and multiple Redis clusters are needed to handle ~500,000 TPS.
Start by explaining the core challenge: managing turn order consistently and quickly. Then discuss how load grows with players and games. Identify the bottleneck (state storage and locking). Propose scaling solutions step-by-step: caching, distributed locks, sharding. Mention trade-offs like consistency vs latency. Finish with monitoring and fallback plans.
Your database handles 1000 QPS for turn updates. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas and implement caching for turn state to reduce direct database load. Also consider sharding the data by game or player to distribute writes. Avoid scaling vertically only, as it has limits.
