| Scale | Board Size | Number of Players | Game State Size | Server Load | Latency Impact |
|---|---|---|---|---|---|
| 100 users | 3x3 to 5x5 | 2-4 players | Small (few KB) | Low, single server sufficient | Minimal |
| 10,000 users | 5x5 to 10x10 | 2-6 players | Medium (tens of KB) | Moderate, multiple servers with load balancer | Noticeable if unoptimized |
| 1,000,000 users | 10x10 to 20x20 | 2-10 players | Large (hundreds of KB) | High, horizontal scaling needed | Needs optimization (caching, async updates) |
| 100,000,000 users | 20x20+ | Multiple players (10+) | Very large (MBs per game) | Very high, distributed system with sharding | Critical to optimize network and data flow |
Extensibility (NxN board, multiple players) in LLD - Scalability & System Analysis
The first bottleneck is the game state storage and synchronization. As the board size (NxN) and number of players increase, the amount of data to store and update grows quickly. This stresses the database and network bandwidth, causing delays in state updates and player experience degradation.
- Horizontal scaling: Add more game servers behind a load balancer to handle more concurrent games and players.
- State partitioning (sharding): Split game states by game ID or player groups to distribute database load.
- Caching: Use in-memory caches (e.g., Redis) for frequently accessed game state to reduce DB hits.
- Event-driven updates: Use message queues or pub/sub to asynchronously update players, reducing synchronous load.
- Efficient data structures: Store only diffs or changes instead of full board state to reduce data size.
- CDN and edge computing: For static assets or game logic, reduce latency by serving closer to players.
Assuming 1 million users playing simultaneously with average 5 players per game:
- Games running concurrently: ~200,000 (1,000,000 / 5)
- Requests per second (QPS): Each player sends ~1 move per 5 seconds -> 200,000 moves/sec total
- Database QPS: 200,000 updates/sec is too high for single DB instance (max ~10,000 QPS)
- Bandwidth: Each update ~1 KB -> 200 MB/s outbound bandwidth needed
- Storage: Each game state ~100 KB, 200,000 games -> 20 GB in memory or fast storage
Start by defining the core components: game state, players, and board size. Discuss how increasing board size and players affects data size and update frequency. Identify the database and network as bottlenecks early. Propose clear scaling strategies like sharding and caching. Always relate solutions back to the bottleneck you identified.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Introduce read replicas and caching to reduce load on the primary database. Also consider sharding the game state by game ID to distribute writes across multiple database instances.
