Bird
0
0
LLDsystem_design~10 mins

Move validation in LLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Move validation
Growth Table: Move Validation System
UsersRequests per Second (RPS)Validation ComplexityLatency RequirementsStorage Needs
100 users~50 RPSSimple synchronous validationLow latency (~100ms)Minimal, mostly in-memory
10,000 users~5,000 RPSModerate complexity, caching possibleLow latency (~50-100ms)Moderate, some persistent logs
1,000,000 users~500,000 RPSHigh complexity, distributed validationVery low latency (~20-50ms)High, distributed storage and caching
100,000,000 users~50,000,000 RPSVery high complexity, sharded and cachedUltra low latency (~10-20ms)Very high, multi-region storage and caching
First Bottleneck

The first bottleneck is the validation logic CPU and memory on the application servers. As user count and requests grow, the synchronous move validation consumes significant CPU cycles and memory, causing increased latency and request queuing.

At medium scale, the database or state store that holds game state for validation also becomes a bottleneck due to high read/write operations.

Scaling Solutions
  • Horizontal scaling: Add more application servers behind a load balancer to distribute validation requests.
  • Caching: Cache frequently accessed game state to reduce database hits during validation.
  • Asynchronous validation: For less critical moves, validate asynchronously to reduce latency impact.
  • Sharding: Partition game state by user or game session to distribute load across multiple databases.
  • Use of in-memory data stores: Employ Redis or similar for fast state access during validation.
  • Optimize validation logic: Simplify or precompute rules to reduce CPU usage.
Back-of-Envelope Cost Analysis
  • At 10,000 users (~5,000 RPS), assuming each validation request is ~1KB, bandwidth needed is ~5MB/s.
  • Database must handle ~5,000 QPS, near upper limit for a single instance; requires read replicas or caching.
  • Application servers: each handles ~2,000 concurrent connections; need ~3 servers minimum.
  • Storage: logs and game state grow with users; estimate ~10GB/day at 10K users, scaling linearly.
Interview Tip

Start by defining the scale and requirements clearly. Identify the critical path for move validation and its latency needs. Discuss bottlenecks in CPU, memory, and database. Propose scaling solutions step-by-step, justifying each with the bottleneck it addresses. Mention trade-offs like consistency vs latency. Use real numbers to show understanding.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas and implement caching to reduce direct database load before scaling vertically or sharding.

Key Result
Move validation systems first hit CPU and memory limits on app servers as user requests grow; scaling requires horizontal app servers, caching, and database sharding.