0
0
LLDsystem_design~10 mins

Order tracking state machine in LLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Order tracking state machine
Growth Table: Order Tracking State Machine
UsersOrders per SecondState Transitions per SecondStorage Size (Order States)Latency Requirements
1001020~10 MBLow (seconds)
10,0001,0002,000~1 GBMedium (sub-second)
1,000,000100,000200,000~100 GBHigh (milliseconds)
100,000,00010,000,00020,000,000~10 TBVery High (milliseconds)
First Bottleneck

The first bottleneck is the database handling state transitions. As order states update frequently, the database must handle many writes and reads per second. At around 10,000 users, the database write throughput and latency become critical because each order state change requires a write and often a read to confirm the current state.

Scaling Solutions
  • Database Scaling: Use write-optimized databases or NoSQL stores for fast state updates. Add read replicas to handle read-heavy queries.
  • Caching: Cache current order states in memory (e.g., Redis) to reduce database reads.
  • Horizontal Scaling: Add more application servers behind load balancers to handle increased state transition requests.
  • Sharding: Partition orders by user ID or region to distribute database load.
  • Event Sourcing: Use event logs to track state changes asynchronously, reducing direct database writes.
  • CDN: Use CDN for static content but it has minimal impact on state machine scaling.
Back-of-Envelope Cost Analysis
  • At 10,000 users: ~1,000 orders/sec, ~2,000 state transitions/sec.
  • Database must handle ~2,000 writes/sec and ~3,000 reads/sec (including queries).
  • Storage: Each order state record ~1 KB, so 1 million orders ~1 GB storage.
  • Network bandwidth: Assuming 1 KB per state update, 2,000 updates/sec = ~2 MB/s bandwidth.
  • At 1 million users: 100,000 orders/sec, 200,000 state transitions/sec, requiring distributed databases and caching.
Interview Tip

Start by explaining the order state machine and its transitions. Then discuss expected load and how it grows with users. Identify the database as the first bottleneck due to frequent writes. Propose caching and sharding to reduce load. Mention horizontal scaling of app servers. Always justify why each solution fits the bottleneck.

Self Check

Your database handles 1,000 QPS for order state updates. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas and implement caching to reduce direct database reads. Consider sharding the database to distribute write load. Also, horizontally scale application servers to handle increased requests.

Key Result
The database handling frequent order state updates is the first bottleneck; scaling requires caching, sharding, and horizontal scaling of app servers.