0
0
LLDsystem_design~10 mins

Order state machine in LLD - Scalability & System Analysis

Choose your learning style9 modes available
Scalability Analysis - Order state machine
Growth Table: Order State Machine
Users / Orders100 Orders/day10,000 Orders/day1,000,000 Orders/day100,000,000 Orders/day
Order State TransitionsSimple DB updates, single instanceIncreased DB writes, possible queueingHigh DB load, need async processingMassive scale, distributed state management
System ComponentsSingle app server, monolithic state logicMultiple app servers, load balancerMicroservices for order states, event-drivenGlobal distributed services, CQRS, event sourcing
DatabaseSingle relational DB instanceRead replicas, connection poolingSharding, partitioning by order ID or regionMulti-region DB clusters, eventual consistency
Message QueuesNot required or simple queueBasic queues for async state changesRobust event queues, retry mechanismsDistributed event streaming platforms (Kafka, Pulsar)
LatencyLow, synchronous updatesModerate, some async processingHigher, eventual consistency acceptedLatency optimized with caching and event sourcing
First Bottleneck

The database becomes the first bottleneck as order volume grows. Each order state change requires a write operation. At around 10,000 orders per day, the DB write load increases significantly, causing slower response times and potential contention.

Scaling Solutions
  • Read Replicas: Offload read queries to replicas to reduce DB load.
  • Connection Pooling: Efficiently manage DB connections to handle more concurrent requests.
  • Asynchronous Processing: Use message queues to decouple state changes from user requests.
  • Sharding: Partition the database by order ID or region to distribute load.
  • Event Sourcing: Store state changes as events to improve scalability and auditability.
  • Microservices: Separate order state logic into dedicated services for better scaling.
  • CDN and Caching: Cache order status responses where possible to reduce DB hits.
Back-of-Envelope Cost Analysis

Assuming 1,000,000 orders/day (~11.6 orders/sec):

  • DB writes: ~12 QPS (writes per second) for state changes.
  • DB reads: Assuming 10 reads per order, ~120 QPS reads.
  • Storage: Each order state event ~1 KB, daily ~1 GB storage needed.
  • Network bandwidth: Assuming 10 KB per order state API call, ~116 KB/s (~0.9 Mbps).
  • Server capacity: One app server can handle ~1000 concurrent connections; multiple servers needed for load balancing.
Interview Tip

Start by describing the order state machine and its transitions. Then discuss expected load and identify the first bottleneck (usually the database). Next, explain scaling strategies like asynchronous processing and sharding. Finally, mention trade-offs such as consistency vs latency and how event sourcing can help.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Introduce read replicas and connection pooling to distribute load and reduce contention. Also, implement asynchronous processing with message queues to decouple writes from user requests, preventing DB overload.

Key Result
The database is the first bottleneck as order volume grows; scaling requires read replicas, sharding, and asynchronous event-driven processing to handle high order state transitions efficiently.