| Users (Building Occupants) | Elevators | Requests per Minute | System Changes |
|---|---|---|---|
| 100 | 2-3 | ~50 | Simple scheduling, single controller, minimal queue management |
| 10,000 | 10-20 | ~5,000 | Distributed control, request queueing, basic load balancing |
| 1,000,000 | 100+ | ~500,000 | Hierarchical control, partitioned zones, advanced scheduling algorithms |
| 100,000,000 | 1000+ | ~50,000,000 | Multi-layer distributed system, real-time analytics, predictive scheduling, fault tolerance |
Multiple elevator coordination in LLD - Scalability & System Analysis
The first bottleneck is the centralized scheduler/controller that manages elevator requests and assigns elevators. As user requests grow, a single controller struggles to process and assign requests quickly, causing delays and inefficient elevator usage.
- Horizontal scaling: Add multiple controllers managing subsets of elevators or building zones to distribute load.
- Partitioning: Divide the building into zones, each with its own scheduler to reduce coordination overhead.
- Caching: Cache recent requests and elevator states to reduce repeated computations.
- Load balancing: Use algorithms to evenly distribute elevator assignments and avoid congestion.
- Predictive scheduling: Use historical data to anticipate demand and pre-position elevators.
- Fault tolerance: Implement fallback controllers and health checks to maintain service during failures.
- At 10,000 users with 10 elevators, expect ~5,000 requests per minute (~83 requests/sec).
- Each controller can handle ~1000-5000 requests per second; a single controller suffices up to ~1000 users.
- Network bandwidth is minimal since messages are small (elevator commands and status updates).
- Storage needs are low, mainly for logs and historical data (few MBs per day).
- CPU and memory scale with number of requests and complexity of scheduling algorithms.
Start by defining system scale and user load. Identify the main bottleneck (usually the scheduler). Discuss how to partition the problem (zones, controllers). Explain trade-offs between centralized and distributed control. Mention caching and predictive techniques. Always justify your choices with scalability and latency in mind.
Your database handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Add read replicas and implement caching to reduce load on the primary database before considering sharding or more complex solutions.
