| Scale | Users | Inventory Items | Transactions per Second | Data Size | Key Changes |
|---|---|---|---|---|---|
| Small | 100 | 10,000 | 10 TPS | ~100 MB | Single server, simple DB, no caching |
| Medium | 10,000 | 1,000,000 | 1,000 TPS | ~10 GB | DB indexing, read replicas, caching introduced |
| Large | 1,000,000 | 100,000,000 | 50,000 TPS | ~1 TB | Sharding, distributed cache, load balancers, async processing |
| Very Large | 100,000,000 | 10,000,000,000 | 5,000,000 TPS | ~100 TB+ | Multi-region deployment, advanced sharding, event sourcing, CQRS, CDN for static data |
Inventory management in LLD - Scalability & System Analysis
At small to medium scale, the database is the first bottleneck. Inventory management requires frequent updates (stock changes, orders) and queries (availability checks). A single database instance can handle only up to ~5,000 queries per second reliably. As users and transactions grow, the DB CPU, disk I/O, and connection limits are reached first.
- Read Replicas: Offload read queries to replicas to reduce load on primary DB.
- Caching: Use in-memory caches (e.g., Redis) for frequently accessed inventory data to reduce DB hits.
- Sharding: Partition inventory data by product ID or region to distribute load across multiple DB servers.
- Horizontal Scaling: Add more application servers behind load balancers to handle increased user requests.
- Asynchronous Processing: Use message queues for inventory updates to smooth spikes and improve throughput.
- CDN: For static assets like product images, use CDN to reduce bandwidth and latency.
- At 10,000 users with 1,000 TPS, DB needs to handle ~1,000 writes/reads per second.
- Storage: 1 million items x ~10 KB per item = ~10 GB data.
- Bandwidth: Assuming 1 KB per request, 1,000 TPS x 1 KB = ~1 MB/s (~8 Mbps).
- Cache memory: To hold hot inventory data, ~1-2 GB Redis instance.
- Network: 1 Gbps network interface sufficient for medium scale.
Start by defining the scale and key operations (reads vs writes). Identify the bottleneck (usually DB). Discuss scaling strategies step-by-step: caching, read replicas, sharding, async processing. Mention trade-offs like consistency vs availability. Use real numbers to justify choices.
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first and why?
Answer: Introduce read replicas and caching to reduce load on the primary database. This helps scale reads horizontally without immediate complex sharding.