| Scale | Users | System Changes |
|---|---|---|
| Small | 100 users | Simple design, single server, direct DB access, minimal caching |
| Medium | 10,000 users | Load balancers added, read replicas for DB, caching introduced, basic monitoring |
| Large | 1,000,000 users | Horizontal scaling of app servers, sharded databases, distributed caching, CDN usage |
| Very Large | 100,000,000 users | Microservices, multi-region deployment, advanced data partitioning, event-driven architecture, autoscaling |
Why advanced concepts handle production systems in LLD - Scalability Evidence
Start learning this pattern below
Jump into concepts and practice - no test required
At low scale, the database is the first to struggle because it handles all requests directly. As users grow, a single server cannot process all requests fast enough, causing slow responses and failures. Without caching or load balancing, the system overloads quickly. Advanced concepts prevent this by distributing load and reducing direct pressure on the database.
- Horizontal Scaling: Add more servers to share the load, preventing any single server from becoming a bottleneck.
- Load Balancing: Distribute user requests evenly across servers to optimize resource use.
- Caching: Store frequent data in fast memory to reduce database hits and speed up responses.
- Database Sharding: Split data into smaller parts so each database handles less data, improving performance.
- Content Delivery Networks (CDN): Serve static content from locations closer to users, reducing latency and bandwidth use.
- Microservices: Break the system into smaller, independent services that can scale and update separately.
- At 1,000 users: ~100 requests/sec, 1 GB storage, 10 Mbps bandwidth
- At 10,000 users: ~1,000 requests/sec, 10 GB storage, 100 Mbps bandwidth
- At 1,000,000 users: ~100,000 requests/sec, 1 TB storage, 10 Gbps bandwidth
- At 100,000,000 users: ~10,000,000 requests/sec, 100 TB+ storage, 100+ Gbps bandwidth
Costs rise quickly with scale, so advanced concepts help optimize resource use and control expenses.
Start by describing the current system and its limits. Identify the first bottleneck as users grow. Explain how you would apply advanced concepts step-by-step to handle increased load. Use real numbers to show understanding. Finish by discussing trade-offs and cost implications.
Your database handles 1000 queries per second (QPS). Traffic grows 10 times. What do you do first and why?
Answer: Add read replicas and implement caching to reduce direct load on the database before scaling servers horizontally.
Practice
Why do production systems use advanced concepts like caching and load balancing?
Solution
Step 1: Understand the purpose of caching and load balancing
Caching stores data temporarily to reduce repeated work, and load balancing spreads user requests to avoid overload.Step 2: Connect these concepts to system stability and speed
By reducing load and speeding up responses, these concepts keep the system stable and fast even with many users.Final Answer:
To keep the system stable and fast under heavy use -> Option DQuick Check:
Advanced concepts = stability and speed [OK]
- Confusing complexity with usefulness
- Ignoring performance benefits
- Assuming fewer developers means better design
Which of the following is the correct syntax to describe a load balancer in a system design diagram?
A) LoadBalancer -> Server1, Server2
B) LoadBalancer = Server1 + Server2
C) LoadBalancer : Server1 & Server2
D) LoadBalancer <-> Server1, Server2Solution
Step 1: Identify common notation for load balancer connections
Arrows (->) show direction of request flow from load balancer to servers.Step 2: Evaluate each option's syntax
LoadBalancer -> Server1, Server2 uses arrows correctly; others use symbols not standard for flow diagrams.Final Answer:
LoadBalancer -> Server1, Server2 -> Option AQuick Check:
Arrow shows flow = LoadBalancer -> Server1, Server2 [OK]
- Using '=' or ':' which are not flow indicators
- Confusing bidirectional arrows for load balancer
- Ignoring standard diagram conventions
Consider this simplified request flow in a production system:
Client -> LoadBalancer -> Cache -> DatabaseIf the cache has the requested data, what is the expected behavior?
Solution
Step 1: Understand cache role in request flow
Cache stores frequently requested data to serve requests quickly without querying the database.Step 2: Analyze behavior when cache has data
If cache has data, it returns it directly, skipping the database to save time and resources.Final Answer:
Request is served from the cache without hitting the database -> Option CQuick Check:
Cache hit = serve from cache [OK]
- Assuming database is always queried
- Thinking cache sends requests back to client
- Confusing load balancer role
In a production system, a developer notices that the load balancer is sending all traffic to a single server, causing overload. What is the likely cause?
Solution
Step 1: Identify symptoms of traffic overload on one server
All traffic going to one server suggests load balancer is not distributing requests evenly.Step 2: Determine cause of uneven traffic distribution
Misconfiguration in load balancer settings can cause it to route all requests to a single server.Final Answer:
Load balancer is misconfigured to use a single server -> Option BQuick Check:
Uneven traffic = load balancer misconfig [OK]
- Blaming cache or database for traffic routing
- Assuming client causes server overload
- Ignoring load balancer role
A production system needs to handle millions of users with minimal downtime. Which combination of advanced concepts best supports this goal?
Solution
Step 1: Identify key needs for high user load and uptime
Handling millions of users requires spreading load, fast responses, and recovery from failures.Step 2: Match advanced concepts to these needs
Load balancing distributes traffic, caching speeds responses, and failover ensures system stays up if parts fail.Final Answer:
Load balancing, caching, and failover mechanisms -> Option AQuick Check:
High scale + uptime = load balancing + caching + failover [OK]
- Choosing single server which can't scale
- Ignoring caching benefits
- Overlooking failover for downtime prevention
