| Scale | Users | Requests per Second | Data Volume | Infrastructure Changes |
|---|---|---|---|---|
| Small | 100 users | ~200 RPS | Few GBs of metadata | Single region, few microservices, basic caching |
| Medium | 10,000 users | ~20,000 RPS | TBs of metadata and video indexing | Multiple microservices, regional caching, CDN usage |
| Large | 1,000,000 users | ~2,000,000 RPS | Petabytes of video and metadata | Global CDN, multi-region deployment, microservice scaling, database sharding |
| Very Large | 100,000,000 users | ~200,000,000 RPS | Exabytes of data | Massive global distribution, advanced caching, multi-cloud, AI-driven load balancing |
Netflix architecture overview in Microservices - Scalability & System Analysis
Start learning this pattern below
Jump into concepts and practice - no test required
At small to medium scale, the database becomes the first bottleneck. Netflix stores user data, viewing history, and metadata which require fast reads and writes. As user count grows, the database faces high query loads and storage demands.
At large scale, network bandwidth and content delivery become bottlenecks due to massive video streaming traffic. The application servers and microservices also face CPU and memory pressure handling requests.
- Database: Use sharding to split data across multiple databases. Employ read replicas to handle read-heavy workloads.
- Caching: Implement multi-layer caching (in-memory caches like Redis, CDN edge caches) to reduce database load and latency.
- Microservices: Scale horizontally by adding more instances behind load balancers. Use container orchestration (e.g., Kubernetes) for management.
- Content Delivery: Use a global CDN to serve video content close to users, reducing bandwidth and latency.
- Network: Optimize streaming protocols and compress data to reduce bandwidth usage.
- Multi-region Deployment: Deploy services in multiple geographic regions for fault tolerance and lower latency.
For 1 million users streaming simultaneously:
- Requests per second: ~2 million (assuming 2 requests per second per user)
- Storage: Petabytes of video and metadata (Netflix stores thousands of movies and shows)
- Bandwidth: Video streaming at 3 Mbps per user -> 3 Mbps * 1M = 3 Tbps (~375 GB/s)
- Servers: Thousands of microservice instances and CDN edge servers globally
Structure your scalability discussion by:
- Identifying key components (database, microservices, CDN, network)
- Estimating load and data growth at different scales
- Pinpointing the first bottleneck and why it occurs
- Proposing targeted scaling solutions for each bottleneck
- Considering cost and complexity trade-offs
- Discussing monitoring and fallback strategies
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to distribute read queries and reduce load on the primary database. Also consider caching frequently accessed data to reduce database hits.
Practice
Solution
Step 1: Understand microservices purpose
Microservices divide a large system into smaller parts that are easier to manage and update.Step 2: Relate to Netflix architecture
Netflix uses microservices to handle specific functions separately, improving scalability and maintenance.Final Answer:
To break down the system into smaller, manageable parts -> Option DQuick Check:
Microservices = Smaller parts [OK]
- Thinking microservices avoid APIs
- Believing Netflix uses one big database
- Confusing microservices with monolithic design
Solution
Step 1: Identify communication method in microservices
Microservices communicate via APIs, which are defined interfaces for exchanging data.Step 2: Match with Netflix architecture
Netflix services use APIs to interact, ensuring loose coupling and independent deployment.Final Answer:
Through APIs that allow services to talk to each other -> Option AQuick Check:
Microservices communicate via APIs [OK]
- Assuming services share memory
- Thinking services connect directly to databases
- Believing file locks coordinate services
Solution
Step 1: Understand microservice isolation
Each microservice handles a specific function independently, so failure affects only that function.Step 2: Apply to recommendation service failure
If the recommendation service fails, only recommendations stop working; other features like login or streaming continue.Final Answer:
Only the recommendation feature will be affected -> Option AQuick Check:
Microservice failure affects only its feature [OK]
- Assuming entire platform fails
- Confusing recommendation with login or streaming
- Thinking microservices share failure impact
Solution
Step 1: Identify tight coupling problem
Tightly coupled services depend directly on each other, causing deployment and scaling problems.Step 2: Apply microservice best practice
Services should communicate only via APIs to remain independent and deploy separately.Final Answer:
Refactor services to communicate only via APIs and avoid direct calls -> Option BQuick Check:
Loose coupling via APIs fixes deployment issues [OK]
- Merging services defeats microservice benefits
- Using shared variables breaks isolation
- Increasing DB size doesn't fix coupling
Solution
Step 1: Understand scaling in microservices
Scaling means running multiple copies of a service to handle more users.Step 2: Apply to streaming service
Deploying multiple streaming service instances with a load balancer distributes user requests efficiently.Step 3: Evaluate other options
Merging services or disabling others breaks microservice principles; single DB server is a bottleneck.Final Answer:
Deploy multiple instances of the streaming service behind a load balancer -> Option CQuick Check:
Scale by multiple instances + load balancer [OK]
- Merging services reduces flexibility
- Single DB server limits scalability
- Disabling services harms user experience
