| Scale | Users | Requests per Second | Data Volume | Infrastructure Changes |
|---|---|---|---|---|
| Small | 100 users | ~200 RPS | Few GBs of metadata | Single region, few microservices, basic caching |
| Medium | 10,000 users | ~20,000 RPS | TBs of metadata and video indexing | Multiple microservices, regional caching, CDN usage |
| Large | 1,000,000 users | ~2,000,000 RPS | Petabytes of video and metadata | Global CDN, multi-region deployment, microservice scaling, database sharding |
| Very Large | 100,000,000 users | ~200,000,000 RPS | Exabytes of data | Massive global distribution, advanced caching, multi-cloud, AI-driven load balancing |
Netflix architecture overview in Microservices - Scalability & System Analysis
At small to medium scale, the database becomes the first bottleneck. Netflix stores user data, viewing history, and metadata which require fast reads and writes. As user count grows, the database faces high query loads and storage demands.
At large scale, network bandwidth and content delivery become bottlenecks due to massive video streaming traffic. The application servers and microservices also face CPU and memory pressure handling requests.
- Database: Use sharding to split data across multiple databases. Employ read replicas to handle read-heavy workloads.
- Caching: Implement multi-layer caching (in-memory caches like Redis, CDN edge caches) to reduce database load and latency.
- Microservices: Scale horizontally by adding more instances behind load balancers. Use container orchestration (e.g., Kubernetes) for management.
- Content Delivery: Use a global CDN to serve video content close to users, reducing bandwidth and latency.
- Network: Optimize streaming protocols and compress data to reduce bandwidth usage.
- Multi-region Deployment: Deploy services in multiple geographic regions for fault tolerance and lower latency.
For 1 million users streaming simultaneously:
- Requests per second: ~2 million (assuming 2 requests per second per user)
- Storage: Petabytes of video and metadata (Netflix stores thousands of movies and shows)
- Bandwidth: Video streaming at 3 Mbps per user -> 3 Mbps * 1M = 3 Tbps (~375 GB/s)
- Servers: Thousands of microservice instances and CDN edge servers globally
Structure your scalability discussion by:
- Identifying key components (database, microservices, CDN, network)
- Estimating load and data growth at different scales
- Pinpointing the first bottleneck and why it occurs
- Proposing targeted scaling solutions for each bottleneck
- Considering cost and complexity trade-offs
- Discussing monitoring and fallback strategies
Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?
Answer: Add read replicas to distribute read queries and reduce load on the primary database. Also consider caching frequently accessed data to reduce database hits.