| Users | Media Storage | CDN Usage | Network Traffic | Latency |
|---|---|---|---|---|
| 100 users | Single storage server, local disk | Minimal or no CDN | Low, direct fetch from storage | Low latency, direct access |
| 10,000 users | Distributed storage cluster, object storage | Basic CDN with few edge nodes | Moderate, some caching | Improved latency via CDN edges |
| 1,000,000 users | Highly scalable object storage (S3-like), multi-region | Global CDN with many edge locations | High, CDN offloads origin | Low latency globally |
| 100,000,000 users | Multi-cloud, geo-redundant storage, sharded data | Advanced CDN with dynamic content optimization | Very high, optimized delivery | Consistent low latency worldwide |
Media storage and CDN in HLD - Scalability & System Analysis
At small scale, the media storage server disk I/O and network bandwidth limit throughput.
At medium scale (~10K users), the origin storage bandwidth and read latency become bottlenecks.
At large scale (1M+ users), the CDN edge cache capacity and cache miss rate impact performance.
Without CDN, origin servers get overwhelmed by traffic spikes.
- Horizontal scaling: Add more storage nodes and CDN edge servers to distribute load.
- Caching: Use CDN to cache media close to users, reducing origin load.
- Sharding: Partition media storage by region or content type to improve access speed.
- Multi-region replication: Store copies of media in multiple geographic locations.
- Compression and optimization: Reduce media size for faster delivery.
- Load balancing: Distribute requests evenly across storage and CDN nodes.
Assuming 1M users, each streaming 1 video per day of 5 MB:
- Requests per second (QPS): ~12 (1M users * 1 request / 86400 seconds)
- Daily data transfer: 5 TB (1M * 5 MB)
- Bandwidth needed at origin: Reduced by CDN cache hit ratio (e.g., 90% cache hit reduces origin bandwidth to 0.5 TB)
- Storage needed: Depends on retention, e.g., 30 days = 150 TB
- Network bandwidth: CDN edges handle most traffic, origin bandwidth is bottleneck without caching
Start by defining user scale and media size.
Identify origin storage limits and CDN role early.
Discuss caching strategies and geographic distribution.
Explain how to handle cache misses and data consistency.
Always mention cost and latency trade-offs.
Your media storage origin handles 1000 QPS. Traffic grows 10x. What do you do first?
Answer: Deploy or increase CDN edge caching to offload origin servers and reduce direct requests, preventing origin overload.
