Bird
Raised Fist0
HLDsystem_design~10 mins

Why video streaming handles massive data in HLD - Scalability Evidence

Choose your learning style9 modes available
Scalability Analysis - Why video streaming handles massive data
Growth Table: Video Streaming Data Handling
UsersData VolumeNetwork TrafficStorage NeedsInfrastructure Changes
100 usersLow (few GB/day)Low (few Mbps)Small (hundreds GB)Single server, simple CDN
10,000 usersMedium (TB/day)High (Gbps)Large (tens TB)Multiple servers, CDN expansion, caching
1,000,000 usersVery High (PB/month)Very High (hundreds Gbps)Very Large (PB scale)Distributed storage, multi-region CDN, load balancing
100,000,000 usersExtreme (Exabytes/year)Extreme (Tbps)Massive (multi-Exabyte)Global CDN, sharded storage, edge computing
First Bottleneck: Network Bandwidth and Storage I/O

As user count grows, the biggest challenge is moving large video files fast enough to many users simultaneously. Network bandwidth limits how much data can be sent at once. Storage input/output speed limits how quickly video files can be read and served. These break first before CPU or memory.

Scaling Solutions
  • Content Delivery Network (CDN): Distribute video copies closer to users worldwide to reduce bandwidth load on origin servers and lower latency.
  • Video Compression and Adaptive Streaming: Use efficient codecs and adjust video quality based on user bandwidth to reduce data size.
  • Horizontal Scaling: Add more streaming servers behind load balancers to handle more concurrent connections.
  • Distributed Storage: Use sharded and replicated storage systems to handle massive video data and high read throughput.
  • Edge Computing: Process and cache video data at network edges to reduce central server load and improve speed.
Back-of-Envelope Cost Analysis

Assuming 1 million users streaming 2 Mbps video simultaneously:

  • Network bandwidth needed: 2 Mbps * 1,000,000 = 2 Tbps (terabits per second)
  • Storage: 1 hour of HD video ~3 GB, 1 million users streaming 1 hour = 3 PB (petabytes) data served
  • Requests per second: If each user requests video chunks every 10 seconds, 100,000 QPS to origin servers
  • Infrastructure: Requires multi-region CDN, distributed storage clusters, and high bandwidth backbone
Interview Tip: Structuring Scalability Discussion

Start by identifying key resources (network, storage, CPU). Discuss growth impact on each. Identify first bottleneck (usually bandwidth/storage I/O). Propose targeted solutions like CDN, compression, horizontal scaling. Quantify with rough numbers. Show understanding of trade-offs and cost.

Self Check Question

Your video streaming database handles 1000 QPS. Traffic grows 10x. What do you do first and why?

Answer: Add read replicas and implement caching to reduce load on the main database, because the database is the first bottleneck at increased traffic.

Key Result
Video streaming handles massive data by distributing content globally via CDNs, compressing video, and scaling storage and network infrastructure to overcome bandwidth and storage I/O bottlenecks.