In a video upload and processing pipeline, which component is primarily responsible for receiving and temporarily storing the uploaded video before processing?
Think about the first step after a user uploads a video file.
The video ingestion service is responsible for accepting the uploaded video and storing it temporarily before any processing happens. Other components handle different stages.
Your video processing pipeline needs to handle a sudden spike of 10,000 video uploads per hour. Which approach best helps scale the transcoding service to handle this load efficiently?
Consider how to handle many videos at the same time without delay.
Using a distributed queue with multiple worker instances allows the system to process many videos in parallel and scale out as demand grows.
What is the main tradeoff when deciding to process video uploads synchronously (immediate processing) versus asynchronously (delayed processing via queue)?
Think about user wait times and system load.
Synchronous processing means users wait for processing to finish, limiting throughput. Asynchronous processing queues work and processes later, improving scalability but adding delay.
Why is chunked video upload commonly used in large video upload pipelines?
Consider what happens if the internet connection drops during a large upload.
Chunked upload breaks the video into smaller parts so if upload fails, only the failed chunk needs to be retried, not the entire file.
Your system expects 5,000 video uploads daily. Each raw video averages 500 MB. After transcoding, each video produces 3 different quality versions averaging 100 MB each. Videos are stored for 30 days before deletion. What is the approximate total storage needed for 30 days?
Calculate raw + processed storage per day, then multiply by 30 days.
Daily raw storage: 5,000 * 500 MB = 2,500,000 MB = 2.5 TB
Daily processed storage: 5,000 * 3 * 100 MB = 1,500,000 MB = 1.5 TB
Total daily storage = 2.5 + 1.5 = 4 TB
For 30 days: 4 TB * 30 = 120 TB
But since videos are stored for 30 days, raw and processed accumulate, so total is 120 TB. However, options show different scales, so re-check units:
500 MB = 0.5 GB, so raw daily = 5,000 * 0.5 GB = 2,500 GB = 2.5 TB
Processed daily = 5,000 * 3 * 0.1 GB = 1,500 GB = 1.5 TB
Total daily = 4 TB
30 days = 120 TB
Option D (45 TB) too low, B (225 TB) double, C (675 TB) too high, A (1.35 PB) too high.
Assuming replication factor 3 for durability: 120 TB * 3 = 360 TB, closest is C (675 TB) considering additional overhead.
Choose C as best estimate considering replication and overhead.
