Bird
Raised Fist0
HLDsystem_design~25 mins

Video recommendation system in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Video Recommendation System
Design covers user interaction, recommendation engine, data storage, and API delivery. Does not cover video storage, streaming infrastructure, or content creation.
Functional Requirements
FR1: Provide personalized video recommendations to users based on their watch history and preferences
FR2: Support at least 10 million active users with concurrent access
FR3: Update recommendations in near real-time as users interact with videos
FR4: Allow users to rate videos and provide feedback to improve recommendations
FR5: Support trending and popular video recommendations globally and regionally
FR6: Ensure recommendations are relevant and diverse to keep users engaged
Non-Functional Requirements
NFR1: System should handle 10 million active users with 1 million concurrent requests
NFR2: API response latency for recommendations should be under 200ms (p99)
NFR3: System availability should be at least 99.9% uptime
NFR4: Data freshness for recommendations should be within 5 minutes
NFR5: Scalable to handle growth in users and video content
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
❓ Question 6
Key Components
User profile service
Video metadata store
Watch history database
Recommendation engine
Real-time event processing pipeline
API gateway for serving recommendations
Cache layer for fast response
Analytics and feedback processing
Design Patterns
Batch and real-time hybrid processing
Microservices architecture
Event-driven architecture
Caching strategies (LRU, TTL)
Data partitioning and sharding
Load balancing and rate limiting
Reference Architecture
                    +---------------------+
                    |   User Devices      |
                    +----------+----------+
                               |
                               v
                    +----------+----------+
                    |     API Gateway     |
                    +----------+----------+
                               |
          +--------------------+--------------------+
          |                                         |
+---------v---------+                     +---------v---------+
| Recommendation    |                     | User Profile &    |
| Engine Service    |                     | Watch History DB  |
+---------+---------+                     +---------+---------+
          |                                         |
          |                                         |
+---------v---------+                     +---------v---------+
| Video Metadata DB |                     | Event Processing  |
+-------------------+                     | Pipeline (Kafka)  |
                                          +---------+---------+
                                                    |
                                          +---------v---------+
                                          | Feedback & Rating |
                                          | Processing Service|
                                          +-------------------+
Components
API Gateway
Nginx or Envoy
Handles incoming user requests and routes them to appropriate services
Recommendation Engine Service
Python/Java microservice with ML models
Generates personalized video recommendations using hybrid algorithms
User Profile & Watch History DB
NoSQL database like Cassandra or DynamoDB
Stores user preferences, watch history, and interaction data
Video Metadata DB
Relational DB like PostgreSQL or NoSQL like MongoDB
Stores video details such as title, category, tags, and popularity
Event Processing Pipeline
Apache Kafka
Processes real-time user events to update watch history and trigger recommendation updates
Feedback & Rating Processing Service
Microservice with batch and stream processing
Ingests user feedback and ratings to improve recommendation quality
Cache Layer
Redis or Memcached
Caches popular and personalized recommendations for low latency
Request Flow
1. User sends request for video recommendations via API Gateway.
2. API Gateway forwards request to Recommendation Engine Service.
3. Recommendation Engine queries User Profile & Watch History DB and Video Metadata DB.
4. Recommendation Engine generates personalized recommendations using ML models.
5. Recommendations are cached in Redis for quick future access.
6. User watches or interacts with videos; events are sent to Event Processing Pipeline.
7. Event Processing updates User Profile & Watch History DB asynchronously.
8. Feedback & Rating Processing Service consumes events to refine recommendation models.
9. Updated recommendations are refreshed in cache within 5 minutes.
Database Schema
Entities: - User: user_id (PK), name, preferences, region - Video: video_id (PK), title, category, tags, upload_date, popularity_score - WatchHistory: user_id (FK), video_id (FK), watch_timestamp - Feedback: user_id (FK), video_id (FK), rating, comment, feedback_timestamp Relationships: - User to WatchHistory: 1 to many - Video to WatchHistory: 1 to many - User to Feedback: 1 to many - Video to Feedback: 1 to many
Scaling Discussion
Bottlenecks
Recommendation Engine CPU and memory limits when generating recommendations for millions of users
Database read/write throughput for user profiles and watch history
Cache invalidation and freshness for personalized recommendations
Event processing pipeline lag under high user activity
API Gateway handling large concurrent request volume
Solutions
Use distributed recommendation engine with model partitioning and horizontal scaling
Partition and shard databases by user region or user ID to distribute load
Implement smart cache invalidation with TTL and partial updates
Scale Kafka clusters and use multiple consumer groups for event processing
Deploy multiple API Gateway instances behind load balancers with autoscaling
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.
Clarify personalization goals and data availability
Explain choice of hybrid recommendation algorithms
Describe how real-time and batch processing complement each other
Discuss caching strategies to meet latency targets
Highlight database design for scalability and consistency
Address bottlenecks and scaling solutions clearly