HLDsystem_design~25 mins

Video recommendation system in HLD - System Design Exercise

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Design: Video Recommendation System

Design covers user interaction, recommendation engine, data storage, and API delivery. Does not cover video storage, streaming infrastructure, or content creation.

Functional Requirements

FR1: Provide personalized video recommendations to users based on their watch history and preferences

FR2: Support at least 10 million active users with concurrent access

FR3: Update recommendations in near real-time as users interact with videos

FR4: Allow users to rate videos and provide feedback to improve recommendations

FR5: Support trending and popular video recommendations globally and regionally

FR6: Ensure recommendations are relevant and diverse to keep users engaged

Non-Functional Requirements

NFR1: System should handle 10 million active users with 1 million concurrent requests

NFR2: API response latency for recommendations should be under 200ms (p99)

NFR3: System availability should be at least 99.9% uptime

NFR4: Data freshness for recommendations should be within 5 minutes

NFR5: Scalable to handle growth in users and video content

Think Before You Design

Questions to Ask

❓ Question 1

❓ Question 2

❓ Question 3

❓ Question 4

❓ Question 5

❓ Question 6

Key Components

User profile service

Video metadata store

Watch history database

Recommendation engine

Real-time event processing pipeline

API gateway for serving recommendations

Cache layer for fast response

Analytics and feedback processing

Design Patterns

Batch and real-time hybrid processing

Microservices architecture

Event-driven architecture

Caching strategies (LRU, TTL)

Data partitioning and sharding

Load balancing and rate limiting

Reference Architecture

                    +---------------------+
                    |   User Devices      |
                    +----------+----------+
                               |
                               v
                    +----------+----------+
                    |     API Gateway     |
                    +----------+----------+
                               |
          +--------------------+--------------------+
          |                                         |
+---------v---------+                     +---------v---------+
| Recommendation    |                     | User Profile &    |
| Engine Service    |                     | Watch History DB  |
+---------+---------+                     +---------+---------+
          |                                         |
          |                                         |
+---------v---------+                     +---------v---------+
| Video Metadata DB |                     | Event Processing  |
+-------------------+                     | Pipeline (Kafka)  |
                                          +---------+---------+
                                                    |
                                          +---------v---------+
                                          | Feedback & Rating |
                                          | Processing Service|
                                          +-------------------+

Components

API Gateway

Nginx or Envoy

Handles incoming user requests and routes them to appropriate services

Recommendation Engine Service

Python/Java microservice with ML models

Generates personalized video recommendations using hybrid algorithms

User Profile & Watch History DB

NoSQL database like Cassandra or DynamoDB

Stores user preferences, watch history, and interaction data

Video Metadata DB

Relational DB like PostgreSQL or NoSQL like MongoDB

Stores video details such as title, category, tags, and popularity

Event Processing Pipeline

Apache Kafka

Processes real-time user events to update watch history and trigger recommendation updates

Feedback & Rating Processing Service

Microservice with batch and stream processing

Ingests user feedback and ratings to improve recommendation quality

Cache Layer

Redis or Memcached

Caches popular and personalized recommendations for low latency

Request Flow

1. User sends request for video recommendations via API Gateway.

2. API Gateway forwards request to Recommendation Engine Service.

3. Recommendation Engine queries User Profile & Watch History DB and Video Metadata DB.

4. Recommendation Engine generates personalized recommendations using ML models.

5. Recommendations are cached in Redis for quick future access.

6. User watches or interacts with videos; events are sent to Event Processing Pipeline.

7. Event Processing updates User Profile & Watch History DB asynchronously.

8. Feedback & Rating Processing Service consumes events to refine recommendation models.

9. Updated recommendations are refreshed in cache within 5 minutes.

Database Schema

Entities: - User: user_id (PK), name, preferences, region - Video: video_id (PK), title, category, tags, upload_date, popularity_score - WatchHistory: user_id (FK), video_id (FK), watch_timestamp - Feedback: user_id (FK), video_id (FK), rating, comment, feedback_timestamp Relationships: - User to WatchHistory: 1 to many - Video to WatchHistory: 1 to many - User to Feedback: 1 to many - Video to Feedback: 1 to many

Scaling Discussion

Bottlenecks

Recommendation Engine CPU and memory limits when generating recommendations for millions of users

Database read/write throughput for user profiles and watch history

Cache invalidation and freshness for personalized recommendations

Event processing pipeline lag under high user activity

API Gateway handling large concurrent request volume

Solutions

Use distributed recommendation engine with model partitioning and horizontal scaling

Partition and shard databases by user region or user ID to distribute load

Implement smart cache invalidation with TTL and partial updates

Scale Kafka clusters and use multiple consumer groups for event processing

Deploy multiple API Gateway instances behind load balancers with autoscaling

Interview Tips

Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.

Clarify personalization goals and data availability

Explain choice of hybrid recommendation algorithms

Describe how real-time and batch processing complement each other

Discuss caching strategies to meet latency targets

Highlight database design for scalability and consistency

Address bottlenecks and scaling solutions clearly