HLDsystem_design~7 mins

News feed generation in HLD - System Design Guide

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Problem Statement

When millions of users follow thousands of others, generating a personalized news feed in real-time becomes a huge challenge. Without an efficient system, users face slow load times, outdated content, or incomplete feeds, leading to poor user experience and engagement loss.

Solution

The system precomputes or dynamically generates personalized feeds by aggregating posts from followed users and ranking them based on relevance and freshness. It uses a combination of push and pull models, caching, and distributed storage to serve feeds quickly and keep them updated.

Architecture

User Posts

→Feed Service

↓

Follower Graph

This diagram shows how user posts flow into the feed service, which interacts with follower data and ranking logic to generate personalized feeds stored in cache and database, then served to users.

Trade-offs

✓ Pros

→

Supports personalized and timely feeds by combining push and pull strategies.

→

Scales horizontally by distributing feed generation and storage.

→

Improves user experience with caching and ranking for relevance.

✗ Cons

→

Complexity increases with maintaining consistency between cache and database.

→

High storage and compute costs for precomputing feeds for millions of users.

→

Latency can increase if feed generation is done fully on-demand.

Use when user base exceeds hundreds of thousands with frequent content updates and personalized feed expectations.

Avoid if user base is small (under 10,000) or if real-time personalization is not critical, as simpler pull-only models suffice.

Real World Examples

Facebook

Uses a combination of push and pull models to generate personalized news feeds that rank posts based on user interactions and freshness.

Twitter

Precomputes timelines for active users to serve feeds with low latency, while also supporting on-demand fetching for less active users.

Generates feeds by aggregating posts from connections and ranking them using machine learning models for relevance and engagement.

Alternatives

Pull-based feed generation

Feeds are generated on-demand by querying followed users' posts at request time without precomputation.

Use when: Choose when user base is small or content updates are infrequent, reducing storage and compute overhead.

Push-based feed generation

Feeds are precomputed and pushed to users' feed storage immediately after content creation.

Use when: Choose when low latency feed delivery is critical and user follow graph is relatively stable.

Hybrid feed generation

Combines push for active users and pull for less active users to balance latency and resource usage.

Use when: Choose when user activity varies widely and system needs to optimize resource usage.

Summary

News feed generation solves the challenge of delivering personalized content to millions of users efficiently.

It uses a mix of push and pull models, caching, and ranking to balance latency, freshness, and resource use.

Choosing the right feed generation approach depends on user scale, activity patterns, and latency requirements.