HLDsystem_design~10 mins

News feed generation in HLD - Scalability & System Analysis

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Scalability Analysis - News feed generation

Growth Table: News Feed Generation

Scale	Users	Feed Requests/Second	Data Volume	Latency Expectation	System Changes
Small	100	10-50	MBs of posts	< 1s	Single server, simple DB queries
Medium	10,000	1,000-5,000	GBs of posts	< 500ms	DB indexing, caching, load balancer
Large	1,000,000	50,000-100,000	TBs of posts	< 300ms	Feed pre-generation, sharded DB, distributed cache
Very Large	100,000,000	5,000,000+	Petabytes of posts	< 200ms	Massive horizontal scaling, CDN, microservices, data partitioning

First Bottleneck

At small scale, the database query speed limits feed generation because fetching and sorting posts for each user is slow.

At medium scale, the database CPU and I/O become bottlenecks due to many concurrent feed requests.

At large scale, network bandwidth and cache invalidation delays cause latency issues.

Scaling Solutions

Database Optimization: Add indexes, use read replicas to distribute read load.
Caching: Use in-memory caches (e.g., Redis) to store popular feeds or feed fragments.
Feed Pre-generation: Generate feeds offline and store them for quick retrieval.
Sharding: Partition user data across multiple databases to reduce load per instance.
Horizontal Scaling: Add more application servers behind load balancers.
Content Delivery Network (CDN): Cache static content and reduce latency globally.
Microservices: Separate feed generation, user service, and post service for better scalability.

Back-of-Envelope Cost Analysis

At 1M users with 100K feed requests/sec, assuming each feed request reads 50 posts (~10KB each), total data read = 100K * 50 * 10KB = ~50GB/s.
Storage needed for posts: If each user generates 10 posts/day, 1M users produce 10M posts/day (~100GB/day assuming 10KB/post).
Network bandwidth: 50GB/s read traffic requires multiple 10Gbps network links.
CPU: Multiple servers needed to handle sorting and merging posts per feed request.

Interview Tip

Start by explaining the user scale and traffic. Identify the main bottleneck (usually DB). Discuss caching and pre-generation to reduce load. Mention sharding and horizontal scaling for large scale. Always justify why each solution fits the bottleneck.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas and implement caching to reduce direct DB load before scaling application servers.

Key Result

News feed generation first breaks at the database due to heavy read and sorting load; caching and feed pre-generation are key to scaling efficiently.