Bird
Raised Fist0
LLDsystem_design~10 mins

Rating and review system in LLD - Scalability & System Analysis

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Scalability Analysis - Rating and review system
Growth Table: Rating and Review System
ScaleUsersReviews per dayStorageTrafficSystem Changes
Small100500~10 MBLowSingle app server, single DB instance
Medium10,00050,000~1 GBModerateDB read replicas, caching, load balancer
Large1,000,0005,000,000~100 GBHighSharded DB, CDN for images, distributed cache
Very Large100,000,000500,000,000~10 TBVery HighMulti-region deployment, microservices, advanced sharding
First Bottleneck

At small to medium scale, the database is the first bottleneck because it must handle many read and write queries for reviews and ratings. Writes increase with new reviews, and reads increase with users fetching reviews. The DB CPU and disk I/O limit throughput.

Scaling Solutions
  • Database Read Replicas: Offload read queries to replicas to reduce load on primary DB.
  • Caching: Use in-memory caches (e.g., Redis) for frequently read data like average ratings.
  • Horizontal Scaling: Add more application servers behind a load balancer to handle more user requests.
  • Sharding: Partition the database by product or user ID to distribute write and read load.
  • CDN: Serve review images and static content via CDN to reduce bandwidth and latency.
  • Asynchronous Processing: Use message queues to handle heavy write operations asynchronously.
Back-of-Envelope Cost Analysis
  • Requests per second (RPS): At 1M users, assuming 5M reviews/day -> ~60 reviews/sec write + ~6000 reads/sec (assuming 100 reads per review) = ~6060 RPS total.
  • Storage: Average review size ~2 KB (text + metadata). 5M reviews/day -> 10 GB/day. For 10 days retention -> ~100 GB storage.
  • Bandwidth: Assuming 100 KB per review fetch (including images), 6000 reads/sec -> ~600 MB/s (~4.8 Gbps). Requires CDN and network scaling.
Interview Tip

Start by clarifying scale and usage patterns. Identify bottlenecks step-by-step: database, app servers, network. Propose solutions matching bottlenecks: caching for reads, sharding for writes, CDN for media. Discuss trade-offs and monitoring strategies.

Self Check

Your database handles 1000 QPS. Traffic grows 10x to 10,000 QPS. What do you do first?

Answer: Add read replicas to offload read queries and implement caching to reduce DB load before considering sharding or adding more app servers.

Key Result
The database becomes the first bottleneck as user and review volume grows; scaling requires read replicas, caching, and sharding to maintain performance.

Practice

(1/5)
1. What is the primary purpose of a rating and review system in an online store?
easy
A. To process payment transactions
B. To collect user feedback and calculate average product ratings
C. To manage product inventory levels
D. To store user passwords securely

Solution

  1. Step 1: Understand the system's goal

    A rating and review system is designed to gather user opinions and ratings about products.
  2. Step 2: Identify the main function

    It calculates average ratings to help other users make decisions quickly.
  3. Final Answer:

    To collect user feedback and calculate average product ratings -> Option B
  4. Quick Check:

    Rating system = Collect feedback + average rating [OK]
Hint: Focus on feedback and rating calculation [OK]
Common Mistakes:
  • Confusing rating system with payment or inventory systems
  • Thinking it manages user credentials
  • Assuming it handles shipping or delivery
2. Which data structure is best suited to store individual reviews for quick lookup by product ID?
easy
A. Hash map with product ID as key and list of reviews as value
B. Array of reviews without indexing
C. Linked list of all reviews
D. Stack of reviews

Solution

  1. Step 1: Consider lookup efficiency

    Quick lookup by product ID requires a data structure with fast key-based access.
  2. Step 2: Choose appropriate structure

    A hash map (dictionary) allows O(1) average time to find reviews by product ID.
  3. Final Answer:

    Hash map with product ID as key and list of reviews as value -> Option A
  4. Quick Check:

    Fast lookup = Hash map [OK]
Hint: Use hash maps for fast key-based lookup [OK]
Common Mistakes:
  • Using arrays without indexing causes slow searches
  • Linked lists have O(n) lookup time
  • Stacks do not support direct lookup by key
3. Given the following pseudocode for updating average rating after a new review:
current_avg = 4.0
num_reviews = 5
new_rating = 5
new_avg = (current_avg * num_reviews + new_rating) / (num_reviews + 1)

What is the value of new_avg?
medium
A. 4.17
B. 4.16
C. 4.0
D. 4.5

Solution

  1. Step 1: Calculate total rating sum before new review

    Total sum = current_avg * num_reviews = 4.0 * 5 = 20
  2. Step 2: Add new rating and compute new average

    New sum = 20 + 5 = 25
    New average = 25 / (5 + 1) = 25 / 6 ≈ 4.1667
  3. Final Answer:

    4.17 -> Option A
  4. Quick Check:

    Average update formula ≈ 4.17 [OK]
Hint: Multiply avg by count, add new, divide by count+1 [OK]
Common Mistakes:
  • Forgetting to add new rating to total sum
  • Dividing by old count instead of count+1
  • Rounding too early causing wrong average
4. A rating system stores average rating and count per product. After deleting a review, the average becomes incorrect. What is the likely cause?
medium
A. Recalculating average using sum of all reviews
B. Using integer division instead of float division
C. Not updating the count of reviews after deletion
D. Storing reviews in a hash map

Solution

  1. Step 1: Understand average calculation

    Average = sum of ratings / count of reviews. Both must be accurate.
  2. Step 2: Identify deletion impact

    If count is not decreased after deleting a review, average calculation divides by wrong count.
  3. Final Answer:

    Not updating the count of reviews after deletion -> Option C
  4. Quick Check:

    Count mismatch causes wrong average [OK]
Hint: Always update count when reviews change [OK]
Common Mistakes:
  • Ignoring count update after deletion
  • Assuming recalculation always fixes average
  • Confusing data structure choice with calculation error
5. You want to design a scalable rating and review system for millions of products and users. Which approach best balances fast average rating queries and frequent review updates?
hard
A. Store all reviews and compute average on each query
B. Use a single database table without indexes
C. Cache only the latest review per product
D. Maintain precomputed average and count, update incrementally on review changes

Solution

  1. Step 1: Consider query and update load

    Millions of products and users mean many queries and updates.
  2. Step 2: Choose efficient strategy

    Precomputing average and count and updating them incrementally avoids scanning all reviews each time.
  3. Step 3: Evaluate other options

    Computing average on each query is slow; no indexes cause slow lookups; caching only latest review misses full rating info.
  4. Final Answer:

    Maintain precomputed average and count, update incrementally on review changes -> Option D
  5. Quick Check:

    Precompute + incremental update = scalable [OK]
Hint: Precompute averages, update on changes for scale [OK]
Common Mistakes:
  • Recomputing averages on every query
  • Ignoring indexing and caching strategies
  • Caching incomplete data causing stale info