HLDsystem_design~7 mins

Why caching reduces latency in HLD - Why This Architecture

Choose your learning style9 modes available

Learn Why Deep Arch Practice Challenge Design Recall Scale

Problem Statement

When every request must fetch data from a slow or distant source like a database or external API, users experience long wait times. This delay causes poor user experience and can overload the backend systems with repeated identical requests.

Solution

Caching stores frequently requested data closer to the user or application, so future requests can be served instantly without reaching the slow source. This reduces the time spent waiting for data retrieval and lowers the load on backend systems.

Architecture

┌─────────────┐       ┌───────────────┐       ┌───────────────┐
│   Client    │──────▶│    Cache      │──────▶│   Database    │
└─────────────┘       └───────────────┘       └───────────────┘
        │                    │                      │
        │                    │                      │
        │◀───────────────────┘                      │
        │                                           │
        │◀──────────────────────────────────────────┘

This diagram shows the client first checking the cache for data. If the cache has it (cache hit), the data returns immediately. If not (cache miss), the cache fetches data from the database and returns it to the client.

Trade-offs

✓ Pros

→

Significantly reduces response time for repeated requests by avoiding slow data sources.

→

Decreases load on backend databases or services, improving overall system stability.

→

Improves user experience with faster data access.

✗ Cons

→

Cache data can become stale if not updated properly, leading to outdated responses.

→

Adds complexity to the system with cache invalidation and consistency management.

→

Requires additional memory or storage resources to hold cached data.

Use caching when read requests are frequent and data changes relatively slowly, especially if backend data sources have high latency or limited capacity.

Avoid caching when data changes very rapidly or must always be fresh, or when the system handles very low traffic where caching overhead outweighs benefits.

Real World Examples

Netflix

Caches user viewing history and recommendations at edge servers to deliver instant responses without querying central databases.

Amazon

Uses caching for product details and pricing to reduce database load during high traffic sales events.

Twitter

Caches user timelines and tweets to serve millions of read requests quickly without hitting the main database every time.

Alternatives

CDN (Content Delivery Network)

Caches static content like images and videos geographically closer to users rather than dynamic data.

Use when: Use CDN when the main latency issue is delivering large static files rather than database query speed.

Database Replication

Creates multiple copies of the database to distribute read load instead of caching data separately.

Use when: Use replication when data freshness is critical and caching complexity is undesirable.

Summary

Caching stores frequently accessed data closer to the user to reduce wait times.

It lowers backend load and improves user experience by serving repeated requests quickly.

Proper cache management is essential to avoid stale data and system complexity.