0
0
HLDsystem_design~12 mins

Multi-level caching in HLD - Architecture Diagram

Choose your learning style9 modes available
System Overview - Multi-level caching

This system uses multiple cache layers to speed up data access and reduce load on the main database. It first checks a fast, local cache, then a shared distributed cache, before querying the database. This design improves response time and system scalability.

Architecture Diagram
User
  |
  v
Load Balancer
  |
  v
API Gateway
  |
  v
Service
  |
  +--> Local Cache
  |
  +--> Distributed Cache
  |
  v
Database
Components
User
client
Initiates requests to the system
Load Balancer
load_balancer
Distributes incoming requests evenly across service instances
API Gateway
api_gateway
Handles request routing, authentication, and throttling
Service
service
Processes requests and manages cache lookups and database queries
Local Cache
cache
Fast in-memory cache local to the service instance for quick data retrieval
Distributed Cache
cache
Shared cache across services to reduce database load and improve data availability
Database
database
Persistent storage for all data
Request Flow - 14 Hops
UserLoad Balancer
Load BalancerAPI Gateway
API GatewayService
ServiceLocal Cache
Local CacheService
ServiceDistributed Cache
Distributed CacheService
ServiceDatabase
DatabaseService
ServiceDistributed Cache
ServiceLocal Cache
ServiceAPI Gateway
API GatewayLoad Balancer
Load BalancerUser
Failure Scenario
Component Fails:Distributed Cache
Impact:Cache misses increase, causing more database queries and higher latency. Local cache still serves some requests, but overall system load on database rises.
Mitigation:System continues working using local cache and database. Adding cache replication and fallback strategies can improve availability.
Architecture Quiz - 3 Questions
Test your understanding
Which component does the service check first to find cached data?
ADatabase
BDistributed Cache
CLocal Cache
DAPI Gateway
Design Principle
Multi-level caching uses a fast local cache for immediate hits and a shared distributed cache for broader data availability. This layered approach reduces database load and improves response times while maintaining data consistency and scalability.