Multi-level caching reduces latency by having caches at different layers (e.g., CPU cache, application cache, distributed cache) so data can be accessed faster depending on where it is stored.
Persistent storage like SSD or HDD is the slowest but has the largest capacity compared to caches and RAM.
Request coalescing or locking ensures only one request fetches data on a cache miss, preventing many simultaneous database hits (cache stampede).
Allowing some stale data improves performance by reducing synchronization overhead but sacrifices strong consistency.
Overall hit rate = L1 hit rate + (L1 miss rate × L2 hit rate) = 0.9 + (0.1 × 0.8) = 0.9 + 0.08 = 0.98 (98%)
Check carefully: The question states L2 hit rate is for requests missed by L1, so combined hit rate is 98%.