Bird
Raised Fist0
HLDsystem_design~7 mins

Circuit breaker pattern in HLD - System Design Guide

Choose your learning style9 modes available
Problem Statement
When a downstream service or resource becomes slow or unresponsive, continuing to send requests causes long delays and resource exhaustion. This leads to cascading failures where the entire system slows down or crashes because it waits indefinitely for failing components.
Solution
The circuit breaker pattern monitors the success and failure of requests to a service. When failures exceed a threshold, it stops sending requests to that service temporarily, returning errors immediately. After a cooldown period, it tests the service again to see if it has recovered, preventing system overload and improving overall resilience.
Architecture
Client
Circuit Breaker
Failure Count
Failure Count

This diagram shows the client sending requests through the circuit breaker to the downstream service. The circuit breaker tracks failures and decides whether to allow or block requests.

Trade-offs
✓ Pros
Prevents system overload by stopping requests to failing services.
Improves system responsiveness by failing fast instead of waiting for timeouts.
Allows automatic recovery by periodically testing the service after failures.
✗ Cons
Adds complexity to client or service communication logic.
Requires tuning of thresholds and timeout durations for optimal performance.
May cause temporary denial of service if the breaker trips too aggressively.
Use when your system depends on unreliable or slow external services and you want to avoid cascading failures, especially at scales above hundreds of requests per second.
Avoid when your system has very low traffic (under 100 requests per second) or when downstream services are highly reliable and fast, as the added complexity may not justify the benefits.
Real World Examples
Netflix
Netflix uses circuit breakers to isolate failures in its microservices, preventing one failing service from causing widespread outages.
Amazon
Amazon applies circuit breakers to manage calls to external payment gateways, avoiding delays when those services are down.
Uber
Uber uses circuit breakers to handle unreliable third-party APIs, ensuring the main app remains responsive even if external services fail.
Alternatives
Retry pattern
Retries failed requests a fixed number of times before giving up, without stopping requests entirely.
Use when: Use when failures are transient and likely to succeed on retry, but not when failures are prolonged or cause cascading issues.
Bulkhead pattern
Isolates resources into separate pools to contain failures, rather than stopping requests entirely.
Use when: Use when you want to limit failure impact by resource isolation rather than blocking requests.
Summary
Circuit breaker pattern prevents system overload by stopping requests to failing services.
It improves system resilience by failing fast and allowing recovery testing.
It is essential for systems relying on unreliable or slow external services at scale.