0
0
Microservicessystem_design~25 mins

Circuit breaker pattern in Microservices - System Design Exercise

Choose your learning style9 modes available
Design: Circuit Breaker Pattern Implementation in Microservices
Design the circuit breaker component integrated with microservices communication. Out of scope: detailed microservice business logic, deployment infrastructure.
Functional Requirements
FR1: Prevent cascading failures when a dependent microservice is down or slow
FR2: Detect failures and stop requests to the failing service temporarily
FR3: Automatically retry requests after a cooldown period
FR4: Provide fallback responses when the dependent service is unavailable
FR5: Monitor and log circuit breaker state changes for observability
Non-Functional Requirements
NFR1: Handle up to 10,000 requests per second
NFR2: Fail fast with p99 latency under 100ms for circuit breaker checks
NFR3: Ensure 99.9% availability of the overall system
NFR4: Minimal added latency when circuit breaker is closed (normal operation)
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Circuit breaker middleware or client library
Health check and failure detection logic
Fallback handler for degraded responses
Metrics and logging system
Configuration management for thresholds and timeouts
Design Patterns
State machine for circuit breaker states (Closed, Open, Half-Open)
Timeout and retry policies
Bulkhead pattern to isolate failures
Fallback pattern for graceful degradation
Observer pattern for monitoring state changes
Reference Architecture
Client Service
   |
   |---> Circuit Breaker Middleware ---+---> Dependent Service
                                       |
                                       +---> Fallback Handler

Monitoring System <--- Circuit Breaker Logs & Metrics
Components
Circuit Breaker Middleware
Custom library or framework integration (e.g., Resilience4j, Hystrix)
Intercepts calls to dependent services, tracks failures, and controls request flow based on circuit state
Dependent Service
Any microservice (REST/gRPC)
Service that may fail or become slow, triggering circuit breaker
Fallback Handler
Code module or service
Provides alternative responses when circuit breaker is open
Monitoring System
Prometheus + Grafana or ELK stack
Collects metrics and logs from circuit breaker for alerting and analysis
Request Flow
1. Client Service sends request to Dependent Service through Circuit Breaker Middleware.
2. Circuit Breaker Middleware checks current state: if Closed, forwards request.
3. If request fails or times out, middleware increments failure count.
4. If failures exceed threshold, circuit breaker state changes to Open; further requests are blocked.
5. When Open, requests are immediately sent to Fallback Handler for alternative response.
6. After cooldown period, circuit breaker enters Half-Open state and allows limited test requests.
7. If test requests succeed, circuit breaker closes; if they fail, it reopens.
8. All state changes and metrics are logged and sent to Monitoring System.
Database Schema
No persistent database required for circuit breaker state; state is kept in-memory per instance or shared via distributed cache (e.g., Redis) if needed for cluster-wide state.
Scaling Discussion
Bottlenecks
Single instance circuit breaker state causing inconsistent behavior in multi-instance deployments
High request volume causing overhead in failure tracking and state management
Monitoring system overwhelmed by large volume of circuit breaker events
Solutions
Use distributed cache or coordination service (e.g., Redis, ZooKeeper) to share circuit breaker state across instances
Optimize circuit breaker implementation for low latency and asynchronous failure tracking
Aggregate and sample metrics before sending to monitoring to reduce load
Interview Tips
Time: 10 minutes to clarify requirements and constraints, 20 minutes to design architecture and data flow, 10 minutes to discuss scaling and trade-offs, 5 minutes for questions
Explain the problem of cascading failures and how circuit breaker prevents them
Describe the three states of the circuit breaker and transitions
Discuss fallback strategies and their importance for user experience
Mention how to monitor and alert on circuit breaker events
Address scaling challenges and solutions for distributed microservices