Design: Microservices Health Check System
Design focuses on health check pattern implementation in a microservices environment including health endpoints, aggregation, and monitoring integration. Out of scope are detailed alerting rules and remediation automation.
Functional Requirements
FR1: Each microservice must expose a health check endpoint.
FR2: Health checks should verify service dependencies like databases and external APIs.
FR3: The system should aggregate health status of all microservices.
FR4: Health status must be accessible for monitoring tools and alerting systems.
FR5: Health checks should be lightweight and fast to avoid overhead.
FR6: Support both readiness and liveness probes for container orchestration.
Non-Functional Requirements
NFR1: Handle up to 100 microservices in the system.
NFR2: Health check response time should be under 200ms (p99).
NFR3: System availability target is 99.9% uptime.
NFR4: Health check endpoints must not cause side effects or heavy load.
NFR5: Support secure access to health endpoints to prevent unauthorized use.