Bird
Raised Fist0
HLDsystem_design~25 mins

Heartbeat mechanism in HLD - System Design Exercise

Choose your learning style9 modes available
Design: Heartbeat Mechanism System
Design covers heartbeat sending, receiving, monitoring, and alerting. Out of scope are client implementation details and dashboard UI design specifics.
Functional Requirements
FR1: Detect if a client or server is alive by sending periodic heartbeat signals
FR2: Support up to 10,000 concurrent clients sending heartbeats
FR3: Trigger alerts if heartbeat is missed for more than 30 seconds
FR4: Provide a dashboard to show live status of all clients
FR5: Ensure minimal network overhead for heartbeat messages
Non-Functional Requirements
NFR1: Heartbeat interval must be configurable but default to 10 seconds
NFR2: System must handle network delays and temporary outages gracefully
NFR3: Latency for detecting a missed heartbeat should be under 5 seconds after timeout
NFR4: System availability target is 99.9% uptime
NFR5: Heartbeat messages should be lightweight (under 1 KB)
Think Before You Design
Questions to Ask
❓ Question 1
❓ Question 2
❓ Question 3
❓ Question 4
❓ Question 5
Key Components
Heartbeat sender (client or server component)
Heartbeat receiver service
State store or cache to track last heartbeat timestamps
Alerting and notification system
Dashboard or monitoring UI
Load balancer or API gateway
Design Patterns
Polling vs push-based heartbeat
Timeout and retry mechanisms
Circuit breaker pattern for unhealthy clients
Event-driven architecture for alerting
Caching for fast heartbeat status lookup
Reference Architecture
  +------------+       Heartbeat       +----------------+
  |  Clients   | --------------------> | Heartbeat      |
  | (10,000)  |                       | Receiver       |
  +------------+                       +----------------+
                                         |       |
                                         |       | Updates last
                                         |       | heartbeat time
                                         v       v
                                   +---------------------+
                                   | State Store (Redis)  |
                                   +---------------------+
                                         |       |
                                         |       | Triggers alerts
                                         |       | if timeout
                                         v       v
                                   +---------------------+
                                   | Alerting Service    |
                                   +---------------------+
                                         |
                                         v
                                   +---------------------+
                                   | Monitoring Dashboard|
                                   +---------------------+
Components
Clients
Any client platform
Send periodic heartbeat messages to indicate they are alive
Heartbeat Receiver
Stateless REST API or TCP server
Receive heartbeat messages and update client status
State Store
Redis or in-memory key-value store
Store last heartbeat timestamp per client for quick lookup
Alerting Service
Event-driven microservice
Detect missed heartbeats and send alerts/notifications
Monitoring Dashboard
Web UI with real-time updates
Display live status of all clients and alerts
Load Balancer
Nginx or cloud LB
Distribute heartbeat requests across receiver instances
Request Flow
1. Client sends heartbeat message every 10 seconds to Heartbeat Receiver.
2. Heartbeat Receiver validates and updates the last heartbeat timestamp in State Store.
3. Alerting Service periodically scans State Store for clients missing heartbeat beyond 30 seconds.
4. If a missed heartbeat is detected, Alerting Service triggers notifications.
5. Monitoring Dashboard queries State Store to show live client statuses and alerts.
Database Schema
Entities: - Client: client_id (PK), metadata - HeartbeatRecord: client_id (FK), last_heartbeat_timestamp Relationships: - One-to-one between Client and HeartbeatRecord - HeartbeatRecord updated on each heartbeat received
Scaling Discussion
Bottlenecks
Heartbeat Receiver can become overwhelmed with 10,000+ concurrent heartbeat messages
State Store may face high read/write load for heartbeat timestamps
Alerting Service scanning large datasets may cause latency
Network bandwidth could be stressed by frequent heartbeat messages
Solutions
Use multiple Heartbeat Receiver instances behind a load balancer to distribute load
Use a highly performant in-memory store like Redis with sharding for State Store
Implement incremental or event-driven alerting instead of full scans
Optimize heartbeat message size and interval; consider adaptive heartbeat frequency
Interview Tips
Time: Spend 10 minutes clarifying requirements and constraints, 20 minutes designing architecture and data flow, 10 minutes discussing scaling and trade-offs, 5 minutes summarizing.
Explain why heartbeat is needed and how it helps detect failures
Discuss trade-offs in heartbeat interval and message size
Describe components and their responsibilities clearly
Highlight how system handles scale and failure scenarios
Mention monitoring and alerting importance for operational health