Problem Statement
When a service or server fails silently or becomes unresponsive, the system continues to send traffic to it, causing slow responses or complete outages. Without a way to detect unhealthy components, the system cannot reroute traffic or trigger recovery actions, leading to poor user experience and downtime.