Problem Statement
Without a systematic way to collect metrics, it becomes impossible to understand system health or performance. Failures go unnoticed until users complain, and diagnosing issues takes much longer, increasing downtime and reducing reliability.