Challenge - 5 Problems
Monitoring Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
💻 Command Output
intermediate2:00remaining
What is the output of this Prometheus query?
Given the Prometheus query
rate(http_requests_total[5m]), what does it return?LangChain
rate(http_requests_total[5m])Attempts:
2 left
💡 Hint
Think about what the 'rate' function calculates over a time range.
✗ Incorrect
The rate() function calculates the per-second average rate of increase of a counter over the specified time window, here 5 minutes.
🧠 Conceptual
intermediate2:00remaining
Which alerting condition triggers a PagerDuty notification?
You have an alert rule:
IF cpu_usage > 90% FOR 10m THEN alert. Which option best describes when PagerDuty will be notified?Attempts:
2 left
💡 Hint
Consider the meaning of the 'FOR 10m' clause in alert rules.
✗ Incorrect
The 'FOR 10m' means the condition must be true continuously for 10 minutes before triggering the alert and notification.
❓ Troubleshoot
advanced2:00remaining
Why does this Grafana alert never fire?
You created a Grafana alert with this query:
memory_usage[5m] > 80. The alert never fires even when memory usage is high. What is the likely cause?LangChain
memory_usage[5m] > 80
Attempts:
2 left
💡 Hint
Check the data types returned by Prometheus functions and how comparisons work.
✗ Incorrect
memory_usage[5m] returns a range vector, but alert conditions require instant vectors for comparison to scalars.
🔀 Workflow
advanced3:00remaining
Order the steps to set up a basic alerting pipeline
Arrange these steps in the correct order to set up monitoring and alerting for a web service.
Attempts:
2 left
💡 Hint
Think about what must be ready before alert rules and notifications.
✗ Incorrect
First, metrics must be exposed by the service (1), then the monitoring system must collect them (4), then alert rules are defined (2), and finally notifications are set up (3).
✅ Best Practice
expert2:30remaining
Which practice improves alert reliability and reduces noise?
You notice many alerts firing for brief spikes in CPU usage. Which practice best reduces false alerts without missing real issues?
Attempts:
2 left
💡 Hint
Consider how to avoid alerts on short-lived spikes.
✗ Incorrect
Using an alert 'for' duration means the condition must be true continuously for a set time, reducing noise from brief spikes.