Challenge - 5 Problems

🎖️

Monitoring Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

💻 Command Output

intermediate

2:00remaining

What is the output of this Prometheus query?

Given the Prometheus query rate(http_requests_total[5m]), what does it return?

LangChain

rate(http_requests_total[5m])

AThe current number of HTTP requests being processed.

BThe total number of HTTP requests since the server started.

CThe maximum number of HTTP requests in any 5-minute window.

DThe average number of HTTP requests per second over the last 5 minutes.

Attempts:

2 left

🧠 Conceptual

intermediate

2:00remaining

Which alerting condition triggers a PagerDuty notification?

You have an alert rule: IF cpu_usage > 90% FOR 10m THEN alert. Which option best describes when PagerDuty will be notified?

APagerDuty is notified immediately when CPU usage exceeds 90%.

BPagerDuty is notified only if CPU usage stays above 90% continuously for 10 minutes.

CPagerDuty is notified if CPU usage exceeds 90% at any point during 10 minutes, even briefly.

DPagerDuty is notified after CPU usage drops below 90% following a spike.

Attempts:

2 left

❓ Troubleshoot

advanced

2:00remaining

Why does this Grafana alert never fire?

You created a Grafana alert with this query: memory_usage[5m] > 80. The alert never fires even when memory usage is high. What is the likely cause?

LangChain

memory_usage[5m] > 80

AThe <code>memory_usage[5m]</code> selector returns a range vector, which cannot be directly compared to a scalar.

BThe alert condition compares a vector to a scalar without proper aggregation.

CThe query uses a wrong function; it should be <code>avg(memory_usage) > 80</code>.

DThe threshold 80 is too high; memory usage never reaches that value.

Attempts:

2 left

🔀 Workflow

advanced

3:00remaining

Order the steps to set up a basic alerting pipeline

Arrange these steps in the correct order to set up monitoring and alerting for a web service.

A1,2,4,3

B4,1,2,3

C1,4,2,3

D4,2,1,3

Attempts:

2 left

✅ Best Practice

expert

2:30remaining

Which practice improves alert reliability and reduces noise?

You notice many alerts firing for brief spikes in CPU usage. Which practice best reduces false alerts without missing real issues?

AUse alert 'for' duration to require condition to persist before firing.

BSet alert thresholds very high to avoid triggering on spikes.

CDisable alerts during peak traffic hours to reduce noise.

DSend alerts to multiple channels to ensure visibility.

Attempts:

2 left