Which statement best describes the purpose of an SLA in platform observability?
Think about what agreements between users and providers usually include.
SLAs set clear expectations for uptime and performance, helping both parties understand service quality.
Given the following Prometheus query output for error rate over 5 minutes, what is the error rate percentage?
error_rate{service="ml-model"} 0.02Remember to convert decimal to percentage by multiplying by 100.
The value 0.02 means 2% error rate because 0.02 * 100 = 2%.
Which sequence of steps correctly sets up an alert in a monitoring system when the platform's uptime falls below 99.9%?
Think about logical order: metric first, then alert, then notification, then test.
You must first define what to monitor, then create alert rules, set notifications, and finally test.
You notice that your observability dashboard shows no data for the last hour. Which is the most likely cause?
Consider what collects and sends data to the dashboard.
If the data collection agent stops, no new data reaches the dashboard, causing missing data.
Which metric is the best choice to monitor to ensure compliance with a 99.9% uptime SLA for an ML platform?
Think about what directly reflects availability and success rate.
Percentage of successful requests directly measures uptime and availability, matching SLA goals.