0
0
MLOpsdevops~5 mins

Platform observability and SLAs in MLOps - Commands & Configuration

Choose your learning style9 modes available
Introduction
When you run machine learning models in production, you need to watch how well the system works and make sure it meets promised performance levels. Platform observability helps you see inside the system, and SLAs set clear goals for uptime and response times.
When you want to track if your ML model is running without errors in production
When you need to know if your prediction service is responding quickly enough for users
When you want to get alerts if the system is down or behaving badly
When you want to share clear performance promises with your team or customers
When you want to improve your ML system by understanding its behavior over time
Commands
Start the MLflow tracking server to collect metrics and logs from your ML models. This helps observe model performance and system health.
Terminal
mlflow server --host 0.0.0.0 --port 5000
Expected OutputExpected
2024/06/01 12:00:00 INFO mlflow.server: Starting MLflow tracking server at http://0.0.0.0:5000
--host - Bind the server to all network interfaces so it can be accessed remotely
--port - Set the port number where the server listens
Run your ML project which logs metrics and parameters to the MLflow server for observability.
Terminal
mlflow run .
Expected OutputExpected
2024/06/01 12:01:00 INFO mlflow.projects: Running ML project 2024/06/01 12:01:10 INFO mlflow.projects: Run completed successfully
Fetch metrics for a specific MLflow run to check model performance and system behavior.
Terminal
curl -X POST http://localhost:5000/api/2.0/preview/mlflow/metrics/get -d '{"run_id": "12345"}'
Expected OutputExpected
{"metrics": [{"key": "accuracy", "value": 0.92, "timestamp": 1685610000}]}
Create a simple alert rule to notify if prediction latency exceeds 200 milliseconds, helping maintain SLA.
Terminal
echo 'alert: high_latency
condition: prediction_latency > 200ms
action: send_email' > alert_rule.yaml
Expected OutputExpected
No output (command runs silently)
Apply the alert rule to the MLflow monitoring system to enforce SLA conditions and get notified on issues.
Terminal
mlflow alerts apply -f alert_rule.yaml
Expected OutputExpected
Alert rule 'high_latency' applied successfully
Key Concept

If you remember nothing else from this pattern, remember: observability means collecting clear data about your ML system so you can meet and prove your SLAs.

Common Mistakes
Not starting the MLflow server before running the ML project
Metrics and logs have nowhere to go, so you lose observability data
Always start the MLflow tracking server first to collect data
Ignoring alert rules and not setting thresholds for key metrics
You won't get notified when the system breaks SLA, causing downtime or bad user experience
Define clear alert rules for important metrics like latency and error rates
Fetching metrics without specifying the correct run ID
You get no data or wrong data, making it hard to understand system health
Always use the exact run ID from your MLflow runs when querying metrics
Summary
Start the MLflow tracking server to collect metrics and logs from your ML models.
Run your ML project to log performance data for observability.
Fetch metrics using MLflow API to check if your system meets SLAs.
Create and apply alert rules to get notified when SLAs are violated.