Bird
Raised Fist0
MLOpsdevops~10 mins

Automated model validation before promotion in MLOps - Commands & Configuration

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
When you train a machine learning model, you want to make sure it works well before using it in real life. Automated model validation helps check the model's quality automatically before you decide to use it in your app or share it with others.
When you want to check if a new model is better than the old one before using it.
When you want to avoid mistakes by automatically testing models after training.
When you want to save time by not checking models manually every time.
When you want to keep a history of model performance to track improvements.
When you want to automatically stop bad models from being used in production.
Commands
This command runs the MLflow project in the current folder with a parameter alpha set to 0.5. It starts the model training and validation process automatically.
Terminal
mlflow run . -P alpha=0.5
Expected OutputExpected
2024/06/01 12:00:00 INFO mlflow.projects: === Run (ID=123abc) started === 2024/06/01 12:00:05 INFO mlflow.projects: Training model with alpha=0.5 2024/06/01 12:00:10 INFO mlflow.projects: Validation accuracy: 0.87 2024/06/01 12:00:10 INFO mlflow.projects: Model passed validation and is ready for promotion 2024/06/01 12:00:10 INFO mlflow.projects: === Run (ID=123abc) succeeded ===
-P - Set a parameter value for the MLflow project run
This command serves the validated model from the run with ID 123abc on port 1234 so you can test it live or use it in your app.
Terminal
mlflow models serve -m runs:/123abc/model -p 1234
Expected OutputExpected
2024/06/01 12:01:00 INFO mlflow.models: Starting model server at http://127.0.0.1:1234 2024/06/01 12:01:00 INFO mlflow.models: Model loaded successfully
-m - Specify the model URI to serve
-p - Set the port number for the model server
This command sends a test data point to the running model server to get a prediction and verify the model works as expected.
Terminal
curl -d '{"data": [[5.1, 3.5, 1.4, 0.2]]}' -H 'Content-Type: application/json' -X POST http://127.0.0.1:1234/invocations
Expected OutputExpected
{"predictions": [0]}
-d - Send JSON data in the request body
-H - Set the content type header to JSON
-X - Use POST method to send data
Key Concept

If you remember nothing else from this pattern, remember: automate testing your model's quality before using it to avoid mistakes and save time.

Code Example
MLOps
import mlflow
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Start MLflow run
with mlflow.start_run() as run:
    # Train model
    model = LogisticRegression(max_iter=200)
    model.fit(X_train, y_train)

    # Predict and validate
    preds = model.predict(X_test)
    acc = accuracy_score(y_test, preds)

    # Log model and metric
    mlflow.log_metric("accuracy", acc)
    mlflow.sklearn.log_model(model, "model")

    # Check if accuracy is good enough
    if acc >= 0.85:
        print(f"Model passed validation with accuracy {acc:.2f}")
    else:
        print(f"Model failed validation with accuracy {acc:.2f}")
OutputSuccess
Common Mistakes
Skipping the validation step and promoting the model directly after training
This can cause bad models to be used in production, leading to wrong predictions and user problems.
Always run automated validation to check model quality before promotion.
Not specifying parameters correctly when running the MLflow project
The model may train with default or wrong settings, causing poor performance.
Use the -P flag to set parameters explicitly during the run.
Testing the model server with wrong data format or missing headers
The server will reject the request or return errors, making testing fail.
Send JSON data with the correct content-type header and use POST method.
Summary
Run the MLflow project with parameters to train and validate the model automatically.
Serve the validated model to test it live or integrate with applications.
Send test data to the model server to confirm it predicts correctly before promotion.

Practice

(1/5)
1. What is the main purpose of automated model validation before promotion in MLOps?
easy
A. To check if the model meets quality standards before deployment
B. To speed up the training process of the model
C. To manually review the model code for errors
D. To collect more data for training the model

Solution

  1. Step 1: Understand the goal of validation

    Automated model validation is designed to ensure the model performs well and meets quality standards before it is used in production.
  2. Step 2: Differentiate from other tasks

    Speeding training, manual code review, or data collection are separate tasks not directly related to validation before promotion.
  3. Final Answer:

    To check if the model meets quality standards before deployment -> Option A
  4. Quick Check:

    Validation ensures quality before deployment = D [OK]
Hint: Validation means checking quality before use [OK]
Common Mistakes:
  • Confusing validation with training speed
  • Thinking validation is manual code review
  • Mixing validation with data collection
2. Which of the following is a correct way to automate model validation in a CI/CD pipeline?
easy
A. Run a script that tests model accuracy and returns pass/fail status
B. Manually check model predictions after deployment
C. Skip validation to save time during deployment
D. Only validate the model after it is in production

Solution

  1. Step 1: Identify automation in CI/CD

    Automation requires scripts or tools that run tests automatically and give clear pass/fail results.
  2. Step 2: Eliminate manual or delayed checks

    Manual checks or skipping validation do not fit automation principles and risk bad models in production.
  3. Final Answer:

    Run a script that tests model accuracy and returns pass/fail status -> Option A
  4. Quick Check:

    Automated validation uses scripts with pass/fail output = C [OK]
Hint: Automation means scripts with pass/fail results [OK]
Common Mistakes:
  • Choosing manual checks as automation
  • Skipping validation to save time
  • Validating only after deployment
3. Given this Python snippet in a validation script:
accuracy = 0.82
threshold = 0.80
if accuracy >= threshold:
    print('PASS')
else:
    print('FAIL')

What will be the output?
medium
A. FAIL
B. PASS
C. SyntaxError
D. No output

Solution

  1. Step 1: Compare accuracy with threshold

    The accuracy is 0.82, which is greater than or equal to the threshold 0.80.
  2. Step 2: Determine the printed output

    Since 0.82 >= 0.80 is true, the script prints 'PASS'.
  3. Final Answer:

    PASS -> Option B
  4. Quick Check:

    0.82 >= 0.80 means PASS [OK]
Hint: Check if accuracy meets or exceeds threshold [OK]
Common Mistakes:
  • Confusing greater than with less than
  • Thinking 0.82 is less than 0.80
  • Assuming syntax error due to >= symbol
4. A validation script uses this code:
if model_accuracy > threshold
    print('PASS')
else:
    print('FAIL')

What is the error and how to fix it?
medium
A. Wrong comparison operator; replace > with <
B. Incorrect variable name; change model_accuracy to accuracy
C. Indentation error; remove indentation before print
D. Missing colon after if condition; add ':' after threshold

Solution

  1. Step 1: Identify syntax error in if statement

    The if statement is missing a colon ':' at the end of the condition line.
  2. Step 2: Correct the syntax

    Add a colon ':' after 'threshold' to fix the syntax error.
  3. Final Answer:

    Missing colon after if condition; add ':' after threshold -> Option D
  4. Quick Check:

    if statements need ':' at end = A [OK]
Hint: if statements always end with ':' [OK]
Common Mistakes:
  • Ignoring missing colon causing syntax error
  • Changing variable names unnecessarily
  • Misunderstanding indentation rules
5. You want to automate model validation to check multiple metrics before promotion. Which approach is best?
hard
A. Manually review metrics and decide promotion later
B. Promote the model if any one metric passes the threshold
C. Write a script that checks all metrics and returns 'PASS' only if all meet thresholds
D. Ignore metrics and promote based on training completion

Solution

  1. Step 1: Understand multi-metric validation

    For reliable validation, all important metrics should meet their thresholds before promotion.
  2. Step 2: Choose automation that enforces all checks

    A script that returns 'PASS' only if all metrics pass ensures no weak model is promoted.
  3. Final Answer:

    Write a script that checks all metrics and returns 'PASS' only if all meet thresholds -> Option C
  4. Quick Check:

    All metrics must pass for promotion = A [OK]
Hint: All metrics must meet thresholds to pass [OK]
Common Mistakes:
  • Promoting if only one metric passes
  • Relying on manual review instead of automation
  • Ignoring metrics and promoting anyway