Bird
Raised Fist0
MLOpsdevops~5 mins

Logging artifacts and models in MLOps - Commands & Configuration

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Introduction
When you train machine learning models, you want to save the results and files so you can use them later or share them. Logging artifacts and models means saving these files in a safe place automatically during training.
When you want to save your trained model to use it later for predictions.
When you want to keep track of files like plots or data samples created during training.
When you want to compare different model versions by saving each one separately.
When you want to share your model and related files with your team easily.
When you want to keep a history of your experiments and their outputs.
Commands
This command installs MLflow, a tool that helps you log models and artifacts easily.
Terminal
pip install mlflow
Expected OutputExpected
Collecting mlflow Downloading mlflow-2.7.0-py3-none-any.whl (18.7 MB) Installing collected packages: mlflow Successfully installed mlflow-2.7.0
This runs a Python script that trains a model and logs the model and an artifact file using MLflow.
Terminal
python log_model.py
Expected OutputExpected
2024/06/01 12:00:00 INFO mlflow.tracking.fluent: Experiment with name 'Default' does not exist. Creating a new experiment. 2024/06/01 12:00:01 INFO mlflow.tracking.fluent: Logging model and artifact Model and artifact logged successfully
Key Concept

If you remember nothing else from this pattern, remember: logging saves your model and files automatically so you never lose your work.

Code Example
MLOps
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import joblib

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, random_state=42)

# Train model
model = RandomForestClassifier(n_estimators=10, random_state=42)
model.fit(X_train, y_train)

# Save a sample artifact
joblib.dump(model, 'model.joblib')

# Start MLflow run
with mlflow.start_run():
    # Log model
    mlflow.sklearn.log_model(model, 'random_forest_model')
    # Log artifact file
    mlflow.log_artifact('model.joblib')

print('Model and artifact logged successfully')
OutputSuccess
Common Mistakes
Not calling mlflow.start_run() before logging artifacts or models
MLflow needs a run context to save logs; without it, logging commands fail silently or cause errors.
Always wrap your logging code inside mlflow.start_run() as a context manager.
Logging model files without specifying the correct path
If the path is wrong, MLflow cannot find the files to save, so nothing gets logged.
Make sure the file paths you pass to mlflow.log_artifact() or mlflow.sklearn.log_model() are correct and files exist.
Summary
Install MLflow to enable logging of models and artifacts.
Use mlflow.start_run() to create a logging session.
Log your trained model and any files you want to keep using MLflow functions.
This process helps you save and track your machine learning work automatically.

Practice

(1/5)
1. What is the main purpose of logging artifacts and models in MLOps?
easy
A. To speed up model training
B. To delete old models automatically
C. To create new datasets from artifacts
D. To save files and models for tracking and reuse

Solution

  1. Step 1: Understand the role of logging in MLOps

    Logging artifacts and models helps keep track of work and reuse it later.
  2. Step 2: Identify the correct purpose

    Saving files and models for tracking and reuse matches the main goal of logging.
  3. Final Answer:

    To save files and models for tracking and reuse -> Option D
  4. Quick Check:

    Logging = Save and track work [OK]
Hint: Logging means saving work for later use [OK]
Common Mistakes:
  • Thinking logging deletes models
  • Confusing logging with speeding training
  • Assuming logging creates new data
2. Which of the following is the correct syntax to log a file artifact using MLflow?
easy
A. mlflow.log_model('path/to/file.txt')
B. mlflow.log_artifact('path/to/file.txt')
C. mlflow.log_artifacts('path/to/file.txt')
D. mlflow.log('path/to/file.txt')

Solution

  1. Step 1: Recall MLflow function for logging files

    The correct function is mlflow.log_artifact() for single files.
  2. Step 2: Check syntax correctness

    mlflow.log_artifact('path/to/file.txt') matches the correct syntax.
  3. Final Answer:

    mlflow.log_artifact('path/to/file.txt') -> Option B
  4. Quick Check:

    Single file logging = log_artifact() [OK]
Hint: Use log_artifact() for single files [OK]
Common Mistakes:
  • Using log_model() for files
  • Using plural log_artifacts() incorrectly
  • Using generic log() function
3. What will be the output of this code snippet?
import mlflow
with mlflow.start_run():
    mlflow.log_artifact('data.csv')
    mlflow.log_model(model, 'model')
print('Run finished')
medium
A. Error because model is not defined
B. Run finished printed; artifacts and model logged in current run
C. No output; code hangs
D. Run finished printed; but nothing logged

Solution

  1. Step 1: Analyze the code snippet

    The code tries to log a file and a model inside a run.
  2. Step 2: Check for errors

    The variable 'model' is not defined, so mlflow.log_model(model, 'model') causes a NameError.
  3. Final Answer:

    Error because model is not defined -> Option A
  4. Quick Check:

    Undefined variable causes error [OK]
Hint: Check if variables are defined before logging [OK]
Common Mistakes:
  • Assuming code runs without defining model
  • Thinking print means success
  • Ignoring variable definitions
4. You run this code but no artifacts appear in MLflow UI:
mlflow.log_artifact('output.txt')

What is the most likely reason?
medium
A. log_artifact() only works inside a run
B. The file output.txt does not exist
C. No active MLflow run was started
D. MLflow server is down

Solution

  1. Step 1: Understand MLflow run context

    Logging artifacts requires an active run to group logs.
  2. Step 2: Identify missing run

    Without mlflow.start_run(), logs are not saved properly.
  3. Final Answer:

    No active MLflow run was started -> Option C
  4. Quick Check:

    Logging needs active run [OK]
Hint: Always start a run before logging [OK]
Common Mistakes:
  • Assuming logging works without a run
  • Ignoring file existence
  • Blaming server without checking run
5. You want to log multiple files and a trained model in one MLflow run. Which code snippet correctly does this?
hard
A. with mlflow.start_run(): mlflow.log_artifact('file1.txt') mlflow.log_artifact('file2.txt') mlflow.log_model(model, 'model')
B. mlflow.log_artifact(['file1.txt', 'file2.txt']) mlflow.log_model(model, 'model')
C. with mlflow.start_run(): mlflow.log_artifacts(['file1.txt', 'file2.txt']) mlflow.log_model(model, 'model')
D. mlflow.start_run() mlflow.log_artifact('file1.txt') mlflow.log_artifacts('file2.txt') mlflow.log_model(model, 'model') mlflow.end_run()

Solution

  1. Step 1: Identify correct way to log multiple files

    For multiple individual files, call mlflow.log_artifact() for each file.
  2. Step 2: Confirm run context and model logging

    Using with mlflow.start_run(): ensures proper context; logs each file and the model correctly.
  3. Final Answer:

    with mlflow.start_run(): mlflow.log_artifact('file1.txt') mlflow.log_artifact('file2.txt') mlflow.log_model(model, 'model') -> Option A
  4. Quick Check:

    Multiple files = log_artifact() for each inside run [OK]
Hint: Call log_artifact() for each file inside a run [OK]
Common Mistakes:
  • Passing list to log_artifacts()
  • Not using a run context
  • Using log_artifacts() on single file