Ml-pythonHow-ToBeginner · 4 min read

How to Use MLflow for Tracking Machine Learning Experiments

Use mlflow.start_run() to begin tracking an experiment, then log parameters with mlflow.log_param(), metrics with mlflow.log_metric(), and models with mlflow.sklearn.log_model(). This helps you keep track of your model training details and results in one place.

📐

Syntax

MLflow tracking uses a simple pattern to log your machine learning experiments:

mlflow.start_run(): Starts a new experiment run.
mlflow.log_param(key, value): Logs a parameter like learning rate or number of trees.
mlflow.log_metric(key, value): Logs a metric like accuracy or loss.
mlflow.sklearn.log_model(model, name): Saves the trained model for later use.
mlflow.end_run(): Ends the current run (optional, auto-ended on exit).

python

import mlflow

with mlflow.start_run():
    mlflow.log_param("param1", 5)
    mlflow.log_metric("accuracy", 0.85)
    # model training and logging here
    # mlflow.sklearn.log_model(model, "model")

💻

Example

This example shows how to track a simple scikit-learn model training with MLflow. It logs parameters, metrics, and the model itself.

python

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Start MLflow run
with mlflow.start_run():
    # Define and train model
    n_estimators = 100
    model = RandomForestClassifier(n_estimators=n_estimators, random_state=42)
    model.fit(X_train, y_train)

    # Predict and calculate accuracy
    preds = model.predict(X_test)
    acc = accuracy_score(y_test, preds)

    # Log parameters and metrics
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_metric("accuracy", acc)

    # Log the model
    mlflow.sklearn.log_model(model, "random_forest_model")

    print(f"Logged model with accuracy: {acc:.4f}")

Output

Logged model with accuracy: 1.0000

⚠️

Common Pitfalls

Common mistakes when using MLflow tracking include:

Not using mlflow.start_run() which causes logs to be ignored.
Logging parameters or metrics outside the run context.
Forgetting to log the model after training.
Overwriting runs by not managing run IDs or experiment names.

Always use with mlflow.start_run(): to ensure logs are saved properly.

python

import mlflow

# Wrong way: logging outside a run
# mlflow.log_param("param", 10)  # This will raise an error

# Right way:
with mlflow.start_run():
    mlflow.log_param("param", 10)

Output

Traceback (most recent call last): File "example.py", line 4, in <module> mlflow.log_param("param", 10) File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/fluent.py", line 456, in log_param _get_active_run_or_raise().log_param(key, value) File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/fluent.py", line 222, in _get_active_run_or_raise raise MlflowException("No active run") mlflow.exceptions.MlflowException: No active run

📊

Quick Reference

Here is a quick summary of MLflow tracking commands:

Command	Description
mlflow.start_run()	Start a new experiment run context
mlflow.log_param(key, value)	Log a parameter (e.g., hyperparameter)
mlflow.log_metric(key, value)	Log a metric (e.g., accuracy)
mlflow.sklearn.log_model(model, name)	Save a trained scikit-learn model
mlflow.end_run()	End the current run (optional)

✅

Key Takeaways

Always use mlflow.start_run() to begin tracking an experiment run.

Log parameters and metrics inside the run context to save them properly.

Use mlflow.sklearn.log_model() to save your trained model for later use.

Avoid logging outside a run context to prevent errors.

MLflow helps organize and compare multiple experiment runs easily.