0
0
Ml-pythonHow-ToBeginner ยท 4 min read

How to Use MLflow for Tracking Machine Learning Experiments

Use mlflow.start_run() to begin tracking an experiment, then log parameters with mlflow.log_param(), metrics with mlflow.log_metric(), and models with mlflow.sklearn.log_model(). This helps you keep track of your model training details and results in one place.
๐Ÿ“

Syntax

MLflow tracking uses a simple pattern to log your machine learning experiments:

  • mlflow.start_run(): Starts a new experiment run.
  • mlflow.log_param(key, value): Logs a parameter like learning rate or number of trees.
  • mlflow.log_metric(key, value): Logs a metric like accuracy or loss.
  • mlflow.sklearn.log_model(model, name): Saves the trained model for later use.
  • mlflow.end_run(): Ends the current run (optional, auto-ended on exit).
python
import mlflow

with mlflow.start_run():
    mlflow.log_param("param1", 5)
    mlflow.log_metric("accuracy", 0.85)
    # model training and logging here
    # mlflow.sklearn.log_model(model, "model")
๐Ÿ’ป

Example

This example shows how to track a simple scikit-learn model training with MLflow. It logs parameters, metrics, and the model itself.

python
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Start MLflow run
with mlflow.start_run():
    # Define and train model
    n_estimators = 100
    model = RandomForestClassifier(n_estimators=n_estimators, random_state=42)
    model.fit(X_train, y_train)

    # Predict and calculate accuracy
    preds = model.predict(X_test)
    acc = accuracy_score(y_test, preds)

    # Log parameters and metrics
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_metric("accuracy", acc)

    # Log the model
    mlflow.sklearn.log_model(model, "random_forest_model")

    print(f"Logged model with accuracy: {acc:.4f}")
Output
Logged model with accuracy: 1.0000
โš ๏ธ

Common Pitfalls

Common mistakes when using MLflow tracking include:

  • Not using mlflow.start_run() which causes logs to be ignored.
  • Logging parameters or metrics outside the run context.
  • Forgetting to log the model after training.
  • Overwriting runs by not managing run IDs or experiment names.

Always use with mlflow.start_run(): to ensure logs are saved properly.

python
import mlflow

# Wrong way: logging outside a run
# mlflow.log_param("param", 10)  # This will raise an error

# Right way:
with mlflow.start_run():
    mlflow.log_param("param", 10)
Output
Traceback (most recent call last): File "example.py", line 4, in <module> mlflow.log_param("param", 10) File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/fluent.py", line 456, in log_param _get_active_run_or_raise().log_param(key, value) File "/usr/local/lib/python3.8/site-packages/mlflow/tracking/fluent.py", line 222, in _get_active_run_or_raise raise MlflowException("No active run") mlflow.exceptions.MlflowException: No active run
๐Ÿ“Š

Quick Reference

Here is a quick summary of MLflow tracking commands:

CommandDescription
mlflow.start_run()Start a new experiment run context
mlflow.log_param(key, value)Log a parameter (e.g., hyperparameter)
mlflow.log_metric(key, value)Log a metric (e.g., accuracy)
mlflow.sklearn.log_model(model, name)Save a trained scikit-learn model
mlflow.end_run()End the current run (optional)
โœ…

Key Takeaways

Always use mlflow.start_run() to begin tracking an experiment run.
Log parameters and metrics inside the run context to save them properly.
Use mlflow.sklearn.log_model() to save your trained model for later use.
Avoid logging outside a run context to prevent errors.
MLflow helps organize and compare multiple experiment runs easily.