ML Pythonml~20 mins

Experiment tracking (MLflow) in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Experiment tracking (MLflow)

Problem:You have trained a machine learning model but have no organized way to track different runs, parameters, and results. This makes it hard to compare models and find the best one.

Current Metrics:Training accuracy: 92%, Validation accuracy: 85%, No experiment tracking used.

Issue:Without experiment tracking, it is difficult to reproduce results or compare different model versions systematically.

Your Task

Set up MLflow to track your machine learning experiments. Log parameters, metrics, and the model itself so you can compare runs easily.

Use MLflow's Python API for tracking.

Log at least parameters, training accuracy, validation accuracy, and the model.

Do not change the model architecture or dataset.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

ML Python

import mlflow
import mlflow.sklearn
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score

# Load data
iris = load_iris()
X_train, X_val, y_train, y_val = train_test_split(iris.data, iris.target, test_size=0.2, random_state=42)

# Define model parameters
n_estimators = 100
max_depth = 3

# Start MLflow run
with mlflow.start_run():
    # Initialize model
    model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth, random_state=42)
    # Train model
    model.fit(X_train, y_train)
    # Predict
    train_preds = model.predict(X_train)
    val_preds = model.predict(X_val)
    # Calculate accuracy
    train_acc = accuracy_score(y_train, train_preds)
    val_acc = accuracy_score(y_val, val_preds)
    # Log parameters
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("max_depth", max_depth)
    # Log metrics
    mlflow.log_metric("train_accuracy", train_acc)
    mlflow.log_metric("val_accuracy", val_acc)
    # Log model
    mlflow.sklearn.log_model(model, "random_forest_model")

print(f"Training accuracy: {train_acc:.2f}")
print(f"Validation accuracy: {val_acc:.2f}")

Added MLflow tracking code to log parameters, metrics, and model.

Wrapped training and evaluation inside mlflow.start_run() context.

Logged model hyperparameters and accuracy scores.

Saved the trained model using MLflow's sklearn integration.

Results Interpretation

Before: No experiment tracking, only printed accuracy.
After: Parameters, training and validation accuracy, and model saved in MLflow for easy comparison and reproducibility.

Using MLflow helps organize and track machine learning experiments, making it easier to compare different runs and reproduce results.

Bonus Experiment

Try logging additional metrics like precision, recall, or F1-score and compare multiple runs with different hyperparameters using MLflow UI.

💡 Hint

Calculate extra metrics with sklearn.metrics and log them with mlflow.log_metric(). Run multiple experiments changing parameters inside mlflow.start_run() blocks.