ML Pythonml~20 mins

Model versioning in ML Python - ML Experiment: Train & Evaluate

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Experiment - Model versioning

Problem:You have trained a machine learning model and saved it as a file. Now you want to keep track of different versions of your model as you improve it, so you can compare and use the best one later.

Current Metrics:Model accuracy: 85%, saved as 'model_v1.pkl'

Issue:No system to manage multiple model versions. It is hard to know which model is best or to roll back to a previous version.

Your Task

Create a simple model versioning system that saves models with version numbers and allows loading a specific version. Demonstrate saving two versions and loading them to compare accuracy.

Use Python and scikit-learn for model training.

Save models as files with version numbers in the filename.

Do not use complex version control tools; keep it simple and clear.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

ML Python

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import joblib

# Load data
iris = load_iris()
X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3, random_state=42)

# Train first model version
model_v1 = RandomForestClassifier(n_estimators=10, random_state=42)
model_v1.fit(X_train, y_train)
joblib.dump(model_v1, 'model_v1.pkl')

# Evaluate first model
preds_v1 = model_v1.predict(X_test)
acc_v1 = accuracy_score(y_test, preds_v1)

# Train second model version with more trees
model_v2 = RandomForestClassifier(n_estimators=50, random_state=42)
model_v2.fit(X_train, y_train)
joblib.dump(model_v2, 'model_v2.pkl')

# Evaluate second model
preds_v2 = model_v2.predict(X_test)
acc_v2 = accuracy_score(y_test, preds_v2)

# Load models from files
loaded_v1 = joblib.load('model_v1.pkl')
loaded_v2 = joblib.load('model_v2.pkl')

# Confirm loaded models accuracy
loaded_acc_v1 = accuracy_score(y_test, loaded_v1.predict(X_test))
loaded_acc_v2 = accuracy_score(y_test, loaded_v2.predict(X_test))

print(f"Model v1 accuracy: {acc_v1:.2f}")
print(f"Model v2 accuracy: {acc_v2:.2f}")
print(f"Loaded model v1 accuracy: {loaded_acc_v1:.2f}")
print(f"Loaded model v2 accuracy: {loaded_acc_v2:.2f}")

Added saving model files with version numbers in filenames.

Trained two models with different hyperparameters to create versions.

Loaded saved models to verify versioning works correctly.

Results Interpretation

Before: Only one model saved as 'model_v1.pkl' with accuracy 85%. No way to track improvements.

After: Two models saved as 'model_v1.pkl' and 'model_v2.pkl' with accuracies 91% and 96%. Can load any version to compare or use.

Model versioning helps keep track of improvements and makes it easy to use or roll back to previous models. Saving models with clear version numbers is a simple and effective approach.

Bonus Experiment

Implement a simple function that lists all saved model versions in the current folder and allows the user to select one to load.

💡 Hint

Use Python's os module to list files matching 'model_v*.pkl' and let the user input the version number to load.