How to save model in sklearn

MlopsHow-ToBeginner · 3 min read

How to Save a Model in sklearn: Simple Guide

To save a model in sklearn, use joblib.dump() or pickle.dump() to write the model object to a file. Later, load it back with joblib.load() or pickle.load() to reuse the trained model without retraining.

📐

Syntax

Use joblib.dump(model, filename) to save the model to a file. Use joblib.load(filename) to load it back. Alternatively, use pickle.dump(model, file) and pickle.load(file) for saving and loading.

model: the trained sklearn model object
filename: string path to save the model file
file: an open file object in binary mode

python

import joblib

# Save model
joblib.dump(model, 'model.joblib')

# Load model
model = joblib.load('model.joblib')

💻

Example

This example trains a simple logistic regression model on sample data, saves it using joblib, then loads it back and makes a prediction.

python

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
import joblib

# Load data
X, y = load_iris(return_X_y=True)

# Train model
model = LogisticRegression(max_iter=200)
model.fit(X, y)

# Save model
joblib.dump(model, 'iris_model.joblib')

# Load model
loaded_model = joblib.load('iris_model.joblib')

# Predict
sample = X[0].reshape(1, -1)
prediction = loaded_model.predict(sample)
print('Predicted class:', prediction[0])

Output

Predicted class: 0

⚠️

Common Pitfalls

Not saving the model properly: Forgetting to open the file in binary mode when using pickle causes errors.
Overwriting files unintentionally: Always check the filename to avoid losing previous models.
Loading incompatible versions: Models saved with one sklearn version may not load correctly in another.

python

import pickle

# Wrong way: opening file in text mode (causes error)
# with open('model.pkl', 'w') as f:
#     pickle.dump(model, f)

# Right way: open file in binary write mode
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Loading model
with open('model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)

📊

Quick Reference

Action	Function	Notes
Save model	joblib.dump(model, 'file.joblib')	Recommended for sklearn models, faster for large numpy arrays
Load model	model = joblib.load('file.joblib')	Restores the saved model object
Save model (pickle)	pickle.dump(model, open('file.pkl', 'wb'))	Works but slower, must open file in binary mode
Load model (pickle)	model = pickle.load(open('file.pkl', 'rb'))	Must open file in binary mode

✅

Key Takeaways

Use joblib.dump() and joblib.load() to save and load sklearn models efficiently.

Always open files in binary mode when using pickle to avoid errors.

Saving models lets you reuse trained models without retraining.

Check filenames carefully to avoid overwriting saved models.

Model files may not be compatible across different sklearn versions.