How to Use Pickle for Model Saving in Python with sklearn
Use
pickle.dump() to save a trained sklearn model to a file and pickle.load() to load it back. This lets you store your model on disk and reuse it later without retraining.Syntax
To save a model, use pickle.dump(model, file). To load it back, use model = pickle.load(file). The file should be opened in binary mode.
- model: your trained sklearn model object
- file: a file object opened with
open(filename, 'wb')for saving oropen(filename, 'rb')for loading
python
import pickle # Save model with open('model.pkl', 'wb') as f: pickle.dump(model, f) # Load model with open('model.pkl', 'rb') as f: model = pickle.load(f)
Example
This example trains a simple sklearn logistic regression model on sample data, saves it using pickle, then loads it back and makes a prediction.
python
from sklearn.datasets import load_iris from sklearn.linear_model import LogisticRegression import pickle # Load data iris = load_iris() X, y = iris.data, iris.target # Train model model = LogisticRegression(max_iter=200) model.fit(X, y) # Save model with open('iris_model.pkl', 'wb') as f: pickle.dump(model, f) # Load model with open('iris_model.pkl', 'rb') as f: loaded_model = pickle.load(f) # Predict with loaded model sample = X[0].reshape(1, -1) prediction = loaded_model.predict(sample) print(f"Predicted class: {prediction[0]}")
Output
Predicted class: 0
Common Pitfalls
- Forgetting to open the file in binary mode (
'wb'for writing,'rb'for reading) causes errors. - Pickle files are Python-specific and may not be secure to load from untrusted sources.
- Changing the model code or sklearn version may cause incompatibility when loading saved models.
- Always test loading and prediction after saving to ensure the model works as expected.
python
import pickle # Wrong: opening file in text mode causes error # with open('model.pkl', 'w') as f: # pickle.dump(model, f) # This will raise an error # Right way: with open('model.pkl', 'wb') as f: pickle.dump(model, f)
Quick Reference
Remember these key points when using pickle for model saving:
- Use
open(filename, 'wb')to save andopen(filename, 'rb')to load. - Use
pickle.dump(model, file)to save the model. - Use
model = pickle.load(file)to load the model. - Test loaded models before using in production.
Key Takeaways
Always open files in binary mode when saving or loading with pickle.
Use pickle.dump() to save and pickle.load() to load sklearn models.
Test your saved model by loading and predicting to ensure it works.
Avoid loading pickle files from untrusted sources for security reasons.
Model compatibility may break if sklearn or model code changes after saving.