How to Use joblib for Model Saving in Python
Use
joblib.dump() to save a trained model to a file and joblib.load() to load it back in Python. This method efficiently stores sklearn models and other Python objects for reuse without retraining.Syntax
The basic syntax for saving a model is joblib.dump(model, filename), where model is your trained model object and filename is the path to save the file. To load the model back, use model = joblib.load(filename).
- joblib.dump: Saves the model to disk.
- joblib.load: Loads the saved model from disk.
- filename: String path to the file where the model is saved or loaded from.
python
import joblib # Save model joblib.dump(model, 'model_filename.joblib') # Load model model = joblib.load('model_filename.joblib')
Example
This example shows training a simple sklearn model, saving it with joblib, and loading it back to make predictions.
python
from sklearn.datasets import load_iris from sklearn.ensemble import RandomForestClassifier import joblib # Load data iris = load_iris() X, y = iris.data, iris.target # Train model model = RandomForestClassifier(random_state=42) model.fit(X, y) # Save model joblib.dump(model, 'rf_iris_model.joblib') # Load model loaded_model = joblib.load('rf_iris_model.joblib') # Predict with loaded model predictions = loaded_model.predict(X[:5]) print('Predictions:', predictions)
Output
Predictions: [0 0 0 0 0]
Common Pitfalls
Common mistakes include:
- Saving the model before training it, which results in an untrained model saved.
- Using inconsistent filenames when saving and loading.
- Not having
joblibinstalled or imported. - Trying to load a model file that does not exist or is corrupted.
Always ensure the model is trained before saving and use the exact filename when loading.
python
import joblib # Wrong: saving before training # model = RandomForestClassifier() # joblib.dump(model, 'model.joblib') # Model not trained yet # Right: train first, then save from sklearn.ensemble import RandomForestClassifier model = RandomForestClassifier() model.fit([[0,0],[1,1]], [0,1]) joblib.dump(model, 'model.joblib')
Quick Reference
| Function | Purpose | Example Usage |
|---|---|---|
| joblib.dump | Save model to file | joblib.dump(model, 'model.joblib') |
| joblib.load | Load model from file | model = joblib.load('model.joblib') |
Key Takeaways
Use joblib.dump() to save trained sklearn models efficiently to disk.
Load saved models with joblib.load() to reuse without retraining.
Always train your model before saving it with joblib.
Keep filenames consistent between saving and loading to avoid errors.
Ensure joblib is installed and imported before using its functions.