Bird
Raised Fist0
MLOpsdevops~10 mins

Model serialization formats (pickle, ONNX, TorchScript) in MLOps - Step-by-Step Execution

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Process Flow - Model serialization formats (pickle, ONNX, TorchScript)
Train Model
Choose Serialization Format
Pickle
Save Model to Disk
Load Model from Disk
Use Model for Inference
This flow shows training a model, choosing a serialization format, saving it, loading it back, and using it for predictions.
Execution Sample
MLOps
import pickle
model = train_model()
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Later
with open('model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)
This code trains a model, saves it using pickle, then loads it back for use.
Process Table
StepActionFormat UsedFile Created/ReadResult
1Train modelN/AN/AModel object created in memory
2Serialize modelpicklemodel.pkl (write)Model saved as binary file
3Deserialize modelpicklemodel.pkl (read)Model object restored in memory
4Use modelN/AN/AModel predicts on new data
💡 Model saved and loaded successfully using pickle format
Status Tracker
VariableStartAfter Step 1After Step 2After Step 3After Step 4
modelNoneTrained model objectTrained model objectNoneNone
file handleNoneNoneOpen for writeOpen for readNone
loaded_modelNoneNoneNoneRestored model objectRestored model object
Key Moments - 3 Insights
Why do we need to open the file in 'wb' mode when saving the model?
Because 'wb' means write binary mode, which is required to save the model data correctly as a binary file (see Step 2 in execution_table).
What happens if we try to load the model without opening the file in 'rb' mode?
Loading requires reading binary data, so opening without 'rb' mode will cause an error or incorrect data (see Step 3 in execution_table).
Why can't we use pickle files directly in other frameworks like TorchScript or ONNX?
Pickle is Python-specific and may not be compatible with other frameworks; ONNX and TorchScript are designed for interoperability and optimized execution.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table, at which step is the model restored back into memory?
AStep 3
BStep 2
CStep 1
DStep 4
💡 Hint
Check the 'Result' column for when the model object is restored in memory.
According to variable_tracker, what is the state of 'loaded_model' after Step 2?
ARestored model object
BNone
CTrained model object
DFile handle open
💡 Hint
Look at the 'loaded_model' row and the column 'After Step 2'.
If we change the serialization format from pickle to ONNX, which step in execution_table would change?
AStep 1
BStep 2
CStep 3
DStep 4
💡 Hint
Serialization format affects saving the model to disk.
Concept Snapshot
Model serialization saves trained models to files for later use.
Common formats:
- pickle: Python-specific, saves objects as binary.
- ONNX: Open format for interoperability across frameworks.
- TorchScript: PyTorch's optimized format for deployment.
Save with write-binary mode, load with read-binary mode.
Choose format based on use case and compatibility.
Full Transcript
This visual execution shows how a machine learning model is trained, saved to disk using a serialization format like pickle, then loaded back for inference. The flow starts with training the model in memory, then choosing a format such as pickle, ONNX, or TorchScript. The model is saved to a file in binary mode and later loaded back by reading the file in binary mode. Variables like 'model' and 'loaded_model' change state as the model is saved and restored. Key moments clarify why file modes matter and compatibility differences between formats. The quiz tests understanding of when the model is restored, variable states, and how changing formats affects steps. The snapshot summarizes key points about serialization formats and usage.

Practice

(1/5)
1. Which model serialization format is Python-specific and not ideal for sharing models across different platforms?
easy
A. Pickle
B. ONNX
C. TorchScript
D. JSON

Solution

  1. Step 1: Understand Pickle's scope

    Pickle is a Python library that serializes Python objects but is limited to Python environments.
  2. Step 2: Compare with other formats

    ONNX and TorchScript are designed for cross-platform use, unlike Pickle.
  3. Final Answer:

    Pickle -> Option A
  4. Quick Check:

    Python-only format = Pickle [OK]
Hint: Pickle = Python-only, others are cross-platform [OK]
Common Mistakes:
  • Confusing ONNX as Python-only
  • Thinking TorchScript is Python-specific
  • Selecting JSON which is not a model format
2. Which of the following is the correct Python code snippet to save a PyTorch model using TorchScript?
easy
A. onnx.save(model, 'model.pt')
B. torch.save(model, 'model.pt')
C. pickle.dump(model, open('model.pt', 'wb'))
D. torch.jit.save(torch.jit.script(model), 'model.pt')

Solution

  1. Step 1: Identify TorchScript saving method

    TorchScript models are saved using torch.jit.save after scripting the model with torch.jit.script.
  2. Step 2: Check other options

    torch.save(model, 'model.pt') saves a PyTorch model but not as TorchScript. pickle.dump(model, open('model.pt', 'wb')) uses pickle, and onnx.save(model, 'model.pt') is invalid syntax.
  3. Final Answer:

    torch.jit.save(torch.jit.script(model), 'model.pt') -> Option D
  4. Quick Check:

    TorchScript save = torch.jit.save + torch.jit.script [OK]
Hint: TorchScript save needs torch.jit.script before torch.jit.save [OK]
Common Mistakes:
  • Using torch.save instead of torch.jit.save
  • Trying to save ONNX model with onnx.save (wrong syntax)
  • Using pickle for TorchScript models
3. Given the following Python code snippet, what will be the output type of the loaded model?
import torch
import pickle

model = SomePyTorchModel()
# Save with pickle
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Load model
with open('model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)

print(type(loaded_model))
medium
A. <class 'torch.jit.ScriptModule'>
B. <class '__main__.SomePyTorchModel'>
C. <class 'onnx.ModelProto'>
D. TypeError

Solution

  1. Step 1: Understand pickle serialization

    Pickle saves and loads the exact Python object, so the loaded model keeps the original class type.
  2. Step 2: Analyze output type

    Since model was saved with pickle, loaded_model is the same class as the original model.
  3. Final Answer:

    <class '__main__.SomePyTorchModel'> -> Option B
  4. Quick Check:

    Pickle load returns original Python object type [OK]
Hint: Pickle load returns original Python object type [OK]
Common Mistakes:
  • Confusing TorchScript or ONNX types with pickle load
  • Expecting a TorchScript or ONNX model type
  • Assuming a TypeError occurs on loading
4. You tried to load a model saved with TorchScript using pickle.load() and got an error. What is the most likely cause?
medium
A. TorchScript models cannot be loaded with pickle.load()
B. The model file is corrupted
C. pickle.load() requires the model to be saved as ONNX
D. TorchScript models must be loaded with torch.load()

Solution

  1. Step 1: Understand serialization compatibility

    TorchScript models are saved in a special format and cannot be loaded by pickle.load(), which expects Python pickle format.
  2. Step 2: Identify correct loading method

    TorchScript models should be loaded with torch.jit.load(), not pickle.load().
  3. Final Answer:

    TorchScript models cannot be loaded with pickle.load() -> Option A
  4. Quick Check:

    pickle.load() incompatible with TorchScript [OK]
Hint: TorchScript needs torch.jit.load(), not pickle.load() [OK]
Common Mistakes:
  • Assuming torch.load() works for TorchScript
  • Thinking ONNX is required for pickle.load()
  • Blaming file corruption without checking method
5. You want to deploy a PyTorch model to a production environment that does not have Python installed. Which serialization format should you choose and why?
hard
A. Pickle, because it is simple and fast
B. JSON, because it stores model weights efficiently
C. TorchScript, because it can run independently of Python
D. ONNX, because it is Python-only and easy to use

Solution

  1. Step 1: Identify deployment constraints

    The environment lacks Python, so the model format must run without Python dependencies.
  2. Step 2: Compare serialization formats

    Pickle requires Python, ONNX is cross-platform but needs an ONNX runtime, TorchScript can run independently using PyTorch's C++ runtime.
  3. Step 3: Choose best fit

    TorchScript is designed for deployment without Python, making it the best choice here.
  4. Final Answer:

    TorchScript, because it can run independently of Python -> Option C
  5. Quick Check:

    Deploy without Python = TorchScript [OK]
Hint: No Python? Use TorchScript for standalone deployment [OK]
Common Mistakes:
  • Choosing Pickle which needs Python
  • Confusing ONNX as Python-only
  • Selecting JSON which is not a model format