Model serialization formats (pickle, ONNX, TorchScript) in MLOps - Time & Space Complexity
Start learning this pattern below
Jump into concepts and practice - no test required
When saving machine learning models, the time it takes depends on the format used. We want to understand how this saving time grows as the model size increases.
How does the time to serialize a model change when the model gets bigger?
Analyze the time complexity of the following code snippet.
import torch
import pickle
# Assume model is a trained PyTorch model
def save_pickle(model, path):
with open(path, 'wb') as f:
pickle.dump(model, f)
def save_torchscript(model, path):
scripted = torch.jit.script(model)
scripted.save(path)
def save_onnx(model, path, input_sample):
torch.onnx.export(model, input_sample, path)
This code saves a model using three formats: pickle, TorchScript, and ONNX.
Identify the loops, recursion, array traversals that repeat.
- Primary operation: Traversing the model's parameters and structure to serialize data.
- How many times: Each parameter and layer is processed once during serialization.
As the model size (number of parameters) grows, the time to save grows roughly in proportion.
| Input Size (number of parameters) | Approx. Operations |
|---|---|
| 10,000 | 10,000 operations |
| 100,000 | 100,000 operations |
| 1,000,000 | 1,000,000 operations |
Pattern observation: Doubling the model size roughly doubles the serialization time.
Time Complexity: O(n)
This means the time to save the model grows linearly with the model size.
[X] Wrong: "Serialization time is constant no matter the model size."
[OK] Correct: Larger models have more data to process, so saving takes more time.
Understanding how serialization time grows helps you design efficient model deployment pipelines and choose the right format for your needs.
"What if we used a streaming serialization method that writes data in chunks? How would the time complexity change?"
Practice
Solution
Step 1: Understand Pickle's scope
Pickle is a Python library that serializes Python objects but is limited to Python environments.Step 2: Compare with other formats
ONNX and TorchScript are designed for cross-platform use, unlike Pickle.Final Answer:
Pickle -> Option AQuick Check:
Python-only format = Pickle [OK]
- Confusing ONNX as Python-only
- Thinking TorchScript is Python-specific
- Selecting JSON which is not a model format
Solution
Step 1: Identify TorchScript saving method
TorchScript models are saved using torch.jit.save after scripting the model with torch.jit.script.Step 2: Check other options
torch.save(model, 'model.pt') saves a PyTorch model but not as TorchScript. pickle.dump(model, open('model.pt', 'wb')) uses pickle, and onnx.save(model, 'model.pt') is invalid syntax.Final Answer:
torch.jit.save(torch.jit.script(model), 'model.pt') -> Option DQuick Check:
TorchScript save = torch.jit.save + torch.jit.script [OK]
- Using torch.save instead of torch.jit.save
- Trying to save ONNX model with onnx.save (wrong syntax)
- Using pickle for TorchScript models
import torch
import pickle
model = SomePyTorchModel()
# Save with pickle
with open('model.pkl', 'wb') as f:
pickle.dump(model, f)
# Load model
with open('model.pkl', 'rb') as f:
loaded_model = pickle.load(f)
print(type(loaded_model))Solution
Step 1: Understand pickle serialization
Pickle saves and loads the exact Python object, so the loaded model keeps the original class type.Step 2: Analyze output type
Since model was saved with pickle, loaded_model is the same class as the original model.Final Answer:
<class '__main__.SomePyTorchModel'> -> Option BQuick Check:
Pickle load returns original Python object type [OK]
- Confusing TorchScript or ONNX types with pickle load
- Expecting a TorchScript or ONNX model type
- Assuming a TypeError occurs on loading
pickle.load() and got an error. What is the most likely cause?Solution
Step 1: Understand serialization compatibility
TorchScript models are saved in a special format and cannot be loaded by pickle.load(), which expects Python pickle format.Step 2: Identify correct loading method
TorchScript models should be loaded with torch.jit.load(), not pickle.load().Final Answer:
TorchScript models cannot be loaded with pickle.load() -> Option AQuick Check:
pickle.load() incompatible with TorchScript [OK]
- Assuming torch.load() works for TorchScript
- Thinking ONNX is required for pickle.load()
- Blaming file corruption without checking method
Solution
Step 1: Identify deployment constraints
The environment lacks Python, so the model format must run without Python dependencies.Step 2: Compare serialization formats
Pickle requires Python, ONNX is cross-platform but needs an ONNX runtime, TorchScript can run independently using PyTorch's C++ runtime.Step 3: Choose best fit
TorchScript is designed for deployment without Python, making it the best choice here.Final Answer:
TorchScript, because it can run independently of Python -> Option CQuick Check:
Deploy without Python = TorchScript [OK]
- Choosing Pickle which needs Python
- Confusing ONNX as Python-only
- Selecting JSON which is not a model format
