Bird
Raised Fist0
MLOpsdevops~5 mins

Model serialization formats (pickle, ONNX, TorchScript) in MLOps - Time & Space Complexity

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Time Complexity: Model serialization formats (pickle, ONNX, TorchScript)
O(n)
Understanding Time Complexity

When saving machine learning models, the time it takes depends on the format used. We want to understand how this saving time grows as the model size increases.

How does the time to serialize a model change when the model gets bigger?

Scenario Under Consideration

Analyze the time complexity of the following code snippet.


import torch
import pickle

# Assume model is a trained PyTorch model

def save_pickle(model, path):
    with open(path, 'wb') as f:
        pickle.dump(model, f)

def save_torchscript(model, path):
    scripted = torch.jit.script(model)
    scripted.save(path)

def save_onnx(model, path, input_sample):
    torch.onnx.export(model, input_sample, path)

This code saves a model using three formats: pickle, TorchScript, and ONNX.

Identify Repeating Operations

Identify the loops, recursion, array traversals that repeat.

  • Primary operation: Traversing the model's parameters and structure to serialize data.
  • How many times: Each parameter and layer is processed once during serialization.
How Execution Grows With Input

As the model size (number of parameters) grows, the time to save grows roughly in proportion.

Input Size (number of parameters)Approx. Operations
10,00010,000 operations
100,000100,000 operations
1,000,0001,000,000 operations

Pattern observation: Doubling the model size roughly doubles the serialization time.

Final Time Complexity

Time Complexity: O(n)

This means the time to save the model grows linearly with the model size.

Common Mistake

[X] Wrong: "Serialization time is constant no matter the model size."

[OK] Correct: Larger models have more data to process, so saving takes more time.

Interview Connect

Understanding how serialization time grows helps you design efficient model deployment pipelines and choose the right format for your needs.

Self-Check

"What if we used a streaming serialization method that writes data in chunks? How would the time complexity change?"

Practice

(1/5)
1. Which model serialization format is Python-specific and not ideal for sharing models across different platforms?
easy
A. Pickle
B. ONNX
C. TorchScript
D. JSON

Solution

  1. Step 1: Understand Pickle's scope

    Pickle is a Python library that serializes Python objects but is limited to Python environments.
  2. Step 2: Compare with other formats

    ONNX and TorchScript are designed for cross-platform use, unlike Pickle.
  3. Final Answer:

    Pickle -> Option A
  4. Quick Check:

    Python-only format = Pickle [OK]
Hint: Pickle = Python-only, others are cross-platform [OK]
Common Mistakes:
  • Confusing ONNX as Python-only
  • Thinking TorchScript is Python-specific
  • Selecting JSON which is not a model format
2. Which of the following is the correct Python code snippet to save a PyTorch model using TorchScript?
easy
A. onnx.save(model, 'model.pt')
B. torch.save(model, 'model.pt')
C. pickle.dump(model, open('model.pt', 'wb'))
D. torch.jit.save(torch.jit.script(model), 'model.pt')

Solution

  1. Step 1: Identify TorchScript saving method

    TorchScript models are saved using torch.jit.save after scripting the model with torch.jit.script.
  2. Step 2: Check other options

    torch.save(model, 'model.pt') saves a PyTorch model but not as TorchScript. pickle.dump(model, open('model.pt', 'wb')) uses pickle, and onnx.save(model, 'model.pt') is invalid syntax.
  3. Final Answer:

    torch.jit.save(torch.jit.script(model), 'model.pt') -> Option D
  4. Quick Check:

    TorchScript save = torch.jit.save + torch.jit.script [OK]
Hint: TorchScript save needs torch.jit.script before torch.jit.save [OK]
Common Mistakes:
  • Using torch.save instead of torch.jit.save
  • Trying to save ONNX model with onnx.save (wrong syntax)
  • Using pickle for TorchScript models
3. Given the following Python code snippet, what will be the output type of the loaded model?
import torch
import pickle

model = SomePyTorchModel()
# Save with pickle
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

# Load model
with open('model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)

print(type(loaded_model))
medium
A. <class 'torch.jit.ScriptModule'>
B. <class '__main__.SomePyTorchModel'>
C. <class 'onnx.ModelProto'>
D. TypeError

Solution

  1. Step 1: Understand pickle serialization

    Pickle saves and loads the exact Python object, so the loaded model keeps the original class type.
  2. Step 2: Analyze output type

    Since model was saved with pickle, loaded_model is the same class as the original model.
  3. Final Answer:

    <class '__main__.SomePyTorchModel'> -> Option B
  4. Quick Check:

    Pickle load returns original Python object type [OK]
Hint: Pickle load returns original Python object type [OK]
Common Mistakes:
  • Confusing TorchScript or ONNX types with pickle load
  • Expecting a TorchScript or ONNX model type
  • Assuming a TypeError occurs on loading
4. You tried to load a model saved with TorchScript using pickle.load() and got an error. What is the most likely cause?
medium
A. TorchScript models cannot be loaded with pickle.load()
B. The model file is corrupted
C. pickle.load() requires the model to be saved as ONNX
D. TorchScript models must be loaded with torch.load()

Solution

  1. Step 1: Understand serialization compatibility

    TorchScript models are saved in a special format and cannot be loaded by pickle.load(), which expects Python pickle format.
  2. Step 2: Identify correct loading method

    TorchScript models should be loaded with torch.jit.load(), not pickle.load().
  3. Final Answer:

    TorchScript models cannot be loaded with pickle.load() -> Option A
  4. Quick Check:

    pickle.load() incompatible with TorchScript [OK]
Hint: TorchScript needs torch.jit.load(), not pickle.load() [OK]
Common Mistakes:
  • Assuming torch.load() works for TorchScript
  • Thinking ONNX is required for pickle.load()
  • Blaming file corruption without checking method
5. You want to deploy a PyTorch model to a production environment that does not have Python installed. Which serialization format should you choose and why?
hard
A. Pickle, because it is simple and fast
B. JSON, because it stores model weights efficiently
C. TorchScript, because it can run independently of Python
D. ONNX, because it is Python-only and easy to use

Solution

  1. Step 1: Identify deployment constraints

    The environment lacks Python, so the model format must run without Python dependencies.
  2. Step 2: Compare serialization formats

    Pickle requires Python, ONNX is cross-platform but needs an ONNX runtime, TorchScript can run independently using PyTorch's C++ runtime.
  3. Step 3: Choose best fit

    TorchScript is designed for deployment without Python, making it the best choice here.
  4. Final Answer:

    TorchScript, because it can run independently of Python -> Option C
  5. Quick Check:

    Deploy without Python = TorchScript [OK]
Hint: No Python? Use TorchScript for standalone deployment [OK]
Common Mistakes:
  • Choosing Pickle which needs Python
  • Confusing ONNX as Python-only
  • Selecting JSON which is not a model format