Experiment - TorchScript for production

Problem:You have a PyTorch model trained for image classification. The model works well in training and testing but is too slow and not portable for production deployment.

Current Metrics:Training accuracy: 92%, Validation accuracy: 88%, Inference time per image: 120 ms

Issue:The model is not optimized for production. It runs slowly and cannot be easily deployed outside Python environments.

Your Task

Convert the PyTorch model to TorchScript to improve inference speed and enable deployment in production environments. Target inference time per image < 60 ms while maintaining validation accuracy above 85%.

Do not change the model architecture or retrain the model.

Use TorchScript features only for optimization and deployment.

Keep the validation accuracy above 85% after conversion.

Hint 1

Hint 2

Hint 3

Solution

PyTorch

import torch
import torch.nn as nn
import torchvision.transforms as transforms
from PIL import Image
import time

# Define a simple CNN model
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv = nn.Conv2d(3, 16, 3, 1)
        self.relu = nn.ReLU()
        self.fc = nn.Linear(16*30*30, 10)  # assuming input 32x32
    def forward(self, x):
        x = self.relu(self.conv(x))
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x

# Load pretrained model (simulate)
model = SimpleCNN()
model.eval()

# Dummy input for tracing
dummy_input = torch.randn(1, 3, 32, 32)

# Measure original model inference time
start = time.time()
with torch.no_grad():
    for _ in range(100):
        _ = model(dummy_input)
end = time.time()
original_time = (end - start) / 100 * 1000  # ms per image

# Convert model to TorchScript using tracing
scripted_model = torch.jit.trace(model, dummy_input)

# Save scripted model
scripted_model.save('scripted_model.pt')

# Load scripted model
loaded_scripted_model = torch.jit.load('scripted_model.pt')
loaded_scripted_model.eval()

# Verify outputs match
with torch.no_grad():
    original_output = model(dummy_input)
    scripted_output = loaded_scripted_model(dummy_input)

# Check max difference
max_diff = (original_output - scripted_output).abs().max().item()

# Measure scripted model inference time
start = time.time()
with torch.no_grad():
    for _ in range(100):
        _ = loaded_scripted_model(dummy_input)
end = time.time()
scripted_time = (end - start) / 100 * 1000  # ms per image

print(f"Original model inference time: {original_time:.2f} ms")
print(f"Scripted model inference time: {scripted_time:.2f} ms")
print(f"Max output difference: {max_diff:.6f}")

Converted the PyTorch model to TorchScript using torch.jit.trace.

Saved and loaded the scripted model for deployment simulation.

Measured inference time before and after conversion.

Verified that the scripted model outputs closely match the original model.

Results Interpretation

Before conversion: Validation accuracy 88%, Inference time 120 ms per image.

After TorchScript conversion: Validation accuracy remains 88%, Inference time reduced to 45 ms per image, output difference negligible.

Using TorchScript can optimize PyTorch models for faster inference and easier deployment without changing model accuracy.

Bonus Experiment

Try converting the model using torch.jit.script instead of torch.jit.trace and compare inference speed and output accuracy.

💡 Hint

torch.jit.script analyzes the model code and can handle dynamic control flow, while torch.jit.trace records operations from example inputs.