Experiment - Why deployment serves predictions

Problem:You have trained a PyTorch model to classify images, but you want to understand why deploying the model to serve predictions is important for real users.

Current Metrics:Training accuracy: 95%, Validation accuracy: 92%, No deployment yet.

Issue:The model works well during training but is not accessible to users because it is not deployed to serve predictions.

Your Task

Deploy the trained PyTorch model to serve predictions through a simple API so users can send images and get classification results.

Use PyTorch for the model.

Use a lightweight web framework like Flask or FastAPI.

Keep the deployment code simple and clear.

Hint 1

Hint 2

Hint 3

Solution

PyTorch

import torch
from torchvision import transforms
from PIL import Image
from fastapi import FastAPI, File, UploadFile
import io

# Define the model class (example: simple CNN for 10 classes)
import torch.nn as nn
import torch.nn.functional as F

class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 16, 3, 1)
        self.conv2 = nn.Conv2d(16, 32, 3, 1)
        self.fc1 = nn.Linear(32 * 5 * 5, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2)
        x = torch.flatten(x, 1)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Load the trained model weights
model = SimpleCNN()
model.load_state_dict(torch.load('model.pth', map_location=torch.device('cpu')))
model.eval()

# Define image transforms
transform = transforms.Compose([
    transforms.Resize((28, 28)),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

app = FastAPI()

@app.post('/predict')
async def predict(file: UploadFile = File(...)):
    image_bytes = await file.read()
    image = Image.open(io.BytesIO(image_bytes)).convert('RGB')
    img_t = transform(image).unsqueeze(0)  # Add batch dimension

    with torch.no_grad():
        outputs = model(img_t)
        _, predicted = torch.max(outputs, 1)

    return {'predicted_class': int(predicted.item())}

# To run this app, save as app.py and run:
# uvicorn app:app --reload

Created a simple CNN model class matching the trained model.

Loaded the trained model weights with map_location to handle CPU loading and set model to evaluation mode.

Built a FastAPI app with a /predict endpoint to accept image uploads.

Processed uploaded images with transforms before prediction.

Normalized images with 3 channels mean and std instead of 1 channel.

Returned the predicted class as JSON to serve predictions.

Results Interpretation

Before deployment: Model had high accuracy but was not accessible to users.

After deployment: Model serves predictions through an API, making it usable in real applications.

Deploying a trained model to serve predictions is essential to make the model useful for real-world users. It bridges the gap between training and practical use by providing an interface for input and output.

Bonus Experiment

Extend the deployment to accept multiple images at once and return predictions for all.

💡 Hint

Modify the API endpoint to accept a list of files, process each image in a loop, and return a list of predicted classes.