What is TorchServe setup in PyTorch?

PyTorchml~5 mins

TorchServe setup in PyTorch

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Introduction

TorchServe helps you easily turn your PyTorch models into web services. It makes sharing and using models simple without extra coding.

You want to share your trained PyTorch model with others via an API.

You need to deploy a model to a server for real-time predictions.

You want to manage multiple models and versions in one place.

You want to monitor model performance and usage easily.

You want to avoid writing custom server code for your model.

Syntax

PyTorch

torch-model-archiver --model-name <model_name> --version <version> --serialized-file <model_file> --handler <handler_file> --extra-files <extra_files>
torchserve --start --model-store <model_store_dir> --models <model_name>=<model_name>.mar

The torch-model-archiver command packages your model into a .mar file for TorchServe.

The torchserve command starts the server and loads your packaged model.

Examples

This example packages a ResNet18 model and starts TorchServe with it.

PyTorch

torch-model-archiver --model-name resnet18 --version 1.0 --serialized-file resnet18.pth --handler image_classifier --extra-files index_to_name.json
torchserve --start --model-store model_store --models resnet18=resnet18.mar

Here, a custom handler script is used to package and serve the model.

PyTorch

torch-model-archiver --model-name mymodel --version 2.1 --serialized-file model.pt --handler custom_handler.py
torchserve --start --model-store model_store --models mymodel=mymodel.mar

Sample Model

This code saves a pretrained ResNet18 model, shows how to package and start TorchServe (commands as comments), and runs a local prediction to demonstrate expected output.

PyTorch

import torch
from torchvision import models, transforms
from PIL import Image
import requests
import json

# Step 1: Download and save a pretrained model
model = models.resnet18(pretrained=True)
model.eval()

# Save the model
torch.save(model.state_dict(), 'resnet18.pth')

# Step 2: Create model archive (run in terminal, shown here as comment)
# torch-model-archiver --model-name resnet18 --version 1.0 --serialized-file resnet18.pth --handler image_classifier --extra-files index_to_name.json

# Step 3: Start TorchServe (run in terminal, shown here as comment)
# torchserve --start --model-store model_store --models resnet18=resnet18.mar

# Step 4: Send a test request to the server
# (Make sure TorchServe is running and model is loaded)

# Download a sample image
url = 'https://pytorch.org/assets/images/dog.jpg'
image = Image.open(requests.get(url, stream=True).raw)

# Prepare image for model
preprocess = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])
input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0)  # create mini-batch

# Load class index to name mapping
with open('index_to_name.json') as f:
    idx_to_label = json.load(f)

# Run prediction locally (for demo, since server call needs network)
with torch.no_grad():
    output = model(input_batch)
probabilities = torch.nn.functional.softmax(output, dim=1)[0]

# Get top 1 prediction
top_prob, top_catid = torch.topk(probabilities, 1)

print(f"Predicted class: {idx_to_label[str(top_catid.item())]} with probability {top_prob.item():.4f}")

OutputSuccess

Important Notes

Make sure to install TorchServe and torch-model-archiver via pip before starting.

The handler defines how input data is processed and output is returned; use built-in handlers or write your own.

Model archive files (.mar) are stored in the model store directory for TorchServe to load.

Summary

TorchServe packages PyTorch models into .mar files for easy deployment.

Use torch-model-archiver to create the model archive and torchserve to start the server.

Once running, you can send data to the server to get predictions from your model.