0
0
Ml-pythonHow-ToBeginner ยท 4 min read

How to Use TorchServe for Serving PyTorch Models

To use TorchServe, first package your PyTorch model into a .mar archive using torch-model-archiver. Then start the server with torchserve and deploy the model for inference via REST API calls.
๐Ÿ“

Syntax

Using TorchServe involves three main commands:

  • torch-model-archiver: Packages your PyTorch model and handler into a .mar file.
  • torchserve: Starts the model server with the packaged model.
  • curl or HTTP client: Sends inference requests to the server.

Basic syntax:

torch-model-archiver --model-name <name> --version <version> --serialized-file <model_path> --handler <handler_file> --export-path <export_dir>
torchserve --start --model-store <export_dir> --models <name>=<name>.mar
curl -X POST http://127.0.0.1:8080/predictions/<name> -T <input_data>
bash
torch-model-archiver --model-name mymodel --version 1.0 --serialized-file model.pt --handler image_classifier --export-path model_store
torchserve --start --model-store model_store --models mymodel=mymodel.mar
curl -X POST http://127.0.0.1:8080/predictions/mymodel -T input.jpg
๐Ÿ’ป

Example

This example shows how to package a simple PyTorch model, start TorchServe, and send an inference request.

python
import torch
import torch.nn as nn

# Define a simple model
class SimpleModel(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(10, 2)
    def forward(self, x):
        return self.linear(x)

# Save the model
model = SimpleModel()
torch.save(model.state_dict(), 'model.pt')

# Package the model (run in terminal):
# torch-model-archiver --model-name simplemodel --version 1.0 --serialized-file model.pt --handler image_classifier --export-path model_store

# Start TorchServe (run in terminal):
# torchserve --start --model-store model_store --models simplemodel=simplemodel.mar

# Send inference request (run in terminal):
# curl -X POST http://127.0.0.1:8080/predictions/simplemodel -T input.jpg
โš ๏ธ

Common Pitfalls

  • Not packaging the model correctly with torch-model-archiver causes loading errors.
  • Forgetting to start TorchServe before sending requests leads to connection failures.
  • Using incompatible handler files or missing dependencies can cause inference errors.
  • Incorrect input format in requests results in bad predictions or errors.

Always test your handler and model locally before packaging.

bash
## Wrong: Missing handler file
torch-model-archiver --model-name mymodel --serialized-file model.pt --export-path model_store

## Right: Include handler
torch-model-archiver --model-name mymodel --serialized-file model.pt --handler image_classifier --export-path model_store
๐Ÿ“Š

Quick Reference

CommandPurposeExample
torch-model-archiverPackage model into .mar filetorch-model-archiver --model-name mymodel --version 1.0 --serialized-file model.pt --handler image_classifier --export-path model_store
torchserveStart model servertorchserve --start --model-store model_store --models mymodel=mymodel.mar
curl POSTSend inference requestcurl -X POST http://127.0.0.1:8080/predictions/mymodel -T input.jpg
โœ…

Key Takeaways

Package your PyTorch model into a .mar file using torch-model-archiver before serving.
Start TorchServe with the model store and specify models to load for inference.
Send inference requests via REST API to the running TorchServe server.
Ensure your handler file matches your model type and input format.
Test your model and handler locally to avoid common deployment errors.