AWS Marketplace
Using TorchServe to list PyTorch models at scale in AWS Marketplace
Recently, AWS announced the release of TorchServe, a PyTorch open-source project in collaboration with Facebook. PyTorch is an open-source machine learning framework created by Facebook, which is popular among ML researchers and data scientists. Despite its ease of use and “Pythonic” interface, deploying and managing models in production is still difficult as it requires data scientists to write prediction APIs and scale them. However, using TorchServe, data scientists and data engineers can deploy and host their machine learning models without writing custom code.
TorchServe provides a convenient framework for AWS Marketplace sellers to list their products without writing their own endpoint controllers and handlers. Before the release of TorchServe, if you wanted to list a PyTorch machine learning model, you needed to develop custom handlers and build your own Docker image. You also had to figure out how to make correct API calls in and out of the container network, and solve other one-time problems in developing the model server. TorchServe can simplify this process, and the whole listing process can happen in fewer than 10 minutes.
We previously published a blog post about how to host a PyTorch machine learning model in Amazon SageMaker. In this blog post, I further explore TorchServe with SageMaker and show how to use TorchServe to list PyTorch models at scale in AWS Marketplace.
Prerequisites
This solution has the following prerequisites:
- An active AWS account
- IAM roles and policies to access AWS services
- You need a role with
AmazonSageMakerFullAccess
,AmazonS3FullAccess
andAmazonEC2ContainerRegistryFullAccess
- For more information, see Adding and removing IAM identity permissions in the AWS Identity and Access Management User Guide
- You need a role with
- AWS services:
Solution overview
I use an end-to-end notebook to demonstrate the listing process. The notebook is held in https://github.com/aws-samples/aws-marketplace-machine-learning. You can clone the notebook listing-torchserve-models.ipynb
to create your PyTorch products. The notebook comes with a Docker file and other configuration and script files to build the Docker image for TorchServe. For more information, see Docker image and Dockerfile reference on the Docker Docs website.
The following steps show you how to:
- install TorchServe
- create a Docker image of TorchServe
- create a model archive format (
.mar
) file from a PyTorch data format (.pth
) file - create a SageMaker model package with the TorchServe Docker image and model archive file
- validate it and list your product in AWS Marketplace.
You can run the steps on SageMaker notebook instances, Amazon EC2 instances, and your computer in a terminal window. If you’re running on your local environment, install the AWS CLI and configure it, AWS SDK for Python (boto3), and Amazon SageMaker Python SDK.
I recommend running the steps on a SageMaker notebook instance so that you can use the provided sample notebook. For more information, see Use Amazon SageMaker Notebook Instances in the Amazon SageMaker Developer Guide.
Solution walkthrough
Step 0: Update AWS CLI, AWS SDK and Amazon SageMaker SDK
Before you begin, update the AWS CLI, the AWS SDK, and the SageMaker SDK. You can run the following commands in a terminal window or the notebook:
Step 1: Git clone TorchServe and install the model archiver
As the first step, you need to download and install TorchServe and its model archiver. The model archiver has the torch-model-archiver
tool, which is used to convert model data from PyTorch data format (.pth
) file to model archive format (.mar
) file. TorchServe uses model archive format file to host the model server.
Run the following command in the notebook to clone the serve folder and install model-archiver. There are different default handlers in the serve/examples/folder directory. Use them to launch your model:
Step 2: Build a TorchServe Docker image and push it to Amazon ECR
AWS Marketplace and SageMaker require the seller to provide the machine learning product in a container for better data privacy and intellectual property (IP) protection. In this step, you create your own Docker image of TorchServe and push it to Amazon ECR. You can reuse the same Docker image for various models and listings.
a. Create a boto3 session and get the account
Run the following code in the notebook:
b. Create an Amazon ECR registry through AWS CLI
Name the registry torchserve-base
because you can use it for different PyTorch model listings. To do that, run the following in the notebook:
c. Build the Docker image and push it to Amazon ECR
i. Name your image label v1 and push it to the registry by running the following command in the notebook:
ii. Scan your Docker image in Amazon ECR after you push the image. To do that, do the following:
-
-
- Sign in your AWS console and navigate to Amazon ECR.
- Choose the repository you created in step 2.b, select the image, and then click
Scan
to scan your image. - The Scan status shows as Complete after you scan the Docker image. The following screenshot shows the TorchServe image with tag of v1 and a scan status as Complete.
-
Congratulations! You have created a TorchServe Docker image.
Step 3: Create a TorchServe model archive with a PyTorch model and upload it to Amazon S3
TorchServe needs a specific model archive format (.mar
) file to serve the model. The torch-model-archiver
tool can convert the model from a .pth
model file to .mar
. You don’t need to create a custom handler and can specify with the option --handler image_classifier
, for example. The torch-model-archiver
tool automatically sets up a handler for you. TorchServe supports default handlers for image classification, image segmentation, object detection, and text classification. Most machine learning models fall into these categories. You can also write your own custom hander to serve your model.
After converting the model file, you must convert it into a compressed tar format (tar.gz
) and upload it to an Amazon S3 bucket.
a. Create a TorchServe archive with a PyTorch model (your own model or a downloaded version)
In this notebook, I recommend downloading a DenseNet-161 model for demonstration.
i. First, download the model from PyTorch.org and name the file densenet161
. To do this, run the following command in the notebook:
The model file is .pth
. You can also use your own trained version here instead of downloaded one, if you prefer.
ii. Second, covert the .pth
model file to .mar
by running the following code in the notebook:
iii. To view the created .mar
files, run the following command in the notebook:
b. Upload the generated .mar archive file to Amazon S3
SageMaker expects that models are in a tar.gz
file. You need to convert the .mar
file to tar.gz
and then upload the model to your default SageMaker S3 bucket in the models directory.
To do this, run the following code in the notebook:
Step 4: Deploy an endpoint and make a prediction using Amazon SageMaker SDK
Now you can create a SageMaker model with the TorchServe Docker image and Pytorch model file that you created. You also create a real-time endpoint based on the SageMaker model through the SageMaker SDK. This helps you make sure that TorchServe is actually working as the base image for model files.
a. Create a SageMaker model with the TorchServe Docker image and model file
Run the following code in the notebook and then get the SageMaker execution role and create a model:
b. Deploy an endpoint with the SageMaker model that you created
In your notebook, specify the endpoint name and use SageMaker SDK to deploy an endpoint by executing the following code:
c. Test the TorchServe hosted endpoint
Use a public image from Amazon S3 to test the endpoint. To do that, enter the following into the notebook:
You should receive the following response:
d. Delete the endpoint
To avoid unnecessary billing, delete the endpoint that you created by running the following command in the notebook:
Step 5: Test the batch transform on SageMaker before listing the PyTorch model in AWS Marketplace
After you make sure that the inference endpoint works, you can test the batch transform job on the SageMaker model that you created. SageMaker also conducts batch transform validation as part of creating the model package before you can list your models in AWS Marketplace.
This step helps you get familiar with the batch transform process and detect potential bugs in the model. First, prepare your transform input data and upload it to an Amazon S3 bucket. Then create the batch transform job through the SageMaker SDK.
a. Create the batch transform input folder
To create the batch transform input folder, in the notebook, enter the following:
b. Upload the batch transform input folder to an Amazon S3 bucket
To upload the batch transform input folder to an Amazon S3 bucket, enter the following in the notebook:
c. Create the batch transform job in SageMaker
Use the SageMaker SDK to create the transform job. Wait for the transform job to end. To do that, enter the following in the notebook:
Congratulations! The batch transform succeeded.
Step 6: Create the model package
Now you can start creating your own model package for listing. Before you create it, you need to specify several fields for inference specification and model package validation specification. Creating the model package also does the validation job.
a. Create the model package inference specification
To create the model package inference specification, specify several fields in the pre-defined inference specification template that I provide. To do that, enter the following in the notebook:
b. Create the model package validation specification
Specify several fields in the pre-defined model package validation specification template that I provide. To do that, enter the following in the notebook:
c. Create the model package with inference specification and validation specification
To create a model package using SageMaker SDK, run the following code in the notebook:
Creating the model package is an asynchronous process. To check its status, run the following command in the notebook:
Congratulations! You have created a model package for listing in AWS Marketplace.
Step 7: Create another model package
This step demonstrates that the same TorchServer Docker image works for different PyTorch model packages. Download a VGG-11 model and create a corresponding model package. You can also use your own model.
a. Create the .mar
file and upload to Amazon S3
To do this, run the following code in the notebook:
b. Create a new model and its corresponding inference specification and validation specification
To create the model and the specifications, run the following code in the notebook:
c. Create the model package with inference specification and validation specification
To create the model package using SageMaker SDK, run the following code in the notebook:
To check the creation status, run the following code in the notebook:
Congratulations! You have created another model package for listing.
Step 8: List your model package in the AWS Marketplace Management Portal
After you created your model packages, they appear on the SageMaker console. Go to the SageMaker console, in the left panel choose model packages, and see the model packages you just created. Select a model package and choose Publish new ML Marketplace listing. In the following screenshot, I selected the torchserve-vgg11
model and chose Publish new ML Marketplace listing from the Actions menu in the upper right.
You’re redirected to AWS Marketplace Management Portal. To start publishing your first PyTorch model, follow the instructions in the Management Portal. For more information, see Machine learning products in the AWS Marketplace Seller Guide.
Conclusion
In this blog post, I showed you how to use TorchServe to create two machine learning model package listings (with the same TorchServe Docker image) for AWS Marketplace. TorchServe provides a convenient and flexible way to host an endpoint for PyTorch machine learning models. TorchServe supports a variety of default torch handlers. You can also write your own handler to better support your unique model and then update your Docker image.
About the Author
Rick Cao is a machine learning engineer with the AWS Marketplace machine learning group. He enjoys applying cutting-edge technologies and building AI/ML solutions with cloud architecture to solve business needs. Prior to AWS, Rick has over five years’ experience in the financial industry. He has a master’s degree in computer science (machine learning major) and a master’s degree in financial engineering.