Deploy a SageMaker Model with AWS API Gateway

Create a secure, scalable REST API for your Amazon SageMaker model endpoint. This guide shows how to use AWS API Gateway to proxy requests, making your ML model accessible via a standard HTTP endpoint.

intermediate30 min5 steps

The play

Prerequisite: Deploy a SageMaker Endpoint
Before creating an API, you need a real-time inference endpoint running in Amazon SageMaker. You can deploy one from a trained model using the AWS Console or SDK. Note the full name of your endpoint (e.g., 'my-model-endpoint-20231027').
Create an IAM Role for API Gateway
AWS API Gateway needs permission to invoke your SageMaker endpoint. Create a new IAM Role. For the 'Trusted entity type', select 'AWS service' and choose 'API Gateway'. Attach a permissions policy that allows the 'sagemaker:InvokeEndpoint' action on your specific endpoint resource.
Create a New REST API
In the AWS API Gateway console, click 'Create API' and choose 'REST API' (not 'HTTP API' or 'WebSocket API'). Select 'New API', give it a name like 'SageMakerProxyAPI', and keep the 'Endpoint Type' as 'Regional'.
Configure the Method and Integration
Create a resource (e.g., '/invocations') and a POST method for it. For the 'Integration type', select 'AWS Service'. Set the 'AWS Region' to your SageMaker endpoint's region and 'AWS Service' to 'SageMaker'. Leave 'AWS Subdomain' blank. Set 'HTTP method' to POST. For 'Action Type', choose 'Use path override' and set 'Path override(optional)' to 'endpoints/YOUR_ENDPOINT_NAME/invocations'. Finally, provide the ARN of the IAM role you created in Step 2.
Deploy and Test the API
From the 'Actions' menu, select 'Deploy API'. Create a new deployment stage (e.g., 'v1'). After deploying, you will get an 'Invoke URL'. This is your public API endpoint. You can now send POST requests to this URL with your model's expected payload.

Starter code

import requests
import json

# 1. Replace with the Invoke URL from your API Gateway deployment stage
API_GATEWAY_URL = "https://YOUR_API_ID.execute-api.us-east-1.amazonaws.com/v1/invocations"

# 2. Define the payload your SageMaker model expects.
# This is just an example for a simple text classification model.
# The content type and body format must match what your SageMaker endpoint requires.
headers = {
    "Content-Type": "application/json"
}

# Example for a model that takes a JSON object with a key 'instances'
payload = {
    "instances": [
        {"features": [1.0, 2.5, 0.5, 1.2]},
        {"features": [0.8, 1.9, 0.4, 1.0]}
    ]
}

# For a model that takes a simple CSV string, you might change the Content-Type and payload:
# headers = {"Content-Type": "text/csv"}
# payload = "1.0,2.5,0.5,1.2"

def invoke_endpoint(url, payload, headers):
    """Sends a POST request to the API Gateway endpoint."""
    try:
        response = requests.post(url, data=json.dumps(payload), headers=headers)
        response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)
        
        print("Status Code:", response.status_code)
        print("Response JSON:", response.json())
        return response.json()

    except requests.exceptions.RequestException as e:
        print(f"Request failed: {e}")
        return None

if __name__ == "__main__":
    if "YOUR_API_ID" in API_GATEWAY_URL:
        print("Please update the 'API_GATEWAY_URL' variable with your API's invoke URL.")
    else:
        invoke_endpoint(API_GATEWAY_URL, payload, headers)