Article·replicate.com
AIinferenceAPIaiapimachine-learningpython
Replicate
Quickly run open-source AI models (image, text, audio) via a simple API on the Replicate platform. This Action Pack guides you through setting up your environment and making your first API call to run an AI model.
beginner15 min5 steps
The play
- Sign Up & Get API TokenCreate an account on Replicate.com and navigate to your dashboard to generate or locate your API token. Keep this token secure.
- Install Replicate Python ClientOpen your terminal or command prompt and install the Replicate Python client using pip.
- Set API Token Environment VariableSet your Replicate API token as an environment variable named `REPLICATE_API_TOKEN`. This allows the client library to authenticate your requests without hardcoding the token.
- Choose an AI ModelBrowse Replicate.com/explore to find an open-source AI model you want to run. Note down its model identifier (e.g., `owner/model-name:version-id`).
- Run Model InferenceWrite and execute a Python script to call your chosen model using the `replicate.run()` function. Pass the model identifier and input parameters as required by the model.
Starter code
import replicate
import os
# IMPORTANT: Set your REPLICATE_API_TOKEN environment variable first.
# Example (Linux/macOS): export REPLICATE_API_TOKEN="r8_YOUR_API_TOKEN_HERE"
# Example (Windows CMD): set REPLICATE_API_TOKEN=r8_YOUR_API_TOKEN_HERE
# This example uses the Llama 2 70B Chat model.
# Replace 'meta/llama-2-70b-chat:...' with your desired model and version if needed.
output = replicate.run(
"meta/llama-2-70b-chat:2c1608e18606fda1fffd2796d76794fd197bb799368365829877405ba472d493",
input={
"prompt": "Explain the concept of quantum entanglement in simple terms.",
"temperature": 0.7,
"max_new_tokens": 100
}
)
# For models that stream output, iterate to collect the full response.
full_response = ""
for item in output:
full_response += item
print(full_response)Source