Article
argo-rolloutscanary-deploymentml-deploymentmlopskubernetesistioprogressive-deliveryci-cd
Canary Deploy ML Models on Kubernetes with Argo Rollouts
Safely release new ML model versions using Argo Rollouts. Gradually shift traffic to the new model, automatically measure performance against SLOs (like error rate), and instantly roll back on failure to protect production users.
intermediate30 min4 steps
The play
- Install Argo Rollouts ControllerFirst, you need a Kubernetes cluster with Istio installed. Then, install the Argo Rollouts controller, which introduces the `Rollout` Custom Resource Definition (CRD) that you will use to manage deployments.
- Define the Rollout ResourceInstead of a standard Kubernetes `Deployment`, define an Argo `Rollout`. This manifest points to your stable and canary services and specifies the traffic shifting strategy, such as sending 10% of traffic to the new version for 5 minutes before increasing it.
- Create an AnalysisTemplateDefine how to measure success. An `AnalysisTemplate` contains queries for your monitoring system (e.g., Prometheus). This example checks if the model's prediction success rate stays above 99%. If this check fails at any point, the rollout is automatically aborted and rolled back.
- Trigger and Monitor the Canary DeploymentTo start the canary release, update the container image in your `Rollout` manifest to the new model version and apply it. Then, use the Argo Rollouts kubectl plugin to get a real-time view of the deployment's progress, including weight shifting and analysis results.
Starter code
#!/bin/bash
# This script deploys a full Canary ML Model example using Argo Rollouts.
# Prerequisites: A Kubernetes cluster with Istio and Argo Rollouts installed.
# Create a namespace for our demo
kubectl create namespace ml-canary-demo
# Apply all resources to the cluster
cat <<EOF | kubectl apply -n ml-canary-demo -f -
apiVersion: v1
kind: Service
metadata:
name: ml-model-stable
spec:
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
selector:
app: ml-model
---
apiVersion: v1
kind: Service
metadata:
name: ml-model-canary
spec:
ports:
- port: 80
targetPort: 8080
protocol: TCP
name: http
selector:
app: ml-model
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: ml-model-vsvc
spec:
hosts:
- "*"
gateways:
- mesh # Or your specific gateway
http:
- name: primary
route:
- destination:
host: ml-model-stable
weight: 100
- destination:
host: ml-model-canary
weight: 0
---
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: latency-check
spec:
args:
- name: virtual-service
metrics:
- name: p95-latency
successCondition: result[0] <= 500 # P95 latency must be <= 500ms
failureLimit: 2
provider:
prometheus:
address: http://prometheus.istio-system.svc.cluster.local:9090
query: |
histogram_quantile(0.95, sum(rate(istio_request_duration_milliseconds_bucket{
reporter="destination",
destination_workload_namespace="ml-canary-demo"
}[1m])) by (le))
---
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: ml-model-rollout
spec:
replicas: 3
selector:
matchLabels:
app: ml-model
template:
metadata:
labels:
app: ml-model
spec:
containers:
- name: ml-model
# Start with the stable version v1.0
image: argoproj/rollouts-demo:blue
ports:
- containerPort: 8080
strategy:
canary:
stableService: ml-model-stable
canaryService: ml-model-canary
trafficRouting:
istio:
virtualService:
name: ml-model-vsvc
routes:
- primary
steps:
- setWeight: 25
- pause: {duration: 2m}
- analysis:
templates:
- templateName: latency-check
args:
- name: virtual-service
value: ml-model-vsvc
- setWeight: 50
- pause: {duration: 2m}
- analysis:
templates:
- templateName: latency-check
- setWeight: 75
- pause: {duration: 2m}
EOF
echo "✅ Base resources created in 'ml-canary-demo' namespace."
echo "👀 Monitor with: kubectl argo rollouts get rollout ml-model-rollout -n ml-canary-demo -w"
echo "🚀 To trigger the canary, run: kubectl argo rollouts set image ml-model-rollout ml-model=argoproj/rollouts-demo:yellow -n ml-canary-demo"