Canary deployments using ArgoRollout
This document describes how you can configure Argo CD and integrate Argo Rollout in a TSE deployment. It uses TSE GitOps for deployment, and SkyWalking as the metrics provider, to achieve a canary deployment and progressive delivery automation.
Prerequisites:
- Argo CD is installed in your cluster and Argo CD CLI is configured to connect to your Argo CD server
- Argo Rollout is installed in your cluster
These instructions have been known to fail with Argo Rollout version 1.4.0 and later. We've observed the canary rollout to be aborted and immediately rolled back. We're looking into it, so in the interim use an earlier version. Contact us via slack for support, and we'll update this documentation when a fix is available.
Create an Application from a Git repository
Create a sample application using the following command. An example repository containing Istio's bookinfo application and TSE configurations is available at https://github.com/tetrateio/tse-gitops-demo.
You can either use Argo CD CLI or their web UI to import application configurations directly from Git.
argocd app create bookinfo-app --repo https://github.com/tetrateio/tse-gitops-demo.git --path application --dest-server https://kubernetes.default.svc --dest-namespace bookinfo --sync-policy automated --self-heal
Check the status of your application using argocd app get bookinfo-app
:
Name: bookinfo-app
Project: default
Server: https://kubernetes.default.svc
Namespace: bookinfo
URL: https://localhost:8080/applications/bookinfo-app
Repo: https://github.com/tetrateio/tse-gitops-demo.git
Target:
Path: argo/app
SyncWindow: Sync Allowed
Sync Policy: Automated
Sync Status: Synced to (1ba8e2d)
Health Status: Healthy
GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE
Namespace bookinfo bookinfo Running Synced namespace/bookinfo created
ServiceAccount bookinfo bookinfo-details Synced serviceaccount/bookinfo-details created
ServiceAccount bookinfo bookinfo-productpage Synced serviceaccount/bookinfo-productpage created
ServiceAccount bookinfo bookinfo-ratings Synced serviceaccount/bookinfo-ratings created
ServiceAccount bookinfo bookinfo-reviews Synced serviceaccount/bookinfo-reviews created
Service bookinfo productpage Synced Healthy service/productpage created
Service bookinfo details Synced Healthy service/details created
Service bookinfo ratings Synced Healthy service/ratings created
Service bookinfo reviews Synced Healthy service/reviews created
apps Deployment bookinfo ratings-v1 Synced Healthy deployment.apps/ratings-v1 created
apps Deployment bookinfo productpage-v1 Synced Healthy deployment.apps/productpage-v1 created
apps Deployment bookinfo reviews OutOfSync Healthy deployment.apps/reviews created
apps Deployment bookinfo details-v1 Synced Healthy deployment.apps/details-v1 created
Namespace bookinfo Synced
Application Setup
If you already have kubernetes manifests created for deployment and service resource, you can choose to keep the same objects along with Argo Rollout
object for facilitating the canary deployments.
You can make necessary changes to the Rollout
object and the TSE mesh configuration of Istio VirtualService/DestinationRule to achieve the desired result.
TSE Configuration Setup
Argo Rollout's canary deployment strategy convention for Istio requires you to make modifications to Istio's VirtualService
and DestinationRule
objects. You can use TSE's DIRECT
mode configuration to achieve the desired result.
- Following the Argo Rollout convention, 2 subsets named stable and canary should be configured with necessary labels in the TSE direct mode resources
VirtualService
andDestinationRule
to identify canary and stable pods. - Please make sure the version labels eg:
version: canary/stable
have been configured according to Argo's Istio convention. This is necessary for TSE to recognize the subsets and plot the metrics in service dashboard. - When using TSE direct mode resources with GitOps, there is an additional label
istio.io/rev: "tsb"
that must to be added to the resources. Please refer to the Direct Mode notes for more details.
Create a bookinfo-tse-conf app by importing the TSE configurations from tse-gitops-demo/argo/tse/conf.yaml. You can also choose to keep it in the same repo.
argocd app create bookinfo-tse-conf --repo https://github.com/tetrateio/tse-gitops-demo.git --path argo/tsb --dest-server https://kubernetes.default.svc --dest-namespace bookinfo --sync-policy automated --self-heal
Check the status of TSE resources using argocd app get bookinfo-tse-conf
:
Name: bookinfo-tse-conf
Project: default
Server: https://kubernetes.default.svc
Namespace: bookinfo
URL: https://localhost:8080/applications/bookinfo-tse-conf
Repo: https://github.com/tetrateio/tse-gitops-demo.git
Target:
Path: argo/tsb
SyncWindow: Sync Allowed
Sync Policy: Automated
Sync Status: Synced to (1ba8e2d)
Health Status: Healthy
GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE
networking.istio.io VirtualService bookinfo bookinfo Synced virtualservice.networking.istio.io/bookinfo created
tsb.tetrate.io Tenant bookinfo bookinfo Synced tenant.tsb.tetrate.io/bookinfo unchanged
networking.istio.io Gateway bookinfo bookinfo-gateway Synced gateway.networking.istio.io/bookinfo-gateway unchanged
traffic.tsb.tetrate.io Group bookinfo bookinfo-traffic Synced group.traffic.tsb.tetrate.io/bookinfo-traffic unchanged
security.tsb.tetrate.io Group bookinfo bookinfo-security Synced group.security.tsb.tetrate.io/bookinfo-security unchanged
gateway.tsb.tetrate.io Group bookinfo bookinfo-gateway Synced group.gateway.tsb.tetrate.io/bookinfo-gateway unchanged
tsb.tetrate.io Workspace bookinfo bookinfo-ws Synced workspace.tsb.tetrate.io/bookinfo-ws unchanged
networking.istio.io VirtualService bookinfo details Synced virtualservice.networking.istio.io/details unchanged
networking.istio.io DestinationRule bookinfo productpage Synced destinationrule.networking.istio.io/productpage unchanged
networking.istio.io DestinationRule bookinfo details Synced destinationrule.networking.istio.io/details unchanged
networking.istio.io VirtualService bookinfo ratings Synced virtualservice.networking.istio.io/ratings unchanged
networking.istio.io DestinationRule bookinfo reviews Synced destinationrule.networking.istio.io/reviews unchanged
networking.istio.io DestinationRule bookinfo ratings Synced destinationrule.networking.istio.io/ratings unchanged
networking.istio.io VirtualService bookinfo reviews Synced virtualservice.networking.istio.io/reviews unchanged
install.tetrate.io IngressGateway bookinfo tsb-gateway-bookinfo Synced ingressgateway.install.tetrate.io/tsb-gateway-bookinfo unchanged
Verify application
Export the external address of the tsb-gateway-bookinfo
service
export GATEWAY_IP=$(kubectl -n bookinfo get service tsb-gateway-bookinfo -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
Confirm that you can access bookinfo application. As you can see in the response, the currently-deployed review v1 service does not call the ratings service.
curl -v "http://bookinfo.tetrate.com/api/v1/products/1/reviews" \
--resolve "bookinfo.tetrate.com:80:$GATEWAY_IP"
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-type: application/json
< content-length: 361
< server: istio-envoy
< date: Mon, 22 Aug 2022 06:36:52 GMT
< x-envoy-upstream-service-time: 782
<
* Connection #0 to host bookinfo.tetrate.com left intact
{"id": "1", "podname": "reviews-rollout-56ff4b868c-74d8t", "clustername": "null", "reviews": [{"reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"}, {"reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}
Setup Argo Rollout
Argo Rollout provides multiple options to migrate your existing kubernetes deployment object into Argo's Rollout** object. You can either convert an existing k8s deployment object to Rollout or you can refer your existing k8s deployment from a Rollout object using workloadRef.
We will follow the workloadRef approach in this example.
In this example we will make a canary deployment of the reviews service to demonstrate Rollout object configurations, and how to manage traffic shifting between the primary and canary deployments of the reviews service:
- Create a Rollout resource and refer your existing deployment using workloadRef.
- Ensure that the selector matchLabels is configured based on your k8s application deployment manifest.
- Configure strategy as canary with subset level traffic splitting.
- Configure canaryMetadata and stableMetadata to inject labels and annotations on canary and stable pods.
- Ensure the labels of canaryMetadata and stableMetadata are consistent with TSE direct mode configurations here.
- Configure Istio virtualService and destinationRule under trafficRouting based on the TSE direct mode configurations.
- Once the Rollout object is created, it will spin up the required number of pods side-by-side along with the k8s deployment pods.
- Once all the Rollout pods are up and running, you can scale down your existing k8s deployment to 0 by changing the replicas.
- Rollout object won't modify your existing k8s deployment, Traffic would be shifted to the pods managed by Rollout object once the subset is updated in VirtualService.
Example rollout.yaml:
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
name: reviews-rollout
spec:
replicas: 5
selector:
matchLabels:
app: reviews
workloadRef:
apiVersion: apps/v1
kind: Deployment
name: reviews
strategy:
canary:
analysis:
templates:
- templateName: apdex
startingStep: 2
args:
- name: service-name
value: canary|reviews|bookinfo|cp-cluster-1|-
canaryMetadata:
annotations:
version: canary
labels:
version: canary
service.istio.io/canonical-revision: canary
stableMetadata:
annotations:
version: stable
labels:
version: stable
service.istio.io/canonical-revision: stable
trafficRouting:
istio:
virtualService:
name: reviews
destinationRule:
name: reviews
canarySubsetName: canary
stableSubsetName: stable
steps:
- setWeight: 10
- pause: {duration: 10m}
- setWeight: 20
- pause: {duration: 5m}
- setWeight: 40
- pause: {duration: 5m}
- setWeight: 60
- pause: {duration: 5m}
- setWeight: 80
- pause: {duration: 5m}
Configure Canary Analysis Template with SkyWalking
The SkyWalking component in TSE can serve as a metrics provider to support canary deployment analysis, enabling automatic promotion or rollback actions For background, refer to the documents Analysis and Progressive delivery in Argo Rollout and Apache SkyWalking Metrics.
Notes:
- Create the canary AnalysisTemplate using skywalking as the metrics provider to drive auto promotion/rollback based on the deployment analysis.
- SkyWalking metrics can be fetched by connecting to OAP service graphql endpoint
http://oap.istio-system:12800
on the TSE ControlPlane Cluster. - The success condition is derived using Apdex score. Please refer to Apdex score for measuring service mesh health for more details.
- The subset name of the canary deployment needs to be configured as an argument service-name in the analysis template.
- The TSE service name in SkyWalking follows the format
subset|service name|namespace name|cluster name|-
. Since we are using the reviews service, usecanary|reviews|bookinfo|cp-cluster-1|-
as service-name value in Rollout resource. - Configure the same AnalysisTemplate details in the Rollout object canary analysis.
Example analysis.yaml:
apiVersion: argoproj.io/v1alpha1
kind: AnalysisTemplate
metadata:
name: apdex
spec:
args:
- name: service-name
metrics:
- name: apdex
interval: 5m
successCondition: "all(result.service_apdex.values.values, {asFloat(.value) >= 9900})"
failureLimit: 3
provider:
skywalking:
interval: 3m
address: http://oap.istio-system:12800
query: |
query queryData($duration: Duration!) {
service_apdex: readMetricsValues(
condition: { name: "service_apdex", entity: { scope: Service, serviceName: "{{ args.service-name }}", normal: true } },
duration: $duration) {
label values { values { value } }
}
}
Create Rollout
Run the following command to create a rollout app:
argocd app create reviews-rollout --repo https://github.com/tetrateio/tse-gitops-demo.git --path argo/rollout --dest-server https://kubernetes.default.svc --dest-namespace bookinfo --sync-policy automated --self-heal
Check the status: argocd app get reviews-rollout
Name: reviews-rollout
Project: default
Server: https://kubernetes.default.svc
Namespace: bookinfo
URL: https://localhost:8080/applications/reviews-rollout
Repo: https://github.com/tetrateio/tse-gitops-demo.git
Target:
Path: argo/rollout
SyncWindow: Sync Allowed
Sync Policy: Automated
Sync Status: Synced to (1ba8e2d)
Health Status: Healthy
GROUP KIND NAMESPACE NAME STATUS HEALTH HOOK MESSAGE
argoproj.io AnalysisTemplate bookinfo apdex Synced analysistemplate.argoproj.io/apdex created
argoproj.io Rollout bookinfo reviews-rollout Synced Healthy rollout.argoproj.io/reviews-rollout created
Trigger Canary Deployment
Update the reviews service deployment image to the v2
version. This will immediately trigger a canary deployment of reviews v2 and will modify the traffic percentage as 90|10.
kubectl argo rollouts set image reviews-rollout reviews=docker.io/istio/examples-bookinfo-reviews-v2:1.16.4 -n bookinfo
Monitor Canary Deployment
Monitor your canary deployment using kubectl argo rollouts get rollout reviews-rollout --watch -n bookinfo
:
Name: reviews-rollout
Namespace: bookinfo
Status: ॥ Paused
Message: CanaryPauseStep
Strategy: Canary
Step: 1/10
SetWeight: 10
ActualWeight: 10
Images: docker.io/istio/examples-bookinfo-reviews-v1:1.16.4 (stable)
docker.io/istio/examples-bookinfo-reviews-v2:1.16.4 (canary)
Replicas:
Desired: 5
Current: 6
Updated: 1
Ready: 6
Available: 6
NAME KIND STATUS AGE INFO
⟳ reviews-rollout Rollout ॥ Paused 6m13s
├──# revision:2
│ └──⧉ reviews-rollout-867b9c9bcb ReplicaSet ✔ Healthy 21s canary
│ └──□ reviews-rollout-867b9c9bcb-86mbt Pod ✔ Running 19s ready:2/2
└──# revision:1
└──⧉ reviews-rollout-5d9dc876c9 ReplicaSet ✔ Healthy 6m13s stable
├──□ reviews-rollout-5d9dc876c9-27mth Pod ✔ Running 6m12s ready:2/2
├──□ reviews-rollout-5d9dc876c9-8qqpx Pod ✔ Running 6m11s ready:2/2
├──□ reviews-rollout-5d9dc876c9-9bqbv Pod ✔ Running 6m11s ready:2/2
├──□ reviews-rollout-5d9dc876c9-cgxgd Pod ✔ Running 6m11s ready:2/2
└──□ reviews-rollout-5d9dc876c9-d447w Pod ✔ Running 6m11s ready:2/2
Generate Traffic
Generate some traffic to the bookinfo application as follows:
while true; do curl -m 5 -v "http://bookinfo.tetrate.com/api/v1/products/1/reviews" --resolve "bookinfo.tetrate.com:80:$GATEWAY_IP"; sleep 2 ; done
Observe that some of the responses will include ratings data. This happens because the reviews-v2 service calls the ratings service.
> GET /api/v1/products/1/reviews HTTP/1.1
> Host: bookinfo.tetrate.com
> User-Agent: curl/7.79.1
> Accept: */*
> Content-Length: 0
> Content-Type: application/x-www-form-urlencoded
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< content-type: application/json
< content-length: 437
< server: istio-envoy
< date: Mon, 22 Aug 2022 06:53:14 GMT
< x-envoy-upstream-service-time: 45
<
* Connection #0 to host bookinfo.tetrate.com left intact
{"id": "1", "podname": "reviews-66f8dddb8c-84pk6", "clustername": "null", "reviews": [{"reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!", "rating": {"stars": 5, "color": "black"}}, {"reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare.", "rating": {"stars": 4, "color": "black"}}]}
Monitor Performance Metrics in TSE
You can monitor the health of each service instance of both canary and stable pods from the TSE service dashboard:
Canary Analysis and Auto Promotion
As we have configured in the rollout object, canary analysis will wait for the first phase to complete in 10 minutes to build some metrics. From the second phase onwards, AnalysisRun (an instantiation of the AnalysisTemplate) will be executed, based on the configured interval. For every completed run, based on the status of successful or failed, argo decides whether to promote/rollback the canary deployment based on the max failureLimit configured in AnalysisTemplate.
During Canary Analysis
Trigger the promotion using kubectl argo rollouts promote reviews-rollout --full -n bookinfo
:
Name: reviews-rollout
Namespace: bookinfo
Status: ॥ Paused
Message: CanaryPauseStep
Strategy: Canary
Step: 5/10
SetWeight: 40
ActualWeight: 40
Images: docker.io/istio/examples-bookinfo-reviews-v1:1.16.4 (stable)
docker.io/istio/examples-bookinfo-reviews-v2:1.16.4 (canary)
Replicas:
Desired: 5
Current: 7
Updated: 2
Ready: 7
Available: 7
NAME KIND STATUS AGE INFO
⟳ reviews-rollout Rollout ॥ Paused 24m
├──# revision:2
│ ├──⧉ reviews-rollout-867b9c9bcb ReplicaSet ✔ Healthy 18m canary
│ │ ├──□ reviews-rollout-867b9c9bcb-86mbt Pod ✔ Running 18m ready:2/2
│ │ └──□ reviews-rollout-867b9c9bcb-9ghh2 Pod ✔ Running 3m4s ready:2/2
│ └──α reviews-rollout-867b9c9bcb-2 AnalysisRun ◌ Running 8m4s ✔ 2
└──# revision:1
└──⧉ reviews-rollout-5d9dc876c9 ReplicaSet ✔ Healthy 24m stable
├──□ reviews-rollout-5d9dc876c9-27mth Pod ✔ Running 24m ready:2/2
├──□ reviews-rollout-5d9dc876c9-8qqpx Pod ✔ Running 24m ready:2/2
├──□ reviews-rollout-5d9dc876c9-9bqbv Pod ✔ Running 24m ready:2/2
├──□ reviews-rollout-5d9dc876c9-cgxgd Pod ✔ Running 24m ready:2/2
└──□ reviews-rollout-5d9dc876c9-d447w Pod ✔ Running 24m ready:2/2
After Successful Promotion
Once all the steps are executed with a successfull
analysis run, argo completes the rollout of the image to version v2
and marks that as stable
.
Monitor this using kubectl argo rollouts get rollout reviews-rollout --watch -n bookinfo
:
Name: reviews-rollout
Namespace: bookinfo
Status: ✔ Healthy
Strategy: Canary
Step: 10/10
SetWeight: 100
ActualWeight: 100
Images: docker.io/istio/examples-bookinfo-reviews-v2:1.16.4 (stable)
Replicas:
Desired: 5
Current: 5
Updated: 5
Ready: 5
Available: 5
NAME KIND STATUS AGE INFO
⟳ reviews-rollout Rollout ✔ Healthy 3d20h
├──# revision:2
│ ├──⧉ reviews-rollout-867b9c9bcb ReplicaSet ✔ Healthy 3d20h stable
│ │ ├──□ reviews-rollout-867b9c9bcb-757hf Pod ✔ Running 3d20h ready:2/2
│ │ ├──□ reviews-rollout-867b9c9bcb-tlt8z Pod ✔ Running 3d20h ready:2/2
│ │ ├──□ reviews-rollout-867b9c9bcb-hwqnd Pod ✔ Running 3d20h ready:2/2
│ │ ├──□ reviews-rollout-867b9c9bcb-hnfzb Pod ✔ Running 3d20h ready:2/2
│ │ └──□ reviews-rollout-867b9c9bcb-h5zrw Pod ✔ Running 3d20h ready:2/2
│ └──α reviews-rollout-867b9c9bcb-2 AnalysisRun ✔ Successful 3d20h ✔ 5
└──# revision:1
└──⧉ reviews-rollout-5d9dc876c9 ReplicaSet • ScaledDown 3d20h
Manual Promotion of Canary Deployment
You can perform a step-by-step promotion. This will proceed through the steps mentioned in the Rollout by changing the traffic weight, and will finish by rolling out the new version completely:
# step-by-step promotion
kubectl argo rollouts promote reviews-rollout -n bookinfo
Alternatively, you can perform a full promotion to the desired version by skipping analysis, pauses, and steps:
# full promotion
kubectl argo rollouts promote reviews-rollout --full -n bookinfo