Non-revisioned to Revisioned
Control plane upgrades are controlled by Istio Isolation Boundaries, which is an Alpha feature and is not recommended for production usage.
Before you continue, make sure you are familiar with Istio Isolation Boundaries feature.
Before you upgrade
Upgrading from a non-revisioned to revisioned control plane setup involves enabling the Istio isolation boundary feature. Once enabled, a revision can be configured within the isolation boundary to which the control plane must upgrade to. Follow the steps mentioned in Isolation Boundaries Installation to deploy the control plane with isolation boundary feature enabled.
Once the Istio isolation boundary feature is enabled, you need to scale down TSB data plane operator before adding isolation boundaries in the ControlPlane
CR. This is to avoid race condition between TSB data plane operator and TSB control plane operator to reconcile the same TSB Ingress/Egress/Tier1Gateway resources.
kubectl scale --replicas=0 deployment tsb-operator-data-plane -n istio-gateway
For the same reason we must also scale down the istio-operator in the istio-gateway namespace.
kubectl scale --replicas=0 deployment istio-operator -n istio-gateway
With this, also delete the webhooks that are created and managed by the tsb-operator-data-plane.
kubectl delete validatingwebhookconfiguration tsb-operator-data-plane-egress tsb-operator-data-plane-ingress tsb-operator-data-plane-tier1; \
kubectl delete mutatingwebhookconfiguration tsb-operator-data-plane-egress tsb-operator-data-plane-ingress tsb-operator-data-plane-tier1;
TSB only supports Canary control plane upgrades for non-revisioned to revisioned upgrades. This would mean that at a given point in time, there will be two Istio control planes deployed - a non-revisioned and a revisioned control plane.
Control plane
Configure an isolation boundary in your ControlPlane
CR. If you use Helm, you can add isolation boundary configuration in your Helm values file.
spec:
hub: <registry-location>
telemetryStore:
elastic:
host: <tsb-address>
port: <tsb-port>
version: <elastic-version>
selfSigned: <is-elastic-use-self-signed-certificate>
managementPlane:
host: <tsb-address>
port: <tsb-port>
clusterName: <cluster-name-in-tsb>
selfSigned: <is-mp-use-self-signed-certificate>
components:
xcp:
isolationBoundaries:
- name: global
revisions:
- name: revisioned
centralAuthMode: 'JWT'
"global"
isolation boundaryAlthough we can deploy multiple revisioned control planes after enabling isolation boundaries support, with any boundary "name" whatsoever, but it is recommended to create 1 isolation boundary named "global" so that existing Workspaces can be considered as part of the "global" isolation boundary. Existing workspaces that are already deployed in the cluster will NOT be bound to a specific isolation boundary, therefore the "global" named isolation boundary provides a fallback for all these Workspaces that do not specify their isolation boundary.
Configuring an isolation boundary in the ControlPlane
CR will setup a revisioned control plane in the istio-system
namespace as follows
kubectl get deployment -n istio-system | grep istio-operator
# Output
istio-operator 1/1 1 1 15h
istio-operator-revisioned 1/1 1 1 2m
kubectl get deployment -n istio-system | grep istiod
# Output
istiod 1/1 1 1 15h
istiod-revisioned 1/1 1 1 2m
Note that there is a non-revisioned control plane still deployed which manages existing sidecars and gateways.
Gateway upgrade
To upgrade the Gateways, add the spec.revision
in the Ingress/Egress/Tier1Gateway
resource. This will reconcile the existing gateway pods to connect to the revisioned Istio control plane. TSB by default configures the Gateway install resources with a RollingUpdate
strategy that ensures zero downtime.
You can also add spec.revision
by patching gateway CR.
kubectl patch ingressgateway.install <name> -n <namespace> --type=json --patch '[{"op": "replace","path": "/spec/revision","value": "revisioned"}]'; \
Application upgrade
To upgrade sidecars, remove istio-injection=enabled
workload namespace label and apply istio.io/rev
label on the workload namespace to the Istio revision.
kubectl label namespace workload-ns istio-injection- istio.io/rev=revisioned
Then restart application workloads. A rolling update is preferred to avoid traffic disruptions.
kubectl rollout restart deployment -n workload-ns
VM workload upgrade
To upgrade VM workload, download latest Istio sidecar from your onboarding plane using revisioned link then reinstall Istio sidecar on the VM.
Update onboarding-agent
configuration with revision
value then restart onboarding-agent
. Istio sidecar will connect to revisioned Istio control plane.
Post-upgrade cleanup
Once all sidecars have moved to the revisioned proxy and all application gateways have revisioned gateways running, and a healthy upgrade is ensured, we can proceed to cleanup the old non-revisioned resources from the cluster which are now stale.
- Remember that we had scaled down the TSB data plane operator and non-revision istio-operator from the istio-gateway namespace. Now the
istio-gateway
namespace itself can be safely removed, as it is not required anymore.
kubectl delete ns istio-gateway
- Delete
IstioOperator
resource namedtsb-istiocontrolplane
from the namespaceistio-system
usingkubectl
.
kubectl delete iop tsb-istiocontrolplane -n istio-system
- Ensure that the
istiod
Deployment is deleted from theistio-system
namespace by the istio-operator deployment. Then delete Istio operator deployment and kubernetes RBAC (clusterrole
andclusterrolebinding
)
kubectl delete clusterrole,clusterrolebinding istio-operator
kubectl delete deployment,sa istio-operator -n istio-system
Rollback from revisioned to non-revisioned
Before post-upgrade cleanup
-
Scale up the tsb data plane operator in the istio-gateway namespace.
kubectl scale --replicas=1 deployment tsb-operator-data-plane -n istio-gateway
With this delete the webhooks that are created and managed by the tsb-operator-control-plane.
kubectl delete validatingwebhookconfiguration tsb-operator-control-plane-egress tsb-operator-control-plane-ingress tsb-operator-control-plane-tier1; \
kubectl delete mutatingwebhookconfiguration tsb-operator-control-plane-egress tsb-operator-control-plane-ingress tsb-operator-control-plane-tier1; -
To rollback revisioned gateways, remove
spec.revision
from theIngress/Egress/Tier1Gateway
TSB gateway install resources.For the gateway deployment, it is preferred to configure rolling update to avoid traffic disruptions. This can be configured in the
ingress/Egress/Tier1Gateway
resource. This will result in gateway pods coming up and getting connected to the older non-revisioned istio control plane which is still running. -
Rollback the sidecars by changing the value of
istio.io/rev
workload namespace label todefault
kubectl label namespace workload-ns istio.io/rev=default
Then restart the application workloads.
kubectl rollout restart deployment -n workload-ns
-
Once all data plane components are rollbacked to non-revisioned control plane, we can proceed with removing the isolation boundary from the
ControlPlane
CR. This will remove the revisioned control plane components deployed inistio-system
namespace.
After post-upgrade cleanup
Rolling back gateways from revisioned to non-revisioned control plane AFTER the post-upgrade cleanup is done, does not guarantee zero down time.
-
First we need to bring back the non-revisioned control plane. To get the older non-revisioned control plane, re-install TSB cluster operators with
ISTIO_ISOLATION_BOUNDARIES
disabled.tctl install manifest cluster-operators --registry $HUB > clusteroperators.yaml
kubectl apply -f clusteroperators.yamlDeploying the operators again, will bring back the TSB data plane operator in the istio-gateway namespace. Also the non-revisioned TSB control plane operator will then reconcile the updated
ControlPlane
resource to redeploy non-revisioned Istio control plane. Since the isolation boundary support is removed, this will also cleanup all revisioned control plane components. -
Edit the existing
ControlPlane
CR to remove thespec.components.xcp.isolationBoundaries
. -
To rollback revisioned gateways, remove
spec.revision
from theIngress/Egress/Tier1Gateway
TSB gateway install resources. For the gateway deployment, it is preferred to configure rolling update to avoid traffic disruptions. This can be configured in theingress/Egress/Tier1Gateway
resource. This will result in gateway pods coming up and getting connected to the older non-revisioned istio control plane which is still running. -
Rollback the sidecars by changing the value of
istio.io/rev
workload namespace label todefault
kubectl label namespace workload-ns istio.io/rev=default
Then restart the application workloads.
kubectl rollout restart deployment -n workload-ns