Revisioned Istio Control Plane
Revisioned Istio control plane is an Alpha feature and is not recommended for production usage.
Istio deployment at control plane clusters can be marked with an arbitrary revision
. If installed as revisioned, all components of the Istio control plane are identified with the configured revision
.
This allows running more than one Istio control plane side-by-side on the same cluster. Running more than one Istio control plane is helpful in more controlled and safer upgrades.
Installation
For upgrading from non-revisioned to revisioned control plane follow the steps as mentioned in Upgrades
For a fresh installation, follow the standard steps for onboarding a control plane cluster along with following changes:
Deploy Operators
Customize the tctl
command that to specify the revision
while generating cluster operators manifest. You have 2 ways to accomplish that.
- From the CLI itself, using the
--set
flag:
In the following example, you will use canary
as revision value. You can change revision value to any value that you want.
tctl install manifest cluster-operators \
--registry <registry-location> \
--set "operator.deployment.env[0].name=CONTROL_PLANE_REVISION" \
--set "operator.deployment.env[0].value=canary" > clusteroperators.yaml
- By passing a
values
file that you created. As an example:
operator:
deployment:
env:
- name: CONTROL_PLANE_REVISION
value: canary
And then running:
tctl install manifest cluster-operators \
--registry <registry-location> --values /path/to/values.yaml > clusteroperators.yaml
Apply this using
kubectl apply -f clusteroperators.yaml
There will be no data plane operator deployment in the clusteroperators.yaml
because with revisioned control plane, data plane operator is no longer needed. Gateway (Ingress/Egress/Tier1) deployments will be handled by TSB control plane operator itself.
Gateway Upgrades Approaches
There are two ways you can upgrade Gateways with revisioned control plane and you can control this by setting ENABLE_INPLACE_GATEWAY_UPGRADE
variable for XCP component in control plane CR.
ENABLE_INPLACE_GATEWAY_UPGRADE=true
is the default behavior. When using in-place gateway upgrade, existing gateway deployment will be patched with new proxy image and will continue using the same gateway service. This means you don't have to make any changes to configure the gateway external IP.ENABLE_INPLACE_GATEWAY_UPGRADE=false
means that a new gateway service and deployment for the canary version will be created, so now there are two services:<gateway-name>
which is handling the non-revisioned control plane workload traffic<gateway-name>-canary
which is handling the revisioned control plane workload traffic, a new external IP will be allocated to this newly created<gateway-name>-canary
service. You can control traffic between two versions by using external load balancers or by updating DNS entry.
Control Plane Installation
A new field, revision
must be set in the ControlPlane
custom resource (CR) with the same value as used while generating clusteroperators.yaml
. By default, TSB will do in-place gateway upgrade. Set ENABLE_INPLACE_GATEWAY_UPGRADE
to false
if you want to deploy canary deployment and service for gateway.
- In-place gateway upgrade
- Canary gateway upgrade
apiVersion: install.tetrate.io/v1alpha1
kind: ControlPlane
metadata:
name: controlplane
namespace: istio-system
spec:
hub: <registry-location>
telemetryStore:
elastic:
host: <tsb-address>
port: <tsb-port>
version: <elastic-version>
selfSigned: <is-elastic-use-self-signed-certificate>
managementPlane:
host: <tsb-address>
port: <tsb-port>
clusterName: <cluster-name-in-tsb>
selfSigned: <is-mp-use-self-signed-certificate>
components:
xcp:
revision: 'canary' # Revision value. Must be same with operator revision value
centralAuthMode: 'JWT'
apiVersion: install.tetrate.io/v1alpha1
kind: ControlPlane
metadata:
name: controlplane
namespace: istio-system
spec:
hub: <registry-location>
telemetryStore:
elastic:
host: <tsb-address>
port: <tsb-port>
version: <elastic-version>
selfSigned: <is-elastic-use-self-signed-certificate>
managementPlane:
host: <tsb-address>
port: <tsb-port>
clusterName: <cluster-name-in-tsb>
selfSigned: <is-mp-use-self-signed-certificate>
components:
xcp:
kubeSpec:
deployment:
env:
- name: ENABLE_INPLACE_GATEWAY_UPGRADE
value: false # Disable in-place upgrade to create canary deployment and service for gateway
revision: 'canary' # Revision value. Must be same with operator revision value
centralAuthMode: 'JWT'
This can then be applied to your Kubernetes cluster:
kubectl apply -f controlplane.yaml
After the installation steps are done, look at deployments
, configmaps
and webhooks
in the istio-system
namespace. All resources which are part of revisioned Istio control plane
will be having revision
as suffix in the name.
kubectl get deployment -n istio-system | grep canary
# Output
istio-operator-canary 1/1 1 1 96s
istiod-canary 1/1 1 1 32s
kubectl get configmap -n istio-system | grep canary
# Output
istio-canary 2 105s
istio-sidecar-injector-canary 2 105s
kubectl get validatingwebhookconfiguration -n istio-system | grep canary
# Output
istio-validator-canary-istio-system 1 2m43s
Sidecar upgrades
Workload namespaces must be labelled with the matching revision
so sidecar proxies will point to the revisioned control plane. If you are upgrading from non-revisioned control plane, remove istio-injection
label. Don't forget to change bookinfo
namespace in the following example to your application namespace.
kubectl label namespace bookinfo istio-injection- istio.io/rev=canary --overwrite
Then restart the workload pods to re-inject sidecar proxies with revisioned control plane. You can use rollout restart.
kubectl rollout restart deployment -n bookinfo
To verify if the sidecar is really connected to the intended istiod
, istioctl
command can be used:
istioctl pc bootstrap deploy/details-v1 -n bookinfo -o json | grep -i discovery
# Output
"discoveryAddress": "istiod-canary.istio-system.svc:15012",
Gateway Deployment
For each gateway (Ingress/Egress/Tier1) resource, you must to set the matching revision
.
For example in your Ingress gateway deployment manifest.
apiVersion: install.tetrate.io/v1alpha1
kind: IngressGateway
metadata:
name: tsb-gateway-bookinfo
namespace: bookinfo
spec:
revision: canary # Revision value. Must be same with operator revision value
Apply this using
kubectl apply -f ingress-gateway.yaml
Once applied, this will result in revisioned gateway deployment.
In-place gateway upgrade
If you're using the In-place gateway upgrade as mentioned above in Gateway Upgrades Approaches, a new deployment and then a new pod with same name will be created using the same service.
kubectl get deployments -n bookinfo
# Output
tsb-gateway-bookinfo 1/1 1 1 1m12s
kubectl get svc -n bookinfo
# Output
tsb-gateway-bookinfo LoadBalancer 10.255.10.85 172.29.255.157 15443:31159/TCP,8080:31789/TCP,...
you can inspect the deployment, service or the newly created pod that now they're labelled with a label istio.io/rev=canary
kubectl get deployments -n bookinfo --show-labels | grep canary
Will show all labels and highlight the canary label
kubectl get svc -n bookinfo --show-labels | grep canary
Will show all labels and highlight the canary label
Canary gateway upgrade
If you're not using the canary gateway upgrade (i.e. ENABLE_INPLACE_GATEWAY_UPGRADE=false
) as mentioned above in Gateway Upgrades Approaches, a new deployment and then a new pod with a new name suffixed the revision value will be created and it will create a new service with new external IP.
kubectl get deployments -n bookinfo
# Output
tsb-gateway-bookinfo 1/1 1 1 8m12s
tsb-gateway-bookinfo-canary 1/1 1 1 4m19s
kubectl get svc -n bookinfo
# Output
tsb-gateway-bookinfo LoadBalancer 10.255.10.81 172.29.255.151 15443:31159/TCP,8080:31789/TCP,...
tsb-gateway-bookinfo-canary LoadBalancer 10.255.10.85 172.29.255.152 15443:31159/TCP,8080:31789/TCP,...
Troubleshooting
- Look for
ingressdeployment
,egressdeployment
,tier1deployment
resources in theistio-system
namespace corresponding to TSBIngressGateway
,EgressGateway
,Tier1Gateway
resources respectively.
kubectl get ingressdeployment -n istio-system
# Output
NAME AGE
tsb-gateway-bookinfo 79s
If missing, tsb control plane operator did not reconcile TSB gateway resource to corresponding xcp resource. First re-verify the revision match between tsb control plane operator and Gateway resource. Next, operator logs should give some hint.
- Look for corresponding
IstioOperator
resource in theistio-system
namespace. example:
kubectl get iop -n istio-system | grep canary
# Output
xcpgw-tsb-gateway-bookinfo-canary canary 15m
If missing, xcp-operator-edge
logs should give some hint.
- If above two points are OK and still gateway deployment/services not getting deployed OR not as per
IstioOperator
resource, istio operator deployment logs should give some hint.
Upgrades
Non-revisioned to revisioned control plane
- You need to scale down TSB data plane operator before starting the upgrade. This is to avoid race condition between TSB data plane operator and TSB control plane operator to reconcile the same TSB Ingress/Egress/Tier1Gateway resources.
kubectl scale --replicas=0 deployment tsb-operator-data-plane -n istio-gateway
- Install revisioned control plane following installation instructions.
- To upgrade sidecars, remove
istio-injection=enabled
workload namespace label and applyistio.io/rev
label on the workload namespace to the Istio revision. Then restart the application workloads. - To upgrade the workloads running on the virtual machine (VM), restart the envoy sidecar running at the virtual machine.
- To upgrade the gateways, add the
spec.revision
in theIngress/Egress/Tier1Gateway
resource as described in the Gateway Deployment section. - Support for the upgrade of gateways running on the VM is
work-in-progress
.
non-revisioned data plane cleanup
To cleanup non-revisioned istio data plane after upgrade completed, that is all sidecars have moved to the revisioned proxy and all application gateways have revisioned gateways running in addition to non-revisioned gateways:
- Delete
IstioOperator
resource namedtsb-gateways
from the namespacetsb-gateway
usingkubectl
.
kubectl delete iop tsb-gateways -n istio-gateway
istio-operator
deployment running in tsb-gateways
will cleanup all non-revisioned application gateways to reconcile the IstioOperator
resource deletion.
- Delete
istio-gateway
namespace because that is no longer needed. - Delete TSB data plane operator webhooks:
kubectl delete validatingwebhookconfiguration tsb-operator-data-plane-egress tsb-operator-data-plane-ingress tsb-operator-data-plane-tier1
kubectl delete mutatingwebhookconfiguration tsb-operator-data-plane-egress tsb-operator-data-plane-ingress tsb-operator-data-plane-tier1
non-revisioned control plane cleanup
- Delete
IstioOperator
resource namedtsb-istiocontrolplane
from the namespaceistio-system
usingkubectl
.
kubectl delete iop tsb-istiocontrolplane -n istio-system
- Delete Istio operator deployment and kubernetes RBAC(
clusterrole
andclusterrolebinding
)
kubectl delete clusterrole,clusterrolebinding istio-operator
kubectl delete deployment,sa istio-operator -n istio-system
Rollback from revisioned to non-revisioned
- If cleanup of non-revisioned has already been performed, first bring back the non-revisioned control plane.
To get the older non-revision control plane, re-install TSB cluster operators without
revision
: First scale down theistio-operator-<revision>
if you're using In-place Gateway Upgrade
kubectl scale --replicas=0 deployment istio-operator-canary -n istio-system
tctl install manifest cluster-operators --registry $HUB > clusteroperators.yaml
kubectl apply -f clusteroperators.yaml
Then edit the existing ControlPlane
CR to remove the spec.components.xcp.revision
.
Non-revisioned TSB control plane operator will then reconcile non-revisioned ControlPlane
resource to redeploy non-revisioned Istio control plane.
- Sidecars can be rollbacked by changing the value of
istio.io/rev
workload namespace label todefault
, followed by rolling restart of application deployments. - Older non-revisioned gateways will be back automatically because of TSB data plane operator, which does not care about
revision
being present or not in gateway Install CRs. - To cleanup revisioned gateways:
- Remove
spec.revision
from theIngress/Egress/Tier1Gateway
TSB gateway install resources. - Delete corresponding
IstioOperator
resources from theistio-system
namespace.
- Remove
- Delete revisioned control plane
IstioOperator
resource(xcp-iop-<revision>
) from the namespaceistio-system
usingkubectl
.kubectl delete iop xcp-iop-<revision> -n istio-system
Delete revisioned Istio operator deployment and kubernetes RBAC(clusterrole
and clusterrolebinding
)
kubectl delete sa,deployment istio-operator-<revision> -n istio-system
kubectl delete clusterrole,clusterrolebinding istio-operator-<revision>