Running an Active/Standby Management Plane pair
For situations where High Availability (HA) of the Tetrate Management Plane is essential, you can run an active/standby pair of Management Plane instances. This guide provides an overview of the process, and you should refer to Tetrate Technical Support for assistance with this procedure.
This document describes how to set up for failover from one 'active' Management Plane deployment to another 'standby' instance. Although existing client-to-Management-Plane connections will be interrupted, clients will be able to reconnect and use the failover Management Plane instance without explicit reconfiguration.
In particular, the following communication flows will remain resilient:
- Control Planes to Management Plane
tctl
CLI to Management Plane API- Access to the Management Plane UI
Prepare for Failover
Certificates
To support failover from the active Management Plane instance to the standby instance, first ensure that the internal root CA certificate is shared between both instances.
Do not use an auto-generated (e.g., by cert-manager) Root CA in each MP instance to procure TLS certificates for TSB, XCP, ElasticSearch and Postgres. If you have to use a self-signed CA, provide it explicitly as part of each Management Plane deployment.
If the active and standby Management Planes both use an auto-generated Root CA, they will eventually get out-of-sync once one of MP(s) rotates its Root CA. As a result, clients will not be able to failover from the active to the standby Management Plane instance without additional configuration.
If possible, use a well-known 3rd party CA to procure TLS certificates for TSB, XCP, ElasticSearch and Postgres. Ensure that if a CA is rotated, this change is propagated to all Management Planes and Control Planes.
If you don’t follow the above guidelines, limited High-Availabiltiy is still possible. To ensure seamless client failover, you need to set up an internal procedure to keep the certificate configuration of both k8s clusters in sync.
Configuration
Tetrate Management Plane configuration is stored in a Postgres database. The configuration process below assumes that you have configured your original 'active' Management Plane instance to use an external, non-Tetrate-managed Postgres instance; when you create the standby Management Plane instance, this will refer to the same Postgres instance.
Alternatively, you can use independent Postgres instances and ensure that the 'active' instance replicates its configuration to the 'standby' one. You can do this replication continually, or you can import a backup on-demand, before triggering the failover.
Deploy a Replica Management Plane instance
These instructions begin with a single Management Plane instance, which is designated as the 'active' instance. A replica (copy) is deployed, which will act as the 'standby' instance in the event that the 'active' instance becomes unavailable.
Helm-based deployment model
Use the helm-based deployment when the active Management Plane was originally deployed using helm (the recommended method).
In the Kubernetes cluster of the original 'active' Management Plane:
- Take a snapshot of the operational Kubernetes secrets. These secrets were auto-generated on first use:
kubectl get secrets -n tsb -o yaml iam-signing-key > source_mp_operational_secrets.yaml
In the Kubernetes cluster intended for the new 'standby' Management Plane:
- Create a k8s namespace for the replica MP:
kubectl create ns tsb
- Apply the operational secrets from the the original Management Plane instance:
kubectl apply -n tsb -f source_mp_operational_secrets.yaml
- Install the replica Management Plane using the same Helm values that were used for the original Management Plane:
helm install mp tetrate-tsb-helm/managementplane \
--version <tsb-version> \
--namespace tsb \
--values source_mp_values.yaml \
--timeout 10m \
--set image.registry=<registry-location>
Generic deployment model
Use the 'generic' deployment method when the active Management Plane was originally deployed using the tctl
CLI.
In the Kubernetes cluster of the original 'active' Management Plane:
- Take a snapshot of the configurational and operational Kubernetes secrets. These secrets were auto-generated on first use:
kubectl get secrets -n tsb -o yaml admin-credentials azure-credentials custom-host-ca elastic-credentials es-certs iam-oidc-client-secret ldap-credentials postgres-credentials tsb-certs xcp-central-cert iam-signing-key > source_mp_all_secrets.yaml
In the Kubernetes cluster intended for the new 'standby' Management Plane:
- Create a k8s namespace for the replica MP:
kubectl create ns tsb
- Apply the secrets from the the original Management Plane instance:
kubectl apply -n tsb -f source_mp_all_secrets.yaml
- Install the replica Management Plane using
helm
:
helm install mp tetrate-tsb-helm/managementplane \
--version <tsb-version> \
--namespace tsb \
--values dr_mp_values.yaml \
--timeout 10m \
--set image.registry=<registry-location>
... where dr_mp_values.yaml:
- Should include the spec field
- Should NOT include the secrets field (as secrets were installed in the previous step)
Failing over from the Active to the Standby instance
If you ever encounter a situation where you need to fail over from the 'active' Management Plane to the 'standby' instance:
- If necessary, take a backup of the 'active' configuration (the Postgres database) and import it into the 'standby' instance. This is only necessary if the instances use separate databases, and you have not arranged for continual replication from 'active' to 'standby'.
- Update the DNS record used for the Management Plane, pointing it to the 'standby' instance
- Shutdown the original MP to force all clients to reconnect to the replica MP