Skip to main content
logoTetrate Service BridgeVersion: next

Best Practices - Preparing for High Availability

The best practices to follow to prepare you for an HA deployment.

This document explains the measures to prepare when creating a failover Tetrate Management Plane deployment. You may wish to refer to Tetrate Technical Support for assistance with this procedure.

Prepare for HA

To prepare for a High Availability Deployment, ensure you follow these best practices.

  1. Take Regular Backups of critical Basic Configuration

    Backup the IAM Signing Key:

    kubectl get secrets -n tsb -o yaml iam-signing-key > source_mp_operational_secrets.yaml

    When necessary, you can restore this with kubectl apply -n tsb -f source_mp_operational_secrets.yaml

    Helm install: Backup the values used to install the Management Plane:

    helm get values <release-name>

    TCTL install: Backup the additional secrets used to install the Management Plane:

    kubectl get secrets -n tsb -o yaml admin-credentials azure-credentials custom-host-ca elastic-credentials es-certs iam-oidc-client-secret ldap-credentials postgres-credentials tsb-certs xcp-central-cert iam-signing-key > source_mp_all_secrets.yaml

    When necessary, you can restore these with kubectl apply -n tsb -f source_mp_all_secrets.yaml

    Store these backups in a secure location, away from the Management Plane cluster.

  2. Use an easy-to-modify DNS name to identify your Management Plane endpoints

    Control Plane clusters connect to the front-envoy endpoint on your Management Plane. If the Management Plane fails, you will need to point these clusters to a standby or new Management Plane instance.

    The easiest way to achieve this is to configure the Control Plane clusters to locate the Management Plane using a well-known, modifiable DNS address (for example, by making the address a CNAME for the specific MP instance). To perform a failover, you then just need to update that address to point to the alternate Management Plane.

    Alternatively, you can reconfigure each control plane one-by-one to update the managementPlane address.

  3. Backup the Postgres Service Configuration database

    You have multiple strategies to operate the Postgres database that is used to store service configuration:

    Whichever method you decide, you should take regular (e.g. nightly or hourly) backups of the service configuration so that the database can be restored if needed:

    • An external Postgres database can be backed up using pg_dump. You may wish to trim the audit logs which are stored in the database to save space, as they are generally not critical when restoring the database.
    • The embedded Postgres database is automatically backed up every 24 hours; a backup can be initiated at any time. Backups are stored in a local PVC and should be copied off-cluster for resilience.
  4. Use an external, Highly-Available ElasticSearch database for metrics

    TSB uses an external ElasticSearch database to store metrics, logs and traces.

    The database can become very large, and it is updated frequently. In general, it is not critically necessary to preserve the database during a failover operation; TSB will tolerate the loss of historical metrics.

    Take appropriate measures to maintain the Elastic database in a resilient, highly-available manner.