Tetrate Service BridgeVersion: 1.12.x

Troubleshoot the Tetrate Management Plane

How to troubleshoot the Tetrate Management Plane, and actions to recover in the event of a failure.

This document explains how you can troubleshoot a failing Tetrate Management Plane, and suggests possible recovery steps. If it proves impossible to recover the failed Management Plane, you can then start your chosen failover process. You may wish to refer to Tetrate Technical Support for assistance with this procedure.

Control Plane Troubleshooting

If your Management Plane appears to be functioning correctly, but one or more of your Workload clusters cannot connect, check out the Control Plane Troubleshooting guide.

Troubleshooting Problems with the Tetrate Management Plane

If the Tetrate Management Plane appears to have failed, you should first consider the following steps to attempt to recover it:

Check that the Management Plane components appear to be running
Verify that the Management Plane cluster is running, and that you can access the tsb namespace:
```
kubectl get pods -n tsb
```
Check the logs from the tsb-operator, which is responsible for deploying and configuring the TSB Management Plane components:
```
kubectl logs -n tsb  deployment/tsb-operator-management-plane
```
Check Access to the Management Plane endpoint
The Management Plane will use a well-known front-Envoy endpoint, listening for HTTPS traffic (UI and API) on port 443:
```
kubectl get svc envoy -n tsb
```
Verify that the external IP address is reachable and can be resolved using the FQDN for the Management Plane. The FQDN for the Management Plane appears in the tctl configuration (run tctl ui to see it), and in each Workload Cluster's controlplane CR:
Run against a Workload Cluster (not the Management Cluster)
```
kubectl get controlplane -o json -n istio-system | jq ".items[0].spec|.managementPlane,.telemetryStore"
```
Check the logs from the front-Envoy proxy:
```
kubectl logs deployment/envoy -n tsb
```
Check the logs from the Control Plane services
If the Management Plane is not functioning, then it's possible that the Control Plane services on each Workload Cluster cannot connect to the Management Plane. Check the logs from various services, looking for errors regarding connections to the Management Plane or errors regarding token validation:
```
kubectl logs deploy/edge -n istio-system -f
```
If necessary, delete the existing tokens on the Control Plane and then verify that these tokens are re-generated on the Control Plane.
```
kubectl delete secret otel-token oap-token ngac-token xcp-edge-central-auth-token -n istio-system
sleep 60
kubectl get secrets otel-token oap-token ngac-token xcp-edge-central-auth-token -n istio-system
```
Check for certificate errors, as described in the Control Plane troubleshooting instructions.
Check the logs from the Management Plane IAM service
All requests through the front-Envoy are authenticated by the Management Plane IAM service:
```
kubectl logs deployment/iam -n tsb
```
You can try restarting the IAM service if you see unexpected errors:
```
kubectl delete pod -n tsb -lapp=iam
kubectl logs -f deployment/iam -n tsb
```
Further Analysis
You can continue to check logs from the deployments of other TSB Management Plane components:
- tsb hosts the Management Plane API service
- web hosts the Management Plane UI
- oap hosts the Management Plane observability analysis platform (based on Apache Skywalking)
- If you are using the embedded Postgres database, kubegres-controller-manager hosts the Kubegres operator which manages the Postgres instances.
Look particularly for errors relating to connection problems (indicating firewall issues) and authentication problems (indicating certificate or token problems). You can safely stop (delete) any TSB component pod that is managed by the TSB Management Plane Operator; the operator creates a deployment that will reload the component.

Next Steps

If you cannot quickly restore your Tetrate Management Plane, you can:

Troubleshooting Problems with the Tetrate Management Plane​

Check that the Management Plane components appear to be running​

Check Access to the Management Plane endpoint​

Check the logs from the Control Plane services​

Check the logs from the Management Plane IAM service​

Further Analysis​

Next Steps​

Troubleshooting Problems with the Tetrate Management Plane

Check that the Management Plane components appear to be running

Check Access to the Management Plane endpoint

Check the logs from the Control Plane services

Check the logs from the Management Plane IAM service

Further Analysis

Next Steps