Tetrate Service Bridge Premium Enterprise Support Best Practices
Tetrate Customer Support is on standby 24/7 to help you resolve your issues. However, we need your cooperation to achieve the best outcome.
This guide is a fast 10-minute read and is an essential tutorial on how to make the most of your Tetrate Service Bridge (TSB) Premium Enterprise Support.
We recommend that you:
- Acquire the concepts and tools in the "Pre-requisite" section before you deploy TSB in production or take over operational responsibilities for TSB.
- Bookmark this page and be prepared to go through the steps in "Self-service troubleshooting", so you are ready to self-diagnose and provide essential troubleshooting information to Tetrate's support team to reduce your resolution time.
Tools you need — Get tctl
Tctl is a helpful utility to troubleshoot TSB. It uses the existing kubecontext to run the commands.
You must install Tctl and be familiar with its commands before deploying TSB in production. See installation and usage instruction.
For every new ticket, Tetrate Customer Support will require running a
tctl collect from both the Management Plane and each cluster involved in the issue.
For example, if you are having issues with an app that is running on a multi-cluster setup, being deployed in clusters CP1 and CP2, you will need to:
- Switch your kubecontext to the Management Plane cluster and run
- Then, do the same for the CP1 and CP2 clusters.
- Finally, attach the 3 generated archives to the ticket.
Knowledge you need — TSB and Service Mesh concepts
We recommend that you read through the TSB basic concepts (30-min read) (free) and Tetrate's free 60-min Istio 0-60 workshop (free) as soon as you install TSB, well before you try to put TSB into production.
Additional recommended training
Additionally, we recommend that you take the self-paced Istio course (free) offered by Tetrate so you are familiar with Istio concepts.
For instructor-led Istio and TSB training, contact your Tetrate account executive.
The fastest way to resolve your problem is to go through this simple checklist as soon as an error occurs and gather the essential information required for effective further troubleshooting by the Tetrate Customer Support team. Your issue resolution will be faster if you perform these procedures prior to engaging Tetrate support and including the information in your support ticket, so Tetrate can focus on issue resolution.
- The first step must always be to restore the environment to a working state.
- If you are deploying a change, roll it back.
- If you have high availability, fall-back to the working system.
Identify the area and troubleshoot, it can be one of the following:
Configuration propagation: issues that cause your configuration to not be properly propagated from the Management Plane to the Control Planes. If a configuration has not been created or updated in the Control Plane cluster, it's most likely an XCP issue. Troubleshoot as follows:
Verify the new configuration has been applied in TSB. Gather and check the configuration via tctl or web UI to discard any Management Plane issue.
Check its status. If it has been applied and the change is not reflected in the application cluster, use the
tctl x statuscommand or check the status in the UI.
Check the MPC logs. Gather the
mpcpod logs from the
tsbnamespace to quickly find if there has been any configuration processing issue. You can find a few examples at our troubleshooting page.
If none of the steps above has helped you to identify the root cause, create a new ticket using the Tetrate Customer Support portal with the following information:
- The output of
tctl x statusof the object.
- The configuration that you're trying to apply.
- Logs from the
mpcpod in the
tsbnamespace in the MP cluster.
- Logs from the XCP
centralpod in the
tsbnamespace in the MP cluster.
- Logs from the XCP
edgepods in the
istio-systemnamespace in the application cluster.
- The output of
Data Plane. If the configuration is correct and has been created/updated in the application cluster, it's most likely a Data Plane issue. Troubleshoot as follows:
If the pod is not able to start, check its logs. It could be a certificate issue, so the configuration is not propagated from Istiod, or the listener is duplicated.
If you can't find the cause of the issue, create a new ticket using the Tetrate Customer Support portal with the following information:
- Logs from the ingressgateway pod
- Configurations (GW, VS, DR, SE) related to the route not working.
- [Optional] Application traffic flow.
How to file a ticket for the fastest resolution
Filing a support ticket requires information collected from self-diagnostic. If you do not have this information, please go through the steps outlined earlier first before filing a ticket. Your issue resolution will be faster if you perform these procedures prior to engaging with Tetrate support and including the information in your ticket, so Tetrate can focus on the resolution.
File a new support ticket
All tickets must be officially initiated via Tetrate's support portal (JIRA), even if you are in touch with a Tetrate employee via other means of communication (e.g., phone or Slack). Officially filing a ticket will ensure the visibility and response of the entire Tetrate support and engineering team, resulting in a faster and better response.
Additionally, even though a Tetrate Customer Engineer (CE) might be your day-to-day contact for a specific professional services engagement, break/fix support will be done by Tetrate's Customer Support team, so you should always file a ticket via Tetrate's JIRA portal even if your Customer Engineer is aware of and helping you with the issue.
Step 1: Write a detailed description with proper context
The description must provide as much context as possible. The more you can tell us on the description, the fewer follow-up questions we may need.
Please, be sure to answer the following questions in your description:
- If the issue is caused by a new or an existing application or configuration.
- If you made any changes to the faulty application.
- If you made any recent changes on the platform that could have caused the problem. Examples: certificate renewal, networking changes, infrastructure upgrades...
Step 2: Include the key information from self-troubleshooting
Attach the following logs from the self-troubleshooting checklist above:
- All logs and configurations from the self-troubleshooting steps.
- [Optional] The
tctl collectarchive from every cluster involved.
Step 3: Choose a priority
Choose one of the following:
- Severity 1: a Production System is severely impacted and completely shut down, or the system operations or mission-critical applications are down, due to a Tetrate software failure.
- Severity 2: a Production System performance is degraded or restricted, but still operational.
- Severity 3: a Non-production System is non-operational or completely shut down.
- Severity 4: a Non-production System performance is degraded or restricted, but still operational.
Step 4: Respond to the Tetrate support team promptly in JIRA
Depending on the severity, a member of the Tetrate Customer Support team will respond to you within Tetrate's SLA.
Please respond to Tetrate's support team promptly to ensure continued support at your designated SLA level, even if your issue has been resolved outside Tetrate support engagement. We can achieve your SLA only with your timely response on tickets.