Tetrate Service BridgeVersion: 1.7.x

Tetrate Service Bridge Premium Enterprise Support Best Practices

Tetrate Customer Support is on standby 24/7 to help you resolve your issues. However, we need your cooperation to achieve the best outcome.

This guide is a fast 10-minute read and is an essential tutorial on how to make the most of your Tetrate Service Bridge (TSB) Premium Enterprise Support.

We recommend that you:

Acquire the concepts and tools in the "Pre-requisite" section before you deploy TSB in production or take over operational responsibilities for TSB.
Bookmark this page and be prepared to go through the steps in "Self-service troubleshooting", so you are ready to self-diagnose and provide essential troubleshooting information to Tetrate's support team to reduce your resolution time.

Pre-requisites

Tools you need — Get tctl

Tctl is a helpful utility to troubleshoot TSB. It uses the existing kubecontext to run the commands.

You must install Tctl and be familiar with its commands before deploying TSB in production. See installation and usage instruction.

For every new ticket, Tetrate Customer Support will require running a tctl collect from both the Management Plane and each cluster involved in the issue.

For example, if you are having issues with an app that is running on a multi-cluster setup, being deployed in clusters CP1 and CP2, you will need to:

Switch your kubecontext to the Management Plane cluster and run tctl collect.
Then, do the same for the CP1 and CP2 clusters.
Finally, attach the 3 generated archives to the ticket.

Knowledge you need — TSB and Service Mesh concepts

We recommend that you read through the TSB basic concepts (30-min read) (free) and Tetrate's free 60-min Istio 0-60 workshop (free) as soon as you install TSB, well before you try to put TSB into production.

Additional recommended training

Additionally, we recommend that you take the self-paced Istio course (free) offered by Tetrate so you are familiar with Istio concepts.

For instructor-led Istio and TSB training, contact your Tetrate account executive.

Self-diagnostic procedure

The fastest way to resolve your problem is to go through this simple checklist as soon as an error occurs and gather the essential information required for effective further troubleshooting by the Tetrate Customer Support team. Your issue resolution will be faster if you perform these procedures prior to engaging Tetrate support and including the information in your support ticket, so Tetrate can focus on issue resolution.

Restore
1. The first step must always be to restore the environment to a working state.
2. If you are deploying a change, roll it back.
3. If you have high availability, fall-back to the working system.
Identify the area and troubleshoot, it can be one of the following:
1. Configuration propagation: issues that cause your configuration to not be properly propagated from the Management Plane to the Control Planes. If a configuration has not been created or updated in the Control Plane cluster, it's most likely an XCP issue. Troubleshoot as follows:
  1. Verify the new configuration has been applied in TSB. Gather and check the configuration via tctl or web UI to discard any Management Plane issue.
  2. Check its status. If it has been applied and the change is not reflected in the application cluster, use the tctl x status command or check the status in the UI.
  3. Check the MPC logs. Gather the mpc pod logs from the tsb namespace to quickly find if there has been any configuration processing issue. You can find a few examples at our troubleshooting page.
  4. If none of the steps above has helped you to identify the root cause, create a new ticket using the Tetrate Customer Support portal with the following information:
    - The output of tctl x status of the object.
    - The configuration that you're trying to apply.
    - Logs from the mpc pod in the tsb namespace in the MP cluster.
    - Logs from the XCP central pod in the tsb namespace in the MP cluster.
    - Logs from the XCP edge pods in the istio-system namespace in the application cluster.
2. Data Plane. If the configuration is correct and has been created/updated in the application cluster, it's most likely a Data Plane issue. Troubleshoot as follows:
  1. If the pod is not able to start, check its logs. It could be a certificate issue, so the configuration is not propagated from Istiod, or the listener is duplicated.
  2. If the pod is able to start, then check its logs to understand if it's due to a misconfiguration or because the backend is not available. See Gateway Troubleshooting and Troubleshooting.
  3. If you can't find the cause of the issue, create a new ticket using the Tetrate Customer Support portal with the following information:
    - Logs from the ingressgateway pod
    - Configurations (GW, VS, DR, SE) related to the route not working.
    - [Optional] Application traffic flow.

How to file a ticket for the fastest resolution

Filing a support ticket requires information collected from self-diagnostic. If you do not have this information, please go through the steps outlined earlier first before filing a ticket. Your issue resolution will be faster if you perform these procedures prior to engaging with Tetrate support and including the information in your ticket, so Tetrate can focus on the resolution.

File a new support ticket

All tickets must be officially initiated via Tetrate's support portal (JIRA), even if you are in touch with a Tetrate employee via other means of communication (e.g., phone or Slack). Officially filing a ticket will ensure the visibility and response of the entire Tetrate support and engineering team, resulting in a faster and better response.

Additionally, even though a Tetrate Customer Engineer (CE) might be your day-to-day contact for a specific professional services engagement, break/fix support will be done by Tetrate's Customer Support team, so you should always file a ticket via Tetrate's JIRA portal even if your Customer Engineer is aware of and helping you with the issue.

Step 1: Write a detailed description with proper context

The description must provide as much context as possible. The more you can tell us on the description, the fewer follow-up questions we may need.

Please, be sure to answer the following questions in your description:

If the issue is caused by a new or an existing application or configuration.
If you made any changes to the faulty application.
If you made any recent changes on the platform that could have caused the problem. Examples: certificate renewal, networking changes, infrastructure upgrades...

Step 2: Include the key information from self-troubleshooting

Attach the following logs from the self-troubleshooting checklist above:

All logs and configurations from the self-troubleshooting steps.
[Optional] The tctl collect archive from every cluster involved.

Step 3: Choose a priority

Choose one of the following:

Severity 1: a Production System is severely impacted and completely shut down, or the system operations or mission-critical applications are down, due to a Tetrate software failure.
Severity 2: a Production System performance is degraded or restricted, but still operational.
Severity 3: a Non-production System is non-operational or completely shut down.
Severity 4: a Non-production System performance is degraded or restricted, but still operational.

Step 4: Respond to the Tetrate support team promptly in JIRA

Depending on the severity, a member of the Tetrate Customer Support team will respond to you within Tetrate's SLA.

Please respond to Tetrate's support team promptly to ensure continued support at your designated SLA level, even if your issue has been resolved outside Tetrate support engagement. We can achieve your SLA only with your timely response on tickets.

How to create an effective support ticket

Creating a well-structured support ticket is crucial for fast and efficient issue resolution. Here’s how to make a good support ticket in Jira:

Write a detailed description and findings from your self-diagnose

Provide a comprehensive description of the issue with as much context as possible. Use the following format for clarity:

Summary: A brief and clear summary of the issue.
Description: Detailed description including the context, changes made, and potential impact.
Steps to Reproduce: A clear list of steps to reproduce the issue.
Expected Result: What you expected to happen.
Actual Result: What actually happened.
Environment: Details about the environment where the issue occurred (e.g., versions, configurations, network setup).

Example:

Summary: Deployment failure in the CP1 cluster after configuration update

Description: We encountered a deployment failure in the CP1 cluster following a recent update to the application configuration. The issue seems to be related to certificate propagation.

New or Existing Application: Existing
Changes Made: Updated the application configuration to include new certificate details.
Recent Platform Changes: Network configuration updates were made last week.

Steps to Reproduce:

Switch to CP1 cluster context.
Apply the new application configuration.
Attempt to deploy the application.

Expected Result: The application should deploy successfully without any errors.

Actual Result: Deployment fails with a certificate propagation error.

Environment:

Tetrate Service Bridge version: 1.9.1
Kubernetes version: 1.26
CP1 cluster running on AKS

Include key information from self-troubleshooting

Attach the necessary logs and configurations collected during your self-troubleshooting steps. Ensure these are clearly labeled and organized.

As best effort please follow these rules:

Do not attach error codes as images
When sharing config files, make sure to export them as YAML files or use the /code option on Jira like this:
When attaching logs, make sure they are shared in txt format for easy review.
When attaching a configuration dump, make sure this is complete, and the correct information is there.

Benefits of Written Communication in Support Tickets

While we understand the desire for direct calls, written communication offers several advantages that significantly enhance the quality of our service to you:

Comprehensive Context: Written communication ensures that all details are documented, which is crucial if there’s a need to escalate the issue to a call.
Expert Collaboration: Having a written record allows us to easily onboard Subject Matter Experts from various fields to address your requirements more effectively.
Reference Documentation: The ticket serves as an archived reference document for both your organization and ours, providing valuable insights for future issues.
Curated Content: Written tickets allow us to provide more thorough and well-curated responses, leading to more effective solutions.

By following these guidelines and using this format, you can create a comprehensive and well-organized support ticket that facilitates faster issue resolution by providing the Tetrate support team with all the necessary information upfront.

Pre-requisites​

Tools you need — Get tctl​

Knowledge you need — TSB and Service Mesh concepts​

Additional recommended training​

Self-diagnostic procedure​

How to file a ticket for the fastest resolution​

File a new support ticket​

Step 1: Write a detailed description with proper context​

Step 2: Include the key information from self-troubleshooting​

Step 3: Choose a priority​

Step 4: Respond to the Tetrate support team promptly in JIRA​

How to create an effective support ticket​

Write a detailed description and findings from your self-diagnose​

Include key information from self-troubleshooting​

Benefits of Written Communication in Support Tickets​