Tetrate Service BridgeVersion: next

Minimal diagnostic dump (collect-minimal)

When you are chasing an intermittent 503 or a connectivity failure for one specific service, a full tctl collect of the whole cluster gives you far more data than you need and takes longer to produce, share, and read through.

tctl collect-minimal is a focused alternative: it captures a hostname-scoped, correlated, layered diagnostic dump for a single host on a single cluster. It resolves everything that participates in serving that one hostname — the proxies, the Istio and XCP configuration, the backing Kubernetes objects, the istiod replicas, and the XCP edge — and lays the artifacts out in the order an engineer naturally triages them.

note

tctl collect-minimal complements, and does not replace, tctl collect. When Tetrate Support asks for a full cluster dump, continue to use tctl collect. Reach for collect-minimal when the problem is already narrowed down to a single hostname.

Prerequisites

tctl installed and configured against the cluster you want to inspect. See the tctl installation and usage guide.
Your current kubecontext pointing at the cluster where the affected gateway or workload runs. collect-minimal reads cluster state through that context, exactly like tctl collect.
The fully qualified hostname of the affected service (for example echo.echo.svc.cluster.local) and the namespace it lives in.

Basic usage

Two flags are required — the hostname to scope the dump to, and that hostname's namespace:

tctl collect-minimal --hostname echo.echo.svc.cluster.local --namespace echo

This resolves the scope, runs every collector, and writes a single timestamped tar.gz archive into the current directory. Attach that archive to your support ticket, or unpack it and read it yourself.

To write the files to a directory instead of an archive (handy for local debugging), add --disable-archive:

tctl collect-minimal --hostname echo.echo.svc.cluster.local --namespace echo --disable-archive

Options

Flag	Description
`--hostname`	Required. Hostname to scope the dump to, e.g. `echo.echo.svc.cluster.local`.
`--namespace`	Required. Namespace of the hostname being scoped to.
`-o`, `--output-directory`	Path to write the collected files under. Defaults to a timestamped directory.
`--disable-archive`	Output a directory of files rather than a `tar.gz` tarball.
`--redact-presets`	Comma-separated redaction presets. The `networking` preset replaces every IPv4/IPv6 address with a consistent hash.
`--redact-regexes`	Comma-separated regexes; any match is replaced with a SHA-256 hash of the matched string.
`--rps-limit`	Requests-per-second limit to the Kubernetes API server (default `50`). Raise it to collect faster on a quiet cluster.

See the tctl collect-minimal command reference for the canonical, auto-generated flag list.

What gets collected

The command first resolves the scope of the hostname: it works out whether this cluster is acting as a tier-1 or tier-2 for the host, finds the service, its backing pods, the gateway pods, the istiod replicas that program those proxies, and the XCP edge. It then runs a set of collectors and organises their output into six fixed, numbered layers plus two metadata files. The numbering follows the triage order: proxy stats → config → Kubernetes state → control plane → mesh control plane.

Path	Contents
`00-summary.md`	Human-readable triage guide — start here (see below).
`manifest.json`	Machine-readable record of the resolved scope and per-collector status.
`01-envoy/`	Envoy proxy artifacts for the host's gateway, backing, and client pods: `stats.txt`, `clusters.json`, and `config_dump.json`. The authoritative artifact for the classic `503`.
`02-istio-config/`	The raw Istio CRs that shape routing for this host (VirtualService, DestinationRule, Gateway, ServiceEntry, …) plus the relevant `IstioOperator` CRs.
`03-xcp-config/`	The XCP config CRs (`*.xcp.tetrate.io`) — the Workspace and Group containers plus the XCP gateway/routing intent serving this hostname.
`04-k8s/`	The backing Kubernetes objects (Service, Endpoints, pods, …).
`05-istiod/`	The istiod replicas that program this host's proxies, filtered to the `istio.io/rev` revisions actually in use. Each replica is grouped under a self-contained `<rev>-<pod>/` directory with its logs, debug endpoints, owning Deployment, fronting Service, and served Envoy config dumps.
`06-xcp-edge/`	XCP edge debug endpoints, including `debug-gateways.json` (does the edge know this host?) and `debug-appliedconfigz.json` (the edge's live applied config).

Collectors are best-effort: if one fails, it records the error in the manifest and the run continues. A partial dump is still useful.

How to read the dump

Open 00-summary.md first. It is written top-to-bottom as a triage guide and contains:

The resolved scope — hostname, namespace, service name, and the detected cluster role (tier-1 / tier-2) with the reason it was chosen.
What was found — counts and names of backing pods, gateway pods, istiod pods, XCP edge pods, and the matched Istio and XCP CRs.
Revisions — which istiod revisions were captured, and which replicas were skipped because their revision is not in use on this hostname's proxies. This is a safety valve: if the proxy you are debugging looks like it is programmed by a skipped revision, re-run after relabelling the pods.
Gold-thread hints — concrete pointers into the layers, for example:
1. 01-envoy/<pod>/stats.txt — a non-zero no_cluster_found points at a missing route/cluster (the classic 503); upstream_cx_connect_fail / upstream_cx_none_healthy point at an L4/TCP upstream failure.
2. 01-envoy/<pod>/clusters.json — an absent destination cluster, or endpoints with health_flags set, explains UH/UF.
3. 01-envoy/<pod>/config_dump.json — route config and listener filter chains.

Tier-1 cross-cluster note

When the resolved cluster role is tier-1, the summary includes a cross-cluster note. A tier-1 gateway rewrites the HTTP authority, so the hostname you queried is not necessarily the hostname serving traffic on the downstream cluster. Confirm the real tier-2 hostname in the tier-1 pod's config_dump.json (look at route.rewrite.authority and the outbound|... cluster name) before re-running collect-minimal against the tier-2 cluster. TCP/TLS-passthrough is SNI-routed and is not rewritten.

collect-minimal runs through the same redaction pipeline as tctl collect. To obfuscate sensitive data before attaching a dump to a ticket, use the redaction flags:

# Hash every IP address consistently
tctl collect-minimal --hostname echo.echo.svc.cluster.local --namespace echo \
  --redact-presets networking

# Hash anything matching your own patterns
tctl collect-minimal --hostname echo.echo.svc.cluster.local --namespace echo \
  --redact-regexes "<regex-one>,<regex-two>"

The manifest.json and 00-summary.md metadata files carry only hostnames, pod names, and per-collector status — never secret material.

Current limitations and roadmap

collect-minimal is being delivered iteratively. The current release captures a single point-in-time snapshot for one hostname on the cluster your kubecontext points at. The following enhancements are planned for future releases:

Windowed collection (--until <duration>) — capture START and END snapshots in a single dump, with mid-window proxy-restart detection, so you can bracket an intermittent failure.
Streamed log capture (--stream) — windowed log streaming with stratified per-second sampling: keep every failure, sample the healthy baseline, and aggregate storms to bound volume without losing data to log rotation.
Reversible proxy diagnostics (tctl debug proxy-stats) — arm the gateways serving a hostname with a connectivity-failure Envoy proxyStatsMatcher profile, fully revertible.
Access-log toggling (tctl debug access-log) — reconfigure access logging for a hostname/namespace, with a revert option.
Richer control-plane capture — istiod deployment and logs for the exact revision the data plane uses, alongside IstioOperator and EdgeXcp CRs.
Cross-cluster client discovery — today client-side envoy logs and data are captured only when the client lives in the same cluster as the first proxy it hits; this will be extended across clusters.
PCAP capture — north/south-bound packet capture.

Prerequisites​

Basic usage​

Options​

What gets collected​

How to read the dump​

Tier-1 cross-cluster note​

Sharing dumps safely​

Current limitations and roadmap​

Related​