Key Metrics
Tetrate Service Bridge collects a large number of metrics. This page is generated from dashboards ran internally at Tetrate and will be updated periodically based on best practices learned from operational experiences in Tetrate and from user deployments. Each heading represents a different dashboard, and each sub-heading is a panel on this dashboard. For this reason, you may see metrics appear multiple times.
GitOps Operational Status
Operational metrics to indicate Cluster GitOps health
GitOps Status
Shows the status of the GitOps component for each cluster.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_enabled | N/A | gitops_enabled |
Accepted Admission Requests
Accepted admission requests for each cluster. This is the rate at which operations are processed by the GitOps relay and sent to TSB.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_admission_count | allowed | sum(rate(gitops_admission_count{allowed="true"}[1h])) by (cluster_name) |
Rejected Admission Requests
Rejected admission requests for each cluster. This is the rate at which operations are processed by the GitOps relay and sent to TSB.
A spike in these metrics may indicate an increase in invalid TSB resources being applied to the Kubernetes clusters, or error in the admission webhook processing.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_admission_count | allowed | sum(rate(gitops_admission_count{allowed="false"}[1h])) by (cluster_name) |
Admission Review Latency
Admission review latency percentiles grouped by cluster.
The GitOps admission reviews make decisions by forwarding the objects to the Management Plane. This metric helps understand the time it takes to make such decisions.
A spike here may indicate network issues or connectivity issues between the Control Plane and the Management Plane.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_admission_duration_bucket | N/A | histogram_quantile(0.99, sum(rate(gitops_admission_duration_bucket[1h])) by (cluster_name, le)) |
gitops_admission_duration_bucket | N/A | histogram_quantile(0.95, sum(rate(gitops_admission_duration_bucket[1h])) by (cluster_name, le)) |
Resources Pushed to TSB
Number of resources pushed to the Management Plane.
This should be equivalent to the admission requests in most cases, but this will also account for object pushes that are done by the background reconcile processes.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_push_count | success | sum(rate(gitops_push_count{success="true"}[1h])) by (cluster_name) |
Failed pushes to TSB
Number of resource pushes to the Management Plane that failed.
This should be equivalent to the failed admission requests in most cases, but this will also account for object pushes that are done by the background reconcile processes.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_push_count | success | sum(rate(gitops_push_count{success="false"}[1h])) by (cluster_name) |
Resources Conversions
Number of Kubernetes resources that have been read from the cluster and successfully converted into TSB objects to be pushed to the Management plane.
The values for this metric should be the same as the Pushed Objects. If there is a difference between them, it probably means some issue when converting the Kubernetes objects to TSB objects.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_convert_count | success | sum(rate(gitops_convert_count{success="true"}[1h])) by (cluster_name) |
Resources conversions errors
Number of Kubernetes resources that have been read from the cluster and failed to be converted into TSB objects.
A spike on this metric indicates that the Kubernetes objects could not be converted to TSB objects and that those resources were not sent to the Management Plane.
Metric Name | Labels | PromQL Expression |
---|---|---|
gitops_convert_count | success | sum(rate(gitops_convert_count{success="false"}[1h])) by (cluster_name) |
Global Configuration Distribution
These metrics indicate the overall health of Tetrate Service Bridge and should be considered the starting point for any investigation into issues with Tetrate Service Bridge.
Connected Clusters
This details all clusters connected to and receiving configuration from the management plane.
If this number drops below 1 or a given cluster does not appear in this table it means that the cluster is disconnected. This may happen for a brief period of time during upgrades/re-deploys.
Metric Name | Labels | PromQL Expression |
---|---|---|
xcp_central_current_edge_connections | N/A | xcp_central_current_edge_connections |
TSB Error Rate (Humans)
Rate of failed requests to the TSB apiserver from the UI and CLI.
Metric Name | Labels | PromQL Expression |
---|---|---|
grpc_server_handled_total | component grpc_code grpc_method grpc_type | sum(rate(grpc_server_handled_total{component="tsb", grpc_code!="OK", grpc_type="unary", grpc_method!="SendAuditLog"}[1m])) by (grpc_code) OR on() vector(0) |
Istio-Envoy Sync Time (99th Percentile)
Once XCP has synced with the management plane it creates resources for Istio to configure Envoy. Istio usually distributes these within a second.
If this number starts to exceed 10 seconds then you may need to scale out istiod. In small clusters, it is possible this number is too small to be handled by the histogram buckets so may be nil.
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_proxy_convergence_time_bucket | N/A | histogram_quantile(0.99, sum(rate(pilot_proxy_convergence_time_bucket[1m])) by (le, cluster_name)) |
XCP central -> edge Sync Time (99th Percentile)
MPC component translates TSB configuration into XCP objects. XCP central then sends these objects to every Edge connected to it.
This is the time taken for XCP central to send the configs to edges in ms.
Metric Name | Labels | PromQL Expression |
---|---|---|
xcp_central_config_propagation_time_ms_bucket | N/A | histogram_quantile(0.99, sum(rate(xcp_central_config_propagation_time_ms_bucket[1m])) by (le, edge)) |
Istiod Errors
Rate of istiod errors broken down by cluster. This graph helps identify clusters that may be experiencing problems. Typically, there should be no errors. Any non-transient errors should be investigated.
Sometimes this graph will show "No data" or these metrics won't exist. This is because istiod only emits these metrics if the errors occur.
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_total_xds_internal_errors | N/A | sum(rate(pilot_xds_write_timeout[1m])) by (cluster_name) + sum(rate(pilot_total_xds_internal_errors[1m])) by (cluster_name) + sum(rate(pilot_total_xds_rejects[1m])) by (cluster_name) + sum(rate(pilot_xds_expired_nonce[1m])) by (cluster_name) + sum(rate(pilot_xds_push_context_errors[1m])) by (cluster_name) + sum(rate(pilot_xds_pushes{type=~".*_senderr"}[1m])) by (cluster_name) OR on() vector(0) |
pilot_total_xds_rejects | N/A | sum(rate(pilot_xds_write_timeout[1m])) by (cluster_name) + sum(rate(pilot_total_xds_internal_errors[1m])) by (cluster_name) + sum(rate(pilot_total_xds_rejects[1m])) by (cluster_name) + sum(rate(pilot_xds_expired_nonce[1m])) by (cluster_name) + sum(rate(pilot_xds_push_context_errors[1m])) by (cluster_name) + sum(rate(pilot_xds_pushes{type=~".*_senderr"}[1m])) by (cluster_name) OR on() vector(0) |
pilot_xds_expired_nonce | N/A | sum(rate(pilot_xds_write_timeout[1m])) by (cluster_name) + sum(rate(pilot_total_xds_internal_errors[1m])) by (cluster_name) + sum(rate(pilot_total_xds_rejects[1m])) by (cluster_name) + sum(rate(pilot_xds_expired_nonce[1m])) by (cluster_name) + sum(rate(pilot_xds_push_context_errors[1m])) by (cluster_name) + sum(rate(pilot_xds_pushes{type=~".*_senderr"}[1m])) by (cluster_name) OR on() vector(0) |
pilot_xds_push_context_errors | N/A | sum(rate(pilot_xds_write_timeout[1m])) by (cluster_name) + sum(rate(pilot_total_xds_internal_errors[1m])) by (cluster_name) + sum(rate(pilot_total_xds_rejects[1m])) by (cluster_name) + sum(rate(pilot_xds_expired_nonce[1m])) by (cluster_name) + sum(rate(pilot_xds_push_context_errors[1m])) by (cluster_name) + sum(rate(pilot_xds_pushes{type=~".*_senderr"}[1m])) by (cluster_name) OR on() vector(0) |
pilot_xds_pushes | type | sum(rate(pilot_xds_write_timeout[1m])) by (cluster_name) + sum(rate(pilot_total_xds_internal_errors[1m])) by (cluster_name) + sum(rate(pilot_total_xds_rejects[1m])) by (cluster_name) + sum(rate(pilot_xds_expired_nonce[1m])) by (cluster_name) + sum(rate(pilot_xds_push_context_errors[1m])) by (cluster_name) + sum(rate(pilot_xds_pushes{type=~".*_senderr"}[1m])) by (cluster_name) OR on() vector(0) |
pilot_xds_write_timeout | N/A | sum(rate(pilot_xds_write_timeout[1m])) by (cluster_name) + sum(rate(pilot_total_xds_internal_errors[1m])) by (cluster_name) + sum(rate(pilot_total_xds_rejects[1m])) by (cluster_name) + sum(rate(pilot_xds_expired_nonce[1m])) by (cluster_name) + sum(rate(pilot_xds_push_context_errors[1m])) by (cluster_name) + sum(rate(pilot_xds_pushes{type=~".*_senderr"}[1m])) by (cluster_name) OR on() vector(0) |
Istio Operational Status
Operational metrics for istiod health.
Connected Envoys
Count of Envoys connected to istiod. This should represent the total number of endpoints in the selected cluster.
If this number significantly decreases for longer than 5 minutes without an obvious reason (e.g. a scale-down event) then you should investigate. This may indicate that Envoys have been disconnected from istiod and are unable to reconnect.
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_xds | cluster_name | sum(pilot_xds{cluster_name="$cluster"}) |
Total Error Rate
The total error rate for Istio when configuring Envoy, including generation and transport errors.
Any errors (current and historic) should be investigated using the more detailed split below.
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_total_xds_internal_errors | cluster_name | sum(rate(pilot_xds_write_timeout{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_internal_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_rejects{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_expired_nonce{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_push_context_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_pushes{cluster_name="$cluster", type=~".*_senderr"}[1m])) OR on() vector(0) |
pilot_total_xds_rejects | cluster_name | sum(rate(pilot_xds_write_timeout{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_internal_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_rejects{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_expired_nonce{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_push_context_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_pushes{cluster_name="$cluster", type=~".*_senderr"}[1m])) OR on() vector(0) |
pilot_xds_expired_nonce | cluster_name | sum(rate(pilot_xds_write_timeout{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_internal_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_rejects{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_expired_nonce{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_push_context_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_pushes{cluster_name="$cluster", type=~".*_senderr"}[1m])) OR on() vector(0) |
pilot_xds_push_context_errors | cluster_name | sum(rate(pilot_xds_write_timeout{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_internal_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_rejects{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_expired_nonce{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_push_context_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_pushes{cluster_name="$cluster", type=~".*_senderr"}[1m])) OR on() vector(0) |
pilot_xds_pushes | cluster_name type | sum(rate(pilot_xds_write_timeout{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_internal_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_rejects{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_expired_nonce{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_push_context_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_pushes{cluster_name="$cluster", type=~".*_senderr"}[1m])) OR on() vector(0) |
pilot_xds_write_timeout | cluster_name | sum(rate(pilot_xds_write_timeout{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_internal_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_total_xds_rejects{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_expired_nonce{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_push_context_errors{cluster_name="$cluster"}[1m])) + sum(rate(pilot_xds_pushes{cluster_name="$cluster", type=~".*_senderr"}[1m])) OR on() vector(0) |
Median Proxy Convergence Time
The median (50th percentile) delay between istiod receiving configuration changes and the proxy receiving all required configuration in the selected cluster. This number indicates how stale the proxy configuration is. As this number increases, it may start to impact application traffic.
This number is typically in the hundreds of milliseconds. In small clusters, this number may be zero.
If this number creeps up to 30s for an extended period, istiod likely needs to be scaled out (or up).
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_proxy_convergence_time_bucket | cluster_name | histogram_quantile(0.5, sum(rate(pilot_proxy_convergence_time_bucket{cluster_name="$cluster"}[1m])) by (le)) |
Istiod Push Rate
The rate of istiod pushes to Envoy grouped by discovery service. Istiod pushes clusters (CDS), endpoints (EDS), listeners (LDS) or routes (RDS) any time it receives a configuration change.
Changes are triggered by a user interacting with TSB or a change in infrastructure such as a new endpoint (service instance/pod) creation.
In small relatively static clusters these values can be zero most of the time.
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_xds_pushes | cluster_name type | sum(irate(pilot_xds_pushes{cluster_name="$cluster", type=~"cds|eds|rds|lds"}[1m])) by (type) |
Istiod Error Rate
The different error rates for Istio during general operations. Including the generation and distribution of Envoy configuration.
pilot_xds_write_timeout
Rate of connection timeouts between Envoy and istiod. This number indicates that an Envoy has taken too long to acknowledge a configuration change from Istio. An increase in these errors typically indicates network issues, envoy resource limits or istiod resource limits (usually cpu)
pilot_total_xds_internal_errors
Rate of errors thrown inside istiod whilst generating Envoy configuration. Check the istiod logs for more details if you see internal errors.
pilot_total_xds_rejects
Rate of rejected configuration from Envoy. Istio should never produce any invalid Envoy configuration so any errors here warrants investigation, starting with the istiod logs.
pilot_xds_expired_nonce
Rate of expired nonces from Envoys. This number indicates that an Envoy has responded to the wrong request sent from Istio. An increase in these errors typically indicates network issues (saturation or partition), Envoy resource limits or istiod resource limits (usually cpu).
pilot_xds_push_context_errors
Rate of errors setting a connection with an Envoy instance. An increase in these errors typically indicates network issues (saturation or partition), Envoy resource limits or istiod resource limits (usually cpu). Check istiod logs for further details.
pilot_xds_pushes
Rate of transport errors sending configuration to Envoy. An increase in these errors typically indicates network issues (saturation or partition), Envoy resource limits or istiod resource limits (usually cpu).
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_total_xds_internal_errors | cluster_name | sum(rate(pilot_total_xds_internal_errors{cluster_name="$cluster"}[1m])) |
pilot_total_xds_rejects | cluster_name | sum(rate(pilot_total_xds_rejects{cluster_name="$cluster"}[1m])) |
pilot_xds_expired_nonce | cluster_name | sum(rate(pilot_xds_expired_nonce{cluster_name="$cluster"}[1m])) |
pilot_xds_push_context_errors | cluster_name | sum(rate(pilot_xds_push_context_errors{cluster_name="$cluster"}[1m])) |
pilot_xds_pushes | cluster_name type | sum(rate(pilot_xds_pushes{cluster_name="$cluster", type=~".*_senderr"}[1m])) by (type) |
pilot_xds_write_timeout | cluster_name | sum(rate(pilot_xds_write_timeout{cluster_name="$cluster"}[1m])) |
Proxy Convergence Time
The delay between an istiod receiving configuration changes and a proxy receiving all required configuration in the cluster. Broken down by percentiles.
This number indicates how stale the proxy configuration is. As this number increases it may start to affect application traffic.
This number is typically in the hundreds of milliseconds. If this number creeps up to 30s for an extended period of time, it is likely that istiod needs to be scaled out (or up) as it is likely pinned up against its CPU limits.
Metric Name | Labels | PromQL Expression |
---|---|---|
pilot_proxy_convergence_time_bucket | cluster_name | histogram_quantile(0.5, sum(rate(pilot_proxy_convergence_time_bucket{cluster_name="$cluster"}[1m])) by (le)) |
pilot_proxy_convergence_time_bucket | cluster_name | histogram_quantile(0.90, sum(rate(pilot_proxy_convergence_time_bucket{cluster_name="$cluster"}[1m])) by (le)) |
pilot_proxy_convergence_time_bucket | cluster_name | histogram_quantile(0.99, sum(rate(pilot_proxy_convergence_time_bucket{cluster_name="$cluster"}[1m])) by (le)) |
pilot_proxy_convergence_time_bucket | cluster_name | histogram_quantile(0.999, sum(rate(pilot_proxy_convergence_time_bucket{cluster_name="$cluster"}[1m])) by (le)) |
Configuration Validation
Success and failure rate of istio
configuration validation requests. This is triggered when TSB configuration is created or updated.
Any failures here should be investigated in the istiod and edge
logs.
If there are TSB configuration changes being made that affect the selected cluster and the success number is zero then there is an issue with configuration propagation. Check the XCP edge
logs to debug further.
Metric Name | Labels | PromQL Expression |
---|---|---|
galley_validation_failed | cluster_name | sum(rate(galley_validation_failed{cluster_name="$cluster"}[1m])) |
galley_validation_passed | cluster_name | sum(rate(galley_validation_passed{cluster_name="$cluster"}[1m])) |
Sidecar Injection
Rate of sidecar injection requests. Sidecar injection is triggered whenever a new instance/pod is created.
Any errors displayed here should be investigated further by checking the istiod logs.
Metric Name | Labels | PromQL Expression |
---|---|---|
sidecar_injection_failure_total | cluster_name | sum(rate(sidecar_injection_failure_total{cluster_name="$cluster"}[1m])) |
sidecar_injection_success_total | cluster_name | sum(rate(sidecar_injection_success_total{cluster_name="$cluster"}[1m])) |
MPC Operational Status
Operational metrics to indicate Management Plane Controller (MPC) health.
Received configs
The number of resources that sent from TSB to MPC.
This metric shows the number of objects that are created, updated, and deleted as part of a configuration push from MPC to XCP.
This metric can be used together with the XCP push operations and push duration to get an understanding of how the amount of resources being pushed to XCP affects the time it takes for the entire configuration push operation to complete.
Metric Name | Labels | PromQL Expression |
---|---|---|
mpc_tsb_config_received_count | resource | mpc_tsb_config_received_count{resource=""} |
Config Processing duration
Time it takes to process an entire config set. It shows the details about the amount of time spent pre-processing the configurations, converting them to XCP, and pushing them to the k8s cluster
Metric Name | Labels | PromQL Expression |
---|---|---|
mpc_config_conversion_time | N/A | mpc_config_conversion_time or on() vector(0) |
mpc_config_pre_process_time | N/A | mpc_config_pre_process_time or on() vector(0) |
mpc_config_total_process_time | error | mpc_config_total_process_time{error=""} or on() vector(0) |
mpc_xcp_config_push_time | error | mpc_xcp_config_push_time{error=""} or on() vector(0) |
Received configs by type
Configuration updates received from TSB are processed by MPC and translated into XCP resources. This metric shows the number of objects of each type MPC will convert.
Metric Name | Labels | PromQL Expression |
---|---|---|
mpc_tsb_config_received_count | resource | mpc_tsb_config_received_count{resource!=""} |
Conversion Time every 5m
Time it takes to convert TSB resources to the XCP APIs.
Metric Name | Labels | PromQL Expression |
---|---|---|
mpc_xcp_conversion_duration_bucket | N/A | histogram_quantile(0.99, sum(rate(mpc_xcp_conversion_duration_bucket[5m])) by (le, resource)) |
MPC to XCP pushed configs
The number of resources that are pushed to XCP.
This metric shows the number of objects that are created, updated, and deleted as part of a configuration push from MPC to XCP. It also shows how many fetch calls to the k8s api server are done.
This metric can be used together with the TSB tp MPC sent configs and XCP push operations and push duration to get an understanding of how the amount of resources being pushed to XCP affects the time it takes for the entire configuration push operation to complete.
Metric Name | Labels | PromQL Expression |
---|---|---|
mpc_xcp_config_create_ops | N/A | sum(mpc_xcp_config_create_ops) |
mpc_xcp_config_delete_ops | N/A | sum(mpc_xcp_config_delete_ops) |
mpc_xcp_config_fetch_ops | N/A | sum(mpc_xcp_config_fetch_ops) |
mpc_xcp_config_update_ops | N/A | sum(mpc_xcp_config_update_ops) |
Updates from TSB every 5m
Configuration and onboarded cluster messages received from TSB.
The number of update messages may increase or decrease based on the time it takes for MPC to fully process the messages. The more time it takes to process, the less frequent config updates will be retrieved.
Metric Name | Labels | PromQL Expression |
---|---|---|
grpc_client_handled_total | component grpc_code grpc_method | sum(increase(grpc_client_handled_total{grpc_code="OK"}[5m])) or on() vector(0) |
grpc_client_handled_total | component grpc_code grpc_method | sum(increase(grpc_client_handled_total{grpc_code="OK"}[5m])) or on() vector(0) |
grpc_client_handled_total | component grpc_code grpc_method | sum(increase(grpc_client_handled_total{grpc_code!="OK"}[5m])) or on() vector(0) |
grpc_client_handled_total | component grpc_code grpc_method | sum(increase(grpc_client_handled_total{grpc_code!="OK"}[5m])) or on() vector(0) |
Conversions by Resource every 5m
Conversions by resource executed in a time period. This can be used to understand the throughput of the MPC conversions.
Metric Name | Labels | PromQL Expression |
---|---|---|
mpc_xcp_conversion_duration_sum | N/A | sum(rate(mpc_xcp_conversion_duration_sum[5m])) by (resource) |
MCP to XCP pushed configs error
The number of resources that failed while pushing to XCP.
This metric shows the number of objects that fail when they are tried to be created, updated, and deleted as part of a configuration push from MPC to XCP. It also shows the number of failed fetch calls to the k8s api server.
This metric can be used together with the MPC to TSB push configs and the XCP push operations and push duration to get an understanding of how the amount of resources being pushed to XCP affects the time it takes for the entire configuration push operation to complete.
Metric Name | Labels | PromQL Expression |
---|---|---|
mpc_xcp_config_create_ops_err | N/A | sum(mpc_xcp_config_create_ops_err) |
mpc_xcp_config_delete_ops_err | N/A | sum(mpc_xcp_config_delete_ops_err) |
mpc_xcp_config_fetch_ops_err | N/A | sum(mpc_xcp_config_fetch_ops_err) |
mpc_xcp_config_update_ops_err | N/A | sum(mpc_xcp_config_update_ops_err) |
Config Status updates every 5m
Config Status update messages sent over the gRPC streams, from XCP to MPC to XCP.
This metric can help understand how messages are queued in TSB when it is under load. The value for both metrics should always be the same. If the Received by TSB metric has a value lower than the MPC one, it means TSB is under load and cannot process all messages sent by MPC as fast as MPC is sending them.
Metric Name | Labels | PromQL Expression |
---|---|---|
grpc_client_msg_received_total | component grpc_method | sum(increase(grpc_client_msg_received_total{component="mpc"}[5m])) or on() vector(0) |
grpc_client_msg_sent_total | component grpc_method | sum(increase(grpc_client_msg_sent_total{component="mpc"}[5m])) or on() vector(0) |
grpc_server_msg_received_total | component grpc_method | sum(increase(grpc_server_msg_received_total{component="tsb"}[5m])) or on() vector(0) |