Local Rate Limiting
In this document, we will enable a rate limit in the Ingress Gateway and show how to rate limit based on the HTTP request user-agent
string.
Before you get started, make sure you:
✓ Familiarize yourself with TSB concepts
✓ Install the TSB environment. You can use TSB demo for quick install
✓ Completed TSB usage quickstart. This document assumes you already created Tenant and are familiar with Workspace and Config Groups. Also you need to configure tctl to your TSB environment.
Deploy httpbin
Service
Deploy a test httpbin
service as follows:
NS=httpbin
kubectl create namespace ${NS}
kubectl label namespace ${NS} istio-injection=enabled --overwrite=true
kubectl apply -n ${NS} -f https://raw.githubusercontent.com/istio/istio/master/samples/httpbin/httpbin.yaml
cat <<EOF > ${NS}-ingressgw.yaml
apiVersion: install.tetrate.io/v1alpha1
kind: Gateway
metadata:
name: ${NS}-ingressgw
namespace: ${NS}
spec:
kubeSpec:
service:
type: LoadBalancer
EOF
kubectl apply -f ${NS}-ingressgw.yaml
Some cloud providers may require additional annotations. For example, on AWS:
spec:
kubeSpec:
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
Create the TSB resources to contain the configuration
Create a Workspace and Gateway Group to contain the TSB configuration. Set ORG
and TEN
to the names of your Tetrate organization and tenant:
ORG=tetrate
TEN=tetrate
cat <<EOF > ${NS}-wsconfig.yaml
apiversion: api.tsb.tetrate.io/v2
kind: Workspace
metadata:
organization: ${ORG}
tenant: ${TEN}
name: ${NS}-ws
spec:
namespaceSelector:
names:
- "*/${NS}"
---
apiVersion: gateway.tsb.tetrate.io/v2
kind: Group
metadata:
organization: ${ORG}
tenant: ${TEN}
workspace: ${NS}-ws
name: ${NS}-gwgroup
spec:
namespaceSelector:
names:
- "*/${NS}"
EOF
tctl apply -f ${NS}-wsconfig.yaml
Expose the application using a simple Gateway resource
cat <<EOF > ${NS}-gw.yaml
apiVersion: gateway.tsb.tetrate.io/v2
kind: Gateway
metadata:
organization: ${ORG}
tenant: ${TEN}
workspace: ${NS}-ws
group: ${NS}-gwgroup
name: ${NS}-gw
spec:
workloadSelector:
namespace: ${NS}
labels:
app: ${NS}-ingressgw
http:
- name: httpbin
port: 80
hostname: "httpbin.tetrate.io"
routing:
rules:
- route:
serviceDestination:
host: "${NS}/httpbin.${NS}.svc.cluster.local"
port: 8000
EOF
tctl apply -f ${NS}-gw.yaml
Test the application
Determine the public endpoint (IP address or DNS name) for the gateway:
kubectl get svc -n ${NS} ${NS}-ingressgw
Access the application as follows. Set GW
to the public IP address or DNS name:
GW=k8s-httpbin-httpbini-4a722ad4c0-ec822974540ebfb1.elb.eu-west-1.amazonaws.com
curl -H "Host: httpbin.tetrate.io" http://${GW}/
It can take 5-10 minutes for a cloud platform to provision the downstream load balancer instances and DNS (if used), before you can access the service.
You can send a steady stream of requests using the wrk
benchmarking tool:
wrk -c 10 -t 10 -d 10 -H "Host: httpbin.tetrate.io" http://${GW}/
# repeat indefinitely
while wrk -c 10 -t 10 -d 10 -H "Host: httpbin.tetrate.io" http://${GW}/ ; do done
Apply a local rate limit
Local Rate Limits are applied using the rateLimiting: local
parameters.
The following example limits each individual client (dimensions: remoteAddress: value: "*"
) to 8 requests per second:
cat <<EOF > ${NS}-gw-ratelimit.yaml
apiVersion: gateway.tsb.tetrate.io/v2
kind: Gateway
metadata:
organization: ${ORG}
tenant: ${TEN}
workspace: ${NS}-ws
group: ${NS}-gwgroup
name: ${NS}-gw
spec:
workloadSelector:
namespace: ${NS}
labels:
app: ${NS}-ingressgw
http:
- name: httpbin
port: 80
hostname: "httpbin.tetrate.io"
routing:
rules:
- route:
serviceDestination:
host: "${NS}/httpbin.${NS}.svc.cluster.local"
port: 8000
rateLimiting:
local:
rules:
- dimensions:
- remoteAddress:
value: "*"
tokenBucket:
maxTokens: 8
tokensPerFill: 8
fillInterval: 1s
EOF
tctl apply -f ${NS}-gw-ratelimit.yaml
Any requests that exceed that limit will immediately receive an HTTP 429
response:
HTTP/1.1 429 Too Many Requests
content-length: 18
content-type: text/plain
date: Fri, 08 Aug 2025 14:08:32 GMT
server: istio-envoy
local_rate_limited
Rate limits based on remoteAddress
may not be accurate if there are multiple load balancers downstream of the Envoy Gateway, as the requests will appear to originate from these load balancers. Refer to your cloud platform documentation to determine if client IP addresses can be preserved or if they can be obtained from a request header.
Check traces or logs from the Envoy gateway to verify the source IP addresses that it observes, which may be different from the client's source IP addresses.
Other Rate Limiting Options
Details for other rate limiting options can be found in the local rate limiting API reference.
For example, the following local rate limit restricts each client token (header Authorization
) to a maximum of 4 requests per minute:
cat <<EOF > ${NS}-gw-ratelimit2.yaml
apiVersion: gateway.tsb.tetrate.io/v2
kind: Gateway
metadata:
organization: ${ORG}
tenant: ${TEN}
workspace: ${NS}-ws
group: ${NS}-gwgroup
name: ${NS}-gw
spec:
workloadSelector:
namespace: ${NS}
labels:
app: ${NS}-ingressgw
http:
- name: httpbin
port: 80
hostname: "httpbin.tetrate.io"
routing:
rules:
- route:
serviceDestination:
host: "${NS}/httpbin.${NS}.svc.cluster.local"
port: 8000
rateLimiting:
local:
rules:
- dimensions:
- header:
name: "Authorization"
# with no value, rate limit on each unique header value
# value:
# exact: "bar"
tokenBucket:
maxTokens: 4
tokensPerFill: 4
fillInterval: 60s
EOF
tctl apply -f ${NS}-gw-ratelimit2.yaml
Make the HTTP request several times within a minute; the first 4 requests should succeed, but subsequent requests fail (429 Too Many Requests
) for the remainder of that minute:
for i in {1..5} ; do curl -s -o /dev/null -w "%{http_code} " -H "Host: httpbin.tetrate.io" -H "Authorization: my-code" http://${GW}/ ; done
Try with a different value for the Authorization
header, and you will see that these requests are counted independently, concurrently allowing 4 per minute.
for i in {1..5} ; do for j in {1..4} ; do curl -s -o /dev/null -w "%{http_code} " -H "Host: httpbin.tetrate.io" -H "Authorization: my-code${j}" http://${GW}/ ; done ; echo ; done