Resource Consumption and Capacity Planning
This document describes a conservative guideline for capacity planning of TIS Plus in Management Plane.These parameters apply to production installations.
The resource provisioning guidelines described in this document are very conservative.
Also please be aware that the resource provisioning described in this document are applicable to vertical resource scaling. Multiple replicas of the same TIS Plus components do not share the load with each other, and therefore you cannot expect the combined resources from multiple components to have the same effect. Replicas of TIS Plus components should only be used for high availability purposes only.
Recommended baseline production installation resource requirements
For a baseline installation of TIS Plus with 1 registered cluster and 1 deployed service within that cluster, the following resources are recommended.
To reiterate, the amount of memory described below are very conservative. Also, the actual performance given by the number of vCPUs tend to fluctuate depending on your underlying infrastructure. You are advised to verify the results in your environment.
Component | vCPU # | Memory MiB |
---|---|---|
TIS Plus server (Management Plane) 1 | 2 | 512 |
XCP Central Components 2 | 2 | 128 |
XCP Edge | 1 | 128 |
Front Envoy | 1 | 50 |
IAM | 1 | 128 |
TIS Plus UI | 1 | 256 |
OAP | 4 | 5192 |
OTEL-collector | 2 | 1024 |
1 Including the Kubernetes operator and persistent data
reconciliation processes.
2 Including the Kubernetes operator.
Recommended scaling resource parameters
The TIS Plus stack is mostly CPU-bound. Additional clusters registered with TIS Plus via XCP increase the CPU utilization by ~4%.
The effect of additional registered clusters or additional deployed workload services on memory utilisation is almost negligible. Likewise, the effect of additional clusters or workloads on resource consumption of the majority of TIS Plus components is mostly negligible, with the notable exceptions of TIS Plus, XCP Central component, TIS Plus UI and IAM.
Components that are part of the visibility stack (e.g. OTel/OAP, etc.) have their resource utilisation driven by requests, thus the resource scaling should follow the user request rate statistics. As a general rule of thumb, more than 1 vCPU is preferred. It is also important to notice that the visibility stack performance is largely bound by Elasticsearch performance.
Thus, we recommend vertically scaling the components by 1 vCPU for a number of deployed workflows:
Management Plane
Besides OAP, All components don't require any resource adjustment. Those components are architectured and tested to support very large clusters.
OAP in Management plane requires extra CPU and Memory ~ 100 millicores of CPU and 1024 MiB of RAM per every 1000 services. E.g. 4000 services aggregated in TIS Plus Management Plane from all TIS Plus clusters would require approximately 400 millicores of CPU and 4096 MiB of RAM in total.
Control Plane Resource Requirements
Following table shows typical peak resource utilization for TIS Plus control plane with the following assumptions:
- 50 services with sidecars
- Traffic on entire cluster is 500 repository
- OAP trace sampling rate is 1% of the traffic
- Metric is captured for every request at every workload.
Note that average CPU utilization would be a fraction of the typical peak value.
Component | Typical Peak CPU (m) | Typical Peak Memory (Mi) |
---|---|---|
Istiod | 300m | 250Mi |
OAP | 2500m | 2500Mi |
XCP Edge | 100m | 100Mi |
XCP Operator | 100m | 100Mi |
TIS Plus Control Plane Operator | 100m | 100Mi |
OTEL Collector | 50m | 100Mi |
TIS Plus/XCP Operator resource usage per Ingress Gateway
The following table shows the resources used by TIS Plus Operator and Istio Operator per Ingress Gateways
Ingress Gateways | TIS Plus Operator CPU(m) | TIS Plus Operator Mem(Mi) | XCP Operator CPU(m) | XCP Operator Mem(Mi) |
---|---|---|---|---|
0 | 100m | 50Mi | 10m | 45Mi |
50 | 2600m | 125Mi | 1100m | 120Mi |
100 | 3500m | 200Mi | 1300m | 175Mi |
150 | 3800m | 250Mi | 1400m | 200Mi |
200 | 4000m | 325Mi | 1400m | 250Mi |
250 | 4700m | 325Mi | 1750m | 300Mi |
300 | 5000m | 475Mi | 1750m | 400Mi |
Component resource utilization
The following tables will show how the different components of TIS Plus scale with 4000 services and peaking with 60 rpm, this is divided by information from the Management Plane, and the Control Plane.
Management Plane
Services | Gateways | Traffic(rpm) | Central CPU(m) | Central Mem(Mi) | MPC CPU(m) | MPC Mem(Mi) | OAP CPU(m) | OAP Mem(Mi) | Otel CPU(m) | Otel Mem(Mi) | TIS Plus CPU(m) | TIS Plus Mem(Mi) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 rpm | 3m | 39Mi | 5m | 30Mi | 37m | 408Mi | 22m | 108Mi | 14m | 57Mi |
400 | 2 | 60 rpm | 4m | 42Mi | 15m | 31Mi | 116m | 736Mi | 24m | 123Mi | 50m | 63Mi |
800 | 4 | 60 rpm | 4m | 54Mi | 24m | 34Mi | 43m | 909Mi | 26m | 127Mi | 85m | 75Mi |
1200 | 6 | 60 rpm | 4m | 59Mi | 32m | 41Mi | 28m | 1141Mi | 27m | 210Mi | 213m | 78Mi |
1600 | 8 | 60 rpm | 5m | 63Mi | 44m | 48Mi | 209m | 1475Mi | 29m | 249Mi | 113m | 86Mi |
2000 | 10 | 60 rpm | 5m | 73Mi | 41m | 51Mi | 51m | 1655Mi | 24m | 319Mi | 211m | 91Mi |
2400 | 12 | 60 rpm | 4m | 84Mi | 72m | 62Mi | 57m | 1910Mi | 29m | 381Mi | 227m | 97Mi |
2800 | 14 | 60 rpm | 5m | 90Mi | 73m | 65Mi | 43m | 2136Mi | 16m | 466Mi | 275m | 104Mi |
3200 | 16 | 60 rpm | 5m | 106Mi | 85m | 78Mi | 89m | 2600Mi | 43m | 574Mi | 382m | 108Mi |
3600 | 18 | 60 rpm | 5m | 123Mi | 94m | 71Mi | 245m | 2772Mi | 37m | 578Mi | 625m | 115Mi |
4000 | 20 | 60 rpm | 5m | 147Mi | 90m | 81Mi | 521m | 3224Mi | 15m | 704Mi | 508m | 122Mi |
IAM will peak at 5m/32Mi, LDAP at 1m/12Mi and XCP Operator at 3m and 23Mi