-
Notifications
You must be signed in to change notification settings - Fork 530
Description
Description:
When Envoy (with HPA) is receive load from a small number of clients (i.e. as a L2 proxy behind a CDN), the load is not evenly distributed across pods after scaling. The reason being that the extra load will not always come in the form of new connections, but rather from existing connections with HTTP Keep-Alive
enabled (default AFAICT). A similar issue is described in a bit more detail here: istio/istio#27280.
To force load to be redistributed, there are different knobs on the Envoy (server) side:
- Via https://www.envoyproxy.io/docs/envoy/latest/api-v3/extensions/filters/network/http_connection_manager/v3/http_connection_manager.proto:
common_http_protocol_options.max_connection_duration
common_http_protocol_options.max_requests_per_connection
http1_safe_max_connection_duration
But these are currently not configurable via the Gateway API. It would be nice if we can set them to have control over the lifetime of the connections from the server side. This in turn will give the ability to rebalance traffic in the cluster when new connections are created from the client.
Note: IMHO it seems very reasonable to default http1_safe_max_connection_duration
to true
if using max_connection_duration
to avoid a race condition also detailed here: envoyproxy/envoy#13388. This last issue is from 2020, but I noticed a reasonably new (2024) option has been added because of envoyproxy/envoy#34356.
As an alternative (or on top of) it would be nice to configure max_requests_per_connection
.
Relevant Links: