Manage number of new connections per cycle by AshMosh · Pull Request #12178 · redis/redis

AshMosh · 2023-05-15T13:56:10Z

There are situations (especially in TLS) in which the engine gets too occupied managing a large number of new connections. Existing connections may time-out while the server is processing the new connections initial TLS handshakes, which may cause cause new connections to be established, perpetuating the problem. To better manage the tradeoff between new connection rate and other workloads, this change adds a new config to manage maximum number of new connections per event loop cycle, instead of using a predetermined number (currently 1000).

This change introduces two new configurations, max-new-connections-per-cycle and max-new-tls-connections-per-cycle. The default value of the tcp connections is 10 per cycle and the default value of tls connections per cycle is 1.

madolson

This was originally discussed here: #10524. The main issue we are trying to prevent is TLS connection storms, which aren't being protected by default here. My preference would have just been to update the tlsAcceptHandler to only do 1 per cycle and leave TCP as is.

src/config.c

madolson

This was originally discussed here: #10524. The main issue we are trying to prevent is TLS connection storms, which aren't being protected by default here. My preference would have just been to update the tlsAcceptHandler to only do 1 per cycle and leave TCP as is. I would like feedback from @yossigo though.

JimB123 · 2023-05-16T15:59:10Z

I believe the focus here should not be solely on TLS. TLS is quite obvious, but it is definitely not the only consideration.

Please read the discussion here first. All of my discussion/argument is still completely on-point: #10524

This boils down to a prioritization issue. When the event loop is saturated (CPU is 100% at the micro level), a prioritization is required. Do we delay new connections? or do we delay existing work? It's a binary decision. We must bias towards processing existing work - otherwise we risk timeout/retries. By increasing the number of connections accepted greater than 1, we are choosing to bias towards accepting connections - which is the wrong decision.

Side note: it's also valid to question if there are cases where the accept count should be reduced to something LESS THAN 1. This would bias us even more to existing work. In practice I have found this unnecessary and IMO it would add unnecessary complexity.

The issue with accept count is quite obvious in a TLS storm. Each accept implies the need for expensive negotiation messages. When we accept too much, the negotiation can't complete without timeout. Retries happen. Eventually we get to a point where NONE of the connections complete negotiation, and the engine is saturated with connection attempts. By prioritizing the negotiation over new accepts, SOME of the connections will complete without timeout. Over a short period of time, a TLS storm will dissipate.

REAL WORLD SCENARIOS that I have witnessed:

We've been talking about TLS storms. I've personally witnessed TLS storms occurring when the connections-per-cycle has been adjusted down to 16. Once connections-per-cycle is reduced to 1, the problem is effectively eliminated. You could try to "tune" this to a number between 1 and 16, however IMO there's not much value to this. A higher setting does not result in significantly different performance, and just increases risk of storm in unexpected conditions.
KEYS. We know that the KEYS command is dangerous. I witnessed a system where a large number of clients were attaching and immediately issuing a KEYS command. This command is expensive (like TLS negotiation) and once several clients were simultaneously trying to execute KEYS, they all started timing out. The event loop lengthened - to MINUTES. Redis was executing these expensive commands, only to find that the client had timed out before receiving the response. The clients immediately retried resulting in the same type of "storm" situation. By adjusting connections-per-cycle to 1, it allowed the KEYS commands that were in-flight to complete. This immediately made the event loop more responsive. Rather than 100% timeout/failure, the system was now responsive and successfully processing some of the commands.
CLUSTER INFO. Some clients issue the CLUSTER INFO command immediately after connecting to collect topology information. Especially on large clusters, this can be an expensive command. I have witnessed a system where the event loop was saturated with CLUSTER INFO. The resultant behavior was identical to the TLS storm or the KEYS scenario above. The event loop slowed. 100% of the CLUSTER INFO commands timed out. Once connections-per-cycle was adjusted to 1, the system immediately recovered. The event loop became responsive. Connections were established.

In summary, if the event loop isn't saturated, this change makes no difference. The event loop spins quickly and connections are established. It's only in the case that the event loop is saturated, that connections-per-cycle has any impact. The impact is immediate and beneficial.

P.S. If it were up to me, I'd remove the configuration and inner connection loop completely. Over the last 2 years, I've seen no evidence that using a value greater than 1 is ever warranted. But I agree that a tunable parameter is the less controversial option.

madolson · 2023-05-16T16:08:23Z

But I agree that a tunable parameter is the less controversial option.

As a note, this is often a more controversial option here compared to our internal AWS fork since it's a new config we have to expose, document, and maintain.

yossigo · 2023-05-17T12:39:17Z

In summary, if the event loop isn't saturated, this change makes no difference. The event loop spins quickly and connections are established. It's only in the case that the event loop is saturated, that connections-per-cycle has any impact. The impact is immediate and beneficial.

@JimB123 Can you confirm this is also the case when there's very large spike of inbound connections? I'm concerned the extra syscalls will reduce accept rate and potentially have an impact on the TCP backlog.

JimB123 · 2023-05-23T21:45:16Z

@JimB123 Can you confirm this is also the case when there's very large spike of inbound connections? I'm concerned the extra syscalls will reduce accept rate and potentially have an impact on the TCP backlog.

@yossigo You are correct that there are more syscalls. If accepting connections was the only measure of success, you would be right to be concerned. However, the measure of success should be accepting connections AND doing work. When originally working on this, I did some rough measurements of pure event loop speed and found the event loop to spin much faster than any reasonable connection rate. In the TLS case, the extra overhead is very small in comparison to the TLS negotiation.

If I was to TRY to show a degradation in overall performance/throughput, I'd set up a scenario as follows:

No TLS
No AUTH
Each connection does a single simple GET command and then disconnects

I think in this case, there might be some measurable performance delta. However, I don't think we should optimize for such an anti-pattern.

yossigo · 2023-05-24T09:12:41Z

@JimB123 I wasn't suggesting to optimize for it, but use it as a test case to understand the worst case impact. Anyway, based on running memtier_benchmark --reconnect-interval 1 -n 5000 -x 3 it seems like we can expect around 2% worst case impact:

unstable:

Totals      13162.03         0.00     11964.29        12.56837        11.26300        29.82300        39.42300       558.46

This PR:

Totals      12871.28         0.00     11700.00        12.85873        11.58300        30.20700        37.63100       546.12

JimB123 · 2023-05-24T15:20:44Z

Very nice @yossigo. 2%, worst case, is better than I was expecting. Of course, in practice, we're very unlikely to see any throughput impact. The connection (& disconnection) rate would have to be extremely high, and CPU would have to be saturated.

madolson · 2023-05-24T16:05:35Z

@yossigo @JimB123 So I'm still going to restate that my preference is we should just update these values to 1 and not have a config.

yossigo · 2023-05-24T16:27:03Z

@madolson I'd play it safe and leave it configurable for now.

madolson · 2023-05-24T17:48:32Z

@yossigo in that case are we okay with a default of 1? Otherwise people will need to be impacted before they know to fix it.

yossigo · 2023-05-28T09:20:02Z

@madolson Yes.

src/server.h

redis.conf

madolson

@redis/core-team A small change to help prevent connection storms in certain circumstances.

oranagra · 2023-06-11T12:27:48Z

i have one more concern.
during loading, we process events only once in a while (e.g. 2mb of rdb data), when that happens, we spin the event loop 4 times:

 * It calls the event loop in order to process a few events. Specifically we
 * try to call the event loop 4 times as long as we receive acknowledge that
 * some event was processed, in order to go forward with the accept, read,
 * write, close sequence needed to serve a client.

this is necessary in order to still be responsive for PING and INFO.
with this change, it means we'll only be responsive for one (or actually two) newly connected clients.
i.e. if we have more than one script processing redis-cli ping watching us, some can hang / fail and consider redis is dead (maybe some kind of WD that will try to restart redis).

this isn't normally a problem since 2mb of rdb file are probably quite fast, and we'll get many of these per second, but there are cases where the rate of such calls to processEventsWhileBlocked drops.
let's imagine for a second that it drops to one call per 10 seconds.
so maybe we want to limit the rate of connections per second, rather than per event loop cycle?

src/tls.c

soloestoy · 2023-06-13T14:41:58Z

@JimB123 I wasn't suggesting to optimize for it, but use it as a test case to understand the worst case impact. Anyway, based on running memtier_benchmark --reconnect-interval 1 -n 5000 -x 3 it seems like we can expect around 2% worst case impact:

unstable:
Totals      13162.03         0.00     11964.29        12.56837        11.26300        29.82300        39.42300       558.46 
This PR:
Totals      12871.28         0.00     11700.00        12.85873        11.58300        30.20700        37.63100       546.12 

I think we need re-benchmark it, 13162.03 qps in unstable is too slow for Redis

oranagra · 2023-06-13T15:22:49Z

we discussed this in a core-team meeting.

we dismissed the concern of processEventsWhileBlocked, considering it an edge case that would only cause issues when several types of abuse are combined (a monitoring software that re-connects each time, several of them, and a very slow rate of processEventsWhileBlocked calls).

we conceptually approved this pending on a re-benchmark (on a better hardware?) of the impact on clients that re-connect for each command they send.

madolson · 2023-06-15T04:45:35Z

@AshMosh Can you help with the performance testing above? Ideally we want to test a more realworld scenarios than on the laptop which Yossi did.

murphyjacob4 · 2023-11-15T01:44:21Z

With max-new-connections-per-cycle=5, reconnect-interval=10:

	QPS	P50 Latency	P95 Latency	P99 Latency	P99.9 Latency	P99.99 Latency	P99.999 Latency
Client with no reconnections	53874	1.14300	1.70300	2.25500	3.27900	9.02300	36.60700
Client with heavy reconnections	49322	1.15900	1.96700	2.55900	4.44700	9.08700	36.86300

With max-new-connections-per-cycle=10, reconnect-interval=10:

	QPS	P50 Latency	P95 Latency	P99 Latency	P99.9 Latency	P99.99 Latency	P99.999 Latency
Client with no reconnections	55293	1.11100	1.69500	2.20700	3.29500	9.59900	36.86300
Client with heavy reconnections	50382	1.12700	1.90300	2.41500	3.85500	9.79100	36.86300

I also reran with reconnect-interval = 1 for a more extreme connection storm simulation:

With max-new-connections-per-cycle=5, reconnect-interval=1:

	QPS	P50 Latency	P95 Latency	P99 Latency	P99.9 Latency	P99.99 Latency	P99.999 Latency
Client with no reconnections	72802	0.83900	1.29500	1.59900	3.13500	16.51100	53.50300
Client with heavy reconnections	11564	4.73500	7.16700	8.95900	15.80700	40.44700	55.80700

With max-new-connections-per-cycle=10, reconnect-interval=1:

	QPS	P50 Latency	P95 Latency	P99 Latency	P99.9 Latency	P99.99 Latency	P99.999 Latency
Client with no reconnections	65281	1.02300	1.42300	2.11100	3.75900	22.52700	53.50300
Client with heavy reconnections	14730	2.97500	5.59900	11.90300	18.81500	39.16700	70.65500

With max-new-connections-per-cycle=1000, reconnect-interval=1:

	QPS	P50 Latency	P95 Latency	P99 Latency	P99.9 Latency	P99.99 Latency	P99.999 Latency
Client with no reconnections	57948	0.85500	2.35100	3.37500	5.05500	22.91100	47.10300
Client with heavy reconnections	15155	2.35100	5.47100	13.31100	19.83900	37.88700	69.11900

madolson · 2023-11-15T22:55:10Z

So I am a bit inclined to update my position and say we should update the PR and use a default value of 10. It seems to still provide most of the protective benefits against TLS based connection storms while still providing minimal detriment to the workloads that open a lot of connections. @soloestoy Thoughts?

JimB123 · 2023-11-16T16:42:06Z

Anything larger than 1 risks timeouts on your real workload and death loop.
We WANT connections to timeout/retry under these conditions. We don't want regular commands to start failing.

Set up a scenario with a high connection rate, using TLS, and a workload with small timeouts for regular commands. Make sure you drive CPU to 100%.

madolson · 2023-12-12T17:16:31Z

Decision for now is to split into two separate configs:

max-new-tcp-connections-per-cycle
max-new-tls-connections-per-cycle

With two different default values. Default of 1 for TLS and 10 for TCP.

The motivation was we wanted to be as defensive as possible for not introducing performance regression for TCP based workloads that do single client + single command workloads (like with PHP).

… to 1

Changing default value to 1.

madolson

@redis/core-team Any other concerns? I've updated the top comment as well.

soloestoy · 2023-12-13T03:47:43Z

I have reconsidered, and if we are concerned about the potential time consumption of accept, perhaps we can set a time threshold for accept, such as 1000 microseconds. This way, we don't need to differentiate between TLS and TCP or worry about the number of accepted connections. Using time as a limit can be a better approach.

soloestoy · 2023-12-13T06:40:09Z

#12178 (comment) @murphyjacob4 the results are based on this PR (limiting to 1 or 10) or unstable?

soloestoy · 2023-12-13T07:10:36Z

Additionally, Redis initially accept only one connection, which was later increased to 1000 starting from version 2.8, see more details in f3d3c60. Does anyone have knowledge about this history?

As mentioned in the commit description, this change was made to optimize scenarios where a connect-disconnect pattern is frequently used. In this scenario, accepting a new connection should be also considered a high-priority event, even if the CPU usage is at 100%. Therefore, if we intend to reduce the default value now, we need to exercise caution and carefully consider the implications.

oranagra · 2023-12-13T11:57:55Z

personally, i don't like the use of plaintext. none of the other configs we have that are different for TLS and plain TCP, have a plaintext marking (no to mention that it's not necessarily "text" that is transmitted, so i don't like that term at all for plain-TCP).
i'd go with max-new-connections-per-cycle and max-new-tls-connections-per-cycle-tls.

madolson · 2023-12-13T16:41:29Z

I have reconsidered, and if we are concerned about the potential time consumption of accept, perhaps we can set a time threshold for accept, such as 1000 microseconds. This way, we don't need to differentiate between TLS and TCP or worry about the number of accepted connections. Using time as a limit can be a better approach.

The problem isn't in the accept per-say it's that we're signing up for future work with the TLS negotiation. I think a better alternative, which I've mentioned before, is to limit the number of TLS connections in the handshaking state. Which bottlenecks incoming TCP connections much less since they move to accept immediately.

@oranagra I didn't like tcp either, since it also applies to unix, so I'm fine just differentiating it on tls. Your suggestion has tls twice, I still prefer the original name max-new-tls-connections-per-cycle as opposed to adding tls as a suffix though.

redis.conf

There are situations (especially in TLS) in which the engine gets too occupied managing a large number of new connections. Existing connections may time-out while the server is processing the new connections initial TLS handshakes, which may cause cause new connections to be established, perpetuating the problem. To better manage the tradeoff between new connection rate and other workloads, this change adds a new config to manage maximum number of new connections per event loop cycle, instead of using a predetermined number (currently 1000). This change introduces two new configurations, max-new-connections-per-cycle and max-new-tls-connections-per-cycle. The default value of the tcp connections is 10 per cycle and the default value of tls connections per cycle is 1. --------- Co-authored-by: Madelyn Olson <[email protected]>

madolson reviewed May 15, 2023

View reviewed changes

src/config.c Outdated Show resolved Hide resolved

madolson reviewed May 15, 2023

View reviewed changes

madolson linked an issue May 15, 2023 that may be closed by this pull request

Better limit the rate of establishing connections at a given time #10524

Closed

madolson added state:major-decision Requires core team consensus approval-needed Waiting for core team approval to be merged labels May 30, 2023

madolson reviewed May 30, 2023

View reviewed changes

src/server.h Outdated Show resolved Hide resolved

JimB123 suggested changes May 31, 2023

View reviewed changes

redis.conf Outdated Show resolved Hide resolved

redis.conf Outdated Show resolved Hide resolved

AshMosh force-pushed the ReduceNumberOfConnectionsAccepted branch 2 times, most recently from 087b5cb to d7d748d Compare June 1, 2023 07:11

AshMosh requested a review from JimB123 June 1, 2023 16:19

JimB123 approved these changes Jun 1, 2023

View reviewed changes

AshMosh requested a review from madolson June 4, 2023 06:02

madolson reviewed Jun 11, 2023

View reviewed changes

redis.conf Outdated Show resolved Hide resolved

madolson reviewed Jun 11, 2023

View reviewed changes

redis.conf Outdated Show resolved Hide resolved

madolson approved these changes Jun 11, 2023

View reviewed changes

oranagra reviewed Jun 11, 2023

View reviewed changes

src/tls.c Outdated Show resolved Hide resolved

AshMosh and others added 4 commits December 12, 2023 16:37

contribution to OSS - Reduce number of connections accepted per cycle…

502d41a

… to 1

Adding a section in redis.conf on max-new-connections-per-cycle.

1c722fe

Changing default value to 1.

Expending the change to include unix.c

0eb2b87

Split new connections into two configuration

4b50a2f

madolson force-pushed the ReduceNumberOfConnectionsAccepted branch from 63b85da to 4b50a2f Compare December 13, 2023 00:51

madolson added 2 commits December 12, 2023 16:53

Wording

194e4f6

Got the defaults flipped

f2f0a78

madolson approved these changes Dec 13, 2023

View reviewed changes

enjoy-binbin approved these changes Dec 13, 2023

View reviewed changes

madolson added 2 commits December 13, 2023 10:22

Remove plaintext reference

841c931

Minor wording change in redis.conf

91c45d4

oranagra approved these changes Dec 13, 2023

View reviewed changes

madolson reviewed Dec 27, 2023

View reviewed changes

redis.conf Outdated Show resolved Hide resolved

soloestoy added the release-notes indication that this issue needs to be mentioned in the release notes label Jan 2, 2024

Update redis.conf

6047ed1

soloestoy approved these changes Jan 2, 2024

View reviewed changes

madolson merged commit c3f8b54 into redis:unstable Jan 2, 2024

madolson removed the approval-needed Waiting for core team approval to be merged label Jan 2, 2024

pizhenwei mentioned this pull request Apr 3, 2024

Introduce RDMA transport #11182

Closed

Conversation

AshMosh commented May 15, 2023 • edited by madolson Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

madolson left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JimB123 commented May 16, 2023

Uh oh!

madolson commented May 16, 2023

Uh oh!

yossigo commented May 17, 2023

Uh oh!

JimB123 commented May 23, 2023

Uh oh!

yossigo commented May 24, 2023

Uh oh!

JimB123 commented May 24, 2023

Uh oh!

madolson commented May 24, 2023

Uh oh!

yossigo commented May 24, 2023

Uh oh!

madolson commented May 24, 2023

Uh oh!

yossigo commented May 28, 2023

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

oranagra commented Jun 11, 2023

Uh oh!

Uh oh!

soloestoy commented Jun 13, 2023

Uh oh!

oranagra commented Jun 13, 2023

Uh oh!

madolson commented Jun 15, 2023

Uh oh!

murphyjacob4 commented Nov 15, 2023

Uh oh!

madolson commented Nov 15, 2023

Uh oh!

JimB123 commented Nov 16, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madolson commented Dec 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

madolson left a comment

Choose a reason for hiding this comment

Uh oh!

soloestoy commented Dec 13, 2023

Uh oh!

soloestoy commented Dec 13, 2023

Uh oh!

soloestoy commented Dec 13, 2023

Uh oh!

oranagra commented Dec 13, 2023

Uh oh!

madolson commented Dec 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

AshMosh commented May 15, 2023 •

edited by madolson

Loading

madolson left a comment •

edited

Loading

JimB123 commented Nov 16, 2023 •

edited

Loading

madolson commented Dec 12, 2023 •

edited

Loading

madolson commented Dec 13, 2023 •

edited

Loading