upstream: allow configuration of connection pool limits#6298
upstream: allow configuration of connection pool limits#6298mattklein123 merged 20 commits intoenvoyproxy:masterfrom
Conversation
We want to limit the number of connection pools per cluster. Add it to the circut breaker thresholds so we can do it per priority. Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
|
There are a few TODOs here for my next steps. First, though, I want to make sure my approach is good. Does everyone agree with using circuit breakers to limit the number of connection pools?
Following that, I think I can move on from the connection pool map work and add an http plugin to infer the original source from the XFF. Finally, snooping through the open PRs, I noticed: #6269. There were some concerns raised there about increasing memory use by adding more stats. Does my circuit breaker approach exacerbate the problem/does it have a similar problem? Ultimately, I think any solution I take that is per cluster will need a similar number of stats per cluster. I guess the question is: how can I prevent excess resource consumption if no limits are necessary? |
|
@mattklein123 Hey Matt, I finally got around to implementing the next phase of this. Sorry it took so long: flu + conference == not much time for other stuff. PTAL! I'm not sure my approach is any good (nobody replied to my slack question about it. :)) So, any feedback is appreciated. |
snowp
left a comment
There was a problem hiding this comment.
Thanks, this approach makes sense to me.
It would probably be good to document the intended use for this vs the max connections circuit breaker. And yes, a release note seems in order since we're adding a new circuit breaker.
|
@snowp Thanks for the quick feedback. I'll address your comments today. |
- Fixed some nits and typos - Removed priority from the http conn pool hash key Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Somehow missed 'Guage` despite ample proper examples around it... Signed-off-by: Kyle Larose <[email protected]>
snowp
left a comment
There was a problem hiding this comment.
One comment but I think this looks good
mattklein123
left a comment
There was a problem hiding this comment.
Thanks, cool stuff. A bunch of random comments but generally LGTM.
/wait
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
Merge in new connection pool limits with changes to track resources remaining. Signed-off-by: Kyle Larose <[email protected]>
|
@mattklein123 Thanks for the review. I've addressed most of your comments. I've left two open -- they need a bit more feedback. |
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for the follow ups. I dropped a comment on the deferred delete issue and flushing a comment here about the TODO. Thank you!
/wait
This reverts commit 3a7dce7. Signed-off-by: Kyle Larose <[email protected]>
- Add a todo re: resource counters - Go back to old method of bypassing deferredDelete Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
|
I've addressed the last few comments. Thanks. |
mattklein123
left a comment
There was a problem hiding this comment.
Thanks, LGTM. I found one nit if you don't mind fixing.
/wait
Signed-off-by: Kyle Larose <[email protected]>
Signed-off-by: Kyle Larose <[email protected]>
|
/retest |
|
🙀 failed invoking rebuild of |
Description: We recently added a connection pool map class which maps keys to connection pools. A new key means a new connection pool. To prevent unbounded growth of connection pools, we placed a limit on the pool. Allocating beyond the limit causes the map to attempt to reclaim unused pools prior to allocating a new one. If none can be reclaimed, the map returns a failure.
We did not add the ability to configure the limit; its default was unlimited. This commit adds the ability to configure a limit by extending the cluster circuit breaker thresholds with a new optional configuration field: max_connection_pools.
Worth noting:
Risk Level: Medium
We're in the main code path for http. But, the default behaviour should not have changed, so there shouldn't be any major impact.
Testing:
Unit testing + integration testing
I spun up a local envoy instance, showing that:
Docs Changes: Protobuf docs + additions to the circuit breaking section in the overview.
Release Notes: Added a release not for the new circuit breaker.