[RFC] cluster: introduce futureThreadLocalCluster interface to cluster manager#15053
[RFC] cluster: introduce futureThreadLocalCluster interface to cluster manager#15053lambdai wants to merge 1 commit intoenvoyproxy:mainfrom
Conversation
Signed-off-by: Yuchen Dai <[email protected]> DelayedFutureCluster and refactor tcp proxy tests Signed-off-by: Yuchen Dai <[email protected]> fix LocalClosedBeforeClusterIsReady Signed-off-by: Yuchen Dai <[email protected]> fix TcpProxyFutureClusterTest Signed-off-by: Yuchen Dai <[email protected]>
|
This looks like what we'd need for our use case. Since some details are missing, I presume the final implementation would have a failure pathway of sorts, but otherwise it appears that its goal is to provide a somewhat simple interface for establishing a callback for when the cluster resolves. |
|
At a very high level this looks right to me, however, I think like VHDS the actual lazy loading should be in its own HTTP or network filter that runs before the other filters. I think this would be a cleaner (and less risky design). My advice would be to first build the future API into CM, and then we can make the filters that use it? |
|
+1 on separating the lazy load to a filter. By default, CM should not have to make decisions and we should have non-lazy EDS. |
|
@bryce-anderson Thank you for the feedback! The goal of this PR is to explain the necessary changes to the data path including CM beyond http request. I absolutely agree tthat the eventual on-demand CDS/EDS needs a fallback strategy. |
|
@htuch @mattklein123
|
|
@lambdai I think one thing that we would want to make sure is possible is receiving a CDS update, storing the cluster config locally in proto but not instantiating a full |
|
re: (1), yes, agreed, let's add this functionality and test it as an independent PR. It's not very complicated as all the parts are already there. I would note that we don't explicitly need this to be part of the cluster manager API, since with the callback mechanisms in place I think this could be implemented on top. I don't feel strongly about it but you might consider a wrapper class. re: (2) for HTTP I agree this should be a filter. For TCP, I'm not crazy about making the tcp_proxy filter even more complicated, so it might be worth it to see if we can refactor somehow to allow TCP route selection to be done by a previous filter? I think maybe this is possible? It's worth investigating. re: (3) I agree with what @htuch said, but this part is complicated enough that I would recommend doing an independent design doc on this portion to make sure we are all on the same page. |
|
Thank you for the feedback! I am drafting and I will post in the linked issue |
|
I think we can move lazy local expansion to a separate issue, but the API (and ideally implementation) should be forward thinking with this use case in mind, as it's something likely to come up in MT scenarios where on-demand doesn't work for reliability reasons. |
|
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
|
This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Partially address #15026
This is a strawman that explains how ClusterManager embraces on-demand cluster by adding
futureThreadLocalCluster()method . The ClusterManager user(e.g. the downstream terminated filter) need to switch fromgetThreadLocalCluster(). I rewrite the tcp proxy filter and add the basic test.Note that tcp proxy does not need the details of why FutureCluster is not ready.
In the
TcpProxyFutureClusterTestI introduced aDelayedFutureClusterto emulate the ondemand cluster config update.In the traditional TcpProxyTest I introduced a
ReadyFutureClusterto explain how to migrate tofutureThreadLocalClusterand fromgetThreadLocalCluster()Notes for reviewers:
I refactor the tcp proxy test to reduce the time on compile and run test. I can create a separate PR to land it in master if you like.
The newly added test live in
test/common/tcp_proxy/tcp_proxy_cluster_test.cc.Commit Message:
Additional Description:
Risk Level:
Testing:
Docs Changes:
Release Notes:
Platform Specific Features:
[Optional Runtime guard:]
[Optional Fixes #Issue]
[Optional Deprecated:]
[Optional API Considerations:]