Skip to content

api: add cluster_specifier_plugin to RouteAction#16944

Merged
mattklein123 merged 6 commits intoenvoyproxy:mainfrom
dfawley:cluster_specifier_plugin
Jun 29, 2021
Merged

api: add cluster_specifier_plugin to RouteAction#16944
mattklein123 merged 6 commits intoenvoyproxy:mainfrom
dfawley:cluster_specifier_plugin

Conversation

@dfawley
Copy link
Copy Markdown
Member

@dfawley dfawley commented Jun 11, 2021

Commit Message: api: add cluster_specifier_plugin to RouteAction
Additional Description:

This is a new extension point that will allow for dynamic cluster selection.
The plugin configured by it will be responsible for returning a cluster to
Envoy through its API (to be defined). Since the cluster returned is dynamic,
a new field is added to Route to list all the clusters it may return. This
allows for proxy implementations that are not using wildcard CDS queries to
pre-fetch clusters.

This feature may ultimately replace the FilterAction mechanism, as it provides
the same functionality for known use cases, but is simpler to implement and
use.

Risk Level: Low
Testing: N/A
Docs Changes: Included in PR.
Release Notes: N/A
Platform Specific Features: N/A

cc @htuch, @markdroth

Related to #8953, #8956

This is a new extension point that will allow for dynamic cluster selection.
The plugin configured by it will be responsible for returning a cluster to
Envoy through its API (to be defined).  Since the cluster returned is dynamic,
a new field is added to Route to list all the clusters it may return.  This
allows for proxy implementations that are not using wildcard CDS queries to
pre-fetch clusters.

This feature may ultimately replace the FilterAction mechanism, as it provides
the same functionality for known use cases, but is simpler to implement and
use.

Signed-off-by: Doug Fawley <[email protected]>
@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
envoyproxy/api-shepherds assignee is @markdroth
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #16944 was opened by dfawley.

see: more, trace.

Copy link
Copy Markdown
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together!

// [#not-implemented-hide:]
// Additional clusters which are not referenced directly by a RouteAction
// cluster_specifier, but may be selected by a cluster_specifier_plugin.
repeated string additional_clusters = 19;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this field should be in RouteConfiguration, not in Route. That way, if there are many routes that use the same cluster_specifier_plugin, we don't need to specify this list of additional clusters multiple times.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right, that makes more sense. Done.


// [#not-implemented-hide:]
// Additional clusters which are not referenced directly by a RouteAction
// cluster_specifier, but may be selected by a cluster_specifier_plugin.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be a good idea to explicitly state that this is intended for clients (like gRPC) that fetch the specific set of cluster resources referred to in the RouteConfiguration rather than making a wildcard CDS subscription.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good; done.

@markdroth
Copy link
Copy Markdown
Contributor

This looks great to me!

@mattklein123, can you provide a non-Google review here? For context, the reason why we're coming back to this approach instead of using FilterAction is that, while FilterAction is a more general solution, it also turns out to be much more complicated to implement. We have several use-cases right now that need to dynamically determine the cluster, but we don't have any use-cases that I'm aware of that actually need the additional generality of FilterAction. Given all of that, we'd prefer to go with this approach.

Please let us know if you have any questions. Thanks!

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I'm good with this PR, but @mattklein123 should provide non-Google coverage as next step.


// [#not-implemented-hide:]
// Additional clusters which are not referenced directly by a RouteAction
// cluster_specifier, but may be selected by a cluster_specifier_plugin.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add a RST link to cluster_specifier, *RouteAction* etc.

@mattklein123 mattklein123 self-assigned this Jun 16, 2021
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this makes sense to me. One thought.

/wait-any

// This is provided for xDS clients like gRPC that fetch the set of cluster
// resources referenced by the RouteConfiguration, and not by a wildcard CDS
// subscription.
repeated string additional_clusters = 12;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought: WDYT about colocating this with cluster_specifier_plugin perhaps in a wrapper message? When the client loads the route it could just as easily keep a set of additional_clusters that it comes across? Another option would be to elide this config entirely and just have the additional clusters concept be part of the plugin API that is read during load, though I understand that is probably not desirable given the desire to separate code from config.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm definitely open to adding a wrapper message, and optionally including this field in it. (Incidentally, I like the idea of a wrapper message anyway for a possible extension point, but I'll defer to whatever the convention is for Envoy.) However, the reason this is in RouteConfiguration is to avoid the situation where many different routes all reference the same plugin, which is capable of returning a large set of clusters (I've found it's ~1.5kb in practice for one data point). This could lead to config bloat, so we decided to move it here to be able to deduplicate.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could lead to config bloat, so we decided to move it here to be able to deduplicate.

Fair enough, but then do you need the cluster_specifier_plugin on each route? Could it all just live at the route level and we can add vhost/route overrides later if needed?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that putting the actual plugin config at the top level would work, because you might need different plugins for different routes.

If the plugin config itself gets too large and we don't want to repeat it in multiple routes, we can move the bulk of the config to an ECDS resource, so that the only thing we need to include in each route is a ConfigSource pointing to the ECDS resource.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my point is that saying that one part of this might get too big for the route while the other one might not sounds strange to me. Why not just have a wrapper message and put it at the top level, and then also allow overriding it at the vhost or route level? This is how we do most other things.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my point is that saying that one part of this might get too big for the route while the other one might not sounds strange to me.

One key difference is that the config of the plugin is opaque to xDS clients, but the additional clusters needs to be understood.

Another option would be:

  • In RouteConfiguration, a map from string (specifier plugin instance name) to ClusterSpecifierPluginConfig which includes a field for additional_clusters.
  • Make cluster_specifier_plugin a string that references that map.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think ^ would work. I would definitely do the wrapper message. What I'm saying though is to do the following:

  1. ClusterSpecifierPluginConfig at route config level.
  2. ClusterSpecifierPluginConfig at individual route level (and maybe vhost level)
  3. Most specific ClusterSpecifierPluginConfig wins. This is probably generally wire efficient enough for almost all uses cases without the indirection.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matt, I think your point about not treating these two pieces of data separately is well taken. As it happens, we just realized that we don't actually need the list of clusters anyway, so we're now back to just having the plugin config. However, I agree that it would be nice to have a way to avoid duplicating the plugin config in each route without having to use ECDS. So I think you're right that we should find a way to move that to the top level of the RouteConfiguration.

That having been said, I am thinking that we should design this in a way that makes it efficient to have multiple plugins (we don't need this now but might in the future), each of which is used in multiple routes. For example, consider a case where we have two plugins defined, plugin1 and plugin2, and we have the following list of routes:

  1. cluster=A
  2. cluster_specifier_plugin=plugin1
  3. cluster_specifier_plugin=plugin2
  4. cluster_specifier_plugin=plugin1
  5. cluster_specifier_plugin=plugin2
  6. cluster_specifier_plugin=plugin1
  7. cluster_specifier_plugin=plugin2

If we do this as a single ClusterSpecifierPluginConfig per RouteConfiguration that can be overridden in individual routes, then we would need to set the top-level ClusterSpecifierPluginConfig to refer to plugin1, and then we'd need to duplicate the ClusterSpecifierPluginConfig for plugin2 in the three routes that need to use plugin2 instead of plugin1. Also, even for the routes that do use plugin1, we'd still need a new field in the cluster_specifier oneof in the individual routes to indicate that it should use the top-level ClusterPerSpecifierPluginConfig instead of specifying a particular cluster. So we'd wind up adding two new fields in each route, like so:

oneof cluster_specifier {
  // Existing fields.
  string cluster = N;
  string cluster_header = N;
  WeightedClusters weighted_clusters = N;

  // New field to indicate using the top-level ClusterSpecifierPluginConfig.
  bool use_plugin = N;
  // New field to allow overriding the ClusterSpecifierPluginConfig.
  ClusterSpecifierPluginConfig plugin_config = N;
}

And the resulting RouteConfiguration would look like this, with the plugin2 config duplicated three times (textproto format):

// Top-level field in RouteConfig
cluster_specifier_plugin: { name: "plugin1" typed_config: { [envoy.extensions.cluster_specifier.plugin1.Plugin1] { ... } } }

// ...

// Route 1
cluster: "A"
// Route 2
use_plugin: true
// Route 3
plugin_config: { name: "plugin2" typed_config: { [envoy.extensions.cluster_specifier.plugin2.Plugin2] { ... } } }
// Route 4
use_plugin: true
// Route 5
plugin_config: { name: "plugin2" typed_config: { [envoy.extensions.cluster_specifier.plugin2.Plugin2] { ... } } }
// Route 6
use_plugin: true
// Route 7
plugin_config: { name: "plugin2" typed_config: { [envoy.extensions.cluster_specifier.plugin2.Plugin2] { ... } } }

This duplicates the plugin2 config 3 times, which is both inefficient and somewhat counter-intuitive, especially since there will generally be only one actual instance of plugin2.

I think we can avoid a lot of that complexity by specifying the list of plugins at the top-level, keyed by instance name, and then referring to those plugins by name in the individual routes. The idea would be to have something like this in the top-level of the RouteConfiguration:

// A list of cluster_specifier plugin instances.  All names in the list must be unique.
// Note: Ideally, this would be a map, but the map key would just duplicate the name field in TypedExtensionConfig.
repeated core.v3.TypedExtensionConfig cluster_specifier_plugin_instances = N;

And then all we need in the cluster_specifier oneof is a single new field:

// Refers to the plugins defined in the RouteConfiguration top-level cluster_specifier_plugin_instances field.
string cluster_specifier_plugin_name = N;

That way, we can configure multiple plugins like this (textproto format):

cluster_specifier_plugin_instances: [
  { name: "plugin1", typed_config: { [envoy.extensions.cluster_specifier.plugin1.Plugin1] { ... } } },
  { name: "plugin2", typed_config: { [envoy.extensions.cluster_specifier.plugin2.Plugin2] { ... } } }
]

And then individual routes could specify either "plugin1" or "plugin2", and there could be multiple routes referring to each of the two plugins.

WDYT?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup ^ sounds great. Let's do it!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with minor style changes, and included a wrapper message for future proofing - PTAL.

Copy link
Copy Markdown
Contributor

@markdroth markdroth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Just one minor nit in the doc comment to address.

@markdroth
Copy link
Copy Markdown
Contributor

I'll let @mattklein123 do the approval here, but this looks great!

@alyssawilk
Copy link
Copy Markdown
Contributor

hey @markdroth Matt is on leave this week, so would you be up for approving or throwing over to another shephard if you wanted a second pair of eyes?

@markdroth
Copy link
Copy Markdown
Contributor

The impetus for this comes from here at Google, so I'd prefer to have a non-Google API shepherd give the final approval.

@lizan, can you please take a look? Thanks!

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks LGTM with small nit.

/wait

Comment on lines +844 to +846
// Name of the cluster specifier plugin to use to determine the cluster for
// requests on this route. The plugin name must be defined in the associated
// :ref:`envoy_v3_api_msg_config.route.v3.RouteConfiguration.cluster_specifier_plugins`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pedantically, can you mention that this string refers to the name inside the TypedExtensionConfig? It's not completely clear to the reader what this string refers to.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I'm not 100% sure about the refs -- it looks like envoy_v3_api_msg_config is used to link to a message and envoy_v3_api_field_config is used to link to a field within a message. Is that right?

Also I'm having some trouble running fix_format, so this should not be merged until I've worked that out. (I assume the presubmits will fail anyway.)

mattklein123
mattklein123 previously approved these changes Jun 28, 2021
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@dfawley
Copy link
Copy Markdown
Member Author

dfawley commented Jun 29, 2021

envoy-presubmit (check linux_x64 coverage) Failing after 72m — check linux_x64 coverage failed

This doesn't seem legit -- is it? If not, is there a way to re-run it without invalidating the approval?

@mattklein123
Copy link
Copy Markdown
Member

This doesn't seem legit -- is it? If not, is there a way to re-run it without invalidating the approval?

Seems like a flake. I will force merge.

@mattklein123 mattklein123 merged commit 6dc4092 into envoyproxy:main Jun 29, 2021
baojr added a commit to baojr/envoy that referenced this pull request Jul 1, 2021
* main: (51 commits)
  listener: add filter chain match support for direct source address (envoyproxy#17118)
  Increase common/common coverage (envoyproxy#17193)
  crash_dump: Added local_end_stream_ to crash dump for H2. (envoyproxy#17199)
  codeql: improve Ubuntu dependency installation (envoyproxy#16556)
  ci: Move tooling tests to tooling job (envoyproxy#17071)
  Fix issue with Windows container image (envoyproxy#17113)
  fix filter linking urls (envoyproxy#17185)
  bug fix: fix bug that check_format.py will check files which are ignored (envoyproxy#17195)
  tls: moving the server name into SocketAddressProvider (envoyproxy#16574)
  network: Use std::make_unique and std::make_shared in source/common/network instead of bare new() (envoyproxy#17177)
  Revert "alpha matching: support generic action factory context (envoyproxy#17025)" (envoyproxy#17191)
  ci: Dont clone filter example where not required (envoyproxy#17182)
  alpha matching: support generic action factory context (envoyproxy#17025)
  xds: Clarify comment for RouteMatch.case_sensitive field. (envoyproxy#17176)
  ci: Only publish the required docker image (envoyproxy#17080)
  coverage: fixing flake (envoyproxy#17190)
  api: add cluster_specifier_plugin to RouteAction (envoyproxy#16944)
  wasm: update V8 to v9.2.230.13. (envoyproxy#17183)
  wasm: update Proxy-Wasm C++ Host and SDK to latest (2021-06-24). (envoyproxy#17174)
  owners: add Piotr as senior extension maintainer (envoyproxy#17175)
  ...

Signed-off-by: Garrett Bourg <[email protected]>
chrisxrepo pushed a commit to chrisxrepo/envoy that referenced this pull request Jul 8, 2021
This is a new extension point that will allow for dynamic cluster selection.
The plugin configured by it will be responsible for returning a cluster to
Envoy through its API (to be defined).  Since the cluster returned is dynamic,
a new field is added to Route to list all the clusters it may return.  This
allows for proxy implementations that are not using wildcard CDS queries to
pre-fetch clusters.

This feature may ultimately replace the FilterAction mechanism, as it provides
the same functionality for known use cases, but is simpler to implement and
use.

Signed-off-by: Doug Fawley <[email protected]>
Signed-off-by: chris.xin <[email protected]>
leyao-daily pushed a commit to leyao-daily/envoy that referenced this pull request Sep 30, 2021
This is a new extension point that will allow for dynamic cluster selection.
The plugin configured by it will be responsible for returning a cluster to
Envoy through its API (to be defined).  Since the cluster returned is dynamic,
a new field is added to Route to list all the clusters it may return.  This
allows for proxy implementations that are not using wildcard CDS queries to
pre-fetch clusters.

This feature may ultimately replace the FilterAction mechanism, as it provides
the same functionality for known use cases, but is simpler to implement and
use.

Signed-off-by: Doug Fawley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants