Skip to content

upstream: subset load balancer#1857

Merged
htuch merged 17 commits intoenvoyproxy:masterfrom
turbinelabs:slb/lb-4-rebased
Oct 24, 2017
Merged

upstream: subset load balancer#1857
htuch merged 17 commits intoenvoyproxy:masterfrom
turbinelabs:slb/lb-4-rebased

Conversation

@zuercher
Copy link
Copy Markdown
Member

Fixes #1279, and is the final part of splitting #1735.

Provides the actual load balancer for subsets with basic stats for the number of subsets active, created, removed, selected (used for host selection), and fallbacks (used fallback path).

I took a pass at updating the architecture overview and cluster docs -- I'm not sure if what I've done in the cluster docs is a good idea.

Signed-off-by: Stephan Zuercher [email protected]

@mattklein123
Copy link
Copy Markdown
Member

@zuercher I haven't reviewed yet, but in general I'm majorly in favor of how you did the docs here for the v2 stuff, and I think we should be doing this in other v2 places also. At this point I feel that trying to wait for proto tooling to make docs is going to be more trouble than it is worth. @htuch @envoyproxy/maintainers what do you think?

@htuch
Copy link
Copy Markdown
Member

htuch commented Oct 16, 2017

@mattklein123 I can't say I'm super excited to hand write the docs for v2 (but thanks @zuercher for kicking this off). I think I could write a Python script (yes, I know) to do some simple autogenerated docs without a huge amount of work, and that way we have data-plane-api as the canonical source for the APIs and docs. Let's try something quick-and-dirty like this after constraints are in, while we wait for the proto tools. IMHO, it was quite fragile maintaining separate docs and JSON schema in v1, there were places where stuff got stale and out of sync.

@mattklein123
Copy link
Copy Markdown
Member

@htuch FWIW, I really don't think the hand written stuff is that bad. The config stuff does not change that often and the custom formatting that can be done in sphinx is often very useful. It's going to be difficult to write a tool that can make things as flexible and as nice. (Ultimately the tool will need to output RST and allow for customization). With that said, such a thing sounds great if it did exist. :)

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks for the final PR. I have some initial feedback, I think the meat of the LB subset creation is still fairly hairy, not sure what could be done to improve beyond more comments as requested.

Load balancer subset statistics
-------------------------------

Statistics for monitoring load balancer subset decisions. Stats are rooted at *cluster.<name>.* and
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you hyperlink back to the subset LB docs here?

.. _config_cluster_manager_cluster_v2_tcp_protocol_options:

tcp_protocol_options
*(optional, object)* TCP protocol options. See the protobuf IDL for details.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high level suggestion: if we are going to put together a solution in the interim without tooling for docs, I'm wondering why we don't invert this, and have users start by looking at the .proto and then link back to v1 docs on various subjects. What are the tradeoffs here? Are protos+comments too scary for people to look at?

There's going to end up being a lot of redundancy here too once we add constraints. The constraint annotations (https://github.com/lyft/protoc-gen-validate) will be specifying optional vs. required inline with the proto definitions, for example.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think from the discussion in slack, the plan is to try to wait for automated doc generation.

I can switch the reference from the architecture overview on load balancers to point at the data-plane-api repo for now and remove the changes I made under docs/configuration.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is the last open comment. Should I go ahead and do this?

is a subset with the exact keys and values specified by the route, the subset is used for load
balancing. Otherwise, the fallback policy is used. The cluster's subset configuration must,
therefore, contain a definition that has the same keys as a given route in order for subset load
balancing to occur.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to link back to the design doc here, as that has some useful examples to make the above concrete.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


LoadBalancer* newLoadBalancer(const SubsetPtr& subset);

static const HostSetImpl empty_host_set_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-POD statics aren't allowed in Envoy due to the static initialization order fiasco (see https://github.com/envoyproxy/envoy/blob/master/STYLE.md#deviations-from-google-c-style-guidelines). We have a macro to help here:

#define CONSTRUCT_ON_FIRST_USE(type, ...) \

fallback_policy_(subsets.fallbackPolicy()), default_subset_(subsets.defaultSubset()),
subset_keys_(subsets.subsetKeys()), original_host_set_(host_set),
original_local_host_set_(local_host_set) {
RELEASE_ASSERT(subsets.isEnabled());
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is just a regular ASSERT, i.e. a code invariant, not something that could be influenced by external environment.

// the host has a value for each key. The kvs vector is cleared before
// use.
bool SubsetLoadBalancer::extractSubsetMetadata(const std::set<std::string>& subset_keys,
const HostSharedPtr& host, SubsetMetadata* kvs) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Envoy style has output params as reference type rather than pointer.

// recursively finds the matching LbSubsetEntryPtr. If it does not
// exist and allow_create is true, create the new subset.
SubsetLoadBalancer::LbSubsetEntryPtr
SubsetLoadBalancer::findOrCreateSubset(LbSubsetMap& subsets, const SubsetMetadata& kvs, size_t idx,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again with the size_t vs uint32_t nit.

SubsetLoadBalancer::LbSubsetEntryPtr
SubsetLoadBalancer::findOrCreateSubset(LbSubsetMap& subsets, const SubsetMetadata& kvs, size_t idx,
bool allow_create) {
const std::string& name = kvs[idx].first;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an ASSERT for the bounds check?

void SubsetLoadBalancer::HostSubsetImpl::update(const std::vector<HostSharedPtr>& hosts_added,
const std::vector<HostSharedPtr>& hosts_removed) {
for (const auto host : hosts_added) {
int locality_index = findLocalityIndex(host);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: const.


// Find the index in which the original HostSet's hostsPerLocality vector
// contains the given host.
int SubsetLoadBalancer::HostSubsetImpl::findLocalityIndex(const HostSharedPtr& host) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Independent of this PR, I think the HostSet locality relationship could be improved to provide more direct mapping between a host and its locality equivalence class. Right now, it's just this vector of HostSets, but it could be a map from Locality to HostSet, and each Host knows its own Locality to index with.

Signed-off-by: Stephan Zuercher <[email protected]>
Signed-off-by: Stephan Zuercher <[email protected]>
Signed-off-by: Stephan Zuercher <[email protected]>
@zuercher
Copy link
Copy Markdown
Member Author

Removed most of the docs/configuration changes (left the stats) and pointed the architecture overview at the protobuf def. I pushed the old version to a branch on our fork: turbinelabs/envoy@ecbde3c.

Signed-off-by: Stephan Zuercher <[email protected]>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @zuercher this is a super complicated change and I'm going to need more time to look through. I will look again in the morning. I have some questions/comments to get started.

therefore, contain a definition that has the same keys as a given route in order for subset load
balancing to occur.

The `design document
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to linking to the doc, would it be better to just move some of the examples inline here? It seems like it would be better for doc flow. I can go either way.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IOU this change as well.

"cluster: LB type 'original_dst_lb' may only be used with cluser type 'original_dst'");
}

TEST_F(ClusterManagerImplTest, SubsetLoadBalancerRestriction) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make sure we have a cluster manager test that loads a subset LB? I don't think we have that covered?

condition described above.

If subsets are :ref:`configured
<https://github.com/envoyproxy/data-plane-api/blob/master/api/cds.proto#L237>` and a route specifies
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it would be best to link to a specific SHA here so things don't move.

@@ -0,0 +1,1009 @@
#include <algorithm>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you look at the coverage build in CI and make sure we are at 100% coverage for the subset code? There are a ton of edge cases and I want to make sure everything is covered.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 lines are not covered. One is a NOT_REACHED macro. The other is unreachable due to the way that overlapping subsets are created, but would look like a mistake if I removed it.

// Find the index in which the original HostSet's hostsPerLocality vector
// contains the given host.
Optional<uint32_t>
SubsetLoadBalancer::HostSubsetImpl::findLocalityIndex(const HostSharedPtr& host) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would make this take a const Host&

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this one for now. I'm hopeful that simplifying the local host set support will make this whole function go away.

}
}

bool SubsetLoadBalancer::hostMatchesDefaultSubset(const HostSharedPtr& host) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: const Host&

// the host has a value for each key. The kvs vector is cleared before
// use.
bool SubsetLoadBalancer::extractSubsetMetadata(const std::set<std::string>& subset_keys,
const HostSharedPtr& host, SubsetMetadata& kvs) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const Host&

// No need to invoke update with local = true: subset creation handles the local hosts.
local_host_set->addMemberUpdateCb([this](const std::vector<HostSharedPtr>& hosts_added,
const std::vector<HostSharedPtr>& hosts_removed) {
update(hosts_added, hosts_removed, true);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little confused about the complex handling around the local host set. Since we only use the local host set to determine how many sendings hosts we have, and across what localities, can't we basically just use a global copy and do something similar to regenerateZoneRouting... for each target subset when it updates?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have to say this whole part of the load balancing code mystified me. I don't think I even fully understand what the zone-aware load balancing is meant to do (even ignoring subsets). In the end I figured it was safest to treat it the same way as the main host set, hence the complexity. Is the code just cheating by using a HostSetImpl for local host set?

return true;
}

bool SubsetLoadBalancer::hostMatches(const SubsetMetadata& kvs, const HostSharedPtr& host) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const Host&

condition described above.

If subsets are `configured
<https://github.com/envoyproxy/data-plane-api/blob/master/api/cds.proto#L237>` and a route specifies
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: better to link to a SHA so that it doesn't move.

Signed-off-by: Stephan Zuercher <[email protected]>
Signed-off-by: Stephan Zuercher <[email protected]>
@zuercher
Copy link
Copy Markdown
Member Author

Removing tracking of the local host set does simplify things a bit. I also remove the Subset nested class (it held the filtered host set, filtered local host set, and lb for a subset) since it wasn't really providing any benefit.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still mid-way through review, but a few comments so far. Thanks for all the great comments and docs, very helpful.


::

{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vague preference for YAML for v2 examples, it's more human friendly for docs.

HostConstSharedPtr chooseHost(LoadBalancerContext* context) override;

private:
LoadBalancerType lb_type_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: const (here and other places below that make sense, e.g. fallback_policy_).


HostConstSharedPtr SubsetLoadBalancer::chooseHost(LoadBalancerContext* context) {
if (context) {
bool host_chosen;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason that we can't use nullptr on tryChooseHostFromContext to indicate !host_chosen? I think it's because if the underlying LB policy returns nullptr that you don't want to fallback (is there an example where this happens?). If so, can you add a short one-liner here to explain this relationship?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reasoning is that if we have an empty subset, we treat it the same as no subset use fallback policy. If the subset has any hosts, we return the underlying LB's response, even if it's nullptr. The underlying LB tries very hard never to return nullptr, but I think it's possible if someone were to set the global panic threshold to 0. I'll add a comment.

fallback_subset_.reset(new LbSubsetEntry());
fallback_subset_->initLoadBalancer(*this, predicate);
} else {
if (!hosts_added.empty() || !hosts_removed.empty()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't the ANY_ENDPOINT update above have this same guard? Also, what about guarding the update in the subset modification below?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a holdover from when the code was trying to avoid invoking update when no hosts in the subset changed. I'll take it out.

if (!kvs.empty()) {
// The host has metadata for each key, find or create its subset.
LbSubsetEntryPtr entry = findOrCreateSubset(subsets_, kvs, 0);
if (entry != nullptr) {
Copy link
Copy Markdown
Member

@htuch htuch Oct 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Give we say above that the plan is to "find or create its subset", how can this now be nullptr? I'd recommend a one-liner explaining how it could be this way, or an ASSERT stating it can't.

// corresponding to the key from the host.
SubsetMetadata kvs = extractSubsetMetadata(keys, *host);
if (!kvs.empty()) {
// The host has metadata for each key, find it's subset.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/it's/its/

Signed-off-by: Stephan Zuercher <[email protected]>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is way easier to understand with the local host set stuff gone. some random comments then I will take a final pass.

host metadata. Finally, the ``DEFAULT_SUBSET`` causes fallback to load balance among hosts that
match a specific set of metadata.

.. _configured: https://github.com/envoyproxy/data-plane-api/blob/9897e3f/api/cds.proto#L237
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with this construct in sphinx. What does this render?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An external link at configured_ in the previous paragraph. The construct for internal links wasn't actually rendering correctly. That said, looking at more than just the RST cheatsheet, it looks like there's a way to inline it. I'll see if I can get that t work.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine either way. Was just curious.

HostConstSharedPtr chooseHost(LoadBalancerContext* context) override;

private:
const LoadBalancerType lb_type_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we put all variables together. In the rest of the code we do in each visibility section:

  1. struct/inner class definitions
  2. method definitions
  3. variables

});
}

SubsetLoadBalancer::~SubsetLoadBalancer() {}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: del empty destructor impl/definition.

fallback_subset_->initLoadBalancer(*this, predicate);
} else {
// Subsequent updates: add/remove hosts
fallback_subset_->host_subset_->update(hosts_added, hosts_removed, predicate);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Given duplication here and at 150, is it possible to make the logic flow so that we always call update() on the fallback_subset_ and just have an initialize step if it doesn't yet exist? That might make the logic easier to quickly grok. If this is a big deal feel free to ignore.

if (!kvs.empty()) {
// The host has metadata for each key, find or create its subset.
LbSubsetEntryPtr entry = findOrCreateSubset(subsets_, kvs, 0);
ASSERT(entry != nullptr);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This will crash in an obvious way at 167 so I would remove.

return true;
}

bool SubsetLoadBalancer::hostMatches(const SubsetMetadata& kvs, const Host& host) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be a template function and merge with hostMatchesDefaultSubset ? Feel free to ignore if not worth it.

Copy link
Copy Markdown
Member Author

@zuercher zuercher Oct 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing a template function for this is easy. But it seems that the use of std::bind makes it so that I have to explicitly call out the specialization, which is pretty hard to read.

For example (post fix-format):

predicate = std::bind(&SubsetLoadBalancer::hostMatchesMetadata<
                          Protobuf::Map<ProtobufTypes::String, ProtobufWkt::Value>>,
                      this, std::placeholders::_1, default_subset_.fields());

I'm inclined to leave it as is, but I'll keep the change stashed if you prefer the template function.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure just drop it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same reaction, but don't see a better solution.

}
}

// not found, but don't want creation
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: start with capital, period end of sentence (same elsewhere if applicable, haven't been paying full attention)

return findOrCreateSubset(entry->children_, kvs, idx);
}

// Initialize a new HostSubsetImpl and LoadBalancer from the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: OK to flow comments out to 100 cols

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick follow up from my last comment. Will take another full pass later or tomorrow morning latest.

void initLoadBalancer(const SubsetLoadBalancer& subset_lb, HostPredicate predicate);
};

const LoadBalancerType lb_type_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

local vars go last (after method definitions). (Apologize if I mistyped this. I can't remember).

Copy link
Copy Markdown
Member Author

@zuercher zuercher Oct 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't. I'm just having one of those days.

Signed-off-by: Stephan Zuercher <[email protected]>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more random comments. Thank you for being patient @zuercher this is a super complicated change and I'm focusing mostly on readability now.

@htuch I think it would be good for you to take another pass now also if you are planning on reviewing further.


private:
const HostSet& original_host_set_;
bool empty_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this var? Can we just look inside HostSetImpl hosts() to see if it's empty?

public:
LbSubsetEntry() {}

LbSubsetMap children_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we reorder vars/methods here also

const std::vector<HostSharedPtr>& hosts_removed) {
updateFallbackSubset(hosts_added, hosts_removed);

std::unordered_set<LbSubsetEntryPtr> subsets_modified;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random thought: 151-157 is the same in the added/removed case. Could have some type of function which takes a lambda which then gets called back with with host/subset pairs to act on. Might make the actual logic in this function a bit easier to read. Take it or leave it.

const auto vs_it = value_subset_map.find(value);
if (vs_it != value_subset_map.end()) {
LbSubsetEntryPtr entry = vs_it->second;
idx++;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire flow is duplicated here. Can we handle uninitialize case and then do the same flow for both cases?

- remove empty_, fix var/function ordering
- reorganize findOrCreateSubset to remove code duplication
- refactor update to separate finding modified subsets from the actual updating

Signed-off-by: Stephan Zuercher <[email protected]>
Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more random things but LGTM. @htuch do you plan on looking at this again?

updateFallbackSubset(hosts_added, hosts_removed);

processSubsets(hosts_added, hosts_removed,
[&](LbSubsetEntryPtr entry, HostPredicate predicate, bool addingHost) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: adding_host

std::unordered_set<LbSubsetEntryPtr> subsets_modified;
bool adding = true;

for (const auto& hosts : {hosts_added, hosts_removed}) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Do you know if this copies hosts_added/hosts_removed? It might. Not sure if compiler is smart enough here to deal with this unless you take address and then deal with pointers.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's copying the vectors. I fixed this (and hopefully made the hacky bit you mentioned above better) by creating a variable to hold the array.

const std::vector<HostSharedPtr>& hosts_added, const std::vector<HostSharedPtr>& hosts_removed,
std::function<void(LbSubsetEntryPtr, HostPredicate, bool)> cb) {
std::unordered_set<LbSubsetEntryPtr> subsets_modified;
bool adding = true;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit hacky but I can live with it. I would just drop a comment that says something like "true on first iteration, false on second".

processSubsets(hosts_added, hosts_removed,
[&](LbSubsetEntryPtr entry, HostPredicate predicate, bool addingHost) {
if (entry->initialized()) {
bool active_before = entry->active();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

@htuch
Copy link
Copy Markdown
Member

htuch commented Oct 24, 2017

I will give it another pass later today..

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending any @htuch comments.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, I really like the comments, docs and how simple some parts of this now look. Thanks for the contribution.

} else {
// Subsequent updates: add/remove hosts.
fallback_subset_->host_subset_->update(hosts_added, hosts_removed, predicate);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like how simple this method has become.

return true;
}

bool SubsetLoadBalancer::hostMatches(const SubsetMetadata& kvs, const Host& host) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same reaction, but don't see a better solution.

@htuch htuch merged commit e8a6881 into envoyproxy:master Oct 24, 2017
@zuercher zuercher deleted the slb/lb-4-rebased branch October 25, 2017 22:37
rshriram pushed a commit to rshriram/envoy that referenced this pull request Oct 30, 2018
* Update Envoy SHA to latest.

Signed-off-by: Piotr Sikora <[email protected]>

* review: install pkg-config on CircleCI.

Signed-off-by: Piotr Sikora <[email protected]>

* review: install pkg-config on build image.

Signed-off-by: Piotr Sikora <[email protected]>
mathetake pushed a commit that referenced this pull request Mar 3, 2026
…out inprogress (#1860)

**Description**
There have been sequences when the routes have been deleted and created.
The extProc sidecar would not be injected. We noticed this due to a
condition when the rollout is in progress, and a few pods are in a
terminating state but have an extProc sidecar in them. It does not
inject the sidecar in some cases (e.g., with multiple replicas).

To fix this issue, we add a check to look for deployment/daemonset
status if they are in progress. Requeue with a 5-second timeout.
Additionally, check whether the sidecar is present in each pod. This is
to ensure that, in multi-replica cases, there are no scenarios with
inconsistent state (some with sidecar and some without). If there is an
inconsistent state, we force rollout.

Changes:
- Skip terminating pods from checks and Consolidated checking of pods
with sidecar into `checkPodHasSideCar()`
- Included `isRolloutInProgress()` for checking inprogress


**Related Issues/PRs (if applicable)**
Fixes: #1857 

**Special notes for reviewers (if applicable)**

Tested this change by running a sequence of creation -> deletion of
routes. After the sequence was complete. The replicas were inconsistent
state which triggered a force rollout

```
INFO	controller.gateway	pods are inconsistent while rollout is stable, forcing rollout	{"podsWithSidecarSeen": true, "podsWithoutSidecarSeen": true}
```

Requesting thoughts/feedback on this approach

---------

Signed-off-by: Gavrish Prabhu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants