Skip to content

round_robin: periodically retry connecting to failed subchannels #11643

@c-bb

Description

@c-bb

As discussed in #11578, the round_robin load-balancer should periodically attempt to reconnect to failed subchannels.

As it stands, if a service lives on n remote servers, all of which eventually undergo downtime/maintenance, then a (long-lived) client will gradually lose connectivity to more and more servers, until only one is left, and only when that one goes down, does it reconnect to all of them. Not especially great for load-balancing.

I'm creating this issue for tracking, as advised by @dgquintas. Hopefully this can get fixed for 1.5.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions