orca: create ORCA producer for LB policies to use to receive OOB load reports#5669
orca: create ORCA producer for LB policies to use to receive OOB load reports#5669dfawley merged 9 commits intogrpc:masterfrom
Conversation
1b369f2 to
1cf3c60
Compare
| // avoid polluting the godoc of the top-level orca package. | ||
| package internal | ||
|
|
||
| import ibackoff "google.golang.org/grpc/internal/backoff" |
There was a problem hiding this comment.
Why do we need this import to be renamed?
There was a problem hiding this comment.
We don't; I just like to keep things consistent, and we have a backoff package and an internal/backoff package, and it's confusing if we reference both using the same name in different files.
There was a problem hiding this comment.
Package names should ideally be self contained and self explanatory. If we are always going to rename this import, we should ideally rename the package itself.
| // A Producer is a type shared among potentially many consumers. It is | ||
| // associated with a SubConn, and an implementation will typically contain | ||
| // other methods to provide additional functionality, e.g. configuration or | ||
| // subscription registration. |
There was a problem hiding this comment.
This description feels very vague to me, in the sense that I can't get a good idea of when to use a Producer, how to use one etc.
There was a problem hiding this comment.
Could you suggest something better after our discussion today?
There was a problem hiding this comment.
Ack. The first sentence is fine, but I don't get much technical understanding from: "and an implementation will typically contain other methods to provide additional functionality, e.g. configuration or subscription registration." Additional functionality with respect to what? There's no functionality defined before this. Do we expect this to ever be implemented by another concrete type? This comment is very specific to type producer in orca/producer. If so, why have this interface?
There was a problem hiding this comment.
Additional functionality with respect to what? There's no functionality defined before this.
In addition to the builder that created it, I suppose, and the close function the builder returns.
Do we expect this to ever be implemented by another concrete type?
Yes, definitely. This is intended to be generic and able to be implemented by anything. The other example we already have is the health checks, which are implemented in the other languages as a generic wrapper, but hard-coded into gRPC in Go. I'd like to move to a similar model there by using this mechanism.
This comment is very specific to type producer in orca/producer.
Not at all. If it was, it would talk about ORCA configuration. ORCA should not be part of grpc-go directly, but instead should only be a plugin which uses this.
easwars
left a comment
There was a problem hiding this comment.
I think you might have missed pushing a commit. I have more comments which I will add soon.
zasweq
left a comment
There was a problem hiding this comment.
Cool PR bossman. Some comments.
| // A Producer is a type shared among potentially many consumers. It is | ||
| // associated with a SubConn, and an implementation will typically contain | ||
| // other methods to provide additional functionality, e.g. configuration or | ||
| // subscription registration. |
There was a problem hiding this comment.
Ack. The first sentence is fine, but I don't get much technical understanding from: "and an implementation will typically contain other methods to provide additional functionality, e.g. configuration or subscription registration." Additional functionality with respect to what? There's no functionality defined before this. Do we expect this to ever be implemented by another concrete type? This comment is very specific to type producer in orca/producer. If so, why have this interface?
1cf3c60 to
4971070
Compare
dfawley
left a comment
There was a problem hiding this comment.
Updated; sorry for the force push; I messed up the synchronization between my laptop & workstation.
| // A Producer is a type shared among potentially many consumers. It is | ||
| // associated with a SubConn, and an implementation will typically contain | ||
| // other methods to provide additional functionality, e.g. configuration or | ||
| // subscription registration. |
There was a problem hiding this comment.
Additional functionality with respect to what? There's no functionality defined before this.
In addition to the builder that created it, I suppose, and the close function the builder returns.
Do we expect this to ever be implemented by another concrete type?
Yes, definitely. This is intended to be generic and able to be implemented by anything. The other example we already have is the health checks, which are implemented in the other languages as a generic wrapper, but hard-coded into gRPC in Go. I'd like to move to a similar model there by using this mechanism.
This comment is very specific to type producer in orca/producer.
Not at all. If it was, it would talk about ORCA configuration. ORCA should not be part of grpc-go directly, but instead should only be a plugin which uses this.
| }) | ||
| if err != nil { | ||
| return resetBackoff, err | ||
| return false, err |
There was a problem hiding this comment.
You liked this rather than using default value of bools more?
There was a problem hiding this comment.
This was changed due to an earlier comment from @easwars. I said in my reply there I didn't feel strongly either way. I still don't.
| // to avoid overloading a server experiencing problems. The attempt count | ||
| // is incremented when stream errors occur and is reset when the stream | ||
| // reports a result. |
There was a problem hiding this comment.
I don't think the last part of this comment is correct/like this comment. First of all, reset has two semantical meanings for this field, one for the timer, one for the attempt count, which gets plumbed into Exponential Backoff to determine what the backoff timer gets reset to I'm assuming your reset refers to attempt count. Also, there is no linkage or even local variable named request count. I see backoffAttempt, which isn't a field here, but the int in this function variable. The attempt count is not reset when the stream reports a result. Right now, the attempt count is reset on either a. the minInterval has changed after processing the receive from the stream compared to what was the stream was configured with. b. If the second+ stream.Recv() returns an error, this also feels wrong, I will leave a comment there.
| return resetBackoff, err | ||
| } | ||
| resetBackoff = true |
There was a problem hiding this comment.
Is this correct? I don't get why it matters whether the error comes from the second+ stream.Recv() or the first one with respect to resetting backoff. Shouldn't it be io.EOF the conditional for a successful stream where we can infer the server is working properly?
easwars
left a comment
There was a problem hiding this comment.
LGTM.
Deep down though, I would have still preferred if we could have added the methods required to make RPCs on the SubChannel interface. I feel that would have been a more intuitive change.
Like, currently, the RegisterOOBListener function in the orca package calls GetOrBuildProducer on the SubConn and passes a builder defined locally. This seems quite roundabout. But, if we want the orca package to handle everything locally, then we would need the functionality to make RPCs on the subConns, and also the functionality to uniquely identify subConns based on their channelz ID. If we don't think adding this functionality to the subConn interface is a good long term option, I agree with the approach taken in this PR.
| // Stop listener 3. This makes the interval longer, with stream recreation | ||
| // delayed until the next report is received. Reports only go to listener | ||
| // 1 now. |
There was a problem hiding this comment.
Reports should only go to listener 1 now. Also explain why the interval is now longer, and perhaps tie that idea of a longer interval to stream recreation.
There was a problem hiding this comment.
ping about stream recreation.
There was a problem hiding this comment.
I think a very high level understanding of the feature (ORCA OOB reporting) plus this comment as-is is sufficient to understand what's happening here. I could make a longer comment to make things painfully obvious, but sometimes less is more.
|
|
||
| lisOpts := orca.OOBListenerOptions{ReportInterval: 50 * time.Millisecond} | ||
| lao := &listenerAndOptions{listener: oobLis, opts: lisOpts} | ||
| r.InitialState(resolver.State{Addresses: []resolver.Address{setListenerAndOptions(resolver.Address{Addr: lis.Addr().String()}, lao)}}) |
There was a problem hiding this comment.
I don't see this split up still.
| // Provide a convenient way to expect backoff calls and return a minimal | ||
| // value. | ||
| expectedBackoff := -1 // -1 indicates any value is allowed. | ||
| const backoffShouldNotBeCalled = 9999 // Use to assert backoff function is not called. |
There was a problem hiding this comment.
There was a problem hiding this comment.
I don't get that explanation, but it's probably just me.
There was a problem hiding this comment.
I explain in comments why we don't want the backoff function to be called when backoffShouldNotBeCalled is assigned to expectedBackoff below. (It's because the last stream was successful.)
|
|
||
| // Register our fake ORCA service | ||
| s := grpc.NewServer() | ||
| fake := newFakeORCAService() |
| // Stop listener 2. This does not affect the interval. The next update | ||
| // goes to listeners 1 and 3. | ||
| // Stop listener 2. This does not affect the interval as listener 3 is | ||
| // still the shortest. The next update goes to listeners 1 and 3. |
There was a problem hiding this comment.
and still has the shortest interval
There was a problem hiding this comment.
This level of review isn't important. The meaning of this comment is very clear, and it's an internal comment in a test, not in external documentation.
There was a problem hiding this comment.
Wouldn't hurt though. But I see what you're saying
zasweq
left a comment
There was a problem hiding this comment.
LGTM outside my final responses, but again don't feel strongly about any of my nits.
|
You can go ahead and merge btw. |
RELEASE NOTES: