Update gRPC measures and views from spec#689
Update gRPC measures and views from spec#689semistrict merged 5 commits intocensus-instrumentation:masterfrom
Conversation
c531a42 to
d07bc02
Compare
| ClientRequestCount = stats.Int64("grpc.io/client/request_count", "Number of client RPC request messages", stats.UnitNone) | ||
| ClientResponseCount = stats.Int64("grpc.io/client/response_count", "Number of client RPC response messages", stats.UnitNone) | ||
| ClientRoundTripLatency = stats.Float64("grpc.io/client/roundtrip_latency", "RPC roundtrip latency in msecs", stats.UnitMilliseconds) | ||
| ClientSentMessagesPerRPC = stats.Int64("grpc.io/client/sent_messages_per_rpc", "Number of messages sent in the RPC (always 1 for non-streaming RPCs).", stats.UnitNone) |
There was a problem hiding this comment.
This unit is wrong, please update it to "1" as the I believe the spec says in https://github.com/census-instrumentation/opencensus-specs/blob/master/stats/gRPC.md#units

There was a problem hiding this comment.
Oh, I guess I hadn't seen the definition of "stats.UnitNone" which is a dimensionless unit, but it was confusing initially, "UnitNone" to me would be "" instead of "1" whereas perhaps we could use "stats.UnitDimensionless"
| ClientRoundTripLatency = stats.Float64("grpc.io/client/roundtrip_latency", "RPC roundtrip latency in msecs", stats.UnitMilliseconds) | ||
| ClientSentMessagesPerRPC = stats.Int64("grpc.io/client/sent_messages_per_rpc", "Number of messages sent in the RPC (always 1 for non-streaming RPCs).", stats.UnitNone) | ||
| ClientSentBytesPerRPC = stats.Int64("grpc.io/client/sent_bytes_per_rpc", "Total bytes sent across all request messages per RPC.", stats.UnitBytes) | ||
| ClientReceivedMessagesPerRPC = stats.Int64("grpc.io/client/received_messages_per_rpc", "Number of response messages received per RPC (always 1 for non-streaming RPCs).", stats.UnitNone) |
There was a problem hiding this comment.
Ditto about using "1" as the spec requires.
| if got, want := m.Description(), defn.desc; got != want { | ||
| t.Errorf("%q: Description = %q; want %q", defn.name, got, want) | ||
| } | ||
| } |
| t.Fatalf("len(gotMeasures) = %d; want %d", got, want) | ||
| } | ||
|
|
||
| for i, v := range views { |
| case codes.OK: | ||
| return "OK" | ||
| case codes.Canceled: | ||
| return "CANCELLED" |
There was a problem hiding this comment.
Aha, it's funny we have "Canceled" as the enumeration but "CANCELLED" as the string as defined in https://godoc.org/google.golang.org/grpc/codes vs https://github.com/grpc/grpc/blob/master/doc/statuscodes.md
There was a problem hiding this comment.
The enum type stringifies them differently btw in gRPC Go: https://github.com/grpc/grpc-go/blob/master/codes/code_string.go#L23.
There was a problem hiding this comment.
Baby Jesus, perhaps let's follow what grpc-go is doing? Thanks for posting that @rakyll.
There was a problem hiding this comment.
Go codes look like just the names of the vars, highly unlikely to be consistent across languages. The examples in the OpenCensus spec are consistent with the gRPC doc (all upper case, CANCELLED with two "l", UPPER_SNAKE_CASE) which is the most official source I could find. Also explicitly added to the spec to avoid future confusion: census-instrumentation/opencensus-specs#97
| var debug bool | ||
|
|
||
| func init() { | ||
| debug = os.Getenv("OC_STATS_DEBUG") != "" |
There was a problem hiding this comment.
Don't think it should be a documented feature of the library so I removed it instead.
| } | ||
| for _, tag := range row.Tags { | ||
| if tag.Key == ocgrpc.KeyClientStatus && tag.Value != "OK" { | ||
| s.ErrorsTotal += int(count) |
There was a problem hiding this comment.
Although we removed the measure and view for viewing the total number of gRPC errors.
There was a problem hiding this comment.
Yes, so this is why I had to compute it from the CompletedRPCs measures
There was a problem hiding this comment.
I have a naive question: how are we then going to view and track the number of errors?
|
Thank you for this PR @Ramonza, it's great work, I like the rigor with the spec adherence. It is quite a large PR though, so I'll need to do a couple of takes on reviewing it. |
6a0e34b to
50c3735
Compare
| view.RegisterExporter(&exporter.PrintExporter{}) | ||
|
|
||
| // Register the view to collect client request count. | ||
| if err := view.Register(ocgrpc.ClientErrorCountView); err != nil { |
There was a problem hiding this comment.
I have a naive question, how are now going to be able to track the number of errors in either the client or server?
There was a problem hiding this comment.
We now break down the requests by status, so you can get the total errors by adding up all the non-success statuses like we do in rpcz.
TestCumulativenessFromHistograms failed on AppVeyor. Looks likely to be timing related. Without rewriting the test to avoid timing issues, I am just increasing the wait times in the hopes that this will resolve the issue.
50c3735 to
b66f142
Compare
|
|
||
| func statusCodeToString(s *status.Status) string { | ||
| // see https://github.com/grpc/grpc/blob/master/doc/statuscodes.md | ||
| switch c := s.Code(); c { |
There was a problem hiding this comment.
Usually switch statements are very slow. Probably better to define a static map<Code, String>
There was a problem hiding this comment.
Curious what the intuition behind that is? Did a benchmark and turns out that the switch is actually significantly faster, at least on my laptop:
name time/op
StatusCodeToString_OK-8 2.66ns ± 0%
StatusCodeToString_Unauthenticated-8 2.88ns ± 0%
MapAlternativeImpl_OK-8 7.56ns ± 1%
There was a problem hiding this comment.
Then keep the switch statement. the map search seems to me very slow, maybe is a Go thing with maps. The way how the switch statements are now implemented is using a 2-level jump table (https://www.codeproject.com/Articles/100473/Something-You-May-Not-Know-About-the-Switch-Statem) so it should be one extra jump, but probably the way how the map is implemented in Go is also 2-level jump (does not store the values directly in the map, instead it uses pointers).
93f04e2 to
a0fc081
Compare
| @@ -0,0 +1,52 @@ | |||
| package ocgrpc | |||
| func BenchmarkMapAlternativeImpl_OK(b *testing.B) { | ||
| st := status.New(codes.OK, "OK") | ||
| for i := 0; i < b.N; i++ { | ||
| s := codeToStringMap[st.Code()] |
f0341f8 to
b90813a
Compare
OpenCensus gRPC views are being updated to reflect the latest spec. Remove a reference to an old view that will not longer exist after the update, and instead subscribe to the default set of client-side views. OpenCensus PR: census-instrumentation/opencensus-go#689 Change-Id: I3308ec22b71c8548e3c7fd39cfad597cab65d241 Reviewed-on: https://code-review.googlesource.com/26370 Reviewed-by: Jonathan Amsterdam <[email protected]>
Fixes: #687