Skip to content
This repository was archived by the owner on Jul 31, 2023. It is now read-only.
This repository was archived by the owner on Jul 31, 2023. It is now read-only.

Improve performance of stats recording #1265

@howardjohn

Description

@howardjohn

Is your feature request related to a problem? Please describe.
Yes. Usage of opencensus-go to record metrics has a substantial overhead. In real world application, we have seen OC accounting for well over 10% of our memory allocations (and we generate GBs of protobufs per minute - so it should be negligible). This has led us to doing things we really shouldn't have to think about, like adding a caching layer on top of the library.

Describe the solution you'd like
Improve performance of the library; in particular memory allocations

Describe alternatives you've considered
Adding a caching layer above the library, using a different library.

Additional context

I wrote some benchmarks to compare to the prometheus client. There are two variants, one with a precompute label/tag and one computed in the loops:

func BenchmarkMetrics(b *testing.B) {
	b.Run("oc", func(b *testing.B) {
		mLineLengths := stats.Float64("test", "my-benchmark", stats.UnitDimensionless)
		key := tag.MustNewKey("key")
		v := &view.View{
			Measure:     mLineLengths,
			TagKeys:     []tag.Key{key},
			Aggregation: view.Sum(),
		}
		if err := view.Register(v); err != nil {
			b.Fatal(err)
		}

		for n := 0; n < b.N; n++ {
			allTags := []tag.Mutator{tag.Upsert(key, "val")}
			if err := stats.RecordWithTags(context.Background(), allTags, mLineLengths.M(1)); err != nil {
				b.Fatal(err)
			}
		}
	})
	b.Run("oc-fixed", func(b *testing.B) {
		mLineLengths := stats.Float64("test", "my-benchmark", stats.UnitDimensionless)
		key := tag.MustNewKey("key")
		v := &view.View{
			Measure:     mLineLengths,
			TagKeys:     []tag.Key{key},
			Aggregation: view.Sum(),
		}
		if err := view.Register(v); err != nil {
			b.Fatal(err)
		}

		allTags := []tag.Mutator{tag.Upsert(key, "val")}
		for n := 0; n < b.N; n++ {
			if err := stats.RecordWithTags(context.Background(), allTags, mLineLengths.M(1)); err != nil {
				b.Fatal(err)
			}
		}
	})
	b.Run("prom", func(b *testing.B) {
		g := prometheus.NewGaugeVec(prometheus.GaugeOpts{
			Namespace: "tests",
			Name:      "test",
		}, []string{"key"})
		prometheus.Register(g)
		for n := 0; n < b.N; n++ {
			g.With(prometheus.Labels{"key": "value"}).Add(1)
		}
	})
	b.Run("prom-fixed", func(b *testing.B) {
		g := prometheus.NewGaugeVec(prometheus.GaugeOpts{
			Namespace: "tests",
			Name:      "test",
		}, []string{"key"})
		prometheus.Register(g)
		l := prometheus.Labels{"key": "value"}
		for n := 0; n < b.N; n++ {
			g.With(l).Add(1)
		}
	})
}

Results:

BenchmarkMetrics
BenchmarkMetrics/oc
BenchmarkMetrics/oc-6             864436              1234 ns/op             768 B/op         14 allocs/op
BenchmarkMetrics/oc-6            1208937              1011 ns/op             768 B/op         14 allocs/op
BenchmarkMetrics/oc-6            1000000              1016 ns/op             768 B/op         14 allocs/op
BenchmarkMetrics/oc-fixed
BenchmarkMetrics/oc-fixed-6      1264486               919.2 ns/op           680 B/op         11 allocs/op
BenchmarkMetrics/oc-fixed-6      1284253               956.4 ns/op           680 B/op         11 allocs/op
BenchmarkMetrics/oc-fixed-6      1279734               961.2 ns/op           680 B/op         11 allocs/op
BenchmarkMetrics/prom
BenchmarkMetrics/prom-6          3083409               371.8 ns/op           336 B/op          2 allocs/op
BenchmarkMetrics/prom-6          3202328               385.6 ns/op           336 B/op          2 allocs/op
BenchmarkMetrics/prom-6          3208323               388.9 ns/op           336 B/op          2 allocs/op
BenchmarkMetrics/prom-fixed
BenchmarkMetrics/prom-fixed-6   12074671                95.92 ns/op            0 B/op          0 allocs/op
BenchmarkMetrics/prom-fixed-6   12057554                89.15 ns/op            0 B/op          0 allocs/op
BenchmarkMetrics/prom-fixed-6   13738635                88.36 ns/op            0 B/op          0 allocs/op

So the prometheus counterpart actually has zero allocs once the label is created. It also is 10x faster. not even considering GC overhead, which is substantial, that means that (with above machine), I can record 1M metrics/s with OC and 10M with prom; of course in the real world the metrics recording should be a tiny portion of the CPU used by the process though.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions