Use LazySeriesIterator with fuzzy metric name queries by aaron7 · Pull Request #442 · cortexproject/cortex

aaron7 · 2017-05-25T07:00:30Z

Step towards fixing #416

Create v8Schema with series index (#430)

This index will be used to fetch series from the index. It will replace the metric name index and we will make the switch after the next steps are complete.

Decided to encode the model.Metric as JSON because it already implements the json.Unmarshaler interface.

Used sha256 for identifying the series as the variant of fnv64a which if used for fingerprints is not unique enough for indexing and using sha256 keeps it simple (we don't have to deal with collision logic).

Move iterators inside chunk store using MergeSeriesIterator (#438)

Move iterators inside the chunk store
Create MergeSeriesIterator which is created from multiple iterators and implements SeriesIterator.
Move iterator code to util as it is used across packages and cyclic imports would exist in some cases if it was kept in querier.

Use LazySeriesIterator with fuzzy metric name queries

There is not opportunity for our iterator to return an error if values are request, so we are just returning no values for now.

Test plan:

count({__name__=~".+"}) by (__name__) ran locally

tomwilkie · 2017-06-05T11:59:12Z

pkg/chunk/chunk_store.go

 		return nil, fmt.Errorf("invalid query, through < from (%d < %d)", through, from)
 	}

 	filters, matchers := util.SplitFiltersAndMatchers(allMatchers)


I wonder if this should be pushed into getMetricNameIterators and getFuzzyMetricLazySeriesIterators.

tomwilkie · 2017-06-05T12:00:30Z

pkg/chunk/chunk_store.go

-		return c.lookupChunksByMetricName(ctx, from, through, matchers, metricNameMatcher.Value)
-	}
-
+func (c *Store) getFuzzyMetricLazySeriesIterators(ctx context.Context, from, through model.Time, filters []*metric.LabelMatcher, matchers []*metric.LabelMatcher, metricNameMatcher *metric.LabelMatcher) ([]local.SeriesIterator, error) {


Probably better names getSeriesIterators?

tomwilkie · 2017-06-05T12:03:11Z

pkg/chunk/chunk_store.go

-		if ok && !metricNameMatcher.Match(metricName) {
-			skippedMetricNames++
+		// Apply metricNameMatcher filter
+		if metricNameMatcher != nil && !metricNameMatcher.Match(metric[metricNameMatcher.Name]) {


Is this not redundant with the checks below?

Also, if you push the util.SplitFiltersAndMatchers into getMetricNameIterators, this section will just become:

for _, matcher := range matchers { if !... {...} }

And you won't need all three checks.

tomwilkie · 2017-06-05T12:21:48Z

pkg/chunk/schema_test.go

@@ -236,6 +237,7 @@ func TestSchemaHashKeys(t *testing.T) {
 const (
 	MetricNameRangeValue = iota + 1


typically you do

const ( _ iota foo bar )

If you want to start at 1.

tomwilkie · 2017-06-05T12:32:56Z

pkg/chunk/schema_util_test.go

+				"flip": "flop",
+			},
+			"KrbXMezYneba+o7wfEdtzOdAWhbfWcDrlVfs1uOCX3M",
+		},


Can you (for my sanity) add a test case for the same metric, with the pairs in a different order?

tomwilkie · 2017-06-05T12:33:57Z

pkg/distributor/distributor.go

 					fpToSampleStream[fp] = mss
 				}
-				mss.Values = util.MergeSamples(mss.Values, ss.Values)
+				mss.Values = util.MergeSampleSets(mss.Values, ss.Values)


In such large PRs I generally try and avoid renamings.

We changed it to MergeSampleSets to keep it consistent with MergeNSampleSets and it's only used in 2 places. I believe this was in a previous PR but we have now agreed to merge all PR's together.

tomwilkie · 2017-06-05T12:38:22Z

pkg/util/iterator.go

@@ -0,0 +1,142 @@
+package util


Add a comment that this is copied from the prometheus code, and the PR to make it public?

tomwilkie · 2017-06-05T12:40:22Z

Let close #430 and #438 and put all the changes in one PR (this one). That way we won't miss any comments.

Done the first pass - looks good! Not obvious errors, just a bunch of nits.

Have you tested locally? I didn't see any changes to the checked-in k8s config to enable this.

tomwilkie · 2017-06-09T14:13:09Z

Is this gtg now? I'll review this weekend if so.

…

On Fri, 9 Jun 2017 at 14:52, Aaron Kirkbride ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In pkg/distributor/distributor.go <#442 (comment)>: > @@ -517,7 +532,7 @@ func (d *Distributor) queryIngesters(ctx context.Context, ingesters []*ring.Inge } fpToSampleStream[fp] = mss } - mss.Values = util.MergeSamples(mss.Values, ss.Values) + mss.Values = util.MergeSampleSets(mss.Values, ss.Values) We changed it to MergeSampleSets to keep it consistent with MergeNSampleSets and it's only used in 2 places. I believe this was in a previous PR but we have now agreed to merge all PR's together. — You are receiving this because your review was requested. Reply to this email directly, view it on GitHub <#442 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAbGhYlkomWhsneYk4Z6FCReP1hRzwcmks5sCU4jgaJpZM4NmCNS> .

marccarre · 2017-06-09T16:06:31Z

pkg/chunk/schema.go

+	return nil, ErrNoMetricNameNotSupported
+}
+
+// v8Entries is the same as v7Entries however with a series index instead of a metric name index


v8Entries is the same as v7Entries
does not seem consistent with:
type v8Entries struct { v6Entries }
which one is correct? is just a typo in the comment?

v8Entries embedding v6Entries is correct. The comment says it's the same as v7 because we are replacing the index v7 adds. However this comment is confusing because v8Entries GetWriteEntries is completely different.

I will update the comment - thanks

marccarre · 2017-06-09T16:49:15Z

pkg/chunk/schema_util.go


+func metricSeriesID(m model.Metric) string {
+	h := sha256.Sum256([]byte(m.String()))
+	return string(encodeBase64Bytes(h[:]))


Out of curiosity, is there a special reason we use base64 here vs., for example, hex.EncodeToString which:

is more natural for such hash functions, and

could be more convenient to use when debugging? (or e.g. printing https://play.golang.org/p/IeHqPaaNk4)

No special reason. It's slightly more efficient to store (4 chars per 3 bytes instead of 2 chars per 2 bytes) and is seen more of an UID than a hash. Base64 is human readable and we also encode to base64 for the dynamo api (http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.LowLevelAPI.html).

marccarre · 2017-06-09T17:26:35Z

pkg/util/merger.go

+func MergeNSamples(sampleSets ...[]model.SamplePair) []model.SamplePair {
+	l := len(sampleSets)
+	switch l {
+	case 0:


Unless I've missed it, and even though that's trivial, I don't think there is an unit test covering this case statement.
If I understood well, we'd need this additional test case:

{ sampleSets: [][]model.SamplePair{}, expected: []model.SamplePair{}, },

marccarre · 2017-06-09T17:44:52Z

pkg/util/merger.go

+}
+
+// MergeNSampleSets merges and dedupes n sets of already sorted sample pairs.
+func MergeNSampleSets(sampleSets ...[]model.SamplePair) []model.SamplePair {


Are values within a []model.SamplePair guaranteed to be sorted? (the comment above seems to indicate so)

If so, would it be worth implementing:

a n-way merge, instead of

this 2-way merge for which we then enqueue results (and repeat)?

It feels like we would re-allocate quite a few arrays and process model.SamplePairs more than once for large number of sampleSets.

In case this could be useful, here is an example of doing the n-way merge using:

a "dedup iterator" on

a "merging iterator" on

a collection of "iterators".

Lack of generics make this pretty ugly in Go (feedback welcome as I'm a beginner at Go), however:

a similar approach could be done with pure arrays/slices;

the "dedup" logic and "merging" logic could be merged for better performance.

The thing about it is that it is nicely composable and the overall complexity is O(n.log(k)),

k being the number of iterators merged, and

n the total number of items.

As discussed offline, we are going to keep with the merge approach for now in this PR because the most common case is only a few sample sets and the code is simpler.

marccarre · 2017-06-13T14:32:36Z

pkg/chunk/iterator.go

+		err = it.createSampleSeriesIterator()
+	})
+	if err != nil {
+		// TODO: Handle error.


Is there a plan to handle this error before this PR gets merged? (I am just asking given the TODO)

Not in this PR. I would like to discuss this - I'm unsure how to approach this.

Unsure as well right now... and it looks like it could fail in many ways 🙁

marccarre · 2017-06-13T14:37:22Z

pkg/querier/querier.go

+			for _, it := range iterators {
+				ss := &model.SampleStream{
+					Metric: it.Metric().Metric,
+					Values: it.RangeValues(in),


Just to point out: RangeValues could return nil. Does it matter? What do we do with these SampleStreams afterwards?

It looks like we only pass these to ToQueryResponse but we don't check for nil there, so we could potentially "panic".

I've had a look into this and it doesn't look like it will panic because we are ranging over a nil slice which does 0 iterations. https://github.com/golang/go/blob/master/doc/go_spec.html#L5011
https://play.golang.org/p/4vvbbHMKUo

marccarre · 2017-06-13T15:07:03Z

pkg/util/iterator.go

+
+// ValueAtOrBeforeTime implements the SeriesIterator interface.
+func (it SampleStreamIterator) ValueAtOrBeforeTime(ts model.Time) model.SamplePair {
+	// TODO: This is a naive inefficient approach - in reality, queries go mostly


Is this something to resolve as part of this PR?

This was moved from pkg/querier/iterator.go. We still use SampleStreamIterator in some cases and the comment is still relevant. We still need improve the iterators to iterate through chunks - this PR focuses on allowing queries which do not fetch samples to execute efficiently.

aaron7 · 2017-06-13T16:42:30Z

@tomwilkie

Is this gtg now? I'll review this weekend if so.

Yes, it's ready for review if you get some time. Thanks :)

tomwilkie · 2017-06-13T16:43:14Z

Great! I'll look at it tomorrow afternoon.

tomwilkie · 2017-06-20T10:27:44Z

pkg/chunk/chunk.go

+}
+
 // ChunksToMatrix converts a slice of chunks into a model.Matrix.
 func ChunksToMatrix(chunks []Chunk) (model.Matrix, error) {


Only used in tests now, could be moved into them.

tomwilkie · 2017-06-20T10:29:49Z

pkg/chunk/chunk_store.go

-	// Fetch chunk descriptors (just ID really) from storage
-	chunks, err := c.lookupChunksByMatchers(ctx, from, through, matchers)
+func (c *Store) getMetricNameIterators(ctx context.Context, from, through model.Time, allMatchers []*metric.LabelMatcher, metricName model.LabelValue) ([]local.SeriesIterator, error) {
+	chunks, err := c.getMetricNameChunks(ctx, from, through, allMatchers, metricName)


My preference would be to move the contents of getMetricNameChunks into here, as this function seems overly short. I see that it is used from the tests, so you'd need to do some work there, in which case it might not be worth the effort.

Yep, once getMetricNameIterators converts the chunks from getMetricNameChunks into iterators, we cannot access the chunks for testing through the iterator interface. We would like access to these to test whether the right chunks are fetched or not.

We can't test at the interval level as we are concerned about which chunks are fetched to create the iterators. We could expose the chunks, but it's a prometheus interface and it doesn't seem like the right thing to do, exposing something that is not related. This currently seems like the best way to get around testing constraints.

tomwilkie · 2017-06-20T10:33:58Z

pkg/chunk/iterator.go

+func (it *LazySeriesIterator) createSampleSeriesIterator() error {
+	metricName, ok := it.metric[model.MetricNameLabel]
+	if !ok {
+		return fmt.Errorf("series does not have a metric name")


Can you push this error into the creation of the LazySeriesIterator?

tomwilkie · 2017-06-20T10:34:45Z

pkg/chunk/iterator.go

+		return fmt.Errorf("series does not have a metric name")
+	}
+
+	ctx := context.Background()


Can you open a ticket upstream to add contexts and error returns to iterators. Will probably have to wait until post v2 though.

aaron7 · 2017-07-11T13:19:10Z

Rebased - will test before merging. Thanks for the review @tomwilkie .

aaron7 · 2017-07-12T16:04:58Z

Tested locally with http_request_duration_microseconds, up and count({__name__=~".+"}) by (__name__).

We're seeing Error: QueryPages error: table=cortex, err=SerializationError: failed decoding JSON RPC response caused by: header line too long errors locally for the fuzzy queries when we have a few chunks in DynamoDB, however we believe this could be a problem with localstack. The agreed plan is to roll out to dev and test against DynamoDB and file an issue with localstack if it is inconsistent.

I will also be adding some more metrics to get information about how many pages we are seeing for each query.

bboreham · 2018-03-04T17:32:57Z

It will replace the metric name index

how?

EDIT: I think this refers to the v7 schema which had an index "Userid:day - hash(metric name) -> metric name". See #416 (comment)

aaron7 force-pushed the 416-lazy-series-when-no-metric-name branch from 7163bcc to abe573d Compare May 25, 2017 10:11

aaron7 changed the title ~~[WIP] Use LazySeriesIterator with fuzzy metric name queries~~ Use LazySeriesIterator with fuzzy metric name queries May 25, 2017

aaron7 requested a review from tomwilkie May 25, 2017 16:30

aaron7 force-pushed the 416-lazy-series-when-no-metric-name branch from 249b65e to affeca8 Compare May 30, 2017 13:30

tomwilkie reviewed Jun 5, 2017

View reviewed changes

aaron7 mentioned this pull request Jun 6, 2017

Lazily fetch series chunks #456

Closed

aaron7 force-pushed the 416-lazy-series-when-no-metric-name branch from 883c061 to 940291e Compare June 6, 2017 16:20

This was referenced Jun 9, 2017

Move iterators inside chunk store using MergeSeriesIterator #438

Closed

Create v8Schema with series index #430

Closed

marccarre reviewed Jun 9, 2017

View reviewed changes

marccarre reviewed Jun 13, 2017

View reviewed changes

tomwilkie reviewed Jun 20, 2017

View reviewed changes

aaron7 added 18 commits July 11, 2017 14:18

Add LazySeriesIterator tests

3c99bee

Move LazySeriesIterator to chunk package

8f6e773

Lookup chunks when accessing samples

ef08626

Create and use sampleSeriesIterator in lazySeriesIterator

e74b82b

Add dynamodb.v8-schema-from to local k8s

40cfbee

Move filter splitter inside get iterator methods

c12be5d

Rename getFuzzyMetricLazySeriesIterators to getSeriesIterators

22631d2

Remove unnecessary splitting of matchers

151742f

Use _ = iota to start at 1

58b5855

Update tests

a28ff45

Fix TestChunkStore_Get_lazy tests

dedfe6a

Add comment about mergeIterator from prometheus fanin

98e880f

Improve schema comment

74b944a

Move chunksToMatrix to ingester tests

59874c0

Move metric name check into constructor

fca2d55

Create matchers based on metric inside LazySeriesIterator

5da7521

Use recursive merger

c9cf222

Create context inside lazy iterator

d4adeeb

aaron7 force-pushed the 416-lazy-series-when-no-metric-name branch from 20f2e43 to 8b31b36 Compare July 11, 2017 13:18

aaron7 added 2 commits July 11, 2017 15:52

Cache metricName on iterator

008ab72

Fix ingester import

aa37aa9

aaron7 force-pushed the 416-lazy-series-when-no-metric-name branch from 8b31b36 to aa37aa9 Compare July 11, 2017 14:55

aaron7 merged commit aea9137 into master Jul 12, 2017

aaron7 deleted the 416-lazy-series-when-no-metric-name branch July 12, 2017 16:05

tomwilkie mentioned this pull request Nov 21, 2017

Reduce duplication when writing #607

Closed

bboreham mentioned this pull request Dec 26, 2017

Optimize merging of many small lists of samples #81

Closed

bboreham mentioned this pull request Mar 4, 2018

Hotspotting on keys #733

Closed

		@@ -236,6 +237,7 @@ func TestSchemaHashKeys(t *testing.T) {
		const (
		MetricNameRangeValue = iota + 1

Conversation

aaron7 commented May 25, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tomwilkie commented Jun 5, 2017

Uh oh!

tomwilkie commented Jun 9, 2017 via email

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaron7 Jun 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marccarre Jun 9, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

marccarre Jun 13, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaron7 commented Jun 13, 2017

Uh oh!

tomwilkie commented Jun 13, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaron7 commented Jul 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

aaron7 commented May 25, 2017 •

edited

Loading

aaron7 Jun 9, 2017 •

edited

Loading

marccarre Jun 9, 2017 •

edited

Loading

marccarre Jun 13, 2017 •

edited

Loading

aaron7 commented Jul 11, 2017 •

edited

Loading

bboreham commented Mar 4, 2018 •

edited

Loading