Skip to content

Conversation

@colega
Copy link
Contributor

@colega colega commented Oct 10, 2024

When references are not added in order, we are sorting in place the tail of the postings list held by MemPostings, however the array backing that slice might already being used by some readers.

Corresponds to #15317.

We can't modify the postings list that are held in MemPostings as they
might already be in use by some readers.

Signed-off-by: Oleg Zaytsev <[email protected]>
@colega colega requested a review from jesusvazquez as a code owner October 10, 2024 10:38
I used `require.Equal` instead of `require.Len` on purpose, as when the
slice is getting bigger, the error message that `require.Len` prints is
somtimes too verbose.

However that makes linter unhappy, linter thinks that `require.Len` is
always better.

Signed-off-by: Oleg Zaytsev <[email protected]>
@colega
Copy link
Contributor Author

colega commented Oct 10, 2024

Running some benchmarks:

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/tsdb
cpu: Apple M3 Pro
                                             │ HeadStripeSeriesCreate.main │     HeadStripeSeriesCreate.fix     │
                                             │           sec/op            │   sec/op     vs base               │
HeadStripeSeriesCreate-12                                      728.0n ± 8%   705.0n ± 4%  -3.16% (p=0.015 n=10)
HeadStripeSeriesCreateParallel-12                              686.8n ± 5%   678.1n ± 2%       ~ (p=0.579 n=10)
HeadStripeSeriesCreate_PreCreationFailure-12                   70.61n ± 6%   70.01n ± 0%  -0.85% (p=0.025 n=10)
geomean                                                        328.0n        322.3n       -1.76%

bboreham
bboreham previously approved these changes Oct 10, 2024
Copy link
Member

@bboreham bboreham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change seems fine; couple of thoughts below.

Comment on lines 420 to 422
old := list
list = make([]storage.SeriesRef, len(old), cap(old))
copy(list, old)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use slices.Clone ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked, and slices.Clone doesn't create a slice of same capacity, which is what I wanted here.

Comment on lines 425 to 430
for i := len(list) - 1; i >= 1; i-- {
if list[i] >= list[i-1] {
break
}
list[i], list[i-1] = list[i-1], list[i]
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: this loop could be modified to find the insertion point, then copy the head, insert new value, copy the tail.
(and not copy on line 422).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that's better given that we expect this to happen few times and very few last elements would need to be swapped. Can we experiment with that in a follow up?


// We don't read the value of the postings here,
// this is tested in TestMemPostings_Unordered_Add_Get where it's easier to achieve the determinism.
// This test just checks that there's no data race.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminded me of

copy(bb, b) // This copying of chunk bytes detects any race.

aknuds1
aknuds1 previously approved these changes Oct 10, 2024
Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, should there be a changelog entry though?

@bboreham
Copy link
Member

It's not typical to add changelog entries in Prometheus.
Put it in the PR title is easiest.

jesusvazquez
jesusvazquez previously approved these changes Oct 10, 2024
Copy link
Member

@jesusvazquez jesusvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aknuds1
Copy link
Contributor

aknuds1 commented Oct 10, 2024

It's not typical to add changelog entries in Prometheus.

@bboreham so what's the convention about when to put e.g. bug fixes in the changelog? This make me uncertain about when to do so, I thought when making user facing fixes they should generally go in there.

@colega colega dismissed stale reviews from jesusvazquez, aknuds1, and bboreham via 512cb16 October 10, 2024 13:21
@colega
Copy link
Contributor Author

colega commented Oct 10, 2024

I pushed a commit changing the logic to avoid copying again when we just copied for appending.

If there are no common labels on the series, we don't excercise the
ordering part of MemSeries, as we're just creating slices of one element
for each label value.

Signed-off-by: Oleg Zaytsev <[email protected]>
@aknuds1
Copy link
Contributor

aknuds1 commented Oct 11, 2024

Thanks!

@aknuds1 aknuds1 merged commit 50ef0dc into prometheus:main Oct 11, 2024
bboreham added a commit to bboreham/prometheus that referenced this pull request Nov 3, 2024
…heus#15141)"

This reverts commit 50ef0dc.

Memory allocation goes so high in Prombench that the system is unusable.
bboreham added a commit to bboreham/prometheus that referenced this pull request Nov 3, 2024
…heus#15141)"

This reverts commit 50ef0dc.

Memory allocation goes so high in Prombench that the system is unusable.
bboreham added a commit to bboreham/prometheus that referenced this pull request Nov 3, 2024
…heus#15141)"

This reverts commit 50ef0dc.

Memory allocation goes so high in Prombench that the system is unusable.

Signed-off-by: Bryan Boreham <[email protected]>
@bboreham bboreham mentioned this pull request Nov 3, 2024
@bboreham
Copy link
Member

bboreham commented Nov 3, 2024

Apparently the case where we copy is quite common when running Prombench, and therefore I think this will impact real users.

I have proposed to revert at #15316.

@bboreham
Copy link
Member

bboreham commented Nov 3, 2024

Maybe, instead of adding all the burden to MemPostings.Add(), we could make ListPostings deal with partially-sorted lists?

Is there an issue detailing how this problem was found?

@aknuds1
Copy link
Contributor

aknuds1 commented Nov 3, 2024

Is there an issue detailing how this problem was found?

@bboreham A public facing issue wasn't added, I can make one.

bboreham added a commit to bboreham/prometheus that referenced this pull request Nov 3, 2024
…heus#15141)"

This reverts commit 50ef0dc.

Memory allocation goes so high in Prombench that the system is unusable.

Signed-off-by: Bryan Boreham <[email protected]>
@aknuds1
Copy link
Contributor

aknuds1 commented Nov 3, 2024

@bboreham I made an issue.

bboreham added a commit that referenced this pull request Nov 3, 2024
Revert "Fix `MemPostings.Add` and `MemPostings.Get` data race (#15141)"
julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024
* Tests for Mempostings.{Add,Get} data race
* Fix MemPostings.{Add,Get} data race

We can't modify the postings list that are held in MemPostings as they
might already be in use by some readers.

* Modify BenchmarkHeadStripeSeriesCreate to have common labels

If there are no common labels on the series, we don't excercise the
ordering part of MemSeries, as we're just creating slices of one element
for each label value.

---------

Signed-off-by: Oleg Zaytsev <[email protected]>
julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024
…heus#15141)"

This reverts commit b6285be.

Memory allocation went through the roof.
julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024
Revert "Fix `MemPostings.Add` and `MemPostings.Get` data race (prometheus#15141)"
julienduchesne pushed a commit to julienduchesne/prometheus that referenced this pull request Dec 13, 2024
@kushalShukla-web
Copy link
Contributor

Maybe, instead of adding all the burden to MemPostings.Add(), we could make ListPostings deal with partially-sorted lists?

Hi @bboreham, if we remove the sorting part from addFor, it means we’ll need to sort during retrieval, which I believe could increase the time complexity. Wouldn’t that make Get() slower?

@colega
Copy link
Contributor Author

colega commented Jul 14, 2025

@kushalShukla-web I was thinking that we could involve isolation here, and committing to the sorted slice once all related transactions are complete, while keeping an unsorted tail for the series created in the ongoing transactions.

This would require having isolation enabled everywhere though.

@kushalShukla-web
Copy link
Contributor

kushalShukla-web commented Oct 15, 2025

@colega In that case, we need to change m map[string]map[string][]storage.SeriesRef into two maps: committed map[string]map[string][]storage.SeriesRef and tails map[string]map[string][]storage.SeriesRef. we can avoid the copying part if we do in this way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants