Fix TSDB creation conflicting with transfer#1818
Fix TSDB creation conflicting with transfer#1818jtlisi merged 10 commits intocortexproject:masterfrom
Conversation
3494f29 to
6f4253e
Compare
codesome
left a comment
There was a problem hiding this comment.
Didn't have a chance to look at the tests, but other changes LGTM
pkg/ingester/ingester_v2.go
Outdated
| // a transfer in occurs. | ||
| if !force { | ||
| ingesterState := i.lifecycler.GetState() | ||
| if ingesterState != ring.ACTIVE { |
There was a problem hiding this comment.
Nit: you can combine these 2, also combine it with the above if !force check to be single if statement.
bb44c41 to
d8a317e
Compare
pkg/ring/model.go
Outdated
There was a problem hiding this comment.
IIRC pending ingesters shouldn't have any tokens; how did you trigger this?
There was a problem hiding this comment.
It should occur in the paths where we use forAllIngesters(). I think one path is:
- Querier receive a
LabelNamesrequest - Querier forwards the request to the distributor
- The distributor runs
client.LabelNames()indistributor.forAllIngesters() distributor.forAllIngesters()callsring.GetAll()ring.GetAll()returns all ingesters which satisfy theIsHealthy()
There was a problem hiding this comment.
@bboreham What's your take on this? If this change is a problem, I could rollback it in order to move one with this PR.
There was a problem hiding this comment.
Thanks for enumerating these.
pkg/ingester/ingester_v2_test.go
Outdated
748dcbb to
7a81a93
Compare
|
Nice work! I appreciate the refactoring on the read path ( |
7a81a93 to
6e4635a
Compare
|
You got some merge conflicts now |
1495357 to
fadbbdc
Compare
|
@gouthamve I've fixed the conflicts after rebasing. May you help me moving forward with this PR, please? |
|
Active development going on here :) you got more conflicts |
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…n JOINING state Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
…ll require a dedicated discussion Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
Signed-off-by: Marco Pracucci <[email protected]>
fadbbdc to
772ed25
Compare
What this PR does:
In this PR:
Suggest to enable "Hide whitespace changes" when reviewing for an easier diff.
Why?
I found an issue leading to data loss and/or TSDB wal corruption, which should be fixed by this PR. What happens:
PENDINGstate, so a query request can hit an ingester even if theACTIVEstate has not been reached yet (not desired, fixed)If a query hit a new ingester in the
PENDINGstate before the transfer starts (it switches toJOININGthen), the query creates a TSDB opening the wal segment00000000. The subsequent transfer will overwrite the wal segments, while the00000000file reference is still kept by the opened TSDB.Moreover, if it get transferred a wal where a checkpoint already occurred, the
00000000segment is not part of the transfer but the segment 0 is created by the opened TSDB which leads to an inconsistent segments numbering on disk (ie. 0, checkpoint.6, 7, 8, 9) which cause TSDB to fail a subsequent re-opening.Which issue(s) this PR fixes:
No reported issue.
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]