Skip to content

Fix: getting rid of EncOOOXOR from chunk encoding#12111

Merged
codesome merged 5 commits intoprometheus:mainfrom
infracloudio:remove-encoooxor
Mar 16, 2023
Merged

Fix: getting rid of EncOOOXOR from chunk encoding#12111
codesome merged 5 commits intoprometheus:mainfrom
infracloudio:remove-encoooxor

Conversation

@mabhi
Copy link
Copy Markdown
Contributor

@mabhi mabhi commented Mar 9, 2023

Explanation

Getting rid of EncOOOXOR from chunk encoding, as this is not an encoding but a flag to mark OOO encoded chunks

Related Issue:

Fixes #11835

Proposed Changes

  • Refactor the code to get rid of OOOXORChunk struct
  • Enhance the ChunkDiskMapper with the responsibility to verify for OOOChunks, apply OOO header and restore original encoding
  • The ChunkDiskMapper uses the first bit to know if the chunk is from out-of-order or not, and uses the last 7 bits of the byte to get the chunk encoding, by assuming that the chunk encoding won't cross 7 bits anytime in near/far future.
  • During chunk writes either directly or via job queue sending the isOOO flag to persist the extra 1st bit and that would be used during reads to check for OOO chunks and restore the original encoding for further processing.

Proof Manifests

@mabhi mabhi force-pushed the remove-encoooxor branch from dfa4fcf to 1aa88c6 Compare March 10, 2023 11:05
@mabhi mabhi marked this pull request as ready for review March 10, 2023 14:07
@mabhi mabhi requested a review from codesome as a code owner March 10, 2023 14:07
Copy link
Copy Markdown
Member

@jesusvazquez jesusvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far the work looks good 💪

I left one nitpick about the naming of a couple of functions.

The most important comment is the one in the head where the head is still aware of ooo chunk encodings. I think we can fix this refactoring the function by adding a boolean saying if the chunk is ooo but lets wait for @codesome 's comments.

Great work!

RestoreOutOfOrderMask = 0b01111111
)

func (cdm *ChunkDiskMapper) OutOfOrderEncoding(sourceEncoding chunkenc.Encoding) chunkenc.Encoding {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rename this to ApplyOutOfOrderEncoding or something like that since its actually applying a mask over the giving encoding.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will rename it as suggested.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApplyOutOfOrderMask is my suggestion. We are not changing encoding here. Reminder: we are not considering this bit as a part of encoding. For ChunkDiskMapper, only the last 7 bits are the encoding.

Copy link
Copy Markdown
Contributor Author

@mabhi mabhi Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return (e & OutOfOrderMask) != 0
}

func (cdm *ChunkDiskMapper) OutOfOrderDecoding(sourceEncoding chunkenc.Encoding) chunkenc.Encoding {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, we're applying a mask to decode

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, will rename it as suggested.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After getting rid of RestoreOutOfOrderMask as suggested, this can be renamed as UnapplyOutOfOrderMask.

Copy link
Copy Markdown
Contributor Author

@mabhi mabhi Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

tsdb/head.go Outdated
isOOO := h.chunkDiskMapper.IsOutOfOrderChunk(encoding)
// flip the first bit to restore the original encoding
if isOOO {
encoding = h.chunkDiskMapper.OutOfOrderDecoding(encoding)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm I wonder what could be done here, it seems here the head is aware of OOO chunk encodings and this is precisely what we wanted to hide in the chunk disk mapper.

Maybe we could change the function to return a boolean saying if its OOO instead of the encoding and then if its OOO we add it to oooMmappedChunks and if its not we add it to mmappedChunks

What do you think @codesome

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jesusvazquez Can you point out as to which function you want to return bool.
The current function is part of the load chunks from initial read. I have tried to put all the logic in CDM. Do you want me to pass this bool as part of the callback function inside the IterateAllChunks ?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I believe I added the encoding to that function when we added the OOO chunks. If you notice the usages I think it should be fairly easy to change.

func (cdm *ChunkDiskMapper) IterateAllChunks(f func(seriesRef HeadSeriesRef, chunkRef ChunkDiskMapperRef, mint, maxt int64, numSamples uint16, encoding chunkenc.Encoding) error) (err error) {

I would change that signature to something like
func (cdm *ChunkDiskMapper) IterateAllChunks(f func(seriesRef HeadSeriesRef, chunkRef ChunkDiskMapperRef, mint, maxt int64, numSamples uint16, isOOO bool) error) (err error) {

That will force you to refactor the call in

if err := h.chunkDiskMapper.IterateAllChunks(func(seriesRef chunks.HeadSeriesRef, chunkRef chunks.ChunkDiskMapperRef, mint, maxt int64, numSamples uint16, encoding chunkenc.Encoding) error {
and remove the information about encodings in the head.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 with what @jesusvazquez said. Let's change the function signature to return the isOOO boolean such that head doesn't need to infer it from the chunk encoding. The encoding coming out of ChunkDiskMapper here should have that ooo bit removed.

Copy link
Copy Markdown
Contributor Author

@mabhi mabhi Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +279 to +280
OutOfOrderMask = 0b10000000
RestoreOutOfOrderMask = 0b01111111
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use a single variable and get the RestoreOutOfOrderMask as ^OutOfOrderMask (note the uint8, else it can be buggy) https://goplay.tools/snippet/dCb5_qkFCbL

Suggested change
OutOfOrderMask = 0b10000000
RestoreOutOfOrderMask = 0b01111111
OutOfOrderMask = uint8(0b10000000)

Copy link
Copy Markdown
Contributor Author

@mabhi mabhi Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

chkEnc := mmapFile.byteSlice.Range(chkStart, chkStart+ChunkEncodingSize)[0]

sourceChkEnc := chunkenc.Encoding(chkEnc)
if cdm.IsOutOfOrderChunk(sourceChkEnc) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this if check. You can simply just apply the bit mask everytime.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

understood

RestoreOutOfOrderMask = 0b01111111
)

func (cdm *ChunkDiskMapper) OutOfOrderEncoding(sourceEncoding chunkenc.Encoding) chunkenc.Encoding {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ApplyOutOfOrderMask is my suggestion. We are not changing encoding here. Reminder: we are not considering this bit as a part of encoding. For ChunkDiskMapper, only the last 7 bits are the encoding.

return (e & OutOfOrderMask) != 0
}

func (cdm *ChunkDiskMapper) OutOfOrderDecoding(sourceEncoding chunkenc.Encoding) chunkenc.Encoding {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After getting rid of RestoreOutOfOrderMask as suggested, this can be renamed as UnapplyOutOfOrderMask.

Copy link
Copy Markdown
Member

@jesusvazquez jesusvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💪 💪

@mabhi
Copy link
Copy Markdown
Contributor Author

mabhi commented Mar 15, 2023

@codesome / @jesusvazquez any other changes need in this PR. Do let me know.

Copy link
Copy Markdown
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nits

// Chunk encodings for out-of-order chunks.
// These encodings must be only used by the Head block for its internal bookkeeping.
const (
OutOfOrderMask = 0b10000000
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be uint8 to begin with and don't need typecasting elsewhere, else it can lead buggy code if not used correctly.

Suggested change
OutOfOrderMask = 0b10000000
OutOfOrderMask = uint8(0b10000000)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

chkEnc := mmapFile.byteSlice.Range(chkStart, chkStart+ChunkEncodingSize)[0]

sourceChkEnc := chunkenc.Encoding(chkEnc)
// update the chunk encoding by restoring the original encoding, if needed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the reword, in Prometheus we start comments with capital letters where applicable and end with a full stop.

Suggested change
// update the chunk encoding by restoring the original encoding, if needed
// Extract the encoding from the byte. ChunkDiskMapper uses only the last 7 bits for the encoding.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +891 to +894
// flip the first bit to restore the original encoding
if isOOO {
chkEnc = cdm.UnapplyOutOfOrderMask(chkEnc)
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reword as the other comment + don't need the if block.

Suggested change
// flip the first bit to restore the original encoding
if isOOO {
chkEnc = cdm.UnapplyOutOfOrderMask(chkEnc)
}
// Extract the encoding from the byte. ChunkDiskMapper uses only the last 7 bits for the encoding.
chkEnc = cdm.UnapplyOutOfOrderMask(chkEnc)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

return (e & OutOfOrderMask) != 0
}

func (cdm *ChunkDiskMapper) UnapplyOutOfOrderMask(sourceEncoding chunkenc.Encoding) chunkenc.Encoding {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On another thought, I think this is better. Making it less specific to OOO such that it can be extended later if needed.

Suggested change
func (cdm *ChunkDiskMapper) UnapplyOutOfOrderMask(sourceEncoding chunkenc.Encoding) chunkenc.Encoding {
func (cdm *ChunkDiskMapper) RemoveMasks(sourceEncoding chunkenc.Encoding) chunkenc.Encoding {

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@mabhi mabhi force-pushed the remove-encoooxor branch from b43f037 to d3c41d0 Compare March 15, 2023 13:01
@mabhi mabhi requested a review from codesome March 16, 2023 04:59
Copy link
Copy Markdown
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last comment and we can merge. LGTM!

@codesome codesome merged commit 8f6d5dc into prometheus:main Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Get rid of the EncOOOXOR from the chunk encodings

3 participants