Reduce RowHash's tag space size by x2#3543
Conversation
terrelln
left a comment
There was a problem hiding this comment.
Generally looks good!
I did a quick benchmark and see just about neutral performance, +- a few percent based on the compiler. Also a small ratio loss, but it seems reasonable. I'll let you do the full benchmarks, just wanted to verify that the speed looked good on my machine too.
lib/compress/zstd_lazy.c
Outdated
| @@ -804,9 +803,10 @@ U16 ZSTD_rotateRight_U16(U16 const value, U32 count) { | |||
| * value to reflect the update. Essentially cycles backwards from [0, {entries per row}) | |||
There was a problem hiding this comment.
You should update this comment
There was a problem hiding this comment.
I thought I did, thank you for noticing!
lib/compress/zstd_lazy.c
Outdated
| { | ||
| U32* const hashTable = ms->hashTable; | ||
| U16* const tagTable = ms->tagTable; | ||
| U8* const tagTable = ms->tagTable; |
There was a problem hiding this comment.
You should use BYTE here and elsewhere
lib/compress/zstd_lazy.c
Outdated
|
|
||
| assert(hash == ZSTD_hashPtr(base + updateStartIdx, hashLog + ZSTD_ROW_HASH_TAG_BITS, mls)); | ||
| ((BYTE*)tagRow)[pos + ZSTD_ROW_HASH_TAG_OFFSET] = hash & ZSTD_ROW_HASH_TAG_MASK; | ||
| ((BYTE*)tagRow)[pos] = hash & ZSTD_ROW_HASH_TAG_MASK; |
There was a problem hiding this comment.
Can get rid of the BYTE* cast now.
|
@terrelln - did you benchmark on a mac or on an x86 server? |
cc96945 to
faf5f15
Compare
|
Fixed CR comments and added a spreadsheet for benchmarks. I've tried reducing this regression, but don't have a good solution at the moment and unless it's critical I'll not continue. |
Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index). In turn, we slash the memory usage for slightly worse compression ratio or better ratio if we use the same memory size with a higher hashLog.
faf5f15 to
69be424
Compare
x86 |
…ys invalid facebook#3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag. Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have. The fix skips position 0 to solve this problem.
…ys invalid facebook#3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag. Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have. The fix skips position 0 to solve this problem.
#3543 decreases the size of the tagTable by a factor of 2, which requires using the first tag position in each row for head position instead of a tag. Although position 0 stopped being a valid match, it still persisted in mask calculation resulting in the matches loops possibly terminating before it should have. The fix skips position 0 to solve this problem.
|
Hi, can this be reflected in the comment at https://github.com/facebook/zstd/blame/25822342be59d831bad65426ae51f5cc22157b09/lib/compress/zstd_lazy.c#L1116-L1118 Thanks! |
Allocate half the memory for tag space, which means that we get one less slot for an actual tag (needs to be used for next position index).
The results is a slight loss in compression ratio (up to 0.2%) and some regressions/improvements to speed depending on level and sample. In turn, we get to save 16% of the hash table's space (5 bytes per entry instead of 6 bytes per entry).
Update:
CR comments fixed.
Benchmark in spreadsheet.
Update 2:
Another interesting benchmark is
fullbench -b41 -l9 enwik8.1kwhich repeatedly usesZSTD_compressStreamto recompress 1024 bytes out of enwik8. Since #3528 hasn't been deployed yet, we can observe that this PR makes the benchmark twice as fast for small data sizes as the majority of time is spent in the tag space intiailization.