Fix string embedding decision with keys#2087
Conversation
zuiderkwast
left a comment
There was a problem hiding this comment.
Thanks, you're right!
So what's the impact of this bug? We sometimes use embedded strings when the total allocation size needed is 65 or 66?
|
That's right. And when we need 65 or 66 bytes, we actually get 80 bytes as this is the next jemalloc bucket. |
ranshid
left a comment
There was a problem hiding this comment.
Thank you Vadym!
The change LGTM. added some minor addition which IMO makes the code more clear.
src/object.c
Outdated
| size += sdsReqSize(key_len, sdsReqType(key_len)) + 1; /* 1 byte for prefixed sds hdr size */ | ||
| } | ||
| size += (expire != -1) * sizeof(long long); | ||
| size += 4 + len; /* embstr header (3) + nullterm (1) */ |
There was a problem hiding this comment.
Although no problem here, I would love to see this part changed as well:
size += sdsReqSize(len, SDS_TYPE_8)
| size += 4 + len; /* embstr header (3) + nullterm (1) */ | |
| size += sdsReqSize(len, SDS_TYPE_8); /* embstr header (3) + nullterm (1) */ |
There was a problem hiding this comment.
Yes, and in that case, we don't need the comment on the right.
There was a problem hiding this comment.
Agree. Let met update the PR.
This commit fixes the size calculation for embedded strings with keys in createStringObjectWithKeyAndExpire(). The previous implementation used a fixed size of 3 bytes for the key's header, which was incorrect. The new implementation properly calculates the required size based on the key's length and SDS type. Added a unit test (test_embedded_string_with_key) to verify that: 1. Short values with long keys are properly encoded as embedded strings 2. Values that would cause the total size to exceed 64 bytes are encoded as raw Signed-off-by: Vadym Khoptynets <[email protected]>
|
@zuiderkwast I think this should be backported right? |
We can if you want, but I don't think it's very important. We save 8 bytes in very few specific cases. It's not a regression because 8.1 saves memory anyway compared too earlier versions. Only when key >= 32, value < 7 and key + value < 48 (or 40 if there is an expire), we save 8 bytes here. In this case we get a 64 bytes allocation jemalloc bucket for the robj and 8 bytes for the value. When the value is >= 7, we need a 16 bytes bucket for it, so the sum is 80 bytes again. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #2087 +/- ##
============================================
+ Coverage 71.05% 71.22% +0.17%
============================================
Files 122 122
Lines 66143 66024 -119
============================================
+ Hits 46997 47025 +28
+ Misses 19146 18999 -147
🚀 New features to boost your workflow:
|
| TEST_ASSERT(embstr_obj->encoding == OBJ_ENCODING_EMBSTR); | ||
|
|
||
| robj *raw_obj = objectSetKeyAndExpire(longer_val_obj, key, -1); | ||
| TEST_ASSERT(raw_obj->encoding == OBJ_ENCODING_RAW); |
There was a problem hiding this comment.
This test case fails in the daily runs for 32-bit:
- https://github.com/valkey-io/valkey/actions/runs/15125828365/job/42517576263#step:10:349
- https://github.com/valkey-io/valkey/actions/runs/15101473588/job/42442959926#step:10:344
@poiuj Can you look into it?
There was a problem hiding this comment.
A pointer is 32 bits so the robj struct is smaller.
|
Bump. This PR broke the daily build. |
|
I have a fix here: |
|
@zuiderkwast thanks for the fix! |
This commit fixes the size calculation for embedded strings with keys in createStringObjectWithKeyAndExpire(). The previous implementation used a fixed size of 3 bytes for the key's header, which was incorrect. The new implementation properly calculates the required size based on the key's length and SDS type. Added a unit test (test_embedded_string_with_key) to verify that: 1. Short values with long keys are properly encoded as embedded strings 2. Values that would cause the total size to exceed 64 bytes are encoded as raw Signed-off-by: Vadym Khoptynets <[email protected]> Signed-off-by: shanwan1 <[email protected]>
This commit fixes the size calculation for embedded strings with keys in createStringObjectWithKeyAndExpire(). The previous implementation used a fixed size of 3 bytes for the key's header, which was incorrect. The new implementation properly calculates the required size based on the key's length and SDS type.
Added a unit test (test_embedded_string_with_key) to verify that: