Fix zero length hash creation with HSETEX#2998
Merged
ranshid merged 1 commit intovalkey-io:unstablefrom Jan 4, 2026
Merged
Conversation
Signed-off-by: Madelyn Olson <[email protected]>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## unstable #2998 +/- ##
============================================
+ Coverage 74.11% 74.33% +0.22%
============================================
Files 129 129
Lines 70894 70894
============================================
+ Hits 52540 52697 +157
+ Misses 18354 18197 -157
🚀 New features to boost your workflow:
|
enjoy-binbin
approved these changes
Jan 3, 2026
Member
|
ranshid
approved these changes
Jan 3, 2026
Member
@madolson before merging this, do you feel that this change will introduce "some odd behavior w.r.t. to keyspace notifications" or were you referring to the rest of the keyspace events? |
Member
Author
It won't change any existing keyspace notifications. I guess I was commenting that we could have changed the keyspace notifications. |
jdheyburn
pushed a commit
to jdheyburn/valkey
that referenced
this pull request
Jan 8, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an empty length string. This is not good behavior as other places in the code will crash if an empty hash is accessed. This change just makes it so that the key is freed if it ends up being empty. This has some odd behavior w.r.t. to keyspace notifications though. Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that exists, it won't send an `HSET` notification or an `HEXPIRE` notification unless foo is already present. It will also send a `DEL` notification if foo is last the field, effectively deleting the key. This isn't consistent with Redis, which will still send an `HSET` and `HDEL` (*NOTE* not `hexpire`) notification if a key is added past the expiration. Likewise, in the case where the key doesn't exist and `HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET, HDEL, and DEL notification in that order. I think it makes sense to keep consistency with what we have today (given that it's been out for awhile), but document the behavior. Basically, if you send a command with an expiration in the past, we will basically ignore the input unless it's "deleting" a key. Signed-off-by: Madelyn Olson <[email protected]>
PingXie
added a commit
to PingXie/valkey
that referenced
this pull request
Jan 27, 2026
Functional improvements:
1. Add merge commit filter to refresh_redis_prs.py
- Skip merge commits ('merge' + 'into' in title)
- Skip release branches
- Skip large PRs (>50 files changed)
- Filtered 35 PRs from database (961 → 926)
2. Raise LAYER2_SIMILARITY_THRESHOLD from 0.75 to 0.85
- Reduces false positives from structural similarity
- Semantic normalization preserves enough info for 85% threshold
- All true positives remain detected (87-100% similarity)
3. Add hybrid SimHash + Patch-ID Layer 1 candidate generation
- Candidate if SimHash >= 0.80 (high confidence)
- Candidate if SimHash >= 0.70 AND patch-id matches
- Reduces false positives while maintaining zero false negatives
Messaging improvements:
- Replace 'FAILED' with 'REQUIRES REVIEW' for neutral tone
- Change 'Potential matches' to 'Similar content detected'
- Update backtest: 'Copied from' → 'Similar to Redis PR(s)'
- Add guidance about reviewing for proper attribution
- Remove all trailing whitespace
Results:
- Backtest accuracy: 88.5% → 95.9% (+7.4pp)
- False positive rate: 65% → 16.7% (-48pp)
- Flagged PRs: 17/212 (8.0%) → 6/212 (2.8%)
- Only false positive: PR valkey-io#2998 (maintenance of adopted feature)
Rationale:
The system detects high similarity but cannot determine intent.
Neutral language requests human review rather than making
accusations. Some matches are acceptable (e.g., bug fixes in
previously-adopted features).
Signed-off-by: Ping Xie <[email protected]>
PingXie
added a commit
to PingXie/valkey
that referenced
this pull request
Jan 27, 2026
Functional improvements:
1. Add merge commit filter to refresh_redis_prs.py
- Skip merge commits ('merge' + 'into' in title)
- Skip release branches
- Skip large PRs (>50 files changed)
- Filtered 35 PRs from database (961 → 926)
2. Raise LAYER2_SIMILARITY_THRESHOLD from 0.75 to 0.85
- Reduces false positives from structural similarity
- Semantic normalization preserves enough info for 85% threshold
- All true positives remain detected (87-100% similarity)
3. Add hybrid SimHash + Patch-ID Layer 1 candidate generation
- Candidate if SimHash >= 0.80 (high confidence)
- Candidate if SimHash >= 0.70 AND patch-id matches
- Reduces false positives while maintaining zero false negatives
Messaging improvements:
- Replace 'FAILED' with 'REQUIRES REVIEW' for neutral tone
- Change 'Potential matches' to 'Similar content detected'
- Update backtest: 'Copied from' → 'Similar to Redis PR(s)'
- Add guidance about reviewing for proper attribution
- Remove all trailing whitespace
Results:
- Backtest accuracy: 88.5% → 95.9% (+7.4pp)
- False positive rate: 65% → 16.7% (-48pp)
- Flagged PRs: 17/212 (8.0%) → 6/212 (2.8%)
- Only false positive: PR valkey-io#2998 (maintenance of adopted feature)
Rationale:
The system detects high similarity but cannot determine intent.
Neutral language requests human review rather than making
accusations. Some matches are acceptable (e.g., bug fixes in
previously-adopted features).
Signed-off-by: Ping Xie <[email protected]>
PingXie
added a commit
to PingXie/valkey
that referenced
this pull request
Jan 27, 2026
Functional improvements:
1. Add merge commit filter to refresh_redis_prs.py
- Skip merge commits ('merge' + 'into' in title)
- Skip release branches
- Skip large PRs (>50 files changed)
- Filtered 35 PRs from database (961 → 926)
2. Raise LAYER2_SIMILARITY_THRESHOLD from 0.75 to 0.85
- Reduces false positives from structural similarity
- Semantic normalization preserves enough info for 85% threshold
- All true positives remain detected (87-100% similarity)
3. Add hybrid SimHash + Patch-ID Layer 1 candidate generation
- Candidate if SimHash >= 0.80 (high confidence)
- Candidate if SimHash >= 0.70 AND patch-id matches
- Reduces false positives while maintaining zero false negatives
Messaging improvements:
- Replace 'FAILED' with 'REQUIRES REVIEW' for neutral tone
- Change 'Potential matches' to 'Similar content detected'
- Update backtest: 'Copied from' → 'Similar to Redis PR(s)'
- Add guidance about reviewing for proper attribution
- Remove all trailing whitespace
Results:
- Backtest accuracy: 88.5% → 95.9% (+7.4pp)
- False positive rate: 65% → 16.7% (-48pp)
- Flagged PRs: 17/212 (8.0%) → 6/212 (2.8%)
- Only false positive: PR valkey-io#2998 (maintenance of adopted feature)
Rationale:
The system detects high similarity but cannot determine intent.
Neutral language requests human review rather than making
accusations. Some matches are acceptable (e.g., bug fixes in
previously-adopted features).
Signed-off-by: Ping Xie <[email protected]>
PingXie
added a commit
to PingXie/valkey
that referenced
this pull request
Jan 27, 2026
Problem: PR valkey-io#2998 is flagged as 93% similar to Redis PRs, but it's just moving 4 lines of code from one location to another (100% exact match). This is trivial refactoring, not copying of new content. Solution: Add detect_code_movement() function that analyzes: 1. Net new lines (added - removed) 2. Percentage of added lines that are exact duplicates of removed Filter criteria: - Skip if net new lines < 5 - Skip if >70% of added lines are exact copies of removed lines Example (PR valkey-io#2998): Added: 4 lines Removed: 4 lines Exact matches: 4/4 (100%) Net new: 0 lines Result: SKIP (trivial code movement) Signed-off-by: Ping Xie <[email protected]>
ranshid
pushed a commit
to ranshid/valkey
that referenced
this pull request
Jan 28, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an empty length string. This is not good behavior as other places in the code will crash if an empty hash is accessed. This change just makes it so that the key is freed if it ends up being empty. This has some odd behavior w.r.t. to keyspace notifications though. Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that exists, it won't send an `HSET` notification or an `HEXPIRE` notification unless foo is already present. It will also send a `DEL` notification if foo is last the field, effectively deleting the key. This isn't consistent with Redis, which will still send an `HSET` and `HDEL` (*NOTE* not `hexpire`) notification if a key is added past the expiration. Likewise, in the case where the key doesn't exist and `HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET, HDEL, and DEL notification in that order. I think it makes sense to keep consistency with what we have today (given that it's been out for awhile), but document the behavior. Basically, if you send a command with an expiration in the past, we will basically ignore the input unless it's "deleting" a key. Signed-off-by: Madelyn Olson <[email protected]>
ranshid
pushed a commit
to ranshid/valkey
that referenced
this pull request
Jan 28, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an empty length string. This is not good behavior as other places in the code will crash if an empty hash is accessed. This change just makes it so that the key is freed if it ends up being empty. This has some odd behavior w.r.t. to keyspace notifications though. Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that exists, it won't send an `HSET` notification or an `HEXPIRE` notification unless foo is already present. It will also send a `DEL` notification if foo is last the field, effectively deleting the key. This isn't consistent with Redis, which will still send an `HSET` and `HDEL` (*NOTE* not `hexpire`) notification if a key is added past the expiration. Likewise, in the case where the key doesn't exist and `HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET, HDEL, and DEL notification in that order. I think it makes sense to keep consistency with what we have today (given that it's been out for awhile), but document the behavior. Basically, if you send a command with an expiration in the past, we will basically ignore the input unless it's "deleting" a key. Signed-off-by: Madelyn Olson <[email protected]>
ranshid
pushed a commit
that referenced
this pull request
Jan 29, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an empty length string. This is not good behavior as other places in the code will crash if an empty hash is accessed. This change just makes it so that the key is freed if it ends up being empty. This has some odd behavior w.r.t. to keyspace notifications though. Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that exists, it won't send an `HSET` notification or an `HEXPIRE` notification unless foo is already present. It will also send a `DEL` notification if foo is last the field, effectively deleting the key. This isn't consistent with Redis, which will still send an `HSET` and `HDEL` (*NOTE* not `hexpire`) notification if a key is added past the expiration. Likewise, in the case where the key doesn't exist and `HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET, HDEL, and DEL notification in that order. I think it makes sense to keep consistency with what we have today (given that it's been out for awhile), but document the behavior. Basically, if you send a command with an expiration in the past, we will basically ignore the input unless it's "deleting" a key. Signed-off-by: Madelyn Olson <[email protected]>
ranshid
added a commit
that referenced
this pull request
Jan 29, 2026
In the original implementation of Hash Field Expiration (#2089), the HSETEX command was implemented to report keyspace notifications only for performed changes. This is mostly aligned with other Hash commands (for example, HDEL will also not report `hdel` event for items which does not exist) The HSETEX case is somewhat different and is more like the `HSET` case. During HSETEX, after the command validations pass, items are ALWAYS "added" to the object, even though they might not actually be added. This case is the same for when the hash object is empty or when all the provided fields do not exist in the object (as reported [here](#2998)) This PR changes the way `HSETEX` will report keyspace notifications so that: 1. `hset` notification will ALWAYS be reported if all command validations pass. 2. `hexpire` will be reported in case the command include an expiration time (even past time) 3. `hxpired` will be reported in case the provided expiration time is in the past (or 0) 4. `hdel` will be reported in case the hash exists (or created as part of the command) and following the command execution it was left empty. 5. we will always return '1' as a return value of tHSETEX command which passed all validations. Before that we returned 1 only if we applied the change cross ALL the input fields, so in case some of them did not exist and a past time was set we would return 0. --------- Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Jacob Murphy <[email protected]>
ranshid
added a commit
to ranshid/valkey
that referenced
this pull request
Jan 29, 2026
…-io#3001) In the original implementation of Hash Field Expiration (valkey-io#2089), the HSETEX command was implemented to report keyspace notifications only for performed changes. This is mostly aligned with other Hash commands (for example, HDEL will also not report `hdel` event for items which does not exist) The HSETEX case is somewhat different and is more like the `HSET` case. During HSETEX, after the command validations pass, items are ALWAYS "added" to the object, even though they might not actually be added. This case is the same for when the hash object is empty or when all the provided fields do not exist in the object (as reported [here](valkey-io#2998)) This PR changes the way `HSETEX` will report keyspace notifications so that: 1. `hset` notification will ALWAYS be reported if all command validations pass. 2. `hexpire` will be reported in case the command include an expiration time (even past time) 3. `hxpired` will be reported in case the provided expiration time is in the past (or 0) 4. `hdel` will be reported in case the hash exists (or created as part of the command) and following the command execution it was left empty. 5. we will always return '1' as a return value of tHSETEX command which passed all validations. Before that we returned 1 only if we applied the change cross ALL the input fields, so in case some of them did not exist and a past time was set we would return 0. --------- Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Jacob Murphy <[email protected]>
ranshid
added a commit
that referenced
this pull request
Jan 29, 2026
In the original implementation of Hash Field Expiration (#2089), the HSETEX command was implemented to report keyspace notifications only for performed changes. This is mostly aligned with other Hash commands (for example, HDEL will also not report `hdel` event for items which does not exist) The HSETEX case is somewhat different and is more like the `HSET` case. During HSETEX, after the command validations pass, items are ALWAYS "added" to the object, even though they might not actually be added. This case is the same for when the hash object is empty or when all the provided fields do not exist in the object (as reported [here](#2998)) This PR changes the way `HSETEX` will report keyspace notifications so that: 1. `hset` notification will ALWAYS be reported if all command validations pass. 2. `hexpire` will be reported in case the command include an expiration time (even past time) 3. `hxpired` will be reported in case the provided expiration time is in the past (or 0) 4. `hdel` will be reported in case the hash exists (or created as part of the command) and following the command execution it was left empty. 5. we will always return '1' as a return value of tHSETEX command which passed all validations. Before that we returned 1 only if we applied the change cross ALL the input fields, so in case some of them did not exist and a past time was set we would return 0. --------- Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Jacob Murphy <[email protected]> Signed-off-by: Ran Shidlansik <[email protected]>
zuiderkwast
pushed a commit
that referenced
this pull request
Jan 30, 2026
In the original implementation of Hash Field Expiration (#2089), the HSETEX command was implemented to report keyspace notifications only for performed changes. This is mostly aligned with other Hash commands (for example, HDEL will also not report `hdel` event for items which does not exist) The HSETEX case is somewhat different and is more like the `HSET` case. During HSETEX, after the command validations pass, items are ALWAYS "added" to the object, even though they might not actually be added. This case is the same for when the hash object is empty or when all the provided fields do not exist in the object (as reported [here](#2998)) This PR changes the way `HSETEX` will report keyspace notifications so that: 1. `hset` notification will ALWAYS be reported if all command validations pass. 2. `hexpire` will be reported in case the command include an expiration time (even past time) 3. `hxpired` will be reported in case the provided expiration time is in the past (or 0) 4. `hdel` will be reported in case the hash exists (or created as part of the command) and following the command execution it was left empty. 5. we will always return '1' as a return value of tHSETEX command which passed all validations. Before that we returned 1 only if we applied the change cross ALL the input fields, so in case some of them did not exist and a past time was set we would return 0. --------- Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Jacob Murphy <[email protected]> Signed-off-by: Ran Shidlansik <[email protected]>
harrylin98
pushed a commit
to harrylin98/valkey_forked
that referenced
this pull request
Feb 19, 2026
…-io#3001) In the original implementation of Hash Field Expiration (valkey-io#2089), the HSETEX command was implemented to report keyspace notifications only for performed changes. This is mostly aligned with other Hash commands (for example, HDEL will also not report `hdel` event for items which does not exist) The HSETEX case is somewhat different and is more like the `HSET` case. During HSETEX, after the command validations pass, items are ALWAYS "added" to the object, even though they might not actually be added. This case is the same for when the hash object is empty or when all the provided fields do not exist in the object (as reported [here](valkey-io#2998)) This PR changes the way `HSETEX` will report keyspace notifications so that: 1. `hset` notification will ALWAYS be reported if all command validations pass. 2. `hexpire` will be reported in case the command include an expiration time (even past time) 3. `hxpired` will be reported in case the provided expiration time is in the past (or 0) 4. `hdel` will be reported in case the hash exists (or created as part of the command) and following the command execution it was left empty. 5. we will always return '1' as a return value of tHSETEX command which passed all validations. Before that we returned 1 only if we applied the change cross ALL the input fields, so in case some of them did not exist and a past time was set we would return 0. --------- Signed-off-by: Ran Shidlansik <[email protected]> Co-authored-by: Jacob Murphy <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
If you send an
HSETEX key EXAT 0 FIELDS 1 foo bar, it will create an empty length string. This is not good behavior as other places in the code will crash if an empty hash is accessed.This change just makes it so that the key is freed if it ends up being empty. This has some odd behavior w.r.t. to keyspace notifications though.
Today, if you do
HSETEX key EXAT 0 FIELDS 1 foo barwith a hash that exists, it won't send anHSETnotification or anHEXPIREnotification unless foo is already present. It will also send aDELnotification if foo is last the field, effectively deleting the key.This isn't consistent with Redis, which will still send an
HSETandHDEL(NOTE nothexpire) notification if a key is added past the expiration. Likewise, in the case where the key doesn't exist andHSETEX key EXAT 0 FIELDS 1 foo baris sent, Redis will send an HSET, HDEL, and DEL notification in that order.I think it makes sense to keep consistency with what we have today (given that it's been out for awhile), but document the behavior. Basically, if you send a command with an expiration in the past, we will basically ignore the input unless it's "deleting" a key.