Skip to content

Fix zero length hash creation with HSETEX#2998

Merged
ranshid merged 1 commit intovalkey-io:unstablefrom
madolson:zero_length_hash
Jan 4, 2026
Merged

Fix zero length hash creation with HSETEX#2998
ranshid merged 1 commit intovalkey-io:unstablefrom
madolson:zero_length_hash

Conversation

@madolson
Copy link
Member

@madolson madolson commented Jan 2, 2026

If you send an HSETEX key EXAT 0 FIELDS 1 foo bar, it will create an empty length string. This is not good behavior as other places in the code will crash if an empty hash is accessed.

This change just makes it so that the key is freed if it ends up being empty. This has some odd behavior w.r.t. to keyspace notifications though.

Today, if you do HSETEX key EXAT 0 FIELDS 1 foo bar with a hash that exists, it won't send an HSET notification or an HEXPIRE notification unless foo is already present. It will also send a DEL notification if foo is last the field, effectively deleting the key.

This isn't consistent with Redis, which will still send an HSET and HDEL (NOTE not hexpire) notification if a key is added past the expiration. Likewise, in the case where the key doesn't exist and HSETEX key EXAT 0 FIELDS 1 foo bar is sent, Redis will send an HSET, HDEL, and DEL notification in that order.

I think it makes sense to keep consistency with what we have today (given that it's been out for awhile), but document the behavior. Basically, if you send a command with an expiration in the past, we will basically ignore the input unless it's "deleting" a key.

@codecov
Copy link

codecov bot commented Jan 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.33%. Comparing base (15e79ef) to head (38ed7db).
⚠️ Report is 4 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2998      +/-   ##
============================================
+ Coverage     74.11%   74.33%   +0.22%     
============================================
  Files           129      129              
  Lines         70894    70894              
============================================
+ Hits          52540    52697     +157     
+ Misses        18354    18197     -157     
Files with missing lines Coverage Δ
src/t_hash.c 95.00% <100.00%> (+0.08%) ⬆️

... and 24 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@ranshid
Copy link
Member

ranshid commented Jan 3, 2026

  1. I agree about the need for this fix, mainly since the fact we should deleted the object and also it is needed to balance as now have 'new' keyspaceevent and it needs to be followed by 'del'
  2. Regarding the Redis behavior of sending hset and hdel events: I guess I am also fine with that.
    The only case where 'changes == 0' is when the hash object was empty (or when the entire fields input set does not exist), so I am also fine with always doing
    'hset' --> 'hexpire' (if exists) --> 'hexpired' (if past expiration time) --> 'del' (if the object is left empty)
    This will be an interface change though (as you stated) maybe I can open a separate issue for that to track.
  3. Lets fix the top comment to change HEXPIRE -> HSETEX

@ranshid ranshid moved this to To be backported in Valkey 9.0 Jan 3, 2026
@ranshid ranshid added the bug Something isn't working label Jan 3, 2026
@ranshid
Copy link
Member

ranshid commented Jan 3, 2026

This change just makes it so that the key is freed if it ends up being empty. This has some odd behavior w.r.t. to keyspace notifications though.

@madolson before merging this, do you feel that this change will introduce "some odd behavior w.r.t. to keyspace notifications" or were you referring to the rest of the keyspace events?

@madolson
Copy link
Member Author

madolson commented Jan 3, 2026

some odd behavior w.r.t. to keyspace notifications

It won't change any existing keyspace notifications. I guess I was commenting that we could have changed the keyspace notifications.

@ranshid ranshid merged commit 263d9ea into valkey-io:unstable Jan 4, 2026
58 checks passed
@ranshid ranshid added the release-notes This issue should get a line item in the release notes label Jan 4, 2026
jdheyburn pushed a commit to jdheyburn/valkey that referenced this pull request Jan 8, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an
empty length string. This is not good behavior as other places in the
code will crash if an empty hash is accessed.

This change just makes it so that the key is freed if it ends up being
empty. This has some odd behavior w.r.t. to keyspace notifications
though.

Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that
exists, it won't send an `HSET` notification or an `HEXPIRE`
notification unless foo is already present. It will also send a `DEL`
notification if foo is last the field, effectively deleting the key.

This isn't consistent with Redis, which will still send an `HSET` and
`HDEL` (*NOTE* not `hexpire`) notification if a key is added past the
expiration. Likewise, in the case where the key doesn't exist and
`HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET,
HDEL, and DEL notification in that order.

I think it makes sense to keep consistency with what we have today
(given that it's been out for awhile), but document the behavior.
Basically, if you send a command with an expiration in the past, we will
basically ignore the input unless it's "deleting" a key.

Signed-off-by: Madelyn Olson <[email protected]>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 27, 2026
Functional improvements:
1. Add merge commit filter to refresh_redis_prs.py
   - Skip merge commits ('merge' + 'into' in title)
   - Skip release branches
   - Skip large PRs (>50 files changed)
   - Filtered 35 PRs from database (961 → 926)

2. Raise LAYER2_SIMILARITY_THRESHOLD from 0.75 to 0.85
   - Reduces false positives from structural similarity
   - Semantic normalization preserves enough info for 85% threshold
   - All true positives remain detected (87-100% similarity)

3. Add hybrid SimHash + Patch-ID Layer 1 candidate generation
   - Candidate if SimHash >= 0.80 (high confidence)
   - Candidate if SimHash >= 0.70 AND patch-id matches
   - Reduces false positives while maintaining zero false negatives

Messaging improvements:
- Replace 'FAILED' with 'REQUIRES REVIEW' for neutral tone
- Change 'Potential matches' to 'Similar content detected'
- Update backtest: 'Copied from' → 'Similar to Redis PR(s)'
- Add guidance about reviewing for proper attribution
- Remove all trailing whitespace

Results:
- Backtest accuracy: 88.5% → 95.9% (+7.4pp)
- False positive rate: 65% → 16.7% (-48pp)
- Flagged PRs: 17/212 (8.0%) → 6/212 (2.8%)
- Only false positive: PR valkey-io#2998 (maintenance of adopted feature)

Rationale:
The system detects high similarity but cannot determine intent.
Neutral language requests human review rather than making
accusations. Some matches are acceptable (e.g., bug fixes in
previously-adopted features).

Signed-off-by: Ping Xie <[email protected]>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 27, 2026
Functional improvements:
1. Add merge commit filter to refresh_redis_prs.py
   - Skip merge commits ('merge' + 'into' in title)
   - Skip release branches
   - Skip large PRs (>50 files changed)
   - Filtered 35 PRs from database (961 → 926)

2. Raise LAYER2_SIMILARITY_THRESHOLD from 0.75 to 0.85
   - Reduces false positives from structural similarity
   - Semantic normalization preserves enough info for 85% threshold
   - All true positives remain detected (87-100% similarity)

3. Add hybrid SimHash + Patch-ID Layer 1 candidate generation
   - Candidate if SimHash >= 0.80 (high confidence)
   - Candidate if SimHash >= 0.70 AND patch-id matches
   - Reduces false positives while maintaining zero false negatives

Messaging improvements:
- Replace 'FAILED' with 'REQUIRES REVIEW' for neutral tone
- Change 'Potential matches' to 'Similar content detected'
- Update backtest: 'Copied from' → 'Similar to Redis PR(s)'
- Add guidance about reviewing for proper attribution
- Remove all trailing whitespace

Results:
- Backtest accuracy: 88.5% → 95.9% (+7.4pp)
- False positive rate: 65% → 16.7% (-48pp)
- Flagged PRs: 17/212 (8.0%) → 6/212 (2.8%)
- Only false positive: PR valkey-io#2998 (maintenance of adopted feature)

Rationale:
The system detects high similarity but cannot determine intent.
Neutral language requests human review rather than making
accusations. Some matches are acceptable (e.g., bug fixes in
previously-adopted features).

Signed-off-by: Ping Xie <[email protected]>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 27, 2026
Functional improvements:
1. Add merge commit filter to refresh_redis_prs.py
   - Skip merge commits ('merge' + 'into' in title)
   - Skip release branches
   - Skip large PRs (>50 files changed)
   - Filtered 35 PRs from database (961 → 926)

2. Raise LAYER2_SIMILARITY_THRESHOLD from 0.75 to 0.85
   - Reduces false positives from structural similarity
   - Semantic normalization preserves enough info for 85% threshold
   - All true positives remain detected (87-100% similarity)

3. Add hybrid SimHash + Patch-ID Layer 1 candidate generation
   - Candidate if SimHash >= 0.80 (high confidence)
   - Candidate if SimHash >= 0.70 AND patch-id matches
   - Reduces false positives while maintaining zero false negatives

Messaging improvements:
- Replace 'FAILED' with 'REQUIRES REVIEW' for neutral tone
- Change 'Potential matches' to 'Similar content detected'
- Update backtest: 'Copied from' → 'Similar to Redis PR(s)'
- Add guidance about reviewing for proper attribution
- Remove all trailing whitespace

Results:
- Backtest accuracy: 88.5% → 95.9% (+7.4pp)
- False positive rate: 65% → 16.7% (-48pp)
- Flagged PRs: 17/212 (8.0%) → 6/212 (2.8%)
- Only false positive: PR valkey-io#2998 (maintenance of adopted feature)

Rationale:
The system detects high similarity but cannot determine intent.
Neutral language requests human review rather than making
accusations. Some matches are acceptable (e.g., bug fixes in
previously-adopted features).

Signed-off-by: Ping Xie <[email protected]>
PingXie added a commit to PingXie/valkey that referenced this pull request Jan 27, 2026
Problem:
PR valkey-io#2998 is flagged as 93% similar to Redis PRs, but it's just moving
4 lines of code from one location to another (100% exact match). This
is trivial refactoring, not copying of new content.

Solution:
Add detect_code_movement() function that analyzes:
1. Net new lines (added - removed)
2. Percentage of added lines that are exact duplicates of removed

Filter criteria:
- Skip if net new lines < 5
- Skip if >70% of added lines are exact copies of removed lines

Example (PR valkey-io#2998):
  Added: 4 lines
  Removed: 4 lines
  Exact matches: 4/4 (100%)
  Net new: 0 lines
  Result: SKIP (trivial code movement)

Signed-off-by: Ping Xie <[email protected]>
ranshid pushed a commit to ranshid/valkey that referenced this pull request Jan 28, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an
empty length string. This is not good behavior as other places in the
code will crash if an empty hash is accessed.

This change just makes it so that the key is freed if it ends up being
empty. This has some odd behavior w.r.t. to keyspace notifications
though.

Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that
exists, it won't send an `HSET` notification or an `HEXPIRE`
notification unless foo is already present. It will also send a `DEL`
notification if foo is last the field, effectively deleting the key.

This isn't consistent with Redis, which will still send an `HSET` and
`HDEL` (*NOTE* not `hexpire`) notification if a key is added past the
expiration. Likewise, in the case where the key doesn't exist and
`HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET,
HDEL, and DEL notification in that order.

I think it makes sense to keep consistency with what we have today
(given that it's been out for awhile), but document the behavior.
Basically, if you send a command with an expiration in the past, we will
basically ignore the input unless it's "deleting" a key.

Signed-off-by: Madelyn Olson <[email protected]>
@zuiderkwast zuiderkwast moved this from To be backported to 9.0.2 WIP in Valkey 9.0 Jan 28, 2026
ranshid pushed a commit to ranshid/valkey that referenced this pull request Jan 28, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an
empty length string. This is not good behavior as other places in the
code will crash if an empty hash is accessed.

This change just makes it so that the key is freed if it ends up being
empty. This has some odd behavior w.r.t. to keyspace notifications
though.

Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that
exists, it won't send an `HSET` notification or an `HEXPIRE`
notification unless foo is already present. It will also send a `DEL`
notification if foo is last the field, effectively deleting the key.

This isn't consistent with Redis, which will still send an `HSET` and
`HDEL` (*NOTE* not `hexpire`) notification if a key is added past the
expiration. Likewise, in the case where the key doesn't exist and
`HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET,
HDEL, and DEL notification in that order.

I think it makes sense to keep consistency with what we have today
(given that it's been out for awhile), but document the behavior.
Basically, if you send a command with an expiration in the past, we will
basically ignore the input unless it's "deleting" a key.

Signed-off-by: Madelyn Olson <[email protected]>
ranshid pushed a commit that referenced this pull request Jan 29, 2026
If you send an `HSETEX key EXAT 0 FIELDS 1 foo bar`, it will create an
empty length string. This is not good behavior as other places in the
code will crash if an empty hash is accessed.

This change just makes it so that the key is freed if it ends up being
empty. This has some odd behavior w.r.t. to keyspace notifications
though.

Today, if you do `HSETEX key EXAT 0 FIELDS 1 foo bar` with a hash that
exists, it won't send an `HSET` notification or an `HEXPIRE`
notification unless foo is already present. It will also send a `DEL`
notification if foo is last the field, effectively deleting the key.

This isn't consistent with Redis, which will still send an `HSET` and
`HDEL` (*NOTE* not `hexpire`) notification if a key is added past the
expiration. Likewise, in the case where the key doesn't exist and
`HSETEX key EXAT 0 FIELDS 1 foo bar` is sent, Redis will send an HSET,
HDEL, and DEL notification in that order.

I think it makes sense to keep consistency with what we have today
(given that it's been out for awhile), but document the behavior.
Basically, if you send a command with an expiration in the past, we will
basically ignore the input unless it's "deleting" a key.

Signed-off-by: Madelyn Olson <[email protected]>
ranshid added a commit that referenced this pull request Jan 29, 2026
In the original implementation of Hash Field Expiration
(#2089), the HSETEX command was
implemented to report keyspace notifications only for performed changes.
This is mostly aligned with other Hash commands (for example, HDEL will
also not report `hdel` event for items which does not exist)
The HSETEX case is somewhat different and is more like the `HSET` case.
During HSETEX, after the command validations pass, items are ALWAYS
"added" to the object, even though they might not actually be added.
This case is the same for when the hash object is empty or when all the
provided fields do not exist in the object (as reported
[here](#2998))

This PR changes the way `HSETEX` will report keyspace notifications so
that:
1. `hset` notification will ALWAYS be reported if all command
validations pass.
2. `hexpire` will be reported in case the command include an expiration
time (even past time)
3. `hxpired` will be reported in case the provided expiration time is in
the past (or 0)
4. `hdel` will be reported in case the hash exists (or created as part
of the command) and following the command execution it was left empty.
5. we will always return '1' as a return value of tHSETEX command which
passed all validations. Before that we returned 1 only if we applied the
change cross ALL the input fields, so in case some of them did not exist
and a past time was set we would return 0.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Jacob Murphy <[email protected]>
ranshid added a commit to ranshid/valkey that referenced this pull request Jan 29, 2026
…-io#3001)

In the original implementation of Hash Field Expiration
(valkey-io#2089), the HSETEX command was
implemented to report keyspace notifications only for performed changes.
This is mostly aligned with other Hash commands (for example, HDEL will
also not report `hdel` event for items which does not exist)
The HSETEX case is somewhat different and is more like the `HSET` case.
During HSETEX, after the command validations pass, items are ALWAYS
"added" to the object, even though they might not actually be added.
This case is the same for when the hash object is empty or when all the
provided fields do not exist in the object (as reported
[here](valkey-io#2998))

This PR changes the way `HSETEX` will report keyspace notifications so
that:
1. `hset` notification will ALWAYS be reported if all command
validations pass.
2. `hexpire` will be reported in case the command include an expiration
time (even past time)
3. `hxpired` will be reported in case the provided expiration time is in
the past (or 0)
4. `hdel` will be reported in case the hash exists (or created as part
of the command) and following the command execution it was left empty.
5. we will always return '1' as a return value of tHSETEX command which
passed all validations. Before that we returned 1 only if we applied the
change cross ALL the input fields, so in case some of them did not exist
and a past time was set we would return 0.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Jacob Murphy <[email protected]>
ranshid added a commit that referenced this pull request Jan 29, 2026
In the original implementation of Hash Field Expiration
(#2089), the HSETEX command was
implemented to report keyspace notifications only for performed changes.
This is mostly aligned with other Hash commands (for example, HDEL will
also not report `hdel` event for items which does not exist)
The HSETEX case is somewhat different and is more like the `HSET` case.
During HSETEX, after the command validations pass, items are ALWAYS
"added" to the object, even though they might not actually be added.
This case is the same for when the hash object is empty or when all the
provided fields do not exist in the object (as reported
[here](#2998))

This PR changes the way `HSETEX` will report keyspace notifications so
that:
1. `hset` notification will ALWAYS be reported if all command
validations pass.
2. `hexpire` will be reported in case the command include an expiration
time (even past time)
3. `hxpired` will be reported in case the provided expiration time is in
the past (or 0)
4. `hdel` will be reported in case the hash exists (or created as part
of the command) and following the command execution it was left empty.
5. we will always return '1' as a return value of tHSETEX command which
passed all validations. Before that we returned 1 only if we applied the
change cross ALL the input fields, so in case some of them did not exist
and a past time was set we would return 0.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Jacob Murphy <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
zuiderkwast pushed a commit that referenced this pull request Jan 30, 2026
In the original implementation of Hash Field Expiration
(#2089), the HSETEX command was
implemented to report keyspace notifications only for performed changes.
This is mostly aligned with other Hash commands (for example, HDEL will
also not report `hdel` event for items which does not exist)
The HSETEX case is somewhat different and is more like the `HSET` case.
During HSETEX, after the command validations pass, items are ALWAYS
"added" to the object, even though they might not actually be added.
This case is the same for when the hash object is empty or when all the
provided fields do not exist in the object (as reported
[here](#2998))

This PR changes the way `HSETEX` will report keyspace notifications so
that:
1. `hset` notification will ALWAYS be reported if all command
validations pass.
2. `hexpire` will be reported in case the command include an expiration
time (even past time)
3. `hxpired` will be reported in case the provided expiration time is in
the past (or 0)
4. `hdel` will be reported in case the hash exists (or created as part
of the command) and following the command execution it was left empty.
5. we will always return '1' as a return value of tHSETEX command which
passed all validations. Before that we returned 1 only if we applied the
change cross ALL the input fields, so in case some of them did not exist
and a past time was set we would return 0.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Jacob Murphy <[email protected]>
Signed-off-by: Ran Shidlansik <[email protected]>
harrylin98 pushed a commit to harrylin98/valkey_forked that referenced this pull request Feb 19, 2026
…-io#3001)

In the original implementation of Hash Field Expiration
(valkey-io#2089), the HSETEX command was
implemented to report keyspace notifications only for performed changes.
This is mostly aligned with other Hash commands (for example, HDEL will
also not report `hdel` event for items which does not exist)
The HSETEX case is somewhat different and is more like the `HSET` case.
During HSETEX, after the command validations pass, items are ALWAYS
"added" to the object, even though they might not actually be added.
This case is the same for when the hash object is empty or when all the
provided fields do not exist in the object (as reported
[here](valkey-io#2998))

This PR changes the way `HSETEX` will report keyspace notifications so
that:
1. `hset` notification will ALWAYS be reported if all command
validations pass.
2. `hexpire` will be reported in case the command include an expiration
time (even past time)
3. `hxpired` will be reported in case the provided expiration time is in
the past (or 0)
4. `hdel` will be reported in case the hash exists (or created as part
of the command) and following the command execution it was left empty.
5. we will always return '1' as a return value of tHSETEX command which
passed all validations. Before that we returned 1 only if we applied the
change cross ALL the input fields, so in case some of them did not exist
and a past time was set we would return 0.

---------

Signed-off-by: Ran Shidlansik <[email protected]>
Co-authored-by: Jacob Murphy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working release-notes This issue should get a line item in the release notes

Projects

Status: 9.0.2
Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants