Skip to content

Fix memory corruption in sharded pubsub unsubscribe#2137

Merged
ranshid merged 2 commits intovalkey-io:unstablefrom
uriyage:fix_pubsub_unsubscribe_channel
May 25, 2025
Merged

Fix memory corruption in sharded pubsub unsubscribe#2137
ranshid merged 2 commits intovalkey-io:unstablefrom
uriyage:fix_pubsub_unsubscribe_channel

Conversation

@uriyage
Copy link
Contributor

@uriyage uriyage commented May 25, 2025

This commit fixes two issues in pubsubUnsubscribeChannel that could lead to memory corruption:

  1. When calculating the slot for a channel, we were using getKeySlot() which might use the current_client's slot if available. This is problematic when a client kills another client (e.g., via CLIENT KILL command) as the slot won't match the channel's actual slot.

  2. The found variable was not initialized to NULL, causing the serverAssert to potentially pass incorrectly when the hashtable lookup failed, leading to memory corruption in subsequent operations.

The fix:

  • Calculate the slot directly from the channel name using keyHashSlot() instead of relying on the current client's slot
  • Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill another client that is subscribed to a sharded pubsub channel during a transaction.

Crash log (After initializing the variable 'found' to null, without initialization, memory corruption could occur):

VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840

This commit fixes two issues in pubsubUnsubscribeChannel that could lead to
memory corruption:

1. The 'found' variable was not initialized to NULL, causing the serverAssert
   to potentially pass incorrectly when the hashtable lookup failed, leading
   to memory corruption in subsequent operations.

2. When calculating the slot for a channel, we were using getKeySlot() which
   might use the current_client's slot if available. This is problematic when
   a client kills another client (e.g., via CLIENT KILL command) as the slot
   won't match the channel's actual slot.

The fix:
- Initialize 'found' to NULL
- Calculate the slot directly from the channel name using keyHashSlot()
  instead of relying on the current client's slot

Added a test case that reproduces the issue by having one client kill another
client that is subscribed to a sharded pubsub channel during a transaction.

Signed-off-by: Uri Yagelnik <[email protected]>
@uriyage uriyage force-pushed the fix_pubsub_unsubscribe_channel branch from 5b98c2b to 644c7f2 Compare May 25, 2025 09:16
@codecov
Copy link

codecov bot commented May 25, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.03%. Comparing base (053bf9e) to head (5d32f50).
Report is 2 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2137      +/-   ##
============================================
- Coverage     71.18%   71.03%   -0.16%     
============================================
  Files           122      122              
  Lines         66046    66045       -1     
============================================
- Hits          47013    46912     -101     
- Misses        19033    19133     +100     
Files with missing lines Coverage Δ
src/pubsub.c 96.79% <100.00%> (ø)

... and 12 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@ranshid ranshid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch (maybe state this is related to the fuzzer infrastructure you are about to present soon)

Overall LGTM, minor comments

Signed-off-by: Uri Yagelnik <[email protected]>
@ranshid ranshid merged commit bd5dcb2 into valkey-io:unstable May 25, 2025
50 of 51 checks passed
@ranshid ranshid added the release-notes This issue should get a line item in the release notes label May 25, 2025
@ranshid ranshid removed this from Valkey 7.2 May 25, 2025
@ranshid ranshid moved this to To be backported in Valkey 8.0 May 25, 2025
@ranshid ranshid moved this to To be backported in Valkey 8.1 May 25, 2025
@ranshid ranshid added the bug Something isn't working label May 25, 2025
hpatro pushed a commit to hpatro/valkey that referenced this pull request Jun 4, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.


The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.


Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
hpatro pushed a commit to hpatro/valkey that referenced this pull request Jun 4, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Signed-off-by: Harkrishn Patro <[email protected]>
hpatro pushed a commit that referenced this pull request Jun 9, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Signed-off-by: Harkrishn Patro <[email protected]>
hpatro pushed a commit that referenced this pull request Jun 11, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Signed-off-by: Harkrishn Patro <[email protected]>
@hpatro hpatro moved this from To be backported to 8.1.2 in Valkey 8.1 Jun 11, 2025
vitarb pushed a commit to vitarb/valkey that referenced this pull request Jun 12, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
(cherry picked from commit bd5dcb2)
@vitarb vitarb mentioned this pull request Jun 13, 2025
vitarb pushed a commit to vitarb/valkey that referenced this pull request Jun 13, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
(cherry picked from commit bd5dcb2)
shanwan1 pushed a commit to shanwan1/valkey that referenced this pull request Jun 13, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Signed-off-by: shanwan1 <[email protected]>
@uriyage uriyage mentioned this pull request Jul 10, 2025
zuiderkwast pushed a commit to vitarb/valkey that referenced this pull request Aug 15, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
(cherry picked from commit bd5dcb2)
zuiderkwast pushed a commit to vitarb/valkey that referenced this pull request Aug 15, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
(cherry picked from commit bd5dcb2)
Signed-off-by: Viktor Söderqvist <[email protected]>
@zuiderkwast zuiderkwast moved this from To be backported to 8.0.5 in Valkey 8.0 Aug 18, 2025
zuiderkwast pushed a commit to vitarb/valkey that referenced this pull request Aug 21, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
(cherry picked from commit bd5dcb2)
Signed-off-by: Viktor Söderqvist <[email protected]>
zuiderkwast pushed a commit that referenced this pull request Aug 22, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
(cherry picked from commit bd5dcb2)
Signed-off-by: Viktor Söderqvist <[email protected]>
sarthakaggarwal97 pushed a commit to sarthakaggarwal97/valkey that referenced this pull request Sep 16, 2025
This commit fixes two issues in `pubsubUnsubscribeChannel` that could
lead to memory corruption:

1. When calculating the slot for a channel, we were using getKeySlot()
which might use the current_client's slot if available. This is
problematic when a client kills another client (e.g., via CLIENT KILL
command) as the slot won't match the channel's actual slot.

2. The `found` variable was not initialized to `NULL`, causing the
serverAssert to potentially pass incorrectly when the hashtable lookup
failed, leading to memory corruption in subsequent operations.

The fix:
- Calculate the slot directly from the channel name using keyHashSlot()
instead of relying on the current client's slot
- Initialize 'found' to NULL

Added a test case that reproduces the issue by having one client kill
another client that is subscribed to a sharded pubsub channel during a
transaction.

Crash log (After initializing the variable 'found' to null, without
initialization, memory corruption could occur):
```
VALKEY BUG REPORT START: Cut & paste starting from here ===
59707:M 24 May 2025 23:10:40.429 # === ASSERTION FAILED CLIENT CONTEXT ===
59707:M 24 May 2025 23:10:40.429 # client->flags = 108086391057154048
59707:M 24 May 2025 23:10:40.429 # client->conn = fd=11
59707:M 24 May 2025 23:10:40.429 # client->argc = 0
59707:M 24 May 2025 23:10:40.429 # === RECURSIVE ASSERTION FAILED ===
59707:M 24 May 2025 23:10:40.429 # ==> pubsub.c:348 'found' is not true

------ STACK TRACE ------

Backtrace:
0   valkey-server                       0x0000000104974054 _serverAssertWithInfo + 112
1   valkey-server                       0x000000010496c7fc pubsubUnsubscribeChannel + 268
2   valkey-server                       0x000000010496cea0 pubsubUnsubscribeAllChannelsInternal + 216
3   valkey-server                       0x000000010496c2e0 pubsubUnsubscribeShardAllChannels + 76
4   valkey-server                       0x000000010496c1d4 freeClientPubSubData + 60
5   valkey-server                       0x00000001048f3cbc freeClient + 792
6   valkey-server                       0x0000000104900870 clientKillCommand + 356
7   valkey-server                       0x00000001048d1790 call + 428
8   valkey-server                       0x000000010496ef4c execCommand + 872
9   valkey-server                       0x00000001048d1790 call + 428
10  valkey-server                       0x00000001048d3a44 processCommand + 5056
11  valkey-server                       0x00000001048fdc20 processCommandAndResetClient + 64
12  valkey-server                       0x00000001048fdeac processInputBuffer + 276
13  valkey-server                       0x00000001048f2ff0 readQueryFromClient + 148
14  valkey-server                       0x0000000104a182e8 callHandler + 60
15  valkey-server                       0x0000000104a1731c connSocketEventHandler + 488
16  valkey-server                       0x00000001048b5e80 aeProcessEvents + 820
17  valkey-server                       0x00000001048b6598 aeMain + 64
18  valkey-server                       0x00000001048dcecc main + 4084
19  dyld                                0x0000000186b34274 start + 2840
````

---------

Signed-off-by: Uri Yagelnik <[email protected]>
(cherry picked from commit bd5dcb2)
Signed-off-by: Viktor Söderqvist <[email protected]>
ranshid pushed a commit that referenced this pull request Jan 1, 2026
## Add Fuzzing Capability to Valkey

### Overview
This PR adds a fuzzing capability to Valkey, allowing developers and
users to stress test their Valkey deployments with randomly generated
commands. The fuzzer is integrated with the existing valkey-benchmark
tool, making it easy to use without requiring additional dependencies.

### Key Features
• **Command Generator**: Automatically generates Valkey commands by
retrieving command information directly from the server
• **Two Fuzzing Modes**:
- normal: Generates only valid commands, doesn't modify server
configurations
- aggressive: Includes malformed commands and allows CONFIG SET
operations

• **Multi-threaded Testing**: Each client runs in a dedicated thread to
maximize interaction between clients and enable testing of complicated
scenarios

• **Integration with valkey-benchmark**: Uses the existing CLI interface

### Implementation Details
• Added new files:
- `fuzzer_command_generator.h/c`: Dynamically generates valkey commands.
- `fuzzer_client.c`: Orchestrate all the client threads, report test
progress, and handle errors.

• Modified existing files:
  - valkey-benchmark.c: Added fuzzing mode options and integration

### Command Generation Approach
The fuzzer dynamically retrieves command information from the server,
allowing it to adapt to different Valkey versions and custom modules.
Since the command information generated from JSON files is sometimes
limited, not all generated commands will be valid, but approximately 95%
valid command generation is achieved.

It is important to generate valid commands to cover as much code path as
possible and not just the invalid command/args path. The fuzzer
prioritizes generating syntactically and semantically correct commands
to ensure thorough testing of the server's core functionality, while
still including a small percentage of invalid commands in `aggressive`
mode to test error handling paths

#### Config modification
For CONFIG SET command, the situation is more complex as the server
currently provides limited information through CONFIG GET *. Some
hardcoded logic is implemented that will need to be modified in the
future. Ideally, the server should provide self-inspection commands to
retrieve config keys-values with their properties (enum values,
modifiability status, etc.).

### Issue Detection
The fuzzer is designed to identify several types of issues:
• Server crashes
• Server memory corruptions / memory leaks(when compiled with ASAN)
• Server unresponsiveness
• Server malformed replies

For unresponsiveness detection, command timeout limits are implemented
to ensure no command blocks for excessive periods. If a server doesn't
respond within 30 seconds, the fuzzer signals that something is wrong.

### Proven Effectiveness
When running against the latest unstable version, the fuzzer has already
identified several issues, demonstrating its effectiveness:
* #2111
* #2112
* #2109
* #2113
* #2108
* #2137
* #2106
* #2347
* #2973
* #2974
### How to Use
Run the fuzzer using the valkey-benchmark tool with the --fuzz flag:

```bash
# Basic usage (10000 commands 1000 commands per client, 10 clients)
./src/valkey-benchmark --fuzz -h 127.0.0.1 -p 6379 -n 10000 -c 10

# With aggressive fuzzing mode
./src/valkey-benchmark --fuzz --fuzz-level aggressive -h 127.0.0.1 -p 6379 -n 10000 -c 10

# With detailed logging
./src/valkey-benchmark --fuzz --fuzz-log-level debug -h 127.0.0.1 -p 6379 -n 10000 -c 10
```

The fuzzer supports existing valkey-benchmark options, including TLS and
cluster mode configuration.

---------

Signed-off-by: Uri Yagelnik <[email protected]>
jdheyburn pushed a commit to jdheyburn/valkey that referenced this pull request Jan 8, 2026
## Add Fuzzing Capability to Valkey

### Overview
This PR adds a fuzzing capability to Valkey, allowing developers and
users to stress test their Valkey deployments with randomly generated
commands. The fuzzer is integrated with the existing valkey-benchmark
tool, making it easy to use without requiring additional dependencies.

### Key Features
• **Command Generator**: Automatically generates Valkey commands by
retrieving command information directly from the server
• **Two Fuzzing Modes**:
- normal: Generates only valid commands, doesn't modify server
configurations
- aggressive: Includes malformed commands and allows CONFIG SET
operations

• **Multi-threaded Testing**: Each client runs in a dedicated thread to
maximize interaction between clients and enable testing of complicated
scenarios

• **Integration with valkey-benchmark**: Uses the existing CLI interface

### Implementation Details
• Added new files:
- `fuzzer_command_generator.h/c`: Dynamically generates valkey commands.
- `fuzzer_client.c`: Orchestrate all the client threads, report test
progress, and handle errors.

• Modified existing files:
  - valkey-benchmark.c: Added fuzzing mode options and integration

### Command Generation Approach
The fuzzer dynamically retrieves command information from the server,
allowing it to adapt to different Valkey versions and custom modules.
Since the command information generated from JSON files is sometimes
limited, not all generated commands will be valid, but approximately 95%
valid command generation is achieved.

It is important to generate valid commands to cover as much code path as
possible and not just the invalid command/args path. The fuzzer
prioritizes generating syntactically and semantically correct commands
to ensure thorough testing of the server's core functionality, while
still including a small percentage of invalid commands in `aggressive`
mode to test error handling paths

#### Config modification
For CONFIG SET command, the situation is more complex as the server
currently provides limited information through CONFIG GET *. Some
hardcoded logic is implemented that will need to be modified in the
future. Ideally, the server should provide self-inspection commands to
retrieve config keys-values with their properties (enum values,
modifiability status, etc.).

### Issue Detection
The fuzzer is designed to identify several types of issues:
• Server crashes
• Server memory corruptions / memory leaks(when compiled with ASAN)
• Server unresponsiveness
• Server malformed replies

For unresponsiveness detection, command timeout limits are implemented
to ensure no command blocks for excessive periods. If a server doesn't
respond within 30 seconds, the fuzzer signals that something is wrong.

### Proven Effectiveness
When running against the latest unstable version, the fuzzer has already
identified several issues, demonstrating its effectiveness:
* valkey-io#2111
* valkey-io#2112
* valkey-io#2109
* valkey-io#2113
* valkey-io#2108
* valkey-io#2137
* valkey-io#2106
* valkey-io#2347
* valkey-io#2973
* valkey-io#2974
### How to Use
Run the fuzzer using the valkey-benchmark tool with the --fuzz flag:

```bash
# Basic usage (10000 commands 1000 commands per client, 10 clients)
./src/valkey-benchmark --fuzz -h 127.0.0.1 -p 6379 -n 10000 -c 10

# With aggressive fuzzing mode
./src/valkey-benchmark --fuzz --fuzz-level aggressive -h 127.0.0.1 -p 6379 -n 10000 -c 10

# With detailed logging
./src/valkey-benchmark --fuzz --fuzz-log-level debug -h 127.0.0.1 -p 6379 -n 10000 -c 10
```

The fuzzer supports existing valkey-benchmark options, including TLS and
cluster mode configuration.

---------

Signed-off-by: Uri Yagelnik <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working release-notes This issue should get a line item in the release notes

Projects

Status: 8.0.5
Status: 8.1.2

Development

Successfully merging this pull request may close these issues.

2 participants