Valkey Fuzzer by uriyage · Pull Request #2340 · valkey-io/valkey

uriyage · 2025-07-10T19:59:54Z

Add Fuzzing Capability to Valkey

Overview

This PR adds a fuzzing capability to Valkey, allowing developers and users to stress test their Valkey deployments with randomly generated commands. The fuzzer is integrated with the existing valkey-benchmark tool, making it easy to use without requiring additional dependencies.

Key Features

• Command Generator: Automatically generates Valkey commands by retrieving command information directly from the server
• Two Fuzzing Modes:

normal: Generates only valid commands, doesn't modify server configurations
aggressive: Includes malformed commands and allows CONFIG SET operations

• Multi-threaded Testing: Each client runs in a dedicated thread to maximize interaction between clients and enable testing of complicated scenarios

• Integration with valkey-benchmark: Uses the existing CLI interface

Implementation Details

• Added new files:

fuzzer_command_generator.h/c: Dynamically generates valkey commands.
fuzzer_client.c: Orchestrate all the client threads, report test progress, and handle errors.

• Modified existing files:

valkey-benchmark.c: Added fuzzing mode options and integration

Command Generation Approach

The fuzzer dynamically retrieves command information from the server, allowing it to adapt to different Valkey versions and custom modules. Since the command information generated from JSON files is sometimes limited, not all generated commands will be valid, but approximately 95% valid command generation is achieved.

It is important to generate valid commands to cover as much code path as possible and not just the invalid command/args path. The fuzzer prioritizes generating syntactically and semantically correct commands to ensure thorough testing of the server's core functionality, while still including a small percentage of invalid commands in aggressive mode to test error handling paths

Config modification

For CONFIG SET command, the situation is more complex as the server currently provides limited information through CONFIG GET *. Some hardcoded logic is implemented that will need to be modified in the future. Ideally, the server should provide self-inspection commands to retrieve config keys-values with their properties (enum values, modifiability status, etc.).

Issue Detection

The fuzzer is designed to identify several types of issues:
• Server crashes
• Server memory corruptions / memory leaks(when compiled with ASAN)
• Server unresponsiveness
• Server malformed replies

For unresponsiveness detection, command timeout limits are implemented to ensure no command blocks for excessive periods. If a server doesn't respond within 30 seconds, the fuzzer signals that something is wrong.

Proven Effectiveness

When running against the latest unstable version, the fuzzer has already identified several issues, demonstrating its effectiveness:

How to Use

Run the fuzzer using the valkey-benchmark tool with the --fuzz flag:

# Basic usage (10000 commands 1000 commands per client, 10 clients)
./src/valkey-benchmark --fuzz -h 127.0.0.1 -p 6379 -n 10000 -c 10

# With aggressive fuzzing mode
./src/valkey-benchmark --fuzz --fuzz-level aggressive -h 127.0.0.1 -p 6379 -n 10000 -c 10

# With detailed logging
./src/valkey-benchmark --fuzz --fuzz-log-level debug -h 127.0.0.1 -p 6379 -n 10000 -c 10

The fuzzer supports existing valkey-benchmark options, including TLS and cluster mode configuration.

codecov · 2025-07-10T20:15:30Z

Codecov Report

❌ Patch coverage is 74.53638% with 357 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.37%. Comparing base (b565432) to head (b0f0e0b).
⚠️ Report is 1 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/fuzzer_command_generator.c	76.85%	228 Missing ⚠️
src/fuzzer_client.c	70.25%	116 Missing ⚠️
src/valkey-benchmark.c	51.85%	13 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #2340      +/-   ##
============================================
+ Coverage     73.90%   74.37%   +0.46%     
============================================
  Files           125      127       +2     
  Lines         69355    70756    +1401     
============================================
+ Hits          51259    52623    +1364     
- Misses        18096    18133      +37

Files with missing lines	Coverage Δ
src/valkey-benchmark.c	`61.50% <51.85%> (-0.43%)`	⬇️
src/fuzzer_client.c	`70.25% <70.25%> (ø)`
src/fuzzer_command_generator.c	`76.85% <76.85%> (ø)`

... and 38 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

hpatro

Nice work @uriyage! Gave it a run locally, works pretty well.

I had a question about the final report, do you manually analyze all the errors and spot the issues, what is your mechanism around it?

src/fuzzer_client.c

src/fuzzer_command_generator.h

src/fuzzer_command_generator.c

src/fuzzer_client.c

ranshid

@uriyage shell we also add a daily job to run the fuzzer?

hpatro · 2025-07-11T19:58:10Z

@uriyage shell we also add a daily job to run the fuzzer?

I also thought about this but unsure about the triage part?

uriyage · 2025-07-13T16:45:32Z

Nice work @uriyage! Gave it a run locally, works pretty well.

I had a question about the final report, do you manually analyze all the errors and spot the issues, what is your mechanism around it?

The errors can be divided into two groups: server-side-observable and client-side-observable.

For server-side issues, we can identify crashes, memory corruptions, or memory leaks. With client-side issues,
we can observe timeouts (where the server is unexpectedly unresponsive) or malformed replies from the server
that indicate problems with reply generation.

For the first two server-side issues, which are the most common, we can simply validate that after the fuzzer run the server didn't crash and validate that no memory issues were reported (when compiled with ASAN).

Client-side issues are more difficult to root cause and require manual work to understand what went wrong. We
could potentially add in the future more server-side debug capabilities, such as having the server inspect its own output and
crash whenever it sends a malformed reply. We could also add some thread monitoring capability to identify when the server
becomes unresponsive.

uriyage · 2025-07-13T16:47:35Z

@uriyage shell we also add a daily job to run the fuzzer?

Thanks, I added it to the TCL tests so it will run with all the variations we currently use for TCL testing. Currently, the test will be considered a failure only if the server crashes or becomes unresponsive after the fuzzer run

madolson · 2025-08-06T16:53:07Z

This isn't required for 9.0, but I would like us to try to get it merged after the 9.0 rc-1 goes out.

rainsupreme

This is great! This seems quite thoroughly designed, and I like it a lot 😁

A few questions/suggestions:

Can we add unit tests? Or make AI do it? (actual UTs, in src/unit and written in C++)
How difficult is it to keep the fuzzer up to date if more commands or arguments are added in the future?
Could it potentially support fuzzing module commands in the future?

src/valkey-benchmark.c

src/fuzzer_command_generator.c

src/valkey-benchmark.c

src/fuzzer_command_generator.c

uriyage · 2025-11-12T12:40:34Z

This is great! This seems quite thoroughly designed, and I like it a lot 😁

A few questions/suggestions:

Can we add unit tests? Or make AI do it? (actual UTs, in src/unit and written in C++)

How difficult is it to keep the fuzzer up to date if more commands or arguments are added in the future?

Could it potentially support fuzzing module commands in the future?

@rainsupreme

Yes, I will try to do it as a follow up item.
The fuzzer dynamically retrieves commands and arguments from the server at runtime, so new commands are automatically tested as long as they're properly defined in the JSON specification files. Manual updates are only needed for edge cases like filtering dangerous commands or handling special argument patterns.
Yes, it already supports module commands. The fuzzer dynamically retrieves all commands via COMMAND DOCS, including module commands, so they're automatically tested as long as the module supplies proper documentation in the JSON

rainsupreme

These are all good improvements! In fuzzer_command_generator.c around line 1382 I'd still like to see fixes for the reduced randomness/entropy caused by int r = rand() % 1000; and where it's passed into other functions instead of calling rand() again. (why?)

Perhaps this is for a future improvement, but do you think it'd be possible for a sequence of commands to be needed to reproduce a failure? If we want to consider that possibility then we'd want to be able to reproduce the sequence of commands that led to the failure. It seems you've been running the fuzzer for a while - have you encountered any failures you haven't been able to reproduce?

rainsupreme · 2025-11-24T21:09:16Z

src/fuzzer_command_generator.c

Why not int r = rand()? I'm not sure why we're constraining it to the 0-999 range here, and I'm also not sure why we're passing this r value to other functions (where I commented about the misleadingly low entropy) instead of calling rand() again. 😕

Thanks @rainsupreme ! I’ve updated the code to avoid reusing rand() results , as suggested.

Regarding reproducibility, simply knowing the sequence of commands is usually not enough to reproduce a failure. Since most bugs result from interaction between different clients commands which run on separate threads, the specific order of execution on the server is critical. But the server execution order depends on non-deterministic factors.
In my experience, given enough time, the specific sequence and timing that led to a crash/memory corruption tend to recur and easy to reproduce. A more effective aid for reproduction would maybe be a server-side logging option similar to AOF but for all commands that will capture the exact execution order of all the commands.

rainsupreme

LGTM!

ranshid · 2025-12-10T10:28:51Z

tests/unit/fuzzer.tcl

+tags {"slow"} {
+    run_solo {fuzzer} {


I feel this is the wrong way to run the fuzzer test.
I would prefer if we had a dedicated test workflow for the fuzzer tests so we can track issues with correlation to the fuzzer and not as part of some other kind of suite. For example, this failed on a probably correct issue, but failing as part of a code coverage workflow is probably not something we would like to have (+it will make the test coverage non-persistent).

So what's the plan with this then?

We discussed it. I was thinking at first that having a dedicated fuzzer run would be better, but then I thought that there are so many fixtures permutations it might be relevant to test the fuzzer with so I suggested we start like this and monitor the results. we can always move it to a dedicated suite with some supported fixtures

ranshid · 2025-12-24T16:30:28Z

@uriyage
We have these fixes pending:
#2974
and
#2973

after which I hope the fuzzer is unblocked and we can merge it.
Please also open a followup issues about better filtering option (we discussed only run commands of specific groups...)

Signed-off-by: Uri Yagelnik <[email protected]>

…ed PR comments Signed-off-by: Uri Yagelnik <[email protected]>

Signed-off-by: Uri Yagelnik <[email protected]>

ranshid · 2026-01-01T12:45:35Z

I think it all looks good now. Merging.
Thank you @uriyage for this important contribution!

## Add Fuzzing Capability to Valkey ### Overview This PR adds a fuzzing capability to Valkey, allowing developers and users to stress test their Valkey deployments with randomly generated commands. The fuzzer is integrated with the existing valkey-benchmark tool, making it easy to use without requiring additional dependencies. ### Key Features • **Command Generator**: Automatically generates Valkey commands by retrieving command information directly from the server • **Two Fuzzing Modes**: - normal: Generates only valid commands, doesn't modify server configurations - aggressive: Includes malformed commands and allows CONFIG SET operations • **Multi-threaded Testing**: Each client runs in a dedicated thread to maximize interaction between clients and enable testing of complicated scenarios • **Integration with valkey-benchmark**: Uses the existing CLI interface ### Implementation Details • Added new files: - `fuzzer_command_generator.h/c`: Dynamically generates valkey commands. - `fuzzer_client.c`: Orchestrate all the client threads, report test progress, and handle errors. • Modified existing files: - valkey-benchmark.c: Added fuzzing mode options and integration ### Command Generation Approach The fuzzer dynamically retrieves command information from the server, allowing it to adapt to different Valkey versions and custom modules. Since the command information generated from JSON files is sometimes limited, not all generated commands will be valid, but approximately 95% valid command generation is achieved. It is important to generate valid commands to cover as much code path as possible and not just the invalid command/args path. The fuzzer prioritizes generating syntactically and semantically correct commands to ensure thorough testing of the server's core functionality, while still including a small percentage of invalid commands in `aggressive` mode to test error handling paths #### Config modification For CONFIG SET command, the situation is more complex as the server currently provides limited information through CONFIG GET *. Some hardcoded logic is implemented that will need to be modified in the future. Ideally, the server should provide self-inspection commands to retrieve config keys-values with their properties (enum values, modifiability status, etc.). ### Issue Detection The fuzzer is designed to identify several types of issues: • Server crashes • Server memory corruptions / memory leaks(when compiled with ASAN) • Server unresponsiveness • Server malformed replies For unresponsiveness detection, command timeout limits are implemented to ensure no command blocks for excessive periods. If a server doesn't respond within 30 seconds, the fuzzer signals that something is wrong. ### Proven Effectiveness When running against the latest unstable version, the fuzzer has already identified several issues, demonstrating its effectiveness: * valkey-io#2111 * valkey-io#2112 * valkey-io#2109 * valkey-io#2113 * valkey-io#2108 * valkey-io#2137 * valkey-io#2106 * valkey-io#2347 * valkey-io#2973 * valkey-io#2974 ### How to Use Run the fuzzer using the valkey-benchmark tool with the --fuzz flag: ```bash # Basic usage (10000 commands 1000 commands per client, 10 clients) ./src/valkey-benchmark --fuzz -h 127.0.0.1 -p 6379 -n 10000 -c 10 # With aggressive fuzzing mode ./src/valkey-benchmark --fuzz --fuzz-level aggressive -h 127.0.0.1 -p 6379 -n 10000 -c 10 # With detailed logging ./src/valkey-benchmark --fuzz --fuzz-log-level debug -h 127.0.0.1 -p 6379 -n 10000 -c 10 ``` The fuzzer supports existing valkey-benchmark options, including TLS and cluster mode configuration. --------- Signed-off-by: Uri Yagelnik <[email protected]>

hpatro reviewed Jul 10, 2025

View reviewed changes

src/fuzzer_client.c Outdated Show resolved Hide resolved

src/fuzzer_command_generator.h Outdated Show resolved Hide resolved

src/fuzzer_command_generator.c Outdated Show resolved Hide resolved

src/fuzzer_client.c Outdated Show resolved Hide resolved

ranshid reviewed Jul 11, 2025

View reviewed changes

madolson added this to Valkey 9.0 Aug 6, 2025

madolson moved this to Optional for next release candidate in Valkey 9.0 Aug 6, 2025

madolson moved this from Optional for next release candidate to Todo in Valkey 9.0 Aug 6, 2025

madolson removed this from Valkey 9.0 Oct 20, 2025

madolson added this to Valkey 9.1 Oct 20, 2025

madolson moved this to Todo in Valkey 9.1 Oct 20, 2025

madolson requested a review from rainsupreme November 5, 2025 18:39

rainsupreme suggested changes Nov 7, 2025

View reviewed changes

uriyage force-pushed the valkey-fuzzer branch from 04be6b1 to 7b1b6a7 Compare November 12, 2025 12:38

rainsupreme suggested changes Nov 24, 2025

View reviewed changes

rainsupreme approved these changes Dec 8, 2025

View reviewed changes

ranshid reviewed Dec 10, 2025

View reviewed changes

uriyage force-pushed the valkey-fuzzer branch from 2825dcf to 13eda97 Compare December 10, 2025 21:08

ranshid added the run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) label Dec 11, 2025

uriyage force-pushed the valkey-fuzzer branch from 13eda97 to a3981d7 Compare December 14, 2025 12:38

ranshid approved these changes Dec 24, 2025

View reviewed changes

uriyage force-pushed the valkey-fuzzer branch 3 times, most recently from 2674304 to 7630e5e Compare December 25, 2025 17:04

Uri Yagelnik added 2 commits January 1, 2026 12:20

Valkey Fuzzer

4234243

Signed-off-by: Uri Yagelnik <[email protected]>

fix clang/typos/cmake errors, added fuzzer test to tcl tests, address…

0ef6e90

…ed PR comments Signed-off-by: Uri Yagelnik <[email protected]>

Uri Yagelnik added 4 commits January 1, 2026 12:20

Address PR comments

80ea43d

Signed-off-by: Uri Yagelnik <[email protected]>

Address PR comments

c62e668

Signed-off-by: Uri Yagelnik <[email protected]>

Address PR comments

666bbb6

Signed-off-by: Uri Yagelnik <[email protected]>

fix include order

b0f0e0b

Signed-off-by: Uri Yagelnik <[email protected]>

uriyage force-pushed the valkey-fuzzer branch from 7630e5e to b0f0e0b Compare January 1, 2026 12:21

ranshid merged commit e843c18 into valkey-io:unstable Jan 1, 2026
24 checks passed

github-project-automation bot moved this from Todo to Done in Valkey 9.1 Jan 1, 2026

Conversation

uriyage commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Add Fuzzing Capability to Valkey

Overview

Key Features

Implementation Details

Command Generation Approach

Config modification

Issue Detection

Proven Effectiveness

How to Use

Uh oh!

codecov bot commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

hpatro left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ranshid left a comment

Choose a reason for hiding this comment

Uh oh!

hpatro commented Jul 11, 2025

Uh oh!

uriyage commented Jul 13, 2025

Uh oh!

uriyage commented Jul 13, 2025

Uh oh!

madolson commented Aug 6, 2025

Uh oh!

rainsupreme left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

uriyage commented Nov 12, 2025

Uh oh!

rainsupreme left a comment

Choose a reason for hiding this comment

Uh oh!

rainsupreme Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

uriyage Dec 7, 2025

Choose a reason for hiding this comment

Uh oh!

rainsupreme left a comment

Choose a reason for hiding this comment

Uh oh!

ranshid Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

madolson Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

ranshid Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ranshid commented Dec 24, 2025

Uh oh!

ranshid commented Jan 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

uriyage commented Jul 10, 2025 •

edited

Loading

codecov bot commented Jul 10, 2025 •

edited

Loading

ranshid Jan 3, 2026 •

edited

Loading