Added a first fuzzer as integration into OSS-Fuzz. by DavidKorczynski · Pull Request #8052 · redis/redis

DavidKorczynski · 2020-11-14T19:00:48Z

Dear maintainers,

This PR adds a fuzzer to the project with the goal of setting Redis up to be fuzzed by way of OSS-Fuzz. OSS-Fuzz is a free service run by Google that performs continuous fuzzing of important open source projects. Essentially, OSS-Fuzz will perform the fuzzing for you and then email you bug reports, coverage reports etc. All we need is a set of email addresses that will receive this information.

For cross-referencing, the PR that adds the OSS-Fuzz logic is here: google/oss-fuzz#4643

If you are happy to integrate it then I would be happy to continue supplying more sophisticated fuzzers to the project as well.

madolson · 2020-11-14T20:42:33Z

Hey David, Thanks for starting this conversation.

Can you tell us a bit more about what is the expectation here? Is the expectation that these need to be part of the main project, i.e. why not a separate project, and are we going to be asked to keep writing more fuzz tests over time?

You are fuzzing some rather uninteresting functions, we probably want to be doing more fuzzing along the core Redis command processing path.

Also, in the nicest way possible, is OSS-Fuzz good? There have been several reports over the years where people give us a dump of "possible bugs" that aren't actually possible and we just waste our time verifying that. I don't want this to end up just being noise that is distracting from real problems we need to be addressing.

DavidKorczynski · 2020-11-14T21:19:16Z

In essence there are no expectations. However, naturally if you do not intend on fixing bugs provided then it would be wasted resources to fuzz. However, no one will do anything if you do not fix the bugs and nothing stops you from opting out at any point.

We can keep the fuzzers in other repositories: if you are most comfortable having it outside the main repository then we can store them over in the main oss-fuzz repository: https:/github.com/google/oss-fuzz . If you are happy to try it out like this first then you can provide me with an email and I will fix it all up so in the main OSS-Fuzz repository so you can get the experience without merging anything into the Redis main repo. Naturally, fuzzing is really important for ensuring security of a project so if you like the results of the fuzzing then we can proceed in the future to get it more integrated into the Redis repo and perhaps CI as well.

You will not be asked to do anyting over time - you do not sign up for anything. The only specific caveat is that bugs are made public after a 90 day deadline, i.e. if the fuzzers find a bug and file a bug report, then if the bug is not fixed within 90 days the bug report will be public on the bug tracker.

In regards to whether OSS-Fuzz is good, then I will strongly argue so. Check out the list of projects integrated, which contains more than 400 significant open source projects: https://github.com/google/oss-fuzz/tree/master/projects Fuzzing is a key part of providing security for software and OSS-Fuzz is an excellent way to do so for open source software.

We can create the fuzzers such that the bugs reported within the threat model of redis, which will avoid being random bugs that are less of a priority to other tasks. In this context it would be very helpful if you could guide me to some APIs as I am not an expert in the Redis codebase but know a lot about fuzzing. From the bug reports you also get nice saintizer reports which show you the full stack trace, inputs that trigger it and more introspection, which eases the root-cause analysis process a lot.

oranagra · 2020-11-15T10:53:44Z

not sure i understand where we aim for, but here are a few random thoughts (after looking the the PR code changes):

the value of fuzzing a single method of some low level subsystem is not very high IMHO, often the bugs are either in edge cases of the whole system (combination of several subsystems, and the way functions integrate with one another).
if i understand this correctly, it looks very coupled with the internals (fuzzing input for internal functions), this means that if we refactor or change this function, the fuzzer breaks, and someone needs to maintain it, this becomes a bigger problem if the fuzzer is hosted in another repo).
actually many times we actually have bug validating user input (for instance a missing argument of a redis command causes it to crash), so it would be nice to have some smart fuzzer that sends redis commands. obviously it should not be completely random, the syntax must pass some initial validations to make it to the valuable code (correct syntax, on a key of the right type, etc).
as @madolson pointed out, the current approach can indeed expose a lot of false positives in redis, things that look like bugs but actually aren't (due to how the code is used), and looking into these reports can be time consuming.

DavidKorczynski · 2020-11-17T14:14:13Z

Thanks for the reply @oranagra

In general I agree with your statements, although fuzzing internal functions can often be a nice way of ensuring no memory corruption exists in them. However, it's not needed to do a ton of fuzzing in these.

In essence, my interest is in integrating fuzzing into the Redis project, and also having it done continuously by way of OSS-Fuzz. If I create the fuzzers such that they produce valid commands and do the maintenance of it, i.e. change the fuzzers when needed, would you be happy to integrate them? In case I can no longer maintain it I can make this clear and also remove the fuzzers if that's what you prefer at that given time.

oranagra · 2020-11-17T16:47:26Z

@DavidKorczynski a fuzzer that generates commands that make sense and have good chance to pass initial checks and reach valuable code is something we would like very much.
generally commands are always backwards compatible so there's no reason this would break. it would be hard for such a fuzzer to validate correct output, but, it can check for crashes, leaks, and double or missing replies.

i.e. if some error handling code for redis responds with two replies, the next command will consume the second reply of the previous one, and if it doesn't reply the client is hung.

you can have a quick look at this work in progress PR #7807 in which i tried to write a simple fuzzer for the purpose of catching crashes and leaks (util.tcl).

DavidKorczynski · 2021-07-23T08:59:08Z

@oranagra I was going to do this but simply did not find the time. I think it's probably best closing this one for now and then re-opening if I can come up with such a fuzzer as you suggests.

**Bug Fixes:** * [redis#7385](RediSearch/RediSearch#7385) Fix high temporary memory consumption when loading multiple search indexes from RDB * [redis#7430](RediSearch/RediSearch#7430) Fix a potential deadlock in `FT.HYBRID` in cluster mode during updates. * [redis#7454](RediSearch/RediSearch#7454) Fix a garbage collection performence regression * [redis#7460](RediSearch/RediSearch#7460) Fix potential double-free in Fork GC error paths * [redis#7455](RediSearch/RediSearch#7455) Fix internal cursors not being deleted promptly in cluster mode * [redis#7667](RediSearch/RediSearch#7667) Fix a cursor logical leak upon dropping the index * [redis#7796](RediSearch/RediSearch#7796) Fix a potential use-after-free when removing connections * [redis#7792](RediSearch/RediSearch#7792) Fix string comparison for binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE * [redis#7823](RediSearch/RediSearch#7823) Update `FT.HYBRID` to accept vector blobs only via parameters * [redis#7903](RediSearch/RediSearch#7903) Fix a memory leak in Hybrid ASM * [redis#8052](RediSearch/RediSearch#8052) Fix `FT.HYBRID` behavior when used with `LOAD *` * [redis#8082](RediSearch/RediSearch#8082) Fix incorrect FULLTEXT field metric counts * [redis#8089](RediSearch/RediSearch#8089) Fix an edge case in `CLUSTERSET` handling * [redis#8152](RediSearch/RediSearch#8152) Fix configuration registration issues **Improvements:** * [redis#7427](RediSearch/RediSearch#7427) Enhance `FT.PROFILE` with vector search execution details * [redis#7431](RediSearch/RediSearch#7431) Ensure full `FT.PROFILE` output is returned on timeout with RETURN policy * [redis#7507](RediSearch/RediSearch#7507) Track timeout warnings and errors in INFO * [redis#7576](RediSearch/RediSearch#7576) Track OOM warnings and errors in INFO * [redis#7612](RediSearch/RediSearch#7612) Track `maxprefixexpansions` warnings and errors in INFO * [redis#7960](RediSearch/RediSearch#7960) Persist query warnings across cursor reads * [redis#7551](RediSearch/RediSearch#7551), [redis#7616](RediSearch/RediSearch#7616), [redis#7622](RediSearch/RediSearch#7622), [redis#7625](RediSearch/RediSearch#7625) Add runtime thread and pending-jobs metrics * [redis#7589](RediSearch/RediSearch#7589) Support multiple slot ranges in `search.CLUSTERSET` * [redis#7707](RediSearch/RediSearch#7707) Add `WITHCOUNT` support to `FT.AGGREGATE` * [redis#7862](RediSearch/RediSearch#7862) Add support for subquery `COUNT` in `FT.HYBRID` * [redis#8087](RediSearch/RediSearch#8087) Add warnings when cursor results may be affected by ASM and expose ASM warnings in `FT.PROFILE` * [redis#8049](RediSearch/RediSearch#8049) Add logging for index-related commands * [redis#8150](RediSearch/RediSearch#8150) Fix shard total profile time reporting in `FT.PROFILE`

**Bug Fixes:** * [#7385](RediSearch/RediSearch#7385) Fix high temporary memory consumption when loading multiple search indexes from RDB * [#7430](RediSearch/RediSearch#7430) Fix a potential deadlock in `FT.HYBRID` in cluster mode during updates. * [#7454](RediSearch/RediSearch#7454) Fix a garbage collection performence regression * [#7460](RediSearch/RediSearch#7460) Fix potential double-free in Fork GC error paths * [#7455](RediSearch/RediSearch#7455) Fix internal cursors not being deleted promptly in cluster mode * [#7667](RediSearch/RediSearch#7667) Fix a cursor logical leak upon dropping the index * [#7796](RediSearch/RediSearch#7796) Fix a potential use-after-free when removing connections * [#7792](RediSearch/RediSearch#7792) Fix string comparison for binary data with embedded NULLs in TOLIST reducer in FT.AGGREGATE * [#7704](RediSearch/RediSearch#7704) Use asynchronous jobs for GC in SVS to accelerate execution * [#7823](RediSearch/RediSearch#7823) Update `FT.HYBRID` to accept vector blobs only via parameters * [#7903](RediSearch/RediSearch#7903) Fix a memory leak in Hybrid ASM * [#8052](RediSearch/RediSearch#8052) Fix `FT.HYBRID` behavior when used with `LOAD *` * [#8082](RediSearch/RediSearch#8082) Fix incorrect FULLTEXT field metric counts * [#8089](RediSearch/RediSearch#8089) Fix an edge case in `CLUSTERSET` handling * [#8152](RediSearch/RediSearch#8152) Fix configuration registration issues **Improvements:** * [#7427](RediSearch/RediSearch#7427) Enhance `FT.PROFILE` with vector search execution details * [#7431](RediSearch/RediSearch#7431) Ensure full `FT.PROFILE` output is returned on timeout with RETURN policy * [#7507](RediSearch/RediSearch#7507) Track timeout warnings and errors in INFO * [#7576](RediSearch/RediSearch#7576) Track OOM warnings and errors in INFO * [#7612](RediSearch/RediSearch#7612) Track `maxprefixexpansions` warnings and errors in INFO * [#7960](RediSearch/RediSearch#7960) Persist query warnings across cursor reads * [#7551](RediSearch/RediSearch#7551), [#7616](RediSearch/RediSearch#7616), [#7622](RediSearch/RediSearch#7622), [#7625](RediSearch/RediSearch#7625) Add runtime thread and pending-jobs metrics * [#7589](RediSearch/RediSearch#7589) Support multiple slot ranges in `search.CLUSTERSET` * [#7707](RediSearch/RediSearch#7707) Add `WITHCOUNT` support to `FT.AGGREGATE` * [#7862](RediSearch/RediSearch#7862) Add support for subquery `COUNT` in `FT.HYBRID` * [#8087](RediSearch/RediSearch#8087) Add warnings when cursor results may be affected by ASM and expose ASM warnings in `FT.PROFILE` * [#8049](RediSearch/RediSearch#8049) Add logging for index-related commands * [#8150](RediSearch/RediSearch#8150) Fix shard total profile time reporting in `FT.PROFILE`

Added a first fuzzer as integration into OSS-Fuzz.

dc5ceb7

DavidKorczynski closed this Jul 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added a first fuzzer as integration into OSS-Fuzz.#8052

Added a first fuzzer as integration into OSS-Fuzz.#8052
DavidKorczynski wants to merge 1 commit intoredis:unstablefrom
DavidKorczynski:unstable

DavidKorczynski commented Nov 14, 2020

Uh oh!

madolson commented Nov 14, 2020

Uh oh!

DavidKorczynski commented Nov 14, 2020 •

edited

Loading

Uh oh!

oranagra commented Nov 15, 2020

Uh oh!

DavidKorczynski commented Nov 17, 2020

Uh oh!

oranagra commented Nov 17, 2020

Uh oh!

DavidKorczynski commented Jul 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

DavidKorczynski commented Nov 14, 2020

Uh oh!

madolson commented Nov 14, 2020

Uh oh!

DavidKorczynski commented Nov 14, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

oranagra commented Nov 15, 2020

Uh oh!

DavidKorczynski commented Nov 17, 2020

Uh oh!

oranagra commented Nov 17, 2020

Uh oh!

DavidKorczynski commented Jul 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DavidKorczynski commented Nov 14, 2020 •

edited

Loading