Prune unnecessary transaction sequences from corpus#625
Merged
Conversation
61ff5c4 to
0761cce
Compare
0761cce to
41a8e8e
Compare
anishnaik
reviewed
May 7, 2025
anishnaik
reviewed
May 7, 2025
anishnaik
reviewed
May 7, 2025
anishnaik
reviewed
May 7, 2025
anishnaik
reviewed
May 7, 2025
anishnaik
reviewed
May 7, 2025
anishnaik
approved these changes
May 20, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a job which runs once a minute that prunes unnecessary txn sequences from the corpus. It removes sequences that no longer contribute any new coverage compared to the rest. This can happen when, for example, sequence A adds new coverage, sequence B builds upon sequence A and adds strictly more coverage, making sequence A now unnecessary since it doesn't "contribute anything" to the corpus.
The job works by:
The job takes about 5-10 seconds, during which time no fuzzing at all can occur
This seems to be pretty effective at reducing corpus size. In a small test I've been running, it's been removing about 100 entries per minute during the initial phase of fuzzing where the corpus quickly grows to 1000 entries (first 5 minutes) and has now slowed down to removing about 50 entries per minute (15 minutes in, corpus size 1400). Corpus growth is slow at this phase because of pruning; the corpus would probably have 2000-2500 entries by now (rather than 1400) if not for pruning.