Skip to content

Switch from PC-based coverage to branch-based coverage#585

Merged
Xenomega merged 4 commits intomasterfrom
wip/edge-coverage-and-hit-count-3
Mar 10, 2025
Merged

Switch from PC-based coverage to branch-based coverage#585
Xenomega merged 4 commits intomasterfrom
wip/edge-coverage-and-hit-count-3

Conversation

@samalws-tob
Copy link
Copy Markdown
Contributor

This PR switches medusa from using PC-based coverage to branch-based coverage. Coverage is only collected at jump instructions, and at contract entrance/exit, rather than at every instruction. This allows for both more fine-grained coverage information (since we collect both the src and the dst of each jump) and sometimes for increased performance. In benchmark tests, this PR gets consistently better coverage than master branch. On some test contracts it gets better performance (in seq/s and gas/s); on some test contracts it gets worse. I suspect that part of the reduced performance on some is due to the increased corpus size that this PR results in; in one example, I took an example where this PR performs worse than master, fixed the corpus size between the two, and re-ran benchmark, and this PR performed better than master. In any case, the consistently better coverage should be sufficient to justify the change to branch-based coverage.

This PR also switches medusa's log messages from displaying PC coverage to displaying branch coverage (in number of branches hit (technically "markers" hit)). PC coverage can also be viewed if needed by setting Debug log mode; in Debug mode, PC coverage is displayed once a minute, since it can be costly (up to 1 second) to perform the conversion needed to get the PC count.

@0xalpharush
Copy link
Copy Markdown
Contributor

What is ENTER_MARKER_XOR for?

@samalws-tob
Copy link
Copy Markdown
Contributor Author

In order to correctly convert branch coverage to line coverage (needed eg for making coverage reports), we need to get the hit count at the start of contract execution in order to initialize hit counter to the correct value. So whenever a contract begins execution we record it by marking coverage at the first PC executed, using the marker ENTER_MARKER_XOR<<32 ^ pc

Copy link
Copy Markdown
Member

@Xenomega Xenomega left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me. I didn't spot any issues with this PR:

  • Manual review showed no issues
  • Running it on a few codebases performed equally or faster.
  • I can't attest to the actual efficacy in benchmarks, but it sounds like you all ran them and feel satisfied, so I'll take your word for it :)

My only thought would've been maybe to consider moving the markers to a custom type we define, with functions to tag them explicitly as reverted/return/etc. It might make the tracer code a bit cleaner/readable, and you can use less of the magic value constant across some of the files. But that's just a CQ suggestion, it's not important, so I'll hit approve.

^^Just thinking out loud since I know we were discussing later revisiting redesigning all tracers and wrapping the logic so we don't have to keep tracking reverted call frames in each one like we were haha.

Anyways, this is an exciting PR, thanks a ton :)

@Xenomega
Copy link
Copy Markdown
Member

Actually one last second catch @samalws-tob , could we change the fmt.Printf statements in source_analysis to use our actual logger before we merge this?

@Xenomega Xenomega merged commit 336cf7d into master Mar 10, 2025
12 checks passed
@Xenomega Xenomega deleted the wip/edge-coverage-and-hit-count-3 branch March 10, 2025 22:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants