Skip to content

feat[venom]: add memory SSA analysis#4555

Merged
charles-cooper merged 66 commits intovyperlang:masterfrom
harkal:feat/venom/memssa
Apr 24, 2025
Merged

feat[venom]: add memory SSA analysis#4555
charles-cooper merged 66 commits intovyperlang:masterfrom
harkal:feat/venom/memssa

Conversation

@harkal
Copy link
Copy Markdown
Collaborator

@harkal harkal commented Apr 4, 2025

What I did

Implement the Memory SSA analysis for Venom

How I did it

How to verify it

Commit message

This commit implements Memory SSA analysis for Venom, based
on LLVM's `MemorySSA` design. This analysis will enable more
sophisticated optimizations passes that require understanding memory
dependencies. The implementation:

- Tracks memory read and write operations in SSA form
- Provides methods to identify memory definitions and uses
- Adds functionality to track memory clobbering relationships
- Supports both memory and storage location types (groundwork)
- Includes debugging facilities for printing Memory SSA information

references:
- https://llvm.org/docs/MemorySSA.html

Description for the changelog

Cute Animal Picture

image-14

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 4, 2025

Codecov Report

Attention: Patch coverage is 79.82456% with 92 lines in your changes missing coverage. Please review.

Project coverage is 92.42%. Comparing base (b396d12) to head (a8dd0fa).
Report is 72 commits behind head on master.

Files with missing lines Patch % Lines
vyper/venom/basicblock.py 46.71% 51 Missing and 22 partials ⚠️
vyper/venom/analysis/mem_ssa.py 93.35% 6 Missing and 11 partials ⚠️
vyper/venom/analysis/mem_alias.py 96.72% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4555      +/-   ##
==========================================
- Coverage   92.70%   92.42%   -0.29%     
==========================================
  Files         123      125       +2     
  Lines       17571    18017     +446     
  Branches     2988     3110     +122     
==========================================
+ Hits        16290    16653     +363     
- Misses        883      931      +48     
- Partials      398      433      +35     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@harkal harkal marked this pull request as ready for review April 7, 2025 13:57
@harkal harkal requested a review from charles-cooper April 7, 2025 13:57

offset: int = 0
size: int = 0
is_volatile: bool = False
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does is_volatile mean? please add a comment

tracking memory definitions and uses explicitly.
"""

VALID_LOCATION_TYPES = {"memory", "storage"}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use vyper/evm/address_space.py here? it already has metadata like the load opcode corresponding to the location.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think any unification refactoring should come after. I don't have the visibility to judge this atm.

return self == FULL_MEMORY_ACCESS
if self == EMPTY_MEMORY_ACCESS or other == EMPTY_MEMORY_ACCESS:
return False
if self.size <= 0 or other.size <= 0:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size < 0 means full memory, right? feels a bit weird to handle it separately from FULL_MEMORY_ACCESS. also, what do you think about having a proper sentinel instead of -1? like MemoryAccess.UNBOUNDED_ACCESS = object()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.size <= 0 or other.size <= 0:
assert self.size > 0 and other.size > 0 # guaranteed by previous checks

self.current_def[block] = mem_def
self.inst_to_def[inst] = mem_def

def _insert_phi_nodes(self):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this doesn't actually insert physical phi instructions, can we call it like compute_phi_nodes or similar?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know... feels like insert to me as we are inserting those into the memory graph (even though the graph is now separate from the other data structures, compared to our "instruction based" variable ssa phi)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the relatedly named _add_phi_nodes function in MakeSSA does add physical instructions, so it's a bit confusing

def _add_phi_nodes(self):
"""
Add phi nodes to the function.
"""

at least add a comment saying that it's adding the nodes to the graph, not instructions to the basic block, otherwise it seems like the function body does not match the name

return True
if other == FULL_MEMORY_ACCESS:
return self == FULL_MEMORY_ACCESS
if self == EMPTY_MEMORY_ACCESS or other == EMPTY_MEMORY_ACCESS:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this is right. if other == EMPTY_MEMORY_ACCESS we should return true, right?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if we are FULL_MEMORY_ACCESS, which is handled in the beginning.

elif opcode == "dloadbytes":
return EMPTY_MEMORY_ACCESS
elif opcode == "dload":
return EMPTY_MEMORY_ACCESS
Copy link
Copy Markdown
Member

@charles-cooper charles-cooper Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically, dload writes and reads from location 0

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know, just not taking the special memory location into account for this.

if loc1 == EMPTY_MEMORY_ACCESS or loc2 == EMPTY_MEMORY_ACCESS:
return False

if loc1.size <= 0 or loc2.size <= 0:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe assert loc1.size > 0 and loc2.size > 0

Copy link
Copy Markdown
Member

@charles-cooper charles-cooper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. i didn't review some details in the clobbered/clobbering functions as carefully since i assume they will evolve with further work. @harkal please draft a commit message and then we can merge this!

@charles-cooper charles-cooper merged commit 2f14f04 into vyperlang:master Apr 24, 2025
163 of 164 checks passed
@charles-cooper charles-cooper changed the title feat[venom]: memory SSA analysis feat[venom]: add memory SSA analysis Apr 24, 2025
@harkal harkal deleted the feat/venom/memssa branch April 24, 2025 16:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants