Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Improve SSA renaming memory usage#15000

Merged
sandreenko merged 8 commits intodotnet:masterfrom
mikedn:ssa-mem-ren
Mar 8, 2019
Merged

Improve SSA renaming memory usage#15000
sandreenko merged 8 commits intodotnet:masterfrom
mikedn:ssa-mem-ren

Conversation

@mikedn
Copy link

@mikedn mikedn commented Nov 12, 2017

This change mainly replaces the use of jitstd::list with a simple singly list based stack that uses less memory. jitstd::list wastes memory both because it's a doubly linked list and because the list object is quite large (head/tail pointers, size, allocators) and there are many of them (basically one for each tracked variable). It also makes memory reuse difficult - removed nodes could be reused but to do that you'd need to make a custom jitstd allocator that maintains a pool of nodes. Doable, but quite a bit of work due to jitstd using old style C++ (pre C++ 11) allocators.

Since this is likely the last large change I'm going to make to SsaRenameState I went an extra mile and cleaned up the whole code - comments, names etc.

This saves ~1% overall used memory, 34% CMK_SSA memory and ~0.1% instructions retired.

Mem diff: https://gist.github.com/mikedn/f0be27a9432e6814245d818e1a0e0b1f
PIN data: https://1drv.ms/x/s!Av4baJYSo5pjgsAnoh04vA3ZEWWD4w

Somewhat strangely, the amount of allocated memory increases a bit - 65 allocator pages. Took me a while to figure out why. Due to the fact that the slab allocator caches pages > 64k you can end up in a situation where the old code was lucky and got a larger page (e.g. 74k in one example I looked at) and after these changes it gets a normal, 64k page and then it needs another page for the rest of 10k it needs.

No x86/x64 FX diffs.

Contributes to #15108

@mikedn mikedn force-pushed the ssa-mem-ren branch 2 times, most recently from 4bd2d68 to 42d3da8 Compare November 21, 2017 21:04
@CarolEidt CarolEidt requested a review from sandreenko April 12, 2018 22:56
@CarolEidt CarolEidt added this to the 2.2.0 milestone Apr 12, 2018
@sandreenko
Copy link

Is this PR ready for review?

@mikedn
Copy link
Author

mikedn commented Sep 13, 2018

Nah, I want to do a bit of cleanup first.

mikedn added 7 commits October 3, 2018 22:17
It's not exactly useful to dump all the stacks after pushing to a stack. Nor is it useful to dump all the stack after popping only some, perhaps none, in PopBlockStacks.

Also dump stack from top to bottom, makes it easier to find the top, which is usually what you care about during SSA renaming.
It makes no difference if the definition is in the "block before any real blocks..." or at the start of the first block, it's just an unnecessary complication.
SsaBuilder already handles that, doing it again in SsaRenameState just duplicates logic.
Worst name ever.

Also use "block" consistently, instead of a mix of "bb" and "block".
- Change SsaRenameState to a class
- Cleanup remaining function comments
- Move SsaRenameStateForBlock & SsaRenameStateLocDef inside SsaRenameState
- Make EnsureStacks private
- Reorder data members
- Use m_ prefix consistently
std::list has a few drawbacks:
- It's a doubly linked list but a singly linked list suffices so every node wastes 8 bytes for an extra pointer.
- The list object itself is relatively large 2 head/tail pointers, node count and memory allocator. There can be hundreds of such objects (one for each local variable) so the smaller the better.

Replace with a simple singly linked, intrusive list based stack.
It's pretty much the same logic (the only difference is that in the memory case "top" can't ever be null so by sharing the code we get a redundant null check).
@mikedn mikedn changed the title [WIP] Improve SSA renaming memory usage Improve SSA renaming memory usage Oct 4, 2018
@AndyAyersMS
Copy link
Member

@mikedn we should start clearing up your backlogged PRs.

Is this one ready for review?

@mikedn
Copy link
Author

mikedn commented Mar 7, 2019

Yes, it's ready. Thanks!

@AndyAyersMS
Copy link
Member

cc @dotnet/jit-contrib

Copy link

@sandreenko sandreenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it was a pleasure to review these commits split.

Copy link

@briansull briansull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good

@AndyAyersMS
Copy link
Member

Might as well trigger a retest, it's been a while.

@AndyAyersMS AndyAyersMS closed this Mar 7, 2019
@AndyAyersMS AndyAyersMS reopened this Mar 7, 2019
@sandreenko sandreenko merged commit fb58386 into dotnet:master Mar 8, 2019
@mikedn mikedn deleted the ssa-mem-ren branch September 28, 2019 19:16
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Cleanup DumpStacks

It's not exactly useful to dump all the stacks after pushing to a stack. Nor is it useful to dump all the stack after popping only some, perhaps none, in PopBlockStacks.

Also dump stack from top to bottom, makes it easier to find the top, which is usually what you care about during SSA renaming.

* Stop passing null block to SsaRenameState::Push

It makes no difference if the definition is in the "block before any real blocks..." or at the start of the first block, it's just an unnecessary complication.

* Stop handling byrefStatesMatchGcHeapStates in SsaRenameState

SsaBuilder already handles that, doing it again in SsaRenameState just duplicates logic.

* Stop using "count" as a name for "SSA number"

Worst name ever.

Also use "block" consistently, instead of a mix of "bb" and "block".

* Delete "ConstructedArray", not needed

* Various cleanup

- Change SsaRenameState to a class
- Cleanup remaining function comments
- Move SsaRenameStateForBlock & SsaRenameStateLocDef inside SsaRenameState
- Make EnsureStacks private
- Reorder data members
- Use m_ prefix consistently

* Replace jitstd::list with a custom stack

std::list has a few drawbacks:
- It's a doubly linked list but a singly linked list suffices so every node wastes 8 bytes for an extra pointer.
- The list object itself is relatively large 2 head/tail pointers, node count and memory allocator. There can be hundreds of such objects (one for each local variable) so the smaller the better.

Replace with a simple singly linked, intrusive list based stack.

* Share push code between lclvar and memory

It's pretty much the same logic (the only difference is that in the memory case "top" can't ever be null so by sharing the code we get a redundant null check).


Commit migrated from dotnet/coreclr@fb58386
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants