-
Notifications
You must be signed in to change notification settings - Fork 5.3k
JIT: Create preheaders on new loops and delete old loop canonicalization #96647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Invalidate the old loop table immediately after old loop finding. Remove `m_oldToNewLoop` and `m_newToOldLoop`. Move `BBF_OLD_LOOP_HEADER_QUIRK` to be set directly by old loop finding.
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch Issue Detailsnull
|
|
/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress |
|
Azure Pipelines successfully started running 2 pipeline(s). |
|
cc @dotnet/jit-contrib PTAL @BruceForstall The libraries-jitstress failure is #96715. The jitstress one looks like an infra issue (restore failed on osx-arm64). Diffs. As mentioned above, due to differences in loop finding around ambiguous nested inner loop cases. |
| // recorded in terms of block numbers, so flag it invalid. | ||
| fgDomsComputed = false; | ||
| fgRenumberBlocks(); | ||
| fgUpdateChangedFlowGraph(FlowGraphUpdates::COMPUTE_DOMS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to recompute the old dominators here since optFindAndScaleGeneralLoopBlocks currently depends on them, but we should be able to get rid of this computation very soon. The current TP regressions are because of cases where we compute the dominators here and then also again after new canonicalization; in those cases only the latter computation is necessary (but it's a little hard to detect, so I just left it in both places for now).
| void Compiler::optFindNewLoops() | ||
| { | ||
| m_loops = FlowGraphNaturalLoops::Find(m_dfsTree); | ||
|
|
||
| if (optCanonicalizeLoops(m_loops)) | ||
| { | ||
| fgUpdateChangedFlowGraph(FlowGraphUpdates::COMPUTE_DOMS); | ||
| m_dfsTree = fgComputeDfs(); | ||
| m_loops = FlowGraphNaturalLoops::Find(m_dfsTree); | ||
| } | ||
|
|
||
| // Starting now, we require all loops to have pre-headers. | ||
| optLoopsRequirePreHeaders = true; | ||
|
|
||
| // Leave a bread crumb for future phases like loop alignment about whether | ||
| // looking for loops makes sense. We generally do not expect phases to | ||
| // introduce new cycles/loops in the flow graph; if they do, they should | ||
| // set this to true themselves. | ||
| // We use more general cycles over "m_loops->NumLoops() > 0" here because | ||
| // future optimizations can easily cause general cycles to become natural | ||
| // loops by removing edges. | ||
| fgMightHaveNaturalLoops = m_dfsTree->HasCycle(); | ||
| assert(fgMightHaveNaturalLoops || (m_loops->NumLoops() == 0)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know why in the diff this shows up as a big addition... I only added the canonicalization part here.
|
I pushed a minor changes, but this passed jitstress/libraries-jitstress before those. |
BruceForstall
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Bruce Forstall <[email protected]>
…ion (dotnet#96647) Delete the old loop canonicalization. Implement canonicalization on new loops; the only canonicalization done is creation of preheaders. Diffs expected, in particular due to ambiguous cases where the old loop finding identifies more loops than the new loop finding. The old loop finding would then canonicalize these extra loops in such a way that new loop finding also found them. That is, the old pipeline: ``` old loop finding -> old loop canonicalization -> new loop finding ``` would result in more loops found than ``` old loop finding -> new loop finding -> new loop canonicalization ``` An example of a program where it is ambiguous how many loops it contains based only on the flow graph is the following: ```csharp [MethodImpl(MethodImplOptions.NoInlining)] private static void Foo(int n) { n *= 2; do { if (n == 3) { n -= 2; continue; } if (n == 4) { n -= 3; continue; } if (n == 5) { n -= 4; continue; } n--; } while (n > 0); } ``` Old loop finding finds 4 loops in this program, while new loop finding finds 1 loop. The backedges (`continue` statements) can either be seen as nested loop backedges or as backedges to the outer loop). Here is another, opposite case: ```csharp [MethodImpl(MethodImplOptions.NoInlining)] private static void Foo(int n) { n *= 2; do { do { n *= 3; } while (n < 5); n -= 10; } while (n != 0); } ``` Old loop finding finds 2 loops but new loop finding finds 1 loop. The differences in loop identification lead to the diffs. Instead of trying to proactively handle them, we are going to see if the perf lab finds any examples that we can gain insight on. One avenue to explore may be to use PGO information to figure out what may be nested loops vs what may just be backedges.
Delete the old loop canonicalization. Implement canonicalization on new loops; the only canonicalization done is creation of preheaders.
Diffs expected; in particular due to ambiguous cases where the old loop finding identifies more loops than the new loop finding. The old loop finding would then canonicalize these extra loops in such a way that new loop finding also found them. That is, the old pipeline:
would result in more loops found than
An example of a program where it is ambiguous how many loops it contains based only on the flow graph is the following:
Old loop finding finds 4 loops in this program, while new loop finding finds 1 loop. The backedges (
continuestatements) can either be seen as nested loop backedges or as backedges to the outer loop). Here is another, opposite case:Old loop finding finds 2 loops but new loop finding finds 1 loop.
The differences in loop identification leads to the diffs. Instead of trying to proactively handle them, we are going to see if the perf lab finds any examples that we can gain insight on. One avenue to explore may be to use PGO information to figure out what may be nested loops vs what may just be backedges.
I validated that diffs are coming from this by writing a small program to go through all contexts < 50 KiB with diffs in libraries_tests.run and verify that the baseline found more loops than the diff:
Nested inner loop backedges is what old loop canonicalization calls the situation where it creates a new "top" block in the above case, that then would cause new loop finding to also recognize the new loop.
The remaining case 26 cases seemed to be cases where new loop canonicalization did not need to create a preheader, but old loop canonicalization did, because of old loop recognition identifying a smaller nested inner loop that did not contain one of the backedges. Example: old, new,
BB05only needs a preheader in "old".