JIT: Create preheaders on new loops and delete old loop canonicalization #96647

jakobbotsch · 2024-01-08T23:05:06Z

Delete the old loop canonicalization. Implement canonicalization on new loops; the only canonicalization done is creation of preheaders.

Diffs expected; in particular due to ambiguous cases where the old loop finding identifies more loops than the new loop finding. The old loop finding would then canonicalize these extra loops in such a way that new loop finding also found them. That is, the old pipeline:

old loop finding -> old loop canonicalization -> new loop finding

would result in more loops found than

old loop finding -> new loop finding -> new loop canonicalization

An example of a program where it is ambiguous how many loops it contains based only on the flow graph is the following:

[MethodImpl(MethodImplOptions.NoInlining)]
private static void Foo(int n)
{
    n *= 2;

    do
    {
        if (n == 3)
        {
            n -= 2;
            continue;
        }

        if (n == 4)
        {
            n -= 3;
            continue;
        }

        if (n == 5)
        {
            n -= 4;
            continue;
        }

        n--;
    } while (n > 0);
}

Old loop finding finds 4 loops in this program, while new loop finding finds 1 loop. The backedges (continue statements) can either be seen as nested loop backedges or as backedges to the outer loop). Here is another, opposite case:

[MethodImpl(MethodImplOptions.NoInlining)]
private static void Foo(int n)
{
    n *= 2;
    do
    {
        do
        {
            n *= 3;
        } while (n < 5);

        n -= 10;
    } while (n != 0);
}

Old loop finding finds 2 loops but new loop finding finds 1 loop.

The differences in loop identification leads to the diffs. Instead of trying to proactively handle them, we are going to see if the perf lab finds any examples that we can gain insight on. One avenue to explore may be to use PGO information to figure out what may be nested loops vs what may just be backedges.

I validated that diffs are coming from this by writing a small program to go through all contexts < 50 KiB with diffs in libraries_tests.run and verify that the baseline found more loops than the diff:

Processed 554/554 contexts, 554 with new loops, 528 due to canonicalizing nested inner loop backedges

Nested inner loop backedges is what old loop canonicalization calls the situation where it creates a new "top" block in the above case, that then would cause new loop finding to also recognize the new loop.

The remaining case 26 cases seemed to be cases where new loop canonicalization did not need to create a preheader, but old loop canonicalization did, because of old loop recognition identifying a smaller nested inner loop that did not contain one of the backedges. Example: old, new, BB05 only needs a preheader in "old".

Invalidate the old loop table immediately after old loop finding. Remove `m_oldToNewLoop` and `m_newToOldLoop`. Move `BBF_OLD_LOOP_HEADER_QUIRK` to be set directly by old loop finding.

ghost · 2024-01-08T23:05:16Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

null

Author:	jakobbotsch
Assignees:	jakobbotsch
Labels:	`area-CodeGen-coreclr`
Milestone:	-

…w-loops

jakobbotsch · 2024-01-09T20:14:31Z

/azp run runtime-coreclr jitstress, runtime-coreclr libraries-jitstress

azure-pipelines · 2024-01-09T20:14:52Z

Azure Pipelines successfully started running 2 pipeline(s).

jakobbotsch · 2024-01-09T22:49:21Z

cc @dotnet/jit-contrib PTAL @BruceForstall

The libraries-jitstress failure is #96715. The jitstress one looks like an infra issue (restore failed on osx-arm64).

Diffs. As mentioned above, due to differences in loop finding around ambiguous nested inner loop cases.
We will be able to regain the TP regression once we get rid of the old loop compaction code; there is an unnecessary old dominator computation right now because of it.

jakobbotsch · 2024-01-09T22:53:21Z

src/coreclr/jit/optimizer.cpp

-        // recorded in terms of block numbers, so flag it invalid.
-        fgDomsComputed = false;
-        fgRenumberBlocks();
+        fgUpdateChangedFlowGraph(FlowGraphUpdates::COMPUTE_DOMS);


We need to recompute the old dominators here since optFindAndScaleGeneralLoopBlocks currently depends on them, but we should be able to get rid of this computation very soon. The current TP regressions are because of cases where we compute the dominators here and then also again after new canonicalization; in those cases only the latter computation is necessary (but it's a little hard to detect, so I just left it in both places for now).

jakobbotsch · 2024-01-09T22:56:48Z

src/coreclr/jit/optimizer.cpp

+void Compiler::optFindNewLoops()
+{
+    m_loops = FlowGraphNaturalLoops::Find(m_dfsTree);
+
+    if (optCanonicalizeLoops(m_loops))
+    {
+        fgUpdateChangedFlowGraph(FlowGraphUpdates::COMPUTE_DOMS);
+        m_dfsTree = fgComputeDfs();
+        m_loops   = FlowGraphNaturalLoops::Find(m_dfsTree);
+    }
+
+    // Starting now, we require all loops to have pre-headers.
+    optLoopsRequirePreHeaders = true;
+
+    // Leave a bread crumb for future phases like loop alignment about whether
+    // looking for loops makes sense. We generally do not expect phases to
+    // introduce new cycles/loops in the flow graph; if they do, they should
+    // set this to true themselves.
+    // We use more general cycles over "m_loops->NumLoops() > 0" here because
+    // future optimizations can easily cause general cycles to become natural
+    // loops by removing edges.
+    fgMightHaveNaturalLoops = m_dfsTree->HasCycle();
+    assert(fgMightHaveNaturalLoops || (m_loops->NumLoops() == 0));
+}


I don't know why in the diff this shows up as a big addition... I only added the canonicalization part here.

jakobbotsch · 2024-01-09T22:59:18Z

I pushed a minor changes, but this passed jitstress/libraries-jitstress before those.

BruceForstall

LGTM

src/coreclr/jit/optimizer.cpp

Co-authored-by: Bruce Forstall <[email protected]>

…ion (dotnet#96647) Delete the old loop canonicalization. Implement canonicalization on new loops; the only canonicalization done is creation of preheaders. Diffs expected, in particular due to ambiguous cases where the old loop finding identifies more loops than the new loop finding. The old loop finding would then canonicalize these extra loops in such a way that new loop finding also found them. That is, the old pipeline: ``` old loop finding -> old loop canonicalization -> new loop finding ``` would result in more loops found than ``` old loop finding -> new loop finding -> new loop canonicalization ``` An example of a program where it is ambiguous how many loops it contains based only on the flow graph is the following: ```csharp [MethodImpl(MethodImplOptions.NoInlining)] private static void Foo(int n) { n *= 2; do { if (n == 3) { n -= 2; continue; } if (n == 4) { n -= 3; continue; } if (n == 5) { n -= 4; continue; } n--; } while (n > 0); } ``` Old loop finding finds 4 loops in this program, while new loop finding finds 1 loop. The backedges (`continue` statements) can either be seen as nested loop backedges or as backedges to the outer loop). Here is another, opposite case: ```csharp [MethodImpl(MethodImplOptions.NoInlining)] private static void Foo(int n) { n *= 2; do { do { n *= 3; } while (n < 5); n -= 10; } while (n != 0); } ``` Old loop finding finds 2 loops but new loop finding finds 1 loop. The differences in loop identification lead to the diffs. Instead of trying to proactively handle them, we are going to see if the perf lab finds any examples that we can gain insight on. One avenue to explore may be to use PGO information to figure out what may be nested loops vs what may just be backedges.

jakobbotsch added 2 commits January 8, 2024 10:46

JIT: Invalidate old loop table immediately

19269ab

Invalidate the old loop table immediately after old loop finding. Remove `m_oldToNewLoop` and `m_newToOldLoop`. Move `BBF_OLD_LOOP_HEADER_QUIRK` to be set directly by old loop finding.

JIT: Create preheaders on new loops and delete old loop canonicalization

e95b627

ghost assigned jakobbotsch Jan 8, 2024

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 8, 2024

jakobbotsch changed the title ~~Canonicalize new loops~~ JIT: Create preheaders on new loops and delete old loop canonicalization Jan 8, 2024

build-analysis bot mentioned this pull request Jan 9, 2024

Checkout failure: "Git fetch failed with exit code 128" dotnet/arcade#9009

Open

2 tasks

jakobbotsch added 3 commits January 9, 2024 11:42

Merge branch 'main' of github.com:dotnet/runtime into canonicalize-ne…

0f0fe5d

…w-loops

Reduce diffs

6c7ce50

Fix merge

2e368f2

jakobbotsch marked this pull request as ready for review January 9, 2024 22:37

jakobbotsch requested a review from BruceForstall January 9, 2024 22:49

jakobbotsch commented Jan 9, 2024

View reviewed changes

Unnest

7839c31

jakobbotsch commented Jan 9, 2024

View reviewed changes

Add a function header

fa523b9

BruceForstall approved these changes Jan 10, 2024

View reviewed changes

src/coreclr/jit/optimizer.cpp Outdated Show resolved Hide resolved

Update src/coreclr/jit/optimizer.cpp

f953f3a

Co-authored-by: Bruce Forstall <[email protected]>

jakobbotsch merged commit e0ecb3a into dotnet:main Jan 10, 2024

jakobbotsch deleted the canonicalize-new-loops branch January 10, 2024 09:20

build-analysis bot mentioned this pull request Jan 10, 2024

NuGet failing with Response status code does not indicate success: 503 (Service Unavailable) dotnet/arcade#11723

Open

5 tasks

jakobbotsch mentioned this pull request Jan 10, 2024

Improve JIT loop optimizations (.NET 9) #93144

Closed

21 tasks

DrewScoggins mentioned this pull request Jan 16, 2024

[Perf] Linux/x64: 3 Regressions on 1/10/2024 11:23:32 AM dotnet/perf-autofiling-issues#27266

Open

cincuranet mentioned this pull request Jan 18, 2024

[Perf] Windows/arm64: 1 Regression on 1/10/2024 2:32:18 PM dotnet/perf-autofiling-issues#27474

Open

github-actions bot locked and limited conversation to collaborators Feb 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JIT: Create preheaders on new loops and delete old loop canonicalization #96647

JIT: Create preheaders on new loops and delete old loop canonicalization #96647

Uh oh!

jakobbotsch commented Jan 8, 2024 •

edited

Loading

Uh oh!

ghost commented Jan 8, 2024

Uh oh!

jakobbotsch commented Jan 9, 2024

Uh oh!

azure-pipelines bot commented Jan 9, 2024

Uh oh!

jakobbotsch commented Jan 9, 2024 •

edited

Loading

Uh oh!

jakobbotsch Jan 9, 2024

Uh oh!

jakobbotsch Jan 9, 2024

Uh oh!

jakobbotsch commented Jan 9, 2024

Uh oh!

BruceForstall left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JIT: Create preheaders on new loops and delete old loop canonicalization #96647

JIT: Create preheaders on new loops and delete old loop canonicalization #96647

Uh oh!

Conversation

jakobbotsch commented Jan 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ghost commented Jan 8, 2024

Uh oh!

jakobbotsch commented Jan 9, 2024

Uh oh!

azure-pipelines bot commented Jan 9, 2024

Uh oh!

jakobbotsch commented Jan 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakobbotsch Jan 9, 2024

Choose a reason for hiding this comment

Uh oh!

jakobbotsch Jan 9, 2024

Choose a reason for hiding this comment

Uh oh!

jakobbotsch commented Jan 9, 2024

Uh oh!

BruceForstall left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jakobbotsch commented Jan 8, 2024 •

edited

Loading

jakobbotsch commented Jan 9, 2024 •

edited

Loading