Allow larger than 0x1000 bytes allocation in genStackPointerConstantAdjustment() #47862

echesakov · 2021-02-04T18:34:19Z

Root Cause: The issue #42023 is due to inability to allocate a local heap (localloc) using decrements of sp with immediate offset larger than 0x1000 bytes - this happens if OsPageSize is greater than 0x1000.

I originally suspected the issue was due to prolog allocation/stack probing as I mentioned in #42023 (comment) but after locally reproducing and analyzing the failure I identified the issue to be not related to prolog execution.

Solution: Instead of directly encoding the decrement value in sub sp, sp, #imm instruction immediate the JIT would instead use a temporary register tmpReg to store the value with mov tmpReg, #imm and use sub sp, sp, tmpReg to do the allocation.

Validation: The testing is the most challenging part of the change - given the fact that we don't have a machine with non-4KB page size.
Hence, I try to simulate the larger page size making VM to report 0x10000 as OsPageSize with a hope that a failure would occur earlier during the JIT/AOT compilation test and not during the tests running time (due to not illegal memory access below stack guard page). In fact, I believe that the test run on linux-arm64 should succeed even with the increased OsPageSize value since on Linux the default value of stack_guard_gap= is 256 pages according to Linux kernel documentation.

… size in src/coreclr/vm/jitinterface.cpp

…2 and Arm64 The temporary register is going to be used when sub sp,sp,#spDelta can not be encoded. For example, this happens when spDelta corresponds to OsPageSize and the OsPageSize is larger than 0x1000 bytes. The following code needs to be generated in such cases mov regTmp,#spDelta sub sp,sp,regTmp

echesakov · 2021-02-04T18:40:35Z

I did a smoke test yesterday with the increased reported page size on top of current master and the following are the links to the test runs:

The test originally failing in #42023 was failing in those test runs as well.

…ister to encode amount of sp adjustment in src/coreclr/jit/codegenarm64.cpp

echesakov · 2021-02-05T18:08:48Z

@dotnet/jit-contrib The change is ready to be reviewed, please take a look.

Since the CI testing was non-functioning yesterday I manually validated the change by running superpmi on libraries and tests collections with win_arm64_x64 clrjit. In order to simulate larger page size I made eeGetPageSize() returning 0x10000 (since the VM changes would not affect superpmi replay). I confirmed that without the change there are two assertions in the JIT related to un-encodable immediate value (one is it seen in #42023) and there are no assertions with the change.

I think it might be beneficial to add a complus variable (e.g. COMPlus_ForcePageSize) that would be active during altjits and force the JIT to use a different page size than reported by VM. Having such variable and a corresponding test scenario would've allowed to catch an issue like #42023 earlier in product cycle.

sandreenko

LGTM, thank you.

I think it might be beneficial to add a complus variable (e.g. COMPlus_ForcePageSize) that would be active during altjits and force the JIT to use a different page size than reported by VM.

Sounds like a rare scenario to me and I don't think we have an appropriate pipeline for such testing, because if I understand correctly we can't run code generated with a bigger page size?

src/coreclr/jit/codegenarmarch.cpp

echesakov · 2021-02-08T17:59:12Z

Sounds like a rare scenario to me and I don't think we have an appropriate pipeline for such testing, because if I understand correctly we can't run code generated with a bigger page size?

You are right, we don't run altjit pipelines anymore. But we could try running with an increased page size on Linux anyway. And if we had M1 machines in CI those, presumably, would run with 16KB page size.

Given the recent developments with the SuperPMI collections pipelines I wonder if it's worth adding a pipeline replaying the collections with different altjits that would allow not to only validate the SuperPMI replay function/harness but also validate altjit infrastructure.

…StackPointerConstantAdjustment in src/coreclr/jit/codegenarmarch.cpp

sandreenko · 2021-02-09T01:39:59Z

arm64 mac tests passed (3 failures are expected):

Total tests run    : 10383
Total passing tests: 10380
Total failed tests : 3
Total skipped tests: 0

echesakov · 2021-02-09T03:04:26Z

@sandreenko Thank you for confirming!

sdmaclea · 2021-02-09T17:09:27Z

I am not convinced this is the right fix. Why aren't we just hard coding the probing size to 4K for local alloc. How will we handle when crossgen occurs on a 64K page machine and generates code to run on a 4K page machine.

Is the extra complexity and register usage really worth it?

echesakov · 2021-02-09T18:18:00Z

How will we handle when crossgen occurs on a 64K page machine and generates code to run on a 4K page machine.

The crossgen will always use the hardcoded 4K page - see the changes in src/coreclr/vm/jitinterface.cpp

Is the extra complexity and register usage really worth it?

The temporary register usage would happen at most once. On a 4K page machine the register is never going to be used. On a 16K or 64K page machine allocation of a local heap would be done in a page size decrements (that size can encoded by sub) until the remaining space is smaller than a page. Now, if the remaining space is larger than allowed range of sub instruction immediate then this is when the register is going to be used.

echesakov added 2 commits February 3, 2021 11:47

In AOT scenarios the VM reports to the JIT the minimal supported page…

6c4a8bd

… size in src/coreclr/vm/jitinterface.cpp

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 4, 2021

echesakov added the arch-arm64 label Feb 4, 2021

dotnet deleted a comment from azure-pipelines bot Feb 4, 2021

echesakov added the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Feb 4, 2021

dotnet deleted a comment from azure-pipelines bot Feb 4, 2021

Use genInstrWithConstant(INS_sub) and rsGetRsvdReg() as temporary reg…

d0fdc3e

…ister to encode amount of sp adjustment in src/coreclr/jit/codegenarm64.cpp

echesakov force-pushed the Runtime_42023 branch from 837d458 to feffd99 Compare February 5, 2021 03:25

dotnet deleted a comment from azure-pipelines bot Feb 5, 2021

echesakov closed this Feb 5, 2021

echesakov reopened this Feb 5, 2021

echesakov marked this pull request as ready for review February 5, 2021 17:52

echesakov force-pushed the Runtime_42023 branch from feffd99 to d0fdc3e Compare February 5, 2021 23:37

dotnet deleted a comment from azure-pipelines bot Feb 5, 2021

echesakov removed the NO-MERGE The PR is not ready for merge yet (see discussion for detailed reasons) label Feb 5, 2021

dotnet deleted a comment from azure-pipelines bot Feb 5, 2021

sandreenko self-requested a review February 6, 2021 00:04

sandreenko approved these changes Feb 7, 2021

View reviewed changes

src/coreclr/jit/codegenarmarch.cpp Outdated Show resolved Hide resolved

BruceForstall approved these changes Feb 8, 2021

View reviewed changes

JulieLeeMSFT linked an issue Feb 8, 2021 that may be closed by this pull request

arm64 skippage6.sh test fails to JIT code when page size > 4KB #42023

Closed

JulieLeeMSFT added this to the 6.0.0 milestone Feb 8, 2021

JulieLeeMSFT assigned echesakov Feb 8, 2021

Update comment to reflect that REG_NA can be passed for regTmp in gen…

b661ac0

…StackPointerConstantAdjustment in src/coreclr/jit/codegenarmarch.cpp

echesakov merged commit bf9f899 into dotnet:master Feb 9, 2021

echesakov deleted the Runtime_42023 branch February 9, 2021 03:04

echesakov mentioned this pull request Feb 9, 2021

arm64 skippage6.sh test fails to JIT code when page size > 4KB #42023

Closed

ghost locked as resolved and limited conversation to collaborators Mar 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow larger than 0x1000 bytes allocation in genStackPointerConstantAdjustment() #47862

Allow larger than 0x1000 bytes allocation in genStackPointerConstantAdjustment() #47862

Uh oh!

echesakov commented Feb 4, 2021 •

edited

Loading

Uh oh!

echesakov commented Feb 4, 2021 •

edited

Loading

Uh oh!

echesakov commented Feb 5, 2021 •

edited

Loading

Uh oh!

sandreenko left a comment

Uh oh!

Uh oh!

echesakov commented Feb 8, 2021

Uh oh!

sandreenko commented Feb 9, 2021

Uh oh!

echesakov commented Feb 9, 2021

Uh oh!

sdmaclea commented Feb 9, 2021 •

edited

Loading

Uh oh!

echesakov commented Feb 9, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Allow larger than 0x1000 bytes allocation in genStackPointerConstantAdjustment() #47862

Allow larger than 0x1000 bytes allocation in genStackPointerConstantAdjustment() #47862

Uh oh!

Conversation

echesakov commented Feb 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

echesakov commented Feb 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

echesakov commented Feb 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sandreenko left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

echesakov commented Feb 8, 2021

Uh oh!

sandreenko commented Feb 9, 2021

Uh oh!

echesakov commented Feb 9, 2021

Uh oh!

sdmaclea commented Feb 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

echesakov commented Feb 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

echesakov commented Feb 4, 2021 •

edited

Loading

echesakov commented Feb 4, 2021 •

edited

Loading

echesakov commented Feb 5, 2021 •

edited

Loading

sdmaclea commented Feb 9, 2021 •

edited

Loading

echesakov commented Feb 9, 2021 •

edited

Loading