Skip to content
This repository was archived by the owner on Jan 23, 2023. It is now read-only.

Do not fast tail call if caller has multislot structs#25885

Merged
jashook merged 5 commits intodotnet:masterfrom
jashook:mitigate_gh_25884
Jul 26, 2019
Merged

Do not fast tail call if caller has multislot structs#25885
jashook merged 5 commits intodotnet:masterfrom
jashook:mitigate_gh_25884

Conversation

@jashook
Copy link

@jashook jashook commented Jul 25, 2019

With more eyes on LowerFastTailCall, @jakobbotsch and I have found another SBC bug. We have verified that this causes silent bad codegen on <= coreclr 3.0. Thank you @jakobbotsch for help on a functioning repro.

@jashook
Copy link
Author

jashook commented Jul 25, 2019

Note this effects unix x64 only.

@jashook
Copy link
Author

jashook commented Jul 25, 2019

/cc @RussKeldorph

Note I discovered that this change should be in #20643 and originally added it there. Then realized this is most likely a bug going back. So this change has been tested already. It is just waiting on reviews. Will hoist it out of #20643.

@jashook
Copy link
Author

jashook commented Jul 25, 2019

Fixes #25884

jashook pushed a commit to jashook/coreclr that referenced this pull request Jul 25, 2019
@jashook
Copy link
Author

jashook commented Jul 25, 2019

/cc @dotnet/jit-contrib

@jashook jashook requested a review from AndyAyersMS July 25, 2019 18:22
Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we see any diffs from this?

@jashook
Copy link
Author

jashook commented Jul 25, 2019

I would argue that this change does not need to run diffs. compHasMultiSlotArgs and varDscInfo->hasMultiSlotStruct are only used in fastTailCall decisions, specifically on unix. So any diff would be a new correct decision not to fast tail call. Regardless we can be almost entirely confident that this fastTailCall decision is not currently being made in our test suite.

That being said, this should be a very quick run and a very easy zero diff example. Will report back.

@AndyAyersMS
Copy link
Member

I would argue that this change does not need to run diffs.

Running diffs should be cheap enough now that we should always do it as a matter of practice.

I am curious about a few things:

  • If we think we should back port this fix into 2.x, we should check if this bug causes bad code in our frameworks, and if there is, why the testing we already do hasn't uncovered this.
  • We might realize this check is overly conservative and blocks tails calls that are perfectly safe. If so we should add a work item to make sure those cases get handled once we've fixed the analysis to understand these types of args.

Copy link

@CarolEidt CarolEidt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a couple of comment changes (mentioned offline)

@jashook
Copy link
Author

jashook commented Jul 26, 2019

@AndyAyersMS there are diffs with this PR, they effect the SIMD tests. Unfortunately superpmi has a bug in exception handling so the actually diffs from jit-analyze are not interesting. I will investigate further on Monday.

On manual inspection we now will not fast tail call for SIMD types more often. This happens infrequently, but will be fixed with the work @jakobbotsch is doing in LowerFastTail call. In the meantime, it is more important to be correct then continue to fast tail call.

@jashook
Copy link
Author

jashook commented Jul 26, 2019

Found 872 files with textual diffs.

Summary:
(Lower is better)

Total bytes of diff: -142162 (-98.893% of base)
    diff is an improvement.

Top file improvements by size (bytes):
        -231 : 218124.dasm (-100.000% of base)
        -231 : 218190.dasm (-100.000% of base)
        -231 : 218256.dasm (-100.000% of base)
        -231 : 131457.dasm (-100.000% of base)
        -231 : 131525.dasm (-100.000% of base)

872 total files with size differences (872 improved, 0 regressed), 9 unchanged.

Top method improvements by size (bytes):
        -231 (-100.000% of base) : 218124.dasm - JIT.HardwareIntrinsics.X86.SimpleTernaryOpTest__MultiplyAddSingle:ValidateResult(struct,struct,struct,long,ref):this
        -231 (-100.000% of base) : 218190.dasm - JIT.HardwareIntrinsics.X86.SimpleTernaryOpTest__MultiplyAddNegatedSingle:ValidateResult(struct,struct,struct,long,ref):this
        -231 (-100.000% of base) : 218256.dasm - JIT.HardwareIntrinsics.X86.AlternatingTernaryOpTest__MultiplyAddSubtractSingle:ValidateResult(struct,struct,struct,long,ref):this
        -231 (-100.000% of base) : 131457.dasm - JIT.HardwareIntrinsics.X86.SimpleTernaryOpTest__BlendVariableInt16:ValidateResult(struct,struct,struct,long,ref):this
        -231 (-100.000% of base) : 131525.dasm - JIT.HardwareIntrinsics.X86.SimpleTernaryOpTest__BlendVariableInt32:ValidateResult(struct,struct,struct,long,ref):this

Top method improvements by size (percentage):
        -217 (-100.000% of base) : 197444.dasm - JIT.HardwareIntrinsics.X86.AlternatingTernaryOpTest__MultiplyAddSubtractSingle:ValidateResult(struct,struct,struct,long,ref):this
        -176 (-100.000% of base) : 116303.dasm - JIT.HardwareIntrinsics.X86.SimpleBinaryOpTest__AndByte:ValidateResult(struct,struct,long,ref):this
        -176 (-100.000% of base) : 11990.dasm - JIT.HardwareIntrinsics.X86.SimpleBinaryOpTest__XorSingle:ValidateResult(struct,struct,long,ref):this
        -176 (-100.000% of base) : 117007.dasm - JIT.HardwareIntrinsics.X86.SimpleBinaryOpTest__CompareEqualInt32:ValidateResult(struct,struct,long,ref):this
        -127 (-100.000% of base) : 118521.dasm - JIT.HardwareIntrinsics.X86.SimpleUnaryOpConvTest__ConvertToVector128DoubleVector128Single:ValidateResult(struct,long,ref):this

872 total methods with size differences (872 improved, 0 regressed), 9 unchanged.

Diffs, note the percentages are off due to an issue with PAL_TRY exception filtering. The is run over a superpmi diff of a superpmi collection of the test assets.

@jashook
Copy link
Author

jashook commented Jul 26, 2019

Change has been previously validated. Merging.

@jashook jashook merged commit 059604d into dotnet:master Jul 26, 2019
@jashook jashook deleted the mitigate_gh_25884 branch July 26, 2019 23:30
jashook pushed a commit to jashook/coreclr that referenced this pull request Jul 26, 2019
* Do not fast tail call if caller has multislot structs

* apply patch

* Address offline feedback
jashook pushed a commit to jashook/coreclr that referenced this pull request Jul 26, 2019
* Do not fast tail call if caller has multislot structs

* apply patch

* Address offline feedback
jashook pushed a commit to jashook/coreclr that referenced this pull request Jul 27, 2019
jashook pushed a commit that referenced this pull request Jul 28, 2019
* Do not fast tail call if caller has multislot structs

* apply patch

* Address offline feedback
jashook pushed a commit that referenced this pull request Jul 29, 2019
* Use fgInitArgInfo in fgCanFastTailCall

This leverages the work done in #19658 to use fgInitArgInfo in fgCanFastTailCall. This removes much of the duplicate ABI code that was required before #19658. In addition, the change more clearly explains the issues with LowerFastTailCall and more clearly bails on when it can and cannot fastTailCall.

Note:

Part of the old logic would report no to a fastTailCall on Windows AMD64, if the struct was a power of two (engregisterable). This has been removed; however, it is unclear currently whether LowerFastTailCall/Codgen will support the change.

* Address pr feedback for FastTailCallCandidates

* Update variable name based on rebase

* Fix comment headers based on feedback

* More feedback addressed

* Fix rebase error

* Address feedback and correctly bail on byref args

* Restrict to only bail on byref struct args

* Correctly used computations from fgArgInfo

* Correctly re-init arg infor for explicit tailcalls

This is required because morphTailCall modifies the argument list.

It would be nice in the future to go through each argument on subsequent calls to fgInitArgInfo to verify there have been no changes.

* Fix x86 assert

* Address PR feedback related to recomputing initArgs

* Fix comment

* Update fgCanFastTailCall header

* Address PR feedback

* Apply format patch

* Correctly reset arginfo after morphtailcall

* Commit does several small things, and fixes a bug

Addresses PR feedback
Removes dead code
Changes to bail a fast tail call when the caller has a multi slot struct

* Fix windows build

* Address pr feedback

* Fix arm and x86 builds

* apply format patch and remove #25885

* Rebase on master

* Apply format patch
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
…#25885)

* Do not fast tail call if caller has multislot structs

* apply patch

* Address offline feedback


Commit migrated from dotnet/coreclr@059604d
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
* Use fgInitArgInfo in fgCanFastTailCall

This leverages the work done in dotnet/coreclr#19658 to use fgInitArgInfo in fgCanFastTailCall. This removes much of the duplicate ABI code that was required before dotnet/coreclr#19658. In addition, the change more clearly explains the issues with LowerFastTailCall and more clearly bails on when it can and cannot fastTailCall.

Note:

Part of the old logic would report no to a fastTailCall on Windows AMD64, if the struct was a power of two (engregisterable). This has been removed; however, it is unclear currently whether LowerFastTailCall/Codgen will support the change.

* Address pr feedback for FastTailCallCandidates

* Update variable name based on rebase

* Fix comment headers based on feedback

* More feedback addressed

* Fix rebase error

* Address feedback and correctly bail on byref args

* Restrict to only bail on byref struct args

* Correctly used computations from fgArgInfo

* Correctly re-init arg infor for explicit tailcalls

This is required because morphTailCall modifies the argument list.

It would be nice in the future to go through each argument on subsequent calls to fgInitArgInfo to verify there have been no changes.

* Fix x86 assert

* Address PR feedback related to recomputing initArgs

* Fix comment

* Update fgCanFastTailCall header

* Address PR feedback

* Apply format patch

* Correctly reset arginfo after morphtailcall

* Commit does several small things, and fixes a bug

Addresses PR feedback
Removes dead code
Changes to bail a fast tail call when the caller has a multi slot struct

* Fix windows build

* Address pr feedback

* Fix arm and x86 builds

* apply format patch and remove dotnet/coreclr#25885

* Rebase on master

* Apply format patch


Commit migrated from dotnet/coreclr@f8f0c55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants