Skip to content

JIT: Enable EH Write Thru by default #35923

@CarolEidt

Description

@CarolEidt

Today, if all the variables that are live cross exception handlers (EH) are kept on stack. This proves expensive for hot methods containing EH code. It is also expensive for certain language constructs like presence async which calls in .NET libraries code that has EH. In below example, even if method Calculate doesn't have any EH regions, the presence of async introduces EH region.

public class AsyncTest
{
    private static AsyncTest obj = new AsyncTest();

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static async Task<int> Calculate()
    {
        return await obj.GetResult();
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    private async Task<int> GetResult()
    {
        Console.WriteLine("Calculating...");
        await Task.Delay(1 * 1000);
        return 1;
    }
}
; Assembly listing for method MiniBench.AsyncTest:Calculate():System.Threading.Tasks.Task`1[Int32]
; Emitting BLENDED_CODE for X64 CPU with AVX - Windows

; Lcl frame size = 56

G_M26483_IG01:              ;; offset=0000H
; ...
; ...
G_M26483_IG02:              ;; offset=0015H
       33C9                 xor      rcx, rcx
       48894C2428           mov      gword ptr [rsp+28H], rcx
       C7442420FFFFFFFF     mov      dword ptr [rsp+20H], -1
       488D4C2420           lea      rcx, bword ptr [rsp+20H]
       E842FFFFFF           call     System.Runtime.CompilerServices.AsyncMethodBuilderCore:Start(byref)
       488B442428           mov      rax, gword ptr [rsp+28H]
       4885C0               test     rax, rax
; ...

The call to AsyncMethodBuilderCore:Start contains EH code and the performance for these methods can be degraded because of accessing the EH vars from stack. We added a work around specially inside AsyncMethodBuilderCore:Start() in dotnet/coreclr#15629 to make sure that EH vars get registers.

We have an implementation of enregistering EH Vars but from the analysis of #35534 showed mixed results for enabling EH Write Thru by default. This issue tracks of enabling the EH Write thru by default. The outcome that we are expecting is that in addition to storing the EH variable values on stack, also store them in a register and use register (as much as possible) during the uses in non-EH code. Additionally, also assign a register for the code that is totally contained inside a EH region.

There are at least a couple of issues, though further analysis should be done to validate that these are the major contributing factors to performance regressions:

  • In situations where there are multiple definitions in a Try clause, each of those will do a store, and may also define a register - requiring an additional move if the value could have been directly defined to memory.
  • In cases of high register pressure, even though the EH vars are considered to have lower spill cost (since they never actually have to be stored, only reloaded at uses), the increase in register pressure due to the EH vars can expose (known) weaknesses in the register allocator's handling of code with high register-pressure.

One option would be to track the number/ratio of defs and uses. Currently we don't readily have that information until we build Intervals, and at that point it is difficult for the register allocator to change its mind about making something a candidate. That said, it could presumably decide that some of the candidates should never get a register, effectively making it the same as if it were not a candidate, though that would require some tweaking to avoid actually allocating a register when not needed (it does better with RegOptional uses than defs).

It's possible that this tracking could be done prior to register allocation at the time of ref counting.

For now, we will take an incremental approach:

Future work

  • Enable EH write thru for local vars having multiple def. This has following dependency:
    • Tune register allocation heuristics to make it prefer spilling lower-weight values rather than assigning bad free register. ( Captured in [LSRA][RyuJIT] Tune register selection heuristics #43318)
    • Further, explore if EH write thru decision can be depend on the ratio of defs/uses and if yes, include that heuristics.

Related issue:

Workaround added so far: dotnet/coreclr#15629

category:cq
theme:register-allocator
skill-level:expert
cost:medium

Metadata

Metadata

Assignees

No one assigned

    Labels

    Bottom Up WorkNot part of a theme, epic, or user storyUser StoryA single user-facing feature. Can be grouped under an epic.area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

    Type

    No type

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions