JIT doesn't allow method prologues to have more than one instruction group

The JIT currently has a restriction that there can only be one IG for the method prologue, this is unlike funclets or the method epilogue which can extend across several.

This is normally not problematic, however there are many scenarios under which the method prologue can extend past the limits of a single group since a single group has a finite number of instructions it can hold.

An example of this is the following program:
```csharp
using System.Numerics.Tensors;
using System.Runtime.CompilerServices;

internal class Program
{
    private static void Main(string[] args)
    {
        ReadOnlySpan<ulong> x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16];
        Console.WriteLine(Invoke(x, x));
    }

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static ulong Invoke(ReadOnlySpan<ulong> x, ReadOnlySpan<ulong> y)
    {
        return TensorPrimitives.ProductOfDifferences<ulong>(x, y);
    }
}
```

If this is run under a `checked` JIT with `DOTNET_ReadyToRun=0`, `DOTNET_TieredCompilation=0`, and `DOTNET_JitStressRegs=0x80` then it will trigger the following assert: https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/emit.cpp#L9670-L9672
```
    /* Right now we don't allow multi-IG prologs */

    assert(emitCurIG != emitPrologIG);
```

This happens because we set `genUseBlockInit = (genInitStkLclCnt > 4)` and then in `genZeroInitFrame` use that to determine if we're zeroing using SIMD or using a what is basically `sizeof(void*)` stores of the native general purpose register

That itself seems "bad" from a performance perspective since it's not accounting for how big these 4 locals are and therefore whether block vs scalar zeroing is "better". But, independently it means that this code path is broken if the total number of store instructions required extends past the limits of a single IG as occurs if you have `4x TYP_SIMD64` as an example.

The JIT needs to be updated to support prologues that extend past 1 group to ensure that we are robust in the face of having more than `EMIT_MAX_IG_INS_COUNT` (which can be less in practice for large `instrDesc`, instructions, in the failure above we hit the limit at 61 instructions out of the maximum 256). 

Additionally, it would probably be beneficial to have zeroing pick the optimal strategy based on number of bytes needing to be zeroed rather than number of locals.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

JIT doesn't allow method prologues to have more than one instruction group #104585

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

JIT doesn't allow method prologues to have more than one instruction group #104585

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions