Skip to content

[arm64] These benchmarks are faster on Rosetta-x64 than on native arm64 #60616

@EgorBo

Description

@EgorBo

The following simple benchmarks are faster on Rosetta (x64 emulation) than on arm64 (native) on the same CPU "Apple M1 mac mini". I think it's a clear sign something can be improved on arm64 side:

using System;
using System.Linq;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

public enum MyEnum
{
    A,B,C,D,E,F
}

public class Program
{
    static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);


    [Benchmark]
    [Arguments(MyEnum.F)]
    public string EnumToString(MyEnum e) => e.ToString();


    [Benchmark]
    [Arguments(10000)]
    public char[] AllocateUninit(int len) => GC.AllocateUninitializedArray<char>(len);
    

    private static readonly int[] _array = Enumerable.Range(1, 1000).ToArray();
    [Benchmark]
    public void ArrayReverse() => Array.Reverse(_array);
}

Rosetta (.NET 6.0 rc2 osx-x64):

|          Method |       Mean |     Error |    StdDev |
|---------------- |-----------:|----------:|----------:|
|    ArrayReverse | 176.320 ns | 2.4289 ns | 2.2720 ns |
|  AllocateUninit | 343.028 ns | 1.0766 ns | 0.8405 ns |
|    EnumToString |  23.064 ns | 0.0278 ns | 0.0247 ns |

Native (.NET 6.0 rc2 osx-arm64):

|          Method |       Mean |     Error |    StdDev |
|---------------- |-----------:|----------:|----------:|
|    ArrayReverse | 201.936 ns | 0.0345 ns | 0.0323 ns |
|  AllocateUninit | 524.582 ns | 1.5993 ns | 1.4960 ns |
|    EnumToString |  44.961 ns | 0.1068 ns | 0.0999 ns |

AllocateUninit and EnumToString most likely are GC issues - not sure it's the same as #60166 as that function reports 2MB for LLC size (however, it might be less than the actual, see https://en.wikipedia.org/wiki/Apple_M1)

PS: I'm sure we'll find more such cases if we run the whole dotnet/performance suite on Rosetta vs native - I already launched a script to do so - it should finish in two days.

/cc @dotnet/jit-contrib @dotnet/gc

Metadata

Metadata

Assignees

Labels

arch-arm64area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMItenet-performancePerformance related issue

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions