-
Notifications
You must be signed in to change notification settings - Fork 5.3k
Description
The following simple benchmarks are faster on Rosetta (x64 emulation) than on arm64 (native) on the same CPU "Apple M1 mac mini". I think it's a clear sign something can be improved on arm64 side:
using System;
using System.Linq;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
public enum MyEnum
{
A,B,C,D,E,F
}
public class Program
{
static void Main(string[] args) => BenchmarkSwitcher.FromAssembly(typeof(Program).Assembly).Run(args);
[Benchmark]
[Arguments(MyEnum.F)]
public string EnumToString(MyEnum e) => e.ToString();
[Benchmark]
[Arguments(10000)]
public char[] AllocateUninit(int len) => GC.AllocateUninitializedArray<char>(len);
private static readonly int[] _array = Enumerable.Range(1, 1000).ToArray();
[Benchmark]
public void ArrayReverse() => Array.Reverse(_array);
}Rosetta (.NET 6.0 rc2 osx-x64):
| Method | Mean | Error | StdDev |
|---------------- |-----------:|----------:|----------:|
| ArrayReverse | 176.320 ns | 2.4289 ns | 2.2720 ns |
| AllocateUninit | 343.028 ns | 1.0766 ns | 0.8405 ns |
| EnumToString | 23.064 ns | 0.0278 ns | 0.0247 ns |
Native (.NET 6.0 rc2 osx-arm64):
| Method | Mean | Error | StdDev |
|---------------- |-----------:|----------:|----------:|
| ArrayReverse | 201.936 ns | 0.0345 ns | 0.0323 ns |
| AllocateUninit | 524.582 ns | 1.5993 ns | 1.4960 ns |
| EnumToString | 44.961 ns | 0.1068 ns | 0.0999 ns |
AllocateUninit and EnumToString most likely are GC issues - not sure it's the same as #60166 as that function reports 2MB for LLC size (however, it might be less than the actual, see https://en.wikipedia.org/wiki/Apple_M1)
PS: I'm sure we'll find more such cases if we run the whole dotnet/performance suite on Rosetta vs native - I already launched a script to do so - it should finish in two days.
/cc @dotnet/jit-contrib @dotnet/gc