Commit 43a0918
[MPS] Add benchmark for scan with indices (#156860)
Baseline performance on M4 Max 64GB (macOS 15.5):
```
[-------------------------------- --------------------------------]
| eager | compile
1 threads: ---------------------------------------------------------
cummin-dim0-32x32 (torch.float16) | 102.5 | 115.0
cummin-dim0-128x128 (torch.float16) | 133.6 | 147.8
cummin-dim0-512x512 (torch.float16) | 233.1 | 243.1
cummin-dim0-1024x1024 (torch.float16) | 364.2 | 385.2
cummin-dim1-32x32 (torch.float16) | 94.4 | 109.8
cummin-dim1-128x128 (torch.float16) | 109.9 | 122.5
cummin-dim1-512x512 (torch.float16) | 227.0 | 233.8
cummin-dim1-1024x1024 (torch.float16) | 985.1 | 1010.5
cummin-1d-100 (torch.float16) | 100.7 | 114.3
cummin-1d-10000 (torch.float16) | 805.0 | 879.1
cummin-1d-1000000 (torch.float16) | 70545.6 | 71310.3
cummin-dim0-32x32 (torch.float32) | 102.7 | 115.5
cummin-dim0-128x128 (torch.float32) | 137.2 | 143.8
cummin-dim0-512x512 (torch.float32) | 209.7 | 222.0
cummin-dim0-1024x1024 (torch.float32) | 340.1 | 389.9
cummin-dim1-32x32 (torch.float32) | 99.2 | 107.8
cummin-dim1-128x128 (torch.float32) | 111.9 | 119.3
cummin-dim1-512x512 (torch.float32) | 250.7 | 255.1
cummin-dim1-1024x1024 (torch.float32) | 987.9 | 1013.2
cummin-1d-100 (torch.float32) | 100.6 | 114.6
cummin-1d-10000 (torch.float32) | 794.7 | 862.2
cummin-1d-1000000 (torch.float32) | 71995.3 | 71963.5
cummin-dim0-32x32 (torch.bfloat16) | 105.9 | 113.9
cummin-dim0-128x128 (torch.bfloat16) | 135.7 | 147.9
cummin-dim0-512x512 (torch.bfloat16) | 231.9 | 240.7
cummin-dim0-1024x1024 (torch.bfloat16) | 327.7 | 366.9
cummin-dim1-32x32 (torch.bfloat16) | 91.3 | 103.3
cummin-dim1-128x128 (torch.bfloat16) | 108.5 | 117.4
cummin-dim1-512x512 (torch.bfloat16) | 222.0 | 233.6
cummin-dim1-1024x1024 (torch.bfloat16) | 936.9 | 982.5
cummin-1d-100 (torch.bfloat16) | 106.6 | 112.4
cummin-1d-10000 (torch.bfloat16) | 795.8 | 819.6
cummin-1d-1000000 (torch.bfloat16) | 68667.4 | 68557.9
Times are in microseconds (us).
```
Pull Request resolved: #156860
Approved by: https://github.com/malfet1 parent 9fe2d15 commit 43a0918
1 file changed
+30
-19
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
74 | 83 | | |
75 | 84 | | |
76 | 85 | | |
| |||
87 | 96 | | |
88 | 97 | | |
89 | 98 | | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
| 99 | + | |
96 | 100 | | |
97 | 101 | | |
98 | 102 | | |
99 | 103 | | |
100 | 104 | | |
101 | 105 | | |
102 | | - | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
103 | 110 | | |
104 | 111 | | |
105 | 112 | | |
| |||
116 | 123 | | |
117 | 124 | | |
118 | 125 | | |
119 | | - | |
120 | | - | |
121 | | - | |
122 | | - | |
123 | | - | |
124 | | - | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
125 | 131 | | |
126 | 132 | | |
127 | 133 | | |
| |||
136 | 142 | | |
137 | 143 | | |
138 | 144 | | |
139 | | - | |
140 | | - | |
141 | | - | |
142 | | - | |
143 | | - | |
144 | | - | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
145 | 150 | | |
146 | 151 | | |
147 | 152 | | |
| |||
171 | 176 | | |
172 | 177 | | |
173 | 178 | | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
174 | 185 | | |
175 | 186 | | |
176 | 187 | | |
| |||
0 commit comments