-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Closed
Labels
area-vmUse area-vm for VM related issues, including code coverage, and the AOT and JIT backends.Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.closed-not-plannedClosed as we don't intend to take action on the reported issueClosed as we don't intend to take action on the reported issuetype-performanceIssue relates to performance or code sizeIssue relates to performance or code size
Description
On IA32, Base64Encoder golem benchmark:
Base64Encoder(RunTime): 47.596634102808196 us.
If _StringBase._createOneByteString is inlined into _StringBase.createFromCharCodes, this slows down benchmark:
Base64Encoder(RunTime): 63.26622740644672 us.
Hot loop in _StringBase._createOneByteString:
0xf68add7b | | 0.95% (25) | cmp esp,[esi+0x24] |
0xf68add7e | | | jna 0xf68ade4e |
0xf68add84 | | 1.68% (44) | cmp ebx,edi |
0xf68add86 | | | jge 0xf68addaa |
0xf68add8c | | 0.38% (10) | mov eax,[ebp+0xc] |
0xf68add8f | | 0.91% (24) | add eax,ebx |
0xf68add91 | | 1.52% (40) | mov edi,eax |
0xf68add93 | | 1.68% (44) | sar edi,1 |
0xf68add95 | | 0.65% (17) | movzxb eax,[ecx+edi*1+0xb] |
0xf68add9a | | 2.48% (65) | mov edi,ebx |
0xf68add9c | | 0.84% (22) | sar edi,1 |
0xf68add9e | | 1.03% (27) | movb [edx+edi*1+0xb],eax |
0xf68adda2 | | 1.60% (42) | add ebx,2 |
0xf68adda5 | | 0.42% (11) | mov edi,[ebp+0x8] |
0xf68adda8 | | 0.72% (19) | jmp 0xf68add7b
The same loop if _StringBase._createOneByteString is inlined:
0xf686f001 | | 2.86% (74) | cmp esp,[esi+0x24] |
0xf686f004 | | | jna 0xf686f4ec |
0xf686f00a | | | cmp eax,ebx |
0xf686f00c | | | jge 0xf686f09c |
0xf686f012 | | | mov ecx,[ebp+0x10] |
0xf686f015 | | 0.04% (1) | add ecx,eax |
0xf686f017 | | 2.74% (71) | mov ebx,ecx |
0xf686f019 | | | sar ebx,1 |
0xf686f01b | | | movzxb ecx,[edi+ebx*1+0xb] |
0xf686f020 | | 0.23% (6) | xchg ecx,eax |
0xf686f022 | | 3.98% (103) | mov ebx,ecx |
0xf686f024 | | | sar ebx,1 |
0xf686f026 | | 2.12% (55) | movb [edx+ebx*1+0xb],eax |
0xf686f02a | | 14.90% (386)| add ecx,2 |
0xf686f02d | | 0.08% (2) | mov eax,ecx |
0xf686f02f | | | mov ebx,[ebp-0xc] |
0xf686f032 | | | jmp 0xf686f001
The following patch can be used to reproduce the problem:
diff --git a/runtime/lib/string_patch.dart b/runtime/lib/string_patch.dart
index 6bf950f911..48c53770bd 100644
--- a/runtime/lib/string_patch.dart
+++ b/runtime/lib/string_patch.dart
@@ -219,6 +219,7 @@ abstract class _StringBase implements String {
return createFromCharCodes(charCodeList, 0, length, bits);
}
+ @pragma("vm:prefer-inline")
static String _createOneByteString(List<int> charCodes, int start, int len) {
// It's always faster to do this in Dart than to call into the runtime.
var s = _OneByteString._allocate(len);
Note that _StringBase._createOneByteString satisfies our inlining_callee_call_sites_threshold heuristic if all integer operations are recognized. Currently it is not inlined because not all integer operations are recognized (there is a remaining unrecognized call). However, with bytecode all integer operations are replaced with BinarySmiOp and _StringBase._createOneByteString is inlined causing performance regression.
Metadata
Metadata
Assignees
Labels
area-vmUse area-vm for VM related issues, including code coverage, and the AOT and JIT backends.Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends.closed-not-plannedClosed as we don't intend to take action on the reported issueClosed as we don't intend to take action on the reported issuetype-performanceIssue relates to performance or code sizeIssue relates to performance or code size