-
Notifications
You must be signed in to change notification settings - Fork 29.7k
Sped up Element._sort #104103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sped up Element._sort #104103
Conversation
|
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!). If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
|
I'm not sure if we want a benchmark beyond the ones we already have. A benchmark for hot reload would capture this the best probably. |
jonahwilliams
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How much does this speed things up on the rebuild benchmarks?
I have a number but I'm not sure how much I trust it until I've played around with the new benchmark a bit more. It can be demonstrated that there is no path through this code as it was originally written that does less operations. |
|
I wrote a microbenchmark for this but the results were just showing a 1% improvement. I suspect the slowness is elsewhere in dart (like List.operator[]=, or function invocation). I can add the microbenchmark if you want but I don't think it's worth the effort when we can demonstrate logically that it is faster and that it is actually a hot spot for some usage of the framework. |
|
what if you make Element.depth not late? |
In my microbenchmarks I was comparing without |
|
(added screenshot in description that shows Element._sort showing up in the rebuild benchmark profile) |
|
sorry I'm not really following, did you see a performance improvement from this change? |
Yes, the linked microbenchmark does show an improvement. It is however small to the point where I didn't do statistics to determine a likelihood when we can evaluate the codes performance logically. Is there a doubt that it is faster? Let me know what your concern is and I can address it better. We can look at the generated assembly if you want, or just explain it. |
|
This does reduce the readability of the code a lot. :/ If this is actually proven faster, there should probably be a comment in the code explaining what this is doing. |
|
I've tweaked the rebuild benchmark to avoid scheduling frames and rebuild time goes from |
If there were a Wikipedia URL of what you are using (probably it is not the first time someone has ever used ^ for sorting), maybe could be more useful. |
I added comments. I think that clears it up. It isn't really a trick, it's just that people don't xor often. |
|
I don't really think this sort of change is worth landing. I think its OK if we were speeding up something that is slow, or if we didn't speed something up but made it easier to read/simpler. But in this case we're rewriting code that isn't really on any critical path and that doesn't make it any clearer. What is the goal here? |
|
Talked to @jonahwilliams offline, he's fine with the change as long as @goderbauer is happy and the comments make up for any deficiencies in clarity of the code. |
| // If the `dirty` values are not equal, sort with non-dirty elements being | ||
| // less than dirty elements. | ||
| final bool isBDirty = b.dirty; | ||
| if (a.dirty ^ isBDirty) { | ||
| return isBDirty ? -1 : 1; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sort with non-dirty elements being less than dirty elements.
I can kiiiiind of understand what you mean, but it still kind of hard lol. The sort with non-dirty elements being less than dirty elements then looking at isBDirty and the xor.
BTW, is isBDirty really needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
b.isDirty is a method call so the compiler can't optimized away the second call because it can't know if dirty always returns the same value.
|
I switched the |
|
I have the same questions that Jonah raised in #104103 (comment). Have those been answered somewhere? |
@goderbauer Yes, we had a discussion offline where he agreed it is worth merging as long as your concerns about making it clear are addressed (mentioned here: #104103 (comment)). I hate to talk to him, feel free to correct me if you want @jonahwilliams. I imagine those concerns are even less now that I've removed xor. It doesn't affect clarity at all. This function is called thousands of times (n log n, where n is the number of Elements) when we do hot reload. The work was done to make it faster. It would be a shame to throw out the savings that can be demonstrated by looking at the generated code, counting operations, or running benchmarks, just because it doesn't look how we'd first think about it. |
|
I have a very high tolerance for "unreadable code" so I don't feel comfortable making the call on whether or not this is worth it. I agree with Aaron that lots of small changes do add up, but of course there is always the risk we make something worse too 🤣 |
|
My readability concerns are mostly addressed.
So, there are benchmarks showing that this is faster? What are the numbers? |
@goderbauer with the following microbenchmark it is 27% faster on arm64 ios profile builds. This is the minimum benefit since Element uses polymorphism and this is the fastest possible implementation, just a straight field. I didn't think checking in a microbenchmark in the case was worth the maintenance since it's clearly faster (to my eyes). // Copyright 2014 The Flutter Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.
import 'dart:developer';
import 'package:flutter/material.dart';
import '../common.dart';
const int _kNumIterations = 100000;
class _Widget extends Widget {
@override
Element createElement() {
throw UnimplementedError();
}
}
class _Element extends Element {
_Element(this.depth, this.dirty) : super(_Widget());
@override
final int depth;
@override
final bool dirty;
@override
bool get debugDoingBuild => throw UnimplementedError();
@override
void performRebuild() {}
}
void main() {
assert(false,
"Don't run benchmarks in debug mode! Use 'flutter run --release'.");
final BenchmarkResultPrinter printer = BenchmarkResultPrinter();
final Element a = _Element(0, false);
final Element b = _Element(1, false);
final Element c = _Element(0, true);
final Element d = _Element(1, true);
final Stopwatch watch = Stopwatch();
int tally = 0;
watch.start();
for (int i = 0; i < _kNumIterations; i += 1) {
tally += Element.sort(a, a);
tally += Element.sort(a, b);
tally += Element.sort(a, c);
tally += Element.sort(a, d);
tally += Element.sort(b, c);
tally += Element.sort(b, d);
tally += Element.sort(c, d);
}
watch.stop();
if (tally < 0) {
print("this shouldn't happen.");
}
printer.addResult(
description: 'Element.sort',
value: watch.elapsedMicroseconds.toDouble() / _kNumIterations,
unit: 'us per iteration',
name: 'element_sort',
);
printer.printToStdout();
} |
goderbauer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for getting some numbers for this!
Redid the logic for
Element._sortto reduce operations:This will make the biggest difference with hot reload.
Here is the profile of rebuilding a large app over and over again which is what lead me to looking at Element._sort:

Pre-launch Checklist
///).If you need help, consider asking for advice on the #hackers-new channel on Discord.