-
Notifications
You must be signed in to change notification settings - Fork 29.7k
Improve UI-thread animation performance on busy screens (~10x reduction depending on your use-case) #138481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@gaaclarke really interesting PR, somehow seems related to some small changes you do that radically improve things. |
I am a little careful of changing the behavior here. This is used in lots and lots of places. It looks like the change is not failing any customer tests, which is a good start. I'll see if I can run google's internal tests on this to see if we can get a better understanding of how breaking this would be.
I scrolled through the data and I can't see a clear pattern. It seems to be a wash, sometimes the new implementation is faster, sometimes its slower. I am actually not sure whether this is a clear win? |
Appreciated. I agree that we need to respect Hyrum's law here. That said, the implemented semantics seem rather ad-hoc. For multiple registrations it is virtually impossible to emply the semantics correctly especially when listeners mutate the list of listeners. I'm also not convinced that the updated semantics are great either. I'd say they're about as bad. Personally, I'd argue that multiple registrations should be rejected (at least for the animations use-case). For a listener its trivial to run code N times w/o having to register itself N-times. However, this discussion probably goes beyond the scope of this PR.
Yes and no but really yes :). It's a bit hard to see given the various dimensions and quite a bit of trial-by-trial variation (I'm going to blame GC and my notebook's thermals :) ). Note that when the new cases are flagged as slower, it's only a very small percentage. In most cases it's just the same +/- noise, I merely picked a conservative threshold for labeling as SLOWER vs SAME. In these cases looking at the absolute runtime can also provide a sense of how noisy the measure may be. I did share the benchmark code, so you can simply run it yourself (the script has zero deps). Let me try to break it down: There are 3 comparisons:
Then we have two more dimensions: A. 1. & 2.The changes covered by 1. and 2. are strict improvements. They don't change any semantics and thus I pulled them out into a separate commit. They should be fairly straightforward. the overriden Note that The change to I pulled the "toList()" case out into a separate benchmark to make this clearer: (With N=1000 probably a lot more noisy than N=10, due to it's significantly shorter runtime. I only wanted to spend so much time tweaking the iteration numbers for every combo) 3."3." covers the changes within the second commit, i.e. moving from ObserverList ot HashedObserver list for animations. This is where the real meat is, i.e. 100+x improvements depending on the number of listeners. At the end of the day it's just array vs hash map. No magic involved. My assumption is that whoever did the current implementation assumed that AnimationControllers will only drive a small, static set of animations. In these cases, sure the array is slighly faster. In the general case: "a changing, not necessarily small set of Animations", the array is atrocious. You can see that the
(Outside the scope of this change but the attempted band aid on ObserverList is dangerous. For certain access patterns it may help. For others it completely destroys performance. It's important to note that the access pattern for That's where you get numbers like: Generally, I don't think we should get too hung up on the performance differences for small As a personal data point, my specific use-case that surfaced this issue has hundreds of animations popping on and off screen. These gains are not theoretical for me. At this point, I'm actually shipping a forked version of flutter incorporating these changes, which significantly reduces frame-drops for me |
|
This pull request executed golden file tests, but it has not been updated in a while (20+ days). Test results from Gold expire after as many days, so this pull request will need to be updated with a fresh commit in order to get results from Gold. For more guidance, visit Writing a golden file test for Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
|
Regarding 1 (Changes to ObservableList.remove): In practice, this sounds like regression to me. I would expect that the case where you remove a non-existent listener is very, very rare (I would almost consider it a bug in the code if somebody does that). So, optimizing for that is odd to me - since you will pay for the "optimization" in the far more common case with an additional check. Regarding 2 (Optimizing HashedObserverList.toList): This does seem like a nice improvement and the benchmarks appear to back that up. We could probably land just this change without much trouble. Regarding 3 (Changing AnimationLocalListenersMixin): This is actually less risky than I initially thought since it only affects this one mixin. For some reason I initially thought this would effect all ChangeNotifiers. We still need to run this change through our google testing suite to see if this would be a breaking change. I was finally able to kick off a test run within google, let's see what it reports back. I'll keep you posted. |
|
Also, can you please resolve the issues reported by the analyze check? Thanks! |
Maybe all use-cases are buggy (in which case, we should probably surface it, e.g. assert or at least log), maybe there are some valid edge cases 🤷♀️ . I'm not sure that ignoring it is the best way to deal with it in either case. Meanwhile, it's the typical insurance-risk-severity evaluation: the check is basically free and the cost independent of the size, whereas invalidating the internal optimization is non-obvious to users and very costly especially for larger sets. In terms of cost, given any modern CPU with speculative execution and accurate branch prediction, I would expect this check to take <1ns :handwave:. Personally, I can very much live w/o this "optimization", I'm merely surprised by your cost argument in the light of the benchmark results. I could certainly buy in your bug argument, however one should probably look into how to surface such bugs (independent of this change). As I was rambling earlier, I think there are other semantics that might be unintentional like FIFO (rather than LIFO) removals of duplicate listeners. Maybe duplicate listeners in general should be considered a bug? 🤷♀️ There would be nice benefits to it too, e.g. one could use cheaper data structure that don't need to keep track of insertion order.
Agreed, it shouldn't cause any surprises.
Thanks! Independently, it could be interesting to evaluate if other constructs would also benefit from other data structures but ultimately it will always depend on your use case.
Once I get word from you on how we will or won't proceed, I'll urgently address any issues. Thanks! |
|
This is not causing any issues internally at google. I think we should be able to land this if the reported analyzer issues get resolved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you forgot the type here, that's the only remaining thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Done
cfc20bb to
60e6080
Compare
Happy to hear. Thanks! |
|
This PR is still failing the analyzer check. Can you please take a look? Thanks. |
|
|
02341e1 to
a620001
Compare
|
Thanks. Analyzer seems to pass now and I rebased against latest main |
goderbauer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LongCatIsLooong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo nits. In case in the future someone else wants to make a different trade off based on their own use cases, could you add a benchmark for this? It also would be nice to see the performance improvements this PR introduces.
a620001 to
35398fc
Compare
I like the idea. Even more so for continuous performance monitoring. It's probably less likely that someone would stumble upon a benchmark and than decides to just implement their own AnimationListener. I couldn't find any other micro-benchmarks in this repo. That's also why I hadn't added mine. I did share them above: https://gist.github.com/ignatz/dba6293d0fef41e3c6e1ed3c51364de6. I had hoped that this change would be validated by existing e2e benchmarks but if you feel adding micro-benchmarks makes sense we can talk about it. That said, I understand that you folks are busy, and therefore I'd like to avoid expanding this simple change further. It has already been more than an OKR cycle. |
|
If you're looking for microbenchmarks: https://github.com/flutter/flutter/tree/master/dev/benchmarks/microbenchmarks |
|
Added a microbenchmark following the example of Could you also provide me a link to a dashboard where I can see/monitor the continuous evolution of them microbenchmarks? A Google search didn't bring anything up and I'm wondering how this setup is being put to use. |
53c3c25 to
e6da0bd
Compare
|
I've spent countless hours complying with any remotely reasonable request. I've provided extensive benchmarks from the get go. We all know and acknowledge that his is a small change with significant performance impact.
This sounds more like an opinion than a best practice. I'm always down for a good argument, just not big on process for the sake of process. AFAICT, there's nothing to be gained from this proposal unless you believe the CI server must execute the benchmarks to be trusted. If you think I should spend a few more weeks on back and forts with moving goal posts: fine. I you don't have the bandwidth to spend 1minute on a cherry-pick: fine. If you don't want to improve Flutter's performance: fine. If you're ok with the opportunity cost for your users: fine. If you're comfortable with disrespecting your contributors' time: fine. If you change your mind: also fine. Otherwise thank you for you time and good luck! And just for the record
could be misconstrued as gaslighting. |
|
I am sorry if you misinterpreted my statement. That wasn't my intention. And there is no disrespecting anyones time. Getting any change landed sometimes requires a couple of feedback cycles and back-and-forth discussions before they are ready to land. That's just part of open-source team work. We do appreciate your contribution, but we also expect contributors to get changes over the line and not just "throw them over the fence" to have someone else finish them when they are done with the fun bits. I still hope that we can land this with your help. |
I'm not sure how to read this. Are you talking generally? Are you talking about the same PR? Maybe some context faded over time? Are you insinuating? Personally, I didn't think that addressing every request, answering everybquestion, writing and running benchmarks for hours were the fun bits. I'm sorry you feel like I'm just throwing code over the fence. In any case thanks for explaining team work to me. I wish you had given me a rational instead 🤷♀️
I'm still not sure what stands to be gained from the extra work but it sure seems important to you. Good luck and thank you for your time |
|
Hey @ignatz, if you allow me, I have pushed the benchmarks to a different PR so your contribution can be landed as part 1 (the benchmark) and part 2 (the improvements). |
|
Hey @bernaferrari. Thanks for jumping in, much appreciated. |
|
I don't know how to remove the benchmark commit and merge this, could you do that? @ignatz else I can reopen the PR without it... |
Reverts #143542 Initiated by: goderbauer Reason for reverting: Failing post-submit, see https://logs.chromium.org/logs/flutter/buildbucket/cr-buildbucket/8755531866181499153/+/u/run_microbenchmarks/stdout Original PR Author: bernaferrari Reviewed By: {goderbauer} This change reverts the following previous change: Original Description: Part 1 of #138481 Cherry picking @ignatz work.
…r small N, by overriding `toList()`, which is critical for `.notifyListeners()`. This change is meant to shift the scales. Previously, `ObserverList` was chosen for generic implementations such as `AnimationController` to optimize for the common case: small N. However, the penalty in the general case, including larger N, is tremendous. Also slightly improve `ObserverList.remove` performance for some use-cases by not always invalidating the set. Some benchmark results. Happy to share the benchmark code as well. I was trying to mimic a realistic ObserverList life-cycle: 2 add/remove SAME ObserverList vs MyObserverList takes: 1.0x, saved: -0% (0:00:01.373926 vs 0:00:01.376916) 2 add/remove SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: -0% (0:00:02.170364 vs 0:00:02.177635) 2 add/remove SLOWER ObserverList vs MyHashedObserverList takes: 0.6x, saved: -64% (0:00:01.375987 vs 0:00:02.254769) 2 notifyListeners SAME ObserverList vs MyObserverList takes: 1.0x, saved: 1% (0:00:01.183753 vs 0:00:01.167889) 2 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.6x, saved: 38% (0:00:01.703029 vs 0:00:01.054944) 2 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:01.173942 vs 0:00:01.061108) 2 notifyListeners self-mod SAME ObserverList vs MyObserverList takes: 1.0x, saved: 1% (0:00:00.413579 vs 0:00:00.409012) 2 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.2x, saved: 16% (0:00:00.861495 vs 0:00:00.725754) 2 notifyListeners self-mod SLOWER ObserverList vs MyHashedObserverList takes: 0.6x, saved: -79% (0:00:00.410618 vs 0:00:00.734759) 2 mix nl/rm/add SLOWER ObserverList vs MyObserverList takes: 1.0x, saved: -4% (0:00:00.382266 vs 0:00:00.398285) 2 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.3x, saved: 25% (0:00:00.696625 vs 0:00:00.522181) 2 mix nl/rm/add SLOWER ObserverList vs MyHashedObserverList takes: 0.8x, saved: -25% (0:00:00.363710 vs 0:00:00.454669) 4 add/remove SAME ObserverList vs MyObserverList takes: 1.0x, saved: -1% (0:00:03.210935 vs 0:00:03.240136) 4 add/remove SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: -2% (0:00:04.000753 vs 0:00:04.088699) 4 add/remove SLOWER ObserverList vs MyHashedObserverList takes: 0.9x, saved: -18% (0:00:03.457464 vs 0:00:04.062586) 4 notifyListeners SLOWER ObserverList vs MyObserverList takes: 0.8x, saved: -22% (0:00:01.857049 vs 0:00:02.266333) 4 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.3x, saved: 21% (0:00:03.031229 vs 0:00:02.395746) 4 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.2x, saved: 14% (0:00:03.723434 vs 0:00:03.213373) 4 notifyListeners self-mod SLOWER ObserverList vs MyObserverList takes: 0.8x, saved: -28% (0:00:03.809080 vs 0:00:04.857361) 4 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.3x, saved: 24% (0:00:03.694135 vs 0:00:02.797687) 4 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 1.3x, saved: 23% (0:00:03.691153 vs 0:00:02.850335) 4 mix nl/rm/add SLOWER ObserverList vs MyObserverList takes: 0.9x, saved: -6% (0:00:01.665852 vs 0:00:01.759793) 4 mix nl/rm/add SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: 1% (0:00:01.430239 vs 0:00:01.422435) 4 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 2.6x, saved: 62% (0:00:03.076555 vs 0:00:01.176919) 7 add/remove FASTER ObserverList vs MyObserverList takes: 1.2x, saved: 16% (0:00:08.613052 vs 0:00:07.209422) 7 add/remove SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: 1% (0:00:08.334085 vs 0:00:08.291406) 7 add/remove SAME ObserverList vs MyHashedObserverList takes: 1.0x, saved: -1% (0:00:07.939038 vs 0:00:07.982456) 7 notifyListeners FASTER ObserverList vs MyObserverList takes: 1.1x, saved: 11% (0:00:03.537551 vs 0:00:03.162703) 7 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.2x, saved: 19% (0:00:04.266710 vs 0:00:03.441823) 7 notifyListeners SAME ObserverList vs MyHashedObserverList takes: 1.0x, saved: 2% (0:00:03.562112 vs 0:00:03.489668) 7 notifyListeners self-mod FASTER ObserverList vs MyObserverList takes: 1.3x, saved: 20% (0:00:07.683331 vs 0:00:06.141342) 7 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 6% (0:00:02.777192 vs 0:00:02.607791) 7 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 2.3x, saved: 57% (0:00:06.138749 vs 0:00:02.623466) 7 mix nl/rm/add SLOWER ObserverList vs MyObserverList takes: 1.0x, saved: -4% (0:00:01.527072 vs 0:00:01.591417) 7 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:01.424173 vs 0:00:01.274727) 7 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 1.8x, saved: 43% (0:00:01.718774 vs 0:00:00.972161) 10 add/remove SAME ObserverList vs MyObserverList takes: 1.0x, saved: 1% (0:00:06.705254 vs 0:00:06.652850) 10 add/remove SLOWER HashedObserverList vs MyHashedObserverList takes: 0.9x, saved: -8% (0:00:06.127039 vs 0:00:06.634606) 10 add/remove FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 6% (0:00:06.621233 vs 0:00:06.242699) 10 notifyListeners SLOWER ObserverList vs MyObserverList takes: 0.8x, saved: -21% (0:00:02.542935 vs 0:00:03.065810) 10 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.2x, saved: 20% (0:00:03.038072 vs 0:00:02.441563) 10 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:02.708075 vs 0:00:02.429304) 10 notifyListeners self-mod SLOWER ObserverList vs MyObserverList takes: 0.6x, saved: -63% (0:00:04.636833 vs 0:00:07.569618) 10 notifyListeners self-mod SLOWER HashedObserverList vs MyHashedObserverList takes: 0.7x, saved: -44% (0:00:01.926254 vs 0:00:02.768144) 10 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 3.9x, saved: 75% (0:00:19.460013 vs 0:00:04.935028) 10 mix nl/rm/add SAME ObserverList vs MyObserverList takes: 1.0x, saved: -3% (0:00:03.122031 vs 0:00:03.207405) 10 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 12% (0:00:02.104239 vs 0:00:01.853132) 10 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 1.7x, saved: 43% (0:00:03.167834 vs 0:00:01.817996) 100 add/remove FASTER ObserverList vs MyObserverList takes: 2.1x, saved: 53% (0:00:29.435696 vs 0:00:13.740486) 100 add/remove SLOWER HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: -4% (0:00:01.810957 vs 0:00:01.874549) 100 add/remove FASTER ObserverList vs MyHashedObserverList takes: 7.0x, saved: 86% (0:00:13.562779 vs 0:00:01.932154) 100 notifyListeners FASTER ObserverList vs MyObserverList takes: 1.1x, saved: 11% (0:00:00.824788 vs 0:00:00.735041) 100 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:00.780762 vs 0:00:00.701437) 100 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 8% (0:00:00.797545 vs 0:00:00.732023) 100 notifyListeners self-mod FASTER ObserverList vs MyObserverList takes: 1.1x, saved: 11% (0:00:24.091372 vs 0:00:21.355631) 100 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: 4% (0:00:00.560706 vs 0:00:00.536952) 100 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 42.2x, saved: 98% (0:00:22.614892 vs 0:00:00.536188) 100 mix nl/rm/add SAME ObserverList vs MyObserverList takes: 1.0x, saved: -1% (0:00:00.399696 vs 0:00:00.403067) 100 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 12% (0:00:00.168415 vs 0:00:00.148730) 100 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 2.7x, saved: 63% (0:00:00.395535 vs 0:00:00.145706) Foo
…lly, with `HashedObserverList<T>` now being a good 20% to 30% faster for small N, the inversion point for performance is reached even sooner. For common use-cases, this means N around three to five (see below). WARNING: This is a potentially breaking change (as witnessed by the test-change), du to listeners only being notified once (in first-insertion order) rather than once every registration. I'd be more than happy to make the behavior configurable on the AnimationController, so that folks could opt into the new behavior or construct an AnimationController depending on their needs. I'd also be interested in understanding the motivation for the behavior at hand a bit better. Given that listeners can be mutating, preserving the order seems quite intuitive. Whereas multiple registrations may make sense, I just can't think of a good use-case. I can easily think of unintentional gotchas like advancing an animation multiple times per step/frame. Context: I'm using a library that draws markers onto a map and then depending on the zoom-level it will cluster (collapse/expand) these markers. This results often in tens or hundreds of markers being animated (position & opacity) with a single AnimationController. After this animation no-longer visible markers will be taken off screen resulting in large numbers of listeners being `removed`. With the implementation at HEAD, my App is spending a majority of its UI-thread time (tens of milliseconds per frame) just deleting listeners. I'd argue that a generic tool like AnimationController should not micro-optimize for the case N=1. With this change, I'm seeing some ~50x improvements depending on how many markers are on screen. There's an argument that could be made that `HashedObserverList` could subsume `ObserverList` in more cases. However, I'd prefer to let you decide this on a case-by-case basis given the slightly different iteration semantics. Here are some benchmark results for typical use-cases. Happy to share the exact benchmark code. 2 add/remove SAME ObserverList vs MyObserverList takes: 1.0x, saved: -0% (0:00:01.373926 vs 0:00:01.376916) 2 add/remove SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: -0% (0:00:02.170364 vs 0:00:02.177635) 2 add/remove SLOWER ObserverList vs MyHashedObserverList takes: 0.6x, saved: -64% (0:00:01.375987 vs 0:00:02.254769) 2 notifyListeners SAME ObserverList vs MyObserverList takes: 1.0x, saved: 1% (0:00:01.183753 vs 0:00:01.167889) 2 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.6x, saved: 38% (0:00:01.703029 vs 0:00:01.054944) 2 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:01.173942 vs 0:00:01.061108) 2 notifyListeners self-mod SAME ObserverList vs MyObserverList takes: 1.0x, saved: 1% (0:00:00.413579 vs 0:00:00.409012) 2 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.2x, saved: 16% (0:00:00.861495 vs 0:00:00.725754) 2 notifyListeners self-mod SLOWER ObserverList vs MyHashedObserverList takes: 0.6x, saved: -79% (0:00:00.410618 vs 0:00:00.734759) 2 mix nl/rm/add SLOWER ObserverList vs MyObserverList takes: 1.0x, saved: -4% (0:00:00.382266 vs 0:00:00.398285) 2 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.3x, saved: 25% (0:00:00.696625 vs 0:00:00.522181) 2 mix nl/rm/add SLOWER ObserverList vs MyHashedObserverList takes: 0.8x, saved: -25% (0:00:00.363710 vs 0:00:00.454669) 4 add/remove SAME ObserverList vs MyObserverList takes: 1.0x, saved: -1% (0:00:03.210935 vs 0:00:03.240136) 4 add/remove SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: -2% (0:00:04.000753 vs 0:00:04.088699) 4 add/remove SLOWER ObserverList vs MyHashedObserverList takes: 0.9x, saved: -18% (0:00:03.457464 vs 0:00:04.062586) 4 notifyListeners SLOWER ObserverList vs MyObserverList takes: 0.8x, saved: -22% (0:00:01.857049 vs 0:00:02.266333) 4 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.3x, saved: 21% (0:00:03.031229 vs 0:00:02.395746) 4 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.2x, saved: 14% (0:00:03.723434 vs 0:00:03.213373) 4 notifyListeners self-mod SLOWER ObserverList vs MyObserverList takes: 0.8x, saved: -28% (0:00:03.809080 vs 0:00:04.857361) 4 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.3x, saved: 24% (0:00:03.694135 vs 0:00:02.797687) 4 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 1.3x, saved: 23% (0:00:03.691153 vs 0:00:02.850335) 4 mix nl/rm/add SLOWER ObserverList vs MyObserverList takes: 0.9x, saved: -6% (0:00:01.665852 vs 0:00:01.759793) 4 mix nl/rm/add SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: 1% (0:00:01.430239 vs 0:00:01.422435) 4 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 2.6x, saved: 62% (0:00:03.076555 vs 0:00:01.176919) 7 add/remove FASTER ObserverList vs MyObserverList takes: 1.2x, saved: 16% (0:00:08.613052 vs 0:00:07.209422) 7 add/remove SAME HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: 1% (0:00:08.334085 vs 0:00:08.291406) 7 add/remove SAME ObserverList vs MyHashedObserverList takes: 1.0x, saved: -1% (0:00:07.939038 vs 0:00:07.982456) 7 notifyListeners FASTER ObserverList vs MyObserverList takes: 1.1x, saved: 11% (0:00:03.537551 vs 0:00:03.162703) 7 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.2x, saved: 19% (0:00:04.266710 vs 0:00:03.441823) 7 notifyListeners SAME ObserverList vs MyHashedObserverList takes: 1.0x, saved: 2% (0:00:03.562112 vs 0:00:03.489668) 7 notifyListeners self-mod FASTER ObserverList vs MyObserverList takes: 1.3x, saved: 20% (0:00:07.683331 vs 0:00:06.141342) 7 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 6% (0:00:02.777192 vs 0:00:02.607791) 7 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 2.3x, saved: 57% (0:00:06.138749 vs 0:00:02.623466) 7 mix nl/rm/add SLOWER ObserverList vs MyObserverList takes: 1.0x, saved: -4% (0:00:01.527072 vs 0:00:01.591417) 7 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:01.424173 vs 0:00:01.274727) 7 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 1.8x, saved: 43% (0:00:01.718774 vs 0:00:00.972161) 10 add/remove SAME ObserverList vs MyObserverList takes: 1.0x, saved: 1% (0:00:06.705254 vs 0:00:06.652850) 10 add/remove SLOWER HashedObserverList vs MyHashedObserverList takes: 0.9x, saved: -8% (0:00:06.127039 vs 0:00:06.634606) 10 add/remove FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 6% (0:00:06.621233 vs 0:00:06.242699) 10 notifyListeners SLOWER ObserverList vs MyObserverList takes: 0.8x, saved: -21% (0:00:02.542935 vs 0:00:03.065810) 10 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.2x, saved: 20% (0:00:03.038072 vs 0:00:02.441563) 10 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:02.708075 vs 0:00:02.429304) 10 notifyListeners self-mod SLOWER ObserverList vs MyObserverList takes: 0.6x, saved: -63% (0:00:04.636833 vs 0:00:07.569618) 10 notifyListeners self-mod SLOWER HashedObserverList vs MyHashedObserverList takes: 0.7x, saved: -44% (0:00:01.926254 vs 0:00:02.768144) 10 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 3.9x, saved: 75% (0:00:19.460013 vs 0:00:04.935028) 10 mix nl/rm/add SAME ObserverList vs MyObserverList takes: 1.0x, saved: -3% (0:00:03.122031 vs 0:00:03.207405) 10 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 12% (0:00:02.104239 vs 0:00:01.853132) 10 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 1.7x, saved: 43% (0:00:03.167834 vs 0:00:01.817996) 100 add/remove FASTER ObserverList vs MyObserverList takes: 2.1x, saved: 53% (0:00:29.435696 vs 0:00:13.740486) 100 add/remove SLOWER HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: -4% (0:00:01.810957 vs 0:00:01.874549) 100 add/remove FASTER ObserverList vs MyHashedObserverList takes: 7.0x, saved: 86% (0:00:13.562779 vs 0:00:01.932154) 100 notifyListeners FASTER ObserverList vs MyObserverList takes: 1.1x, saved: 11% (0:00:00.824788 vs 0:00:00.735041) 100 notifyListeners FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 10% (0:00:00.780762 vs 0:00:00.701437) 100 notifyListeners FASTER ObserverList vs MyHashedObserverList takes: 1.1x, saved: 8% (0:00:00.797545 vs 0:00:00.732023) 100 notifyListeners self-mod FASTER ObserverList vs MyObserverList takes: 1.1x, saved: 11% (0:00:24.091372 vs 0:00:21.355631) 100 notifyListeners self-mod FASTER HashedObserverList vs MyHashedObserverList takes: 1.0x, saved: 4% (0:00:00.560706 vs 0:00:00.536952) 100 notifyListeners self-mod FASTER ObserverList vs MyHashedObserverList takes: 42.2x, saved: 98% (0:00:22.614892 vs 0:00:00.536188) 100 mix nl/rm/add SAME ObserverList vs MyObserverList takes: 1.0x, saved: -1% (0:00:00.399696 vs 0:00:00.403067) 100 mix nl/rm/add FASTER HashedObserverList vs MyHashedObserverList takes: 1.1x, saved: 12% (0:00:00.168415 vs 0:00:00.148730) 100 mix nl/rm/add FASTER ObserverList vs MyHashedObserverList takes: 2.7x, saved: 63% (0:00:00.395535 vs 0:00:00.145706)
143d894 to
f969ca8
Compare
Of course, done. That's also what I offered before. In case you're curious, can be done with To be clear, I'm happy to help with virtually any request I can remotely reason about. I merely don't like spending my time on something that has zero tangible value just because someone, unsure why, called it a "best practice". In general, most engineers will be allergic to corporate processes and authority arguments. Doesn't feel like teamwork, feels at best like someone wanting to check the "demonstrates leadership at the next level" box and triggers my inner Hixie. I love Flutter ❤️ EDIT: I would also like to explicitly express my gratitude for you, Bernardo. You're clearly caught in the cross fire - sorry - but you also genuinely seem to want to help. Thanks! |
Reverts #143799 Initiated by: goderbauer Reason for reverting: The microbenchmark failed two run two times in a row after this was committed (see https://logs.chromium.org/logs/flutter/buildbucket/cr-buildbucket/8755379171959141089/+/u/run_microbenchmarks/stdout). Original PR Author: bernaferrari Reviewed By: {goderbauer} This change reverts the following previous change: Original Description: Re-lands #143542 (which is part 1 of #138481) The filename was wrong 🫠
|
Greetings from PR triage, checking in on this, it looks like the benchmark will need to be checked back in before we can land this PR? For reference, #143799 was reverted in #143946. Performance improvements like this are difficult to test, and therefore difficult to catch when they regress. The benchmark is intended to establish a baseline before this lands so we know (1) it has made an improvement on that baseline as expected, and then (2) it persists so we will know if it regresses in the future. Also, reading through the comments here, just wanted to provide a friendly tone check. We should all be working together respectfully, per our code of conduct. 🙂 |
|
I'll reland for the third time the benchmarks. It is hard because they are not checked by the CI, so bugs slided two times and I got tired/traumatized of getting things reverted. Then I forgot, but I'll do it again shortly. I also have no idea why they failed last time. Will see what happens locally. |
|
Thanks @bernaferrari. I appreciate you volunteering your time to help with this change. 🙏 |
|
Since this was closed, I filed #146211 to follow up on this. |
The following PR (#138481) got split in 2, this is part 2. We now have the microbenchmarks to compare this change against (and hopefully see improvements). Close: #146211 Part 1: #153368 --------- Co-authored-by: Nate Wilson <[email protected]>
|
This (finally) got merged! 🥳 |

This change can be a significant reduction in UI thread load for animations when a controller drives many animations that will come in and go out of existance. 100x are easily possible depending on how many things are popping on and off the screen (e.g. hundreds).
WARNING: This is a somewhat breaking change (as shown by the test-change), due to listeners only being notified once (in first-insertion order) rather than once every registration. I'd be more than happy to make the behavior configurable on the AnimationController, so that folks could opt into the new behavior or construct an AnimationController depending on what their use-case demands.
Tangentially, I'd also be interested in understanding the motivation for the implemented behavior a bit better. Given that listeners can be mutating, preserving the order seems quite intuitive. Whereas multiple registrations may make sense, I just can't think of a good use-case. I can easily think of unintentional gotchas like advancing an animation multiple times per step/frame. (EDIT: it's also surprising that despite its ordering, it's not a stack, i.e. you push new listeners to the end but you remove them from the front,which makes working with multiple registrations conceptually probably hard to work with 🤷♀️ )
CONTEXT: I'm using a library that draws markers onto a map and then depending on the zoom-level it will cluster (collapse/expand) these markers. This results often in tens or hundreds of markers being animated (position & opacity) with a single AnimationController. After this animation no-longer visible markers will be taken off screen resulting in large numbers of listeners being
removed. With the implementation at HEAD, my App is spending a majority of its UI-thread time (tens of milliseconds per frame) just deleting listeners. I'd argue that a generic tool like AnimationController should not micro-optimize for the case N=1.I also did a performance comparison based on N to support the proposal. I tried to model realistic use cases, i.e. add/remove,
notifiyListeners()w/o any_setinvalidation, with invalidation, and a cycle of notifylisteners/rm/add. The benchmark code can be found here: https://gist.github.com/ignatz/dba6293d0fef41e3c6e1ed3c51364de6 .Pre-launch Checklist
///).If you need help, consider asking for advice on the #hackers-new channel on Discord.