Skip to content

Conversation

@gaaclarke
Copy link
Member

@gaaclarke gaaclarke commented May 18, 2022

Redid the logic for Element._sort to reduce operations:

before after
branches 4 3
getters 8 5
ops 6 3

This will make the biggest difference with hot reload.

Here is the profile of rebuilding a large app over and over again which is what lead me to looking at Element._sort:
Screen Shot 2022-05-17 at 3 21 40 PM

Pre-launch Checklist

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I read and followed the Flutter Style Guide, including Features we expect every widget to implement.
  • I signed the CLA.
  • I listed at least one issue that this PR fixes in the description above.
  • I updated/added relevant documentation (doc comments with ///).
  • I added new tests to check the change I am making, or this PR is test-exempt.
  • All existing and new tests are passing.

If you need help, consider asking for advice on the #hackers-new channel on Discord.

@flutter-dashboard flutter-dashboard bot added the framework flutter/packages/flutter repository. See also f: labels. label May 18, 2022
@gaaclarke gaaclarke marked this pull request as ready for review May 18, 2022 17:53
@flutter-dashboard
Copy link

It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!).

If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix?

Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing.

@gaaclarke
Copy link
Member Author

I'm not sure if we want a benchmark beyond the ones we already have. A benchmark for hot reload would capture this the best probably.

Copy link
Contributor

@jonahwilliams jonahwilliams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much does this speed things up on the rebuild benchmarks?

@gaaclarke
Copy link
Member Author

How much does this speed things up on the rebuild benchmarks?

I have a number but I'm not sure how much I trust it until I've played around with the new benchmark a bit more. It can be demonstrated that there is no path through this code as it was originally written that does less operations.

@gaaclarke
Copy link
Member Author

I wrote a microbenchmark for this but the results were just showing a 1% improvement. I suspect the slowness is elsewhere in dart (like List.operator[]=, or function invocation). I can add the microbenchmark if you want but I don't think it's worth the effort when we can demonstrate logically that it is faster and that it is actually a hot spot for some usage of the framework.

@jonahwilliams
Copy link
Contributor

what if you make Element.depth not late?

@gaaclarke
Copy link
Member Author

what if you make Element.depth not late?

In my microbenchmarks I was comparing without _depth being late: https://gist.github.com/gaaclarke/98ab1e38fcea413693afbb2f8a9e0f60

@gaaclarke
Copy link
Member Author

(added screenshot in description that shows Element._sort showing up in the rebuild benchmark profile)

@jonahwilliams
Copy link
Contributor

sorry I'm not really following, did you see a performance improvement from this change?

@gaaclarke
Copy link
Member Author

sorry I'm not really following, did you see a performance improvement from this change?

Yes, the linked microbenchmark does show an improvement. It is however small to the point where I didn't do statistics to determine a likelihood when we can evaluate the codes performance logically. Is there a doubt that it is faster? Let me know what your concern is and I can address it better. We can look at the generated assembly if you want, or just explain it.

@goderbauer
Copy link
Member

This does reduce the readability of the code a lot. :/

If this is actually proven faster, there should probably be a comment in the code explaining what this is doing.

@gaaclarke
Copy link
Member Author

I've tweaked the rebuild benchmark to avoid scheduling frames and rebuild time goes from 5336.5437 us to 5220.361 us, 2% improvement. I'll make a separate CL for the change to the benchmark.

@bernaferrari
Copy link
Contributor

bernaferrari commented May 18, 2022

This does reduce the readability of the code a lot. :/

If there were a Wikipedia URL of what you are using (probably it is not the first time someone has ever used ^ for sorting), maybe could be more useful.

@gaaclarke
Copy link
Member Author

This does reduce the readability of the code a lot. :/

If this is actually proven faster, there should probably be a comment in the code explaining what this is doing.

I added comments. I think that clears it up. It isn't really a trick, it's just that people don't xor often.

@gaaclarke gaaclarke requested a review from jonahwilliams May 19, 2022 17:02
@jonahwilliams
Copy link
Contributor

I don't really think this sort of change is worth landing. I think its OK if we were speeding up something that is slow, or if we didn't speed something up but made it easier to read/simpler. But in this case we're rewriting code that isn't really on any critical path and that doesn't make it any clearer. What is the goal here?

@gaaclarke
Copy link
Member Author

Talked to @jonahwilliams offline, he's fine with the change as long as @goderbauer is happy and the comments make up for any deficiencies in clarity of the code.

Comment on lines 3202 to 3207
// If the `dirty` values are not equal, sort with non-dirty elements being
// less than dirty elements.
final bool isBDirty = b.dirty;
if (a.dirty ^ isBDirty) {
return isBDirty ? -1 : 1;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sort with non-dirty elements being less than dirty elements.

I can kiiiiind of understand what you mean, but it still kind of hard lol. The sort with non-dirty elements being less than dirty elements then looking at isBDirty and the xor.

BTW, is isBDirty really needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b.isDirty is a method call so the compiler can't optimized away the second call because it can't know if dirty always returns the same value.

@gaaclarke
Copy link
Member Author

I switched the xor for not equals, they are functionally equivalent and I think removes a lot of the confusion in the code.

@gaaclarke gaaclarke requested a review from goderbauer May 20, 2022 16:08
@goderbauer
Copy link
Member

I have the same questions that Jonah raised in #104103 (comment). Have those been answered somewhere?

@gaaclarke
Copy link
Member Author

I have the same questions that Jonah raised in #104103 (comment). Have those been answered somewhere?

@goderbauer Yes, we had a discussion offline where he agreed it is worth merging as long as your concerns about making it clear are addressed (mentioned here: #104103 (comment)). I hate to talk to him, feel free to correct me if you want @jonahwilliams. I imagine those concerns are even less now that I've removed xor. It doesn't affect clarity at all.

This function is called thousands of times (n log n, where n is the number of Elements) when we do hot reload. The work was done to make it faster. It would be a shame to throw out the savings that can be demonstrated by looking at the generated code, counting operations, or running benchmarks, just because it doesn't look how we'd first think about it.

@jonahwilliams
Copy link
Contributor

I have a very high tolerance for "unreadable code" so I don't feel comfortable making the call on whether or not this is worth it. I agree with Aaron that lots of small changes do add up, but of course there is always the risk we make something worse too 🤣

@goderbauer
Copy link
Member

My readability concerns are mostly addressed.

This function is called thousands of times (n log n, where n is the number of Elements) when we do hot reload. The work was done to make it faster. It would be a shame to throw out the savings that can be demonstrated by looking at the generated code, counting operations, or running benchmarks, just because it doesn't look how we'd first think about it.

So, there are benchmarks showing that this is faster? What are the numbers?

@gaaclarke
Copy link
Member Author

So, there are benchmarks showing that this is faster? What are the numbers?

@goderbauer with the following microbenchmark it is 27% faster on arm64 ios profile builds. This is the minimum benefit since Element uses polymorphism and this is the fastest possible implementation, just a straight field. I didn't think checking in a microbenchmark in the case was worth the maintenance since it's clearly faster (to my eyes).

// Copyright 2014 The Flutter Authors. All rights reserved.
// Use of this source code is governed by a BSD-style license that can be
// found in the LICENSE file.

import 'dart:developer';

import 'package:flutter/material.dart';

import '../common.dart';

const int _kNumIterations = 100000;

class _Widget extends Widget {
  @override
  Element createElement() {
    throw UnimplementedError();
  }
}

class _Element extends Element {
  _Element(this.depth, this.dirty) : super(_Widget());

  @override
  final int depth;

  @override
  final bool dirty;

  @override
  bool get debugDoingBuild => throw UnimplementedError();

  @override
  void performRebuild() {}
}

void main() {
  assert(false,
      "Don't run benchmarks in debug mode! Use 'flutter run --release'.");
  final BenchmarkResultPrinter printer = BenchmarkResultPrinter();

  final Element a = _Element(0, false);
  final Element b = _Element(1, false);
  final Element c = _Element(0, true);
  final Element d = _Element(1, true);

  final Stopwatch watch = Stopwatch();
  int tally = 0;
  watch.start();
  for (int i = 0; i < _kNumIterations; i += 1) {
    tally += Element.sort(a, a);
    tally += Element.sort(a, b);
    tally += Element.sort(a, c);
    tally += Element.sort(a, d);
    tally += Element.sort(b, c);
    tally += Element.sort(b, d);
    tally += Element.sort(c, d);
  }
  watch.stop();
  if (tally < 0) {
    print("this shouldn't happen.");
  }

  printer.addResult(
    description: 'Element.sort',
    value: watch.elapsedMicroseconds.toDouble() / _kNumIterations,
    unit: 'us per iteration',
    name: 'element_sort',
  );

  printer.printToStdout();
}

Copy link
Member

@goderbauer goderbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks for getting some numbers for this!

@fluttergithubbot fluttergithubbot merged commit 41c6063 into flutter:master May 23, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/plugins that referenced this pull request May 23, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/packages that referenced this pull request May 23, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/plugins that referenced this pull request May 23, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/plugins that referenced this pull request May 23, 2022
camsim99 pushed a commit to camsim99/flutter that referenced this pull request Aug 10, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/packages that referenced this pull request Aug 30, 2022
engine-flutter-autoroll added a commit to engine-flutter-autoroll/plugins that referenced this pull request Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

framework flutter/packages/flutter repository. See also f: labels.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants