Skip to content

Conversation

@jamesderlin
Copy link
Contributor

Description

Triggering runZoned/runZonedGuarded's error callback does not
necessarily mean that the body stops executing. (Consequently, that
also means that the error callback can be triggered multiple times.)

This allowed some types of crashes to not be properly reported. In
such cases we would undesirably continue along the success path,
which calls exit() and prematurely terminates the process before
the error path completes. (It'd be nice to stop calling exit()
entirely, but I digress.)

Fix this by checking if the zone's error callback has fired before
exiting with a success code. This also prevents prematurely
terminating before the error callback can finish reporting the crash.

Related Issues

#56406

Tests

I added the following tests:

  • I added a test to try to simulate the type of crash that would trigger runZoned's error callback. I verified that the test fails without the rest of my change and passes with it.

Checklist

Before you create this PR confirm that it meets all requirements listed below by checking the relevant checkboxes ([x]). This will ensure a smooth and quick review process.

  • I read the Contributor Guide and followed the process outlined there for submitting PRs.
  • I signed the CLA.
  • I read and followed the Flutter Style Guide, including Features we expect every widget to implement.
  • I read the Tree Hygiene wiki page, which explains my responsibilities.
  • I updated/added relevant documentation (doc comments with ///).
  • All existing and new tests are passing.
  • The analyzer (flutter analyze --flutter-repo) does not report any problems on my PR.
  • I am willing to follow-up on review comments in a timely manner.

Breaking Change

Did any tests fail when you ran them? Please read Handling breaking changes.

…allback

Triggering `runZoned`/`runZonedGuarded`'s error callback does not
necessarily mean that the body stops executing. (Consequently, that
also means that the error callback can be triggered multiple times.)

This allowed some types of crashes to not be properly reported.  In
such cases we would undesirably continue along the success path,
which calls `exit()` and prematurely terminates the process before
the error path completes. (It'd be nice to stop calling `exit()`
entirely, but I digress.)

Fix this by checking if the zone's error callback has fired before
exiting with a success code.  This also prevents prematurely
terminating before the error callback can finish reporting the crash.
@jamesderlin jamesderlin requested a review from zanderso June 2, 2020 09:04
@fluttergithubbot fluttergithubbot added the tool Affects the "flutter" command-line tool. See also t: labels. label Jun 2, 2020
Copy link
Member

@zanderso zanderso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Just a comment about the test.

Timer.run(() => throw error);

// Give the Timer time to fire.
await Future<void>.delayed(const Duration(milliseconds: 100));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might work in this case, but it's generally not a good idea to have explicit timers like this in tests. There are two options. The first option is to use a FakeAsync block. I think that might be overkill here. Instead, it looks like you might be able to pass around some Futures to await and some Completers to complete to achieve the execution order that you're trying to exercise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced this one with a Completer to wait on. I didn't think that particular case would be controversial since it involves a 0-delay timer and a non-zero delay timer, and I expect that timer callbacks would fire in order.

I really don't like the Timer in SlowCrashReporter. The point of that is to make runner.run's failure path take longer than the success path if they're racing. I don't think there's anything along the success path that I can wait on; I can't add a Completer to the test-implementation of exit() since the fix is to not call exit() in this situation.

I could add an explicit, global, @visibleForTesting-only Completer to runner.dart that will be completed when we continue down the success path. That also feels gross, but I suppose it's better than the hard-coded delay. Is that acceptable?

Remove hard-coded wait durations.

Also adjust some names.
});

await completer.future;
return FlutterCommandResult.success();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does anything go wrong if you do runCompleted.complete() here instead?

Copy link
Contributor Author

@jamesderlin jamesderlin Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work, but it would be a bit more brittle. If asynchronous work ever occurs after FlutterCommand.runCommand() returns, the test could pass when it should fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not quite getting it. Could you spell this out for me a bit more? What's the sequence of events that would be problematic?

Copy link
Contributor Author

@jamesderlin jamesderlin Jun 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CrashingFlutterCommand.runCommand completed runCompleted instead, the desired sequence of events would be:

  1. CrashingFlutterCommand.runCommand generates an asynchronous error.
  2. The onError callback from runner.run's runZoned call fires. We proceed down the crash reporting path. WaitingCrashReporter waits for runCompleted.
  3. CrashingFlutterCommand.runCommand continues, completes runCompleted, and returns.
  4. runner.run continues in its try block.
  5. WaitingCrashReporter is unblocked and successfully reports the crash.

If execution yields between steps 3 and 4, then step 5 could run first. If step 5 runs before step 4, a regression where runner.run's try block makes its own call to exit() would not be detected.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, thanks. I understand the concern now. Other than a stray call in analyze_continuously.dart, the tool only calls exit() in runner.dart and base/signals.dart. We should guard against proliferating calls to exit with a more on-the-nose test, but not in this PR. For this PR, I think it would be best to move the runCompleted completer from runner.dart into the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed #59338

Move the command Completer into the test to avoid polluting the tool
code.  Note that this weakens the test and might prevent it from
catching regressions.
@zanderso zanderso mentioned this pull request Jun 12, 2020
@jamesderlin jamesderlin requested a review from zanderso June 12, 2020 21:35
Copy link
Member

@zanderso zanderso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@fluttergithubbot fluttergithubbot merged commit c21b323 into flutter:master Jun 15, 2020
zljj0818 pushed a commit to zljj0818/flutter that referenced this pull request Jun 22, 2020
mingwandroid pushed a commit to mingwandroid/flutter that referenced this pull request Sep 6, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 30, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

tool Affects the "flutter" command-line tool. See also t: labels.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants