Skip to content

☂️ Hot restart integration tests are quite flaky #153049

@andrewkolos

Description

@andrewkolos

Recently, some of flutter_tools' hot restart integration tests have been flaky on CI, usually due to timeouts. While there may be various causes, I cannot help but think there is some fragility or race-proneness in the hot-reload workflow in the tool code. This is a problem for all the usual reasons test flakiness is, but this issue is especially important since hot reload is a core offering of Flutter. It needs to reliably testable, and I want to be sure hot reloading is not hanging for users as well.

This is an umbrella issue to group other issues together and serve as a place for meta-discussion across all these issues.

  1. The earliest one I'm personally aware of is Mac tool_integration_tests_3_4 is 2.02% flaky #146879. The flaky test here was test/integration.shard/vmservice_integration_test.dart: Flutter Tool VMService method hotRestart can be called, which was timing out. I added some logging to try to troubleshoot the issue, but I was unsuccessful in determining a root cause as the flake stopped appearing (perhaps due to some side effect of my logging code "fixed" it, the test moved to a different shard, or some other external factor). To summarize what I did find, the tool was getting stuck waiting for an IsolateRunnable event to come back from the VM service (comment with more info).
  2. At the same time was Mac tool_integration_tests_2_4 is 2.04% flaky #145812. The flaky test here was test/integration.shard/hot_reload_test.dart: hot restart works without error. At the time I considered this duplicate with the prior-mentioned Mac tool_integration_tests_3_4 is 2.02% flaky #146879, but perhaps this was not the case.
  3. Mac tool_integration_tests_3_4 is 2.02% flaky #152220 seems to be a reapprance of the first, Mac tool_integration_tests_3_4 is 2.02% flaky #146879 (test/integration.shard/vmservice_integration_test.dart: Flutter Tool VMService method hotRestart can be called).
  4. Mac tool_integration_tests_2_5 is extremely flaky #153026 appears to be more recent and severe. The failing tests here are test/integration.shard/hot_reload_test.dart: hot restart works without error and test/integration.shard/hot_reload_with_asset_test.dart: hot restart does not need to sync assets on the first reload. The former would be a re-apperance of the second issue, Mac tool_integration_tests_2_4 is 2.04% flaky #145812. The latter was a new issue. The flake rate here appeared to be worsened by [tool] Guard process writes to frontend server in ResidentCompiler #152358, which was since reverted. That PR should have resulted in no behavior change—just some additional awaits and waiting for stdin sinks to flush between writes to the compiler server process. This suggests a there is race somewhere in/around DefaultResidentCompiler in the tool code.
  5. (Added on 8/14/24): test/integration.shard/overall_experience_test.dart: flutter run can hot reload and hot restart, handle "p" key (logs)

I will be trying to locally reproduce these for better troubleshooting. However, I imagine the root cause of many of these flakes will be quite tricky to figure out, so I will probably need all the help I can get here.

Metadata

Metadata

Assignees

Labels

P1High-priority issues at the top of the work listteam-toolOwned by Flutter Tool teamtriaged-toolTriaged by Flutter Tool team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions