-
Notifications
You must be signed in to change notification settings - Fork 3.6k
[ci] Partially migrate macOS-host Pigeon tests to LUCI #3046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
It looks like these tests are hanging in LUCI, even though they run fine on Cirrus. @godofredoc Is there a way to connect to a machine while it's running to see if we can tell what process is hanging? |
|
@godofredoc Ping on the question above about debugging hangs. This is the last macOS LUCI migration, and I don't know how to make progress on it without infra help. |
|
|
|
It's not 119750; I've been experimenting with Intel in #3149 and it hangs there too. So either there's a real hang, or a process left around that makes LUCI think there's a hang, and I'm still not sure how to debug that further. |
|
Based on my experiments in #3149 it seems to be specific to the macOS tests, so perhaps something related to launching the macOS application. But we're doing that successfully on LUCI machines in flutter/plugins. |
d2b0e8d to
bf885d7
Compare
|
Watching a run over VNC I'm still not sure what's going on, but it looks like despite the missing output the macOS application was built (it's on disk) and probably ran tests (I see test data on disk, although I'm not sure how to inspect it to be sure it's as expected). And since I don't see |
Disabling output streaming (and thus the waits for it to complete) didn't help; it was apparently stuck on |
We can try running the test, getting the list of dart subprocesses and their children and killing one by one of the child subprocesses to see if at some point the main dart subprocess recovers and exits. We've seen this behavior with |
|
This time the tests passed, with the change being that I had added some (I didn't see any obvious subprocesses still running the last time I was connected and watch. I was just looking at the reverse-PID list of active processes though, so maybe I missed something.) |
|
Ah, I think I confused myself about what I had previously tested, and the print statements were unrelated. If I run just the macOS unit tests, everything is fine. If I run that same test after running the iOS tests, it times out, but after the logs show the iOS tests as having run successfully. I'll try just the macOS integration tests and see if that also passes, but it seems like it's a combination of tests that triggers this. |
Yes, that worked too. So any individual test is fine, but either macOS test will hang in LUCI if another test (iOS or macOS) runs first. |
I just tried this again; according to the hierarchical view in Activity Monitor there are no processes under the test process when the hang happens. It really seems like somehow the problem is in the awaits in the Dart code itself rather than with the process being run, but I'm not seeing what it would be. |
|
I replaced all of the code that handles awaits process and process stream awaiting with I've exhausted everything I can think to try at the moment. |
|
@tarrinneal This is ready for review as a hybrid, interim solution that gets the tests that do run successfully on LUCI running there, while leaving the ones that don't on Cirrus for now. Once this lands I'll make a new draft PR to fully switch that we can used to continue debugging flutter/flutter#120231. |
|
Landing on red since once again this PR seems to somehow have affected post-submit tests for the tree before it landed. |
Partially migrates the macOS-host Pigeon tests to LUCI:
While having the tests split is not ideal, it allows us to at least begin the process of moving these tests to LUCI (rather than leaving the failing bringup-mode tasks indefinitely) while we try to figure out why macOS tests are hanging in CI. And this allows us to eliminate the legacy iOS test harness, reducing tech debt.
This does mean we'll double-run other package's custom tests for now, but those other tests take ~30s in CI, so the cost is irrelevant.
Part of the consolidation of test harnesses that is covered by flutter/flutter#111505