-
Notifications
You must be signed in to change notification settings - Fork 17.3k
GitHub 0.36.5 to dev #21782
GitHub 0.36.5 to dev #21782
Conversation
|
It looks like the new superstring version is breaking: |
|
I was able to make a narrower steps to reproduce for this error message.
Maybe this submodule needs to be initialized during a Edit: or switch to a more Node-ecosystem-friendly solution like the |
iconv submodule should be initiated before publishing to npm. I made a PR: atom/superstring#91 |
|
@sadick254 I'm guessing we'll need to cut new releases of superstring, whats-my-line, and atom/github to get this passing 😬 |
|
( So, this would do it:
|
|
@smashwilson Yes we do. I will be doing the releases in a short while. |
|
Now it looks like the GitHub package tests are timing out the build 😓 It looks like the macOS builds there are running in about 14 minutes, although of course that's on a GitHub Actions runner instead of Azure DevOps so it might not be completely equivalent. I'm wondering if we should split the GitHub package out completely into its own matrix row there. None of the other bundled packages have nearly that long of a test time, so I feel like it's going to overwhelm whatever row it's on. I might also be able to add some internal logic to skip lengthy integration tests and such as part of a full Atom build. |
|
The separation of the GitHub tests at the CI level has no benefits. They are already taken care of the test runner (they run in separate child processes). The retrying is because of the timeout trigger I added to the tests (I defined this error as timing out). Disable the following line, and it will not retry them anymore: Line 219 in 629a7a2
|
|
@aminya Ah! Thanks for the context. I pretty clearly haven't been paying attention here for a while 😅 I'll give it a shot with that line removed from the conditional. I'd thought that error wasn't fatal, but I'm pretty rusty so I could be mistaken 🤔 |
Co-Authored-By: Amin Yahyaabadi <[email protected]>
|
For more context, there were some PRs earlier this year to update script/test: |
|
These tests were flaky before too, that's why I had added that time-out trigger. My instinct says that the offending part is using MacOS 1.15 which is caused by this PR: Has MacOS changed anything regarding code signing, app signature, etc? @smashwilson Could you revert this and this, and run the tests again. |
Yah, I remember seeing these Decorations test hard-crash Electron occasionally for a long time, now (although, it never crossed the threshold to be worth really trying to get to the bottom of it). But on this branch I've never seen a build that didn't hit it, which leads me to believe that in this case the failure is legitimate. I'm wondering if these are made more likely by the superstring changes (the only related native code changes I'm aware of) or Electron changes (has there been an Electron upgrade on atom/atom master since the last time I released?). The "Error Downloading Update" message is, as far as I know, entirely unrelated. I believe it's written to stderr even when the tests are passing, you just don't see it because we only dump stderr for failing suites. |
|
I have been doing a ton of CI runs to bisect this issue, feel free to take a look as I narrow it down. Branch: https://github.com/DeeDeeG/atom/commits/superstring-up with commits+CI status icons (which can be clicked through to see the respective runs) Tentative lead: Something changed in |
|
@DeeDeeG thank you for helping out with this 😍 |
|
This PR atom/github#2587 is the only code change I'm seeing. I will put together a CI run with that PR reverted. I know it's an important bug fix, but it might be tripping up Atom's CI for reasons as yet unknown. (No code changed in |
|
@smashwilson reverting that PR (atom/github#2587) allows the macOS "package tests 1" job to finish. Example passing Atom CI: https://dev.azure.com/DeeDeeG/b/_build/results?buildId=920&view=results |
|
Bizarre... Those tests are all green on atom/github's CI. I suppose the next step would be to try to reproduce the failures in atom/github with a local dev build of Atom's master? It's strange that it's a crash instead of a test failure, too. |
|
Not sure if this helps, but I got this when running the Atom "package tests 1" set on Windows: https://gist.github.com/DeeDeeG/91429219c191794c5df52d8797bc2f61 On macOS, for me at some point during the |
|
@smashwilson Please try what I suggested in #21782 (comment). In the GitHub repo, you are running the tests in MacOS Mojave, but this repo switched to Catalina recently. |
|
@aminya you may be onto something... The "Nightly" pipeline hasn't had a successful build since November 8. Among other things that have changed since then, the build would have been done on macOS 10.14 Mojave back then. (In any case, I am running the Two separate questions come to my mind: Does building Atom on macOS 10.15 (Catalina) reproduce this CI failure? Does running the tests on macOS 10.15 reproduce the CI failure? Only the combination of both? Will try to run enough tests to sort it out. Related (CI failures affecting not just Nightly but Production/master pipelines) I believe CI should use the GitHub file download API, and do so with authentication. Either it's trying to download "certificates" that don't exist, or GitHub is only telling CI the file doesn't exist because it's trying to download them unauthenticated. |
|
@aminya I'm doing a CI run of this branch with those two commits reverted: |
|
I found the reasons for the failing tests! GitHub's CI needs to be updated as I mentioned in: Here are the details of the errors: |
|
@aminya those look like different errors, though? The integration tests that are failing there have always been kind of brittle - they rely on receiving actual filesystem events and the DOM updating in a different manner. I've debated taking them out altogether because they seem to cause much more build trouble than they identify legitimate problems. The failures that we're seeing in the Atom build here are hard Electron crashes in the Decoration suite for some reason. |
|
When I get time I'll have to split the crashes in here into their own atom/github issue 🤔 |
setup-atom downloads and sets up the official atom releases from different channels, and so there is no difference in terms of the environment. If the tests fail, it should be related to the code itself. Maybe we should first get this PR working before trying to fix the Electron failures? |
|
Do you mean get atom/github#2459 working and merged? That's fine, but we'll still need to fix the Electron crashes before we can merge this PR, because that's the only failure that's appearing here. |
I'm confused about this, then. The reason this PR is not mergeable is because of the native crashes. What do you mean by "get this PR working" that isn't fixing those? |
|
@smashwilson Never mind 😄 Definitely, you have more context about the issue. I thought that the test failures in #21782 and #21783 are different from the errors that happen in the Electron PR. |
"Crashing" may be the usual quitting behavior for Atom on macOS when not code-signed. In practice, nothing obviously bad is happening in the GUI. I always see a crash event mentioned when closing a window, if running non-code-signed Atom via the command-line with full path to the binary, like this: % out/Atom\ Dev.app/Contents/MacOS/Atom\ Dev
Error Downloading Update: Could not get code signature for running application
Attempting to call a function in a renderer window that has been closed or released.
Function provided here: worker.js:79:22
Remote event names: crashed, destroyedPerhaps (reading possibly way too far into the scant logging info) there is a worker responsible for getting updates, and perhaps that worker doesn't exit if it can't read a code signature. Then it's forced to exit less-than-gracefully the next time a window closes. Like I mentioned above, this is not a noticeable problem for end-users that I know of. (Exiting cleanly would be good, but speaking for myself, I think there may be bigger things to worry about in Atom development than exiting this worker cleanly? In which case this could be seen rather as a CI issue than a production issue. We could try to ignore this particular error message, perhaps re-logging it as info rather than treating it as a full-on CI-failing level of error.) |
|
Crashes filed as atom/github#2607. |
|
My above comment may be off-base -- I don't think this resembles the usual background noise of "closing a window prints a message about a crash event." With this bug, I see the test output trying to print what test was running, and the test output is cut off in the middle of the line. So I think the crash bubbles all the way up to the test runner itself? Or maybe just the test runner is crashing? (As @smashwilson mentioned here: #21782 (comment), this bug/crash/phenomenon is reliably happening in the Here's an example run, linking to the line where the text is cut off part-way through: https://dev.azure.com/DeeDeeG/b/_build/results?buildId=979&view=logs&j=6de74252-975b-522d-7a3d-227ca79ca6c1&t=a70a2b1c-2767-57d7-6509-edd8e9181730&l=145 The text cut off part-way through is: @smashwilson @aminya I think I've narrowed it down to this test group: https://github.com/atom/github/blob/v0.36.6/test/atom/decoration.test.js#L42-L167 ( (This is the only part of the decoration suite that has the text P.S. These tests are running in parallel if I understand correctly, so I suppose the issue could be in one of the other nearby tests, not necessarily the one which has its name truncated. |
|
I tried this with parallelization in BaselineAs a baseline or control to compare against, here is the code and dependencies of the current
Result: "macOS test packages-1" job is passing Same thing with updated
|
|
Actually, I'm mostly seeing the same |
There is no parallelization in the tests. If you want to run them in parallel you should change Line 478 in 1e10a0a
|
|
Okay, thanks. I must have misremembered -- I think that greater test parallelization was attempted to be added at some point, but then it got changed to I didn't learn a lot from my experiment in that comment... It turns out the result is the same, and still just as mysterious. I am trying everything I can think of to narrow down the cause of this strange issue, or at least get CI to start passing here. |
|
I have found that the Just download the built Atom in I think it's another one of those "weird abrupt quitting behavior when run as a The downside of running outside of |
|
This error reproduced at least once with Perhaps this new reproduced case can help zero in on the true cause. Test output snippet (click to expand):CI run based on commit 0008c25 (not saying that commit is related, or not related, to this reproducing. I'm not sure. Just documenting this info in case the CI run gets deleted by Azure Pipelines.) |
|
I ran the tests in Interestingly, when running outside of When I patch CI link with 0.36.6 patched to have 60 second timeouts These timeouts do not happen for me under the otherwise same testing scenario, but with There is an open issue for exactly one out of this group of test failures at the Edit: Another issue for these flaky filesystem-accessing tests: atom/github#1769 |
This job runs outside of `script/test`, without using child_processes. It appears to help github's tests pass as of the v0.36.5 release. See atom#21782 for discussion of (and attempts to debug) this issue.
aminya
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be closed now after merging #21903
|
Yup 👍🏻 |
Upgrade the GitHub package on the nightly channel to the latest release.