Skip to content

Conversation

@hebasto
Copy link
Member

@hebasto hebasto commented Sep 19, 2023

During my investigation of #28411 and other similar functional test failures on Windows in CI, I found out that

self.process = subprocess.Popen(self.args + extra_args, env=subp_env, stdout=stdout, stderr=stderr, cwd=cwd, **kwargs)
sometimes fails for unknown to me reasons. By "fails", I mean that a child process does not make any progress.

This PR ensures a child process's progress by checking a created PID file shortly. If the check fails, another two attempts are following.

Although this PR fixes tests on Windows, the new logic is platform-agnostic and increases test robustness.

In several dozens of runs in my personal repo GHA, the only intermittent failure still happens -- #28491.

Closes #28411.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 19, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28392 (test: Use pathlib over os path by ns-xvrn)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member

maflcko commented Sep 19, 2023

May be easier to just bump the python version from 3.9 to 3.12 to fix the bug?

@hebasto hebasto marked this pull request as draft September 20, 2023 08:03
@hebasto hebasto marked this pull request as ready for review September 20, 2023 09:49
@hebasto
Copy link
Member Author

hebasto commented Sep 20, 2023

The CI failure is #28491 and unrelated to this PR.

@hebasto
Copy link
Member Author

hebasto commented Sep 20, 2023

... just bump the python version from 3.9 to 3.12...

From Python 3.12 Release Schedule:

Expected:

  • 3.12.0 final: Monday, 2023-10-02

The currently available Python versions in the Windows 2022 image:

  • 3.7.9
  • 3.8.10
  • 3.9.13
  • 3.10.11
  • 3.11.5

@maflcko
Copy link
Member

maflcko commented Sep 20, 2023

Which one are we using right now?

@hebasto
Copy link
Member Author

hebasto commented Sep 20, 2023

Which one are we using right now?

On Windows, it is 3.11.5.

@maflcko
Copy link
Member

maflcko commented Sep 20, 2023

Ok, so the issue is probably not due to an too-old python version.

@fanquake
Copy link
Member

Concept ~0. A bunch of extra code in the test-framework, to fix a not-yet-identified, Windows only issue.

the new logic is platform-agnostic and increases test robustness.

Can you elaborate on how this increases robustness for non-Windows platforms, if they are already working?

@hebasto
Copy link
Member Author

hebasto commented Sep 24, 2023

Concept ~0. A bunch of extra code in the test-framework, to fix a not-yet-identified, Windows only issue.

  1. We already have an entire directory with code that serves similar purposes in our CI.

  2. We already have a bunch of platform-specific code in the test-framework.

  3. The issue has been identified (please refer to the PR description), but its cause has not yet been determined.
    Of course, it would be great if someone identifies it. And then this workaround can be dropped.

Can you elaborate on how this increases robustness for non-Windows platforms, if they are already working?

If some similar issues will happen for non-Windows platform in the future, they won't break the tests.

@fanquake
Copy link
Member

If some similar issues will happen for non-Windows platform in the future, they won't break the tests.

You mean the issues will just be hidden / less-likely to be identified & debugged?

@hebasto
Copy link
Member Author

hebasto commented Sep 24, 2023

@fanquake

If some similar issues will happen for non-Windows platform in the future, they won't break the tests.

You mean the issues will just be hidden / less-likely to be identified & debugged?

This PR adds additional logging and exceptions.

What do you suggest?

@fanquake
Copy link
Member

What do you suggest?

I would suggest we figure out why Python doesn't work on Windows, or at least, doesn't work when run in the GitHub CI, and fix it in a targeted way (while reporting the issue upstream), with the intention to drop the workaround as soon as a newer version of Python is available, rather than inject all this new code, into the test framework, where it affects all platforms.

@hebasto
Copy link
Member Author

hebasto commented Sep 25, 2023

I would suggest we figure out why Python doesn't work on Windows, or at least, doesn't work when run in the GitHub CI...

I started to think that the issue is specific to GHA CI as I cannot reproduce it locally.

@maflcko
Copy link
Member

maflcko commented Sep 29, 2023

Could it make sense to disable the functional tests on Windows for pull requests and only run them on master?

This means that issues will be caught at a later stage only, but I'd suspect they are easy to fixup post-merge.

Overall this may be less work than having someone re-run the CI on all affected pull request or having people ignore the Windows CI anyway.

@fanquake
Copy link
Member

fanquake commented Oct 2, 2023

Yea, I think this might be the right thing to do (for now). Persistent random red CI is pointless, and confusing for contributors. It's a shame that Windows Python doesn't seem to work on GitHub, but we also aren't going to make all the changes here to work around that.

fanquake added a commit to bitcoin-core/gui that referenced this pull request Oct 4, 2023
…windows in master

aba4a58 ci: Only run functional tests on windows in master (Fabian Jahr)

Pull request description:

  This idea was discussed [here](bitcoin/bitcoin#28509 (comment)).

ACKs for top commit:
  hebasto:
    ACK aba4a58

Tree-SHA512: 89fd6352b585bae3538d5350b0404c216a8225fe356d408c1ebe3394e7b9a190d65639f4eef310056e020909928d7a1f2de25585c97d2ac087d1a9f72af281eb
Frank-GER pushed a commit to syscoin/syscoin that referenced this pull request Oct 13, 2023
…in master

aba4a58 ci: Only run functional tests on windows in master (Fabian Jahr)

Pull request description:

  This idea was discussed [here](bitcoin#28509 (comment)).

ACKs for top commit:
  hebasto:
    ACK aba4a58

Tree-SHA512: 89fd6352b585bae3538d5350b0404c216a8225fe356d408c1ebe3394e7b9a190d65639f4eef310056e020909928d7a1f2de25585c97d2ac087d1a9f72af281eb
@bitcoin bitcoin locked and limited conversation to collaborators Oct 1, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: wallet_listtransactions.py --legacy-wallet failure

4 participants