Skip to content

fix(test-tests): prevent race condition in gentest tests w/xdist#2009

Merged
marioevz merged 2 commits intoethereum:forks/amsterdamfrom
danceratopz:ci/fix-clirunner-race-condition
Jan 12, 2026
Merged

fix(test-tests): prevent race condition in gentest tests w/xdist#2009
marioevz merged 2 commits intoethereum:forks/amsterdamfrom
danceratopz:ci/fix-clirunner-race-condition

Conversation

@danceratopz
Copy link
Member

@danceratopz danceratopz commented Jan 12, 2026

🗒️ Description

Problem

The test_tx_type tests fail intermittently in CI with error:

Error formatting code using formatter '.../bin/ruff'

When multiple pytest-xdist workers run in parallel, they all invoke ruff which tries to read/write the same .ruff_cache directory, causing race conditions.

This is the likely culprit for these fails (https://github.com/ethereum/execution-specs/actions/runs/20916668952/job/60091856756?pr=1803):

=================================== FAILURES ===================================
_______________________________ test_tx_type[0] ________________________________
[gw1] linux -- Python 3.11.14 /opt/actions-runner/_work/execution-specs/execution-specs/.tox/tests_pytest_py3/bin/python3

pytester = <Pytester PosixPath('/opt/actions-runner/_work/execution-specs/execution-specs/.tox/.tmp/pytest/popen-gw1/test_tx_type0')>
tmp_path = PosixPath('/opt/actions-runner/_work/execution-specs/execution-specs/.tox/.tmp/pytest/popen-gw1/test_tx_type_0_0')
monkeypatch = <_pytest.monkeypatch.MonkeyPatch object at 0x7ab64149b550>
tx_type = 0
transaction_hash = '0xa41f343be7a150b740e5c939fa4d89f3a2850dbe21715df96b612fc20d1906be'

    @pytest.mark.parametrize("tx_type", list(transactions_by_type.keys()))
    def test_tx_type(
        pytester: pytest.Pytester,
        tmp_path: Path,
        monkeypatch: Any,
        tx_type: int,
        transaction_hash: str,
    ) -> None:
        """Generates a test case for any transaction type."""
        # This test is run in a CI environment, where connection to a
        # node could be unreliable. Therefore, we mock the RPC request to avoid any
        # network issues. This is done by patching the `get_context` method of the
        # `StateTestProvider`.
        runner = CliRunner()
        tmp_path_tests = tmp_path / "tests"
        tmp_path_tests.mkdir()
        tmp_path_output = tmp_path / "output"
        tmp_path_output.mkdir()
        generated_py_file = str(tmp_path_tests / f"gentest_type_{tx_type}.py")
    
        tx = transactions_by_type[tx_type]
    
        def get_mock_context(self: StateTestProvider) -> dict:
            del self
            return tx
    
        monkeypatch.setattr(StateTestProvider, "get_context", get_mock_context)
    
        ## Generate ##
        gentest_result = runner.invoke(
            generate, [transaction_hash, generated_py_file]
        )
>       assert gentest_result.exit_code == 0
E       assert 1 == 0
E        +  where 1 = <Result Exception("Error formatting code using formatter '/opt/actions-runner/_work/execution-specs/execution-specs/.tox/tests_pytest_py3/bin/ruff'")>.exit_code

/opt/actions-runner/_work/execution-specs/execution-specs/packages/testing/src/execution_testing/cli/gentest/tests/test_cli.py:145: AssertionError

Investigation

Initial hypothesis was that Click's CliRunner might be causing issues, since its documentation states:

"This only works in single-threaded systems without any concurrency as it changes the global interpreter state."

However, investigation revealed this warning doesn't apply here:

  1. pytest-xdist uses processes, not threads - Each worker (gw0, gw1, etc.) runs in a separate Python process with isolated interpreter state
  2. CliRunner's global state modifications (sys.stdin, sys.stdout, os.environ) are confined to each worker's process
  3. Process isolation does NOT protect shared filesystem resources - All workers share the same .ruff_cache directory

This pointed back to the ruff cache as the likely culprit, but without stderr capture we can't be sure.

🔗 Related Issues or PRs

Sporadic fail seen in:

✅ Checklist

  • All: Ran fast tox checks to avoid unnecessary CI fails, see also Code Standards and Enabling Pre-commit Checks:
    uvx tox -e static
  • All: PR title adheres to the repo standard - it will be used as the squash commit message and should start type(scope):.
  • All: Considered adding an entry to CHANGELOG.md. skipped
  • All: Considered updating the online docs in the ./docs/ directory.
  • All: Set appropriate labels for the changes (only maintainers can apply labels).

Cute Animal Picture

image

Include ruff's stderr output in exceptions when formatting fails,
enabling diagnosis of the root cause in CI failures.
Add --no-cache flag to ruff invocation to prevent potential race
conditions when pytest-xdist runs tests with multiple workers sharing
the same .ruff_cache directory.
@danceratopz danceratopz added C-bug Category: this is a bug, deviation, or other problem A-test-tests Area: tests for packages/testing A-ci Area: Continuous Integration labels Jan 12, 2026
@danceratopz danceratopz changed the title fix(test-tests): prevent ruff cache race condition in gentest tests w/xdist fix(test-tests): prevent race condition in gentest tests w/xdist Jan 12, 2026
@codecov
Copy link

codecov bot commented Jan 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.33%. Comparing base (f391a0c) to head (f47232a).
⚠️ Report is 1 commits behind head on forks/amsterdam.

Additional details and impacted files
@@               Coverage Diff                @@
##           forks/amsterdam    #2009   +/-   ##
================================================
  Coverage            86.33%   86.33%           
================================================
  Files                  538      538           
  Lines                34557    34557           
  Branches              3222     3222           
================================================
  Hits                 29835    29835           
  Misses                4148     4148           
  Partials               574      574           
Flag Coverage Δ
unittests 86.33% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Member

@marioevz marioevz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@marioevz marioevz merged commit 05def83 into ethereum:forks/amsterdam Jan 12, 2026
18 checks passed
fselmo pushed a commit to fselmo/execution-specs that referenced this pull request Jan 12, 2026
…ereum#2009)

* fix(gentest): capture ruff stderr in format error messages

Include ruff's stderr output in exceptions when formatting fails,
enabling diagnosis of the root cause in CI failures.

* fix(gentest): disable ruff cache to prevent parallel test failures

Add --no-cache flag to ruff invocation to prevent potential race
conditions when pytest-xdist runs tests with multiple workers sharing
the same .ruff_cache directory.
@danceratopz danceratopz deleted the ci/fix-clirunner-race-condition branch January 13, 2026 04:03
jsign pushed a commit to jsign/execution-specs that referenced this pull request Jan 20, 2026
)

* refactor(benchmark): remvoe space from test name

* refactor: prevent trailing space
CPerezz pushed a commit to CPerezz/execution-specs that referenced this pull request Feb 27, 2026
…ereum#2009)

* fix(gentest): capture ruff stderr in format error messages

Include ruff's stderr output in exceptions when formatting fails,
enabling diagnosis of the root cause in CI failures.

* fix(gentest): disable ruff cache to prevent parallel test failures

Add --no-cache flag to ruff invocation to prevent potential race
conditions when pytest-xdist runs tests with multiple workers sharing
the same .ruff_cache directory.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-ci Area: Continuous Integration A-test-tests Area: tests for packages/testing C-bug Category: this is a bug, deviation, or other problem

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants