ci: speed up schema crash test (CSafeLoader + xdist-safe aggregation)#3873
ci: speed up schema crash test (CSafeLoader + xdist-safe aggregation)#3873
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 417c9ea883
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| notify: | ||
| name: Notify on failure | ||
| needs: [schema_crash_test] | ||
| if: failure() && github.event.pull_request == null |
There was a problem hiding this comment.
Restrict failure issue creation to main-branch runs
The notify job currently runs on any failed workflow_dispatch execution because its condition only checks failure() and github.event.pull_request == null. Since manual dispatch can target non-main branches, a contributor’s failed branch run will open/update the shared "schema crash test failing" issue with a "main branch" title/body, creating false alarms and potentially masking real main failures until a later main success closes it. Add the same github.ref == 'refs/heads/main' guard used by close-on-success so only main-branch failures create this issue.
Useful? React with 👍 / 👎.
The real-world schema crash test ran ~700 provider cases serially, taking 25-40 minutes wall-clock on PRs touching json_schema code. Parallelizing with -n auto uses the runner's cores for 4-8x speedup.
417c9ea to
80bccc6
Compare
The real-world schema crash test was dominated by pure-Python YAML parsing (~80% of wall-clock on AWS/GitHub/Google providers). Switching to libyaml's CSafeLoader gives ~8x speedup per file with identical safety guarantees. Prior to this change, -n auto broke the aggregate baseline check because the module-level _results accumulator only saw results from whichever xdist worker happened to run test_z_aggregate_crash_rate. Each per-provider test now persists its counts to SCHEMA_CRASH_RESULTS_DIR, and aggregation moves to a pytest_sessionfinish hook in conftest.py — that hook runs on the xdist master after all workers finish, giving a correct cross-worker total.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9f4a28ee0b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| RUN_REAL_WORLD_SCHEMA_TEST: "1" | ||
| OPENAPI_DIRECTORY_PATH: /tmp/openapi-directory | ||
| run: uv run pytest tests/utilities/json_schema_type/test_real_world_schemas.py -m integration -v -s --timeout-method=thread | ||
| run: uv run pytest tests/utilities/json_schema_type/test_real_world_schemas.py -m integration -v -n auto --timeout-method=thread |
There was a problem hiding this comment.
Prevent xdist workers from racing on repo setup
Running this workflow with -n auto makes multiple workers call test_provider_schemas concurrently, and each call still executes _ensure_repo() (in tests/utilities/json_schema_type/test_real_world_schemas.py) against the shared OPENAPI_DIRECTORY_PATH (/tmp/openapi-directory). When the checkout is not already at the pinned commit, _ensure_repo() does shutil.rmtree(CLONE_DIR) and reclones, so two workers can delete/clone the same directory at once and intermittently raise git clone/git CalledProcessError, turning the job flaky before any schema assertions run.
Useful? React with 👍 / 👎.
Under -n auto, multiple xdist workers can enter _ensure_repo() concurrently and race on shutil.rmtree + git clone against the shared CLONE_DIR. An fcntl.flock around the rmtree/clone path serializes the reclone, and a lock-free fast path returns early when HEAD already matches the pinned commit (covers the steady state where the workflow's clone step has already prepared the directory).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ce268f4486
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
|
||
| from __future__ import annotations | ||
|
|
||
| import fcntl |
There was a problem hiding this comment.
Guard fcntl import on non-POSIX platforms
test_real_world_schemas.py now imports fcntl at module import time, but fcntl is unavailable on Windows. Our default CI (.github/workflows/run-tests.yml, matrix includes windows-latest) runs pytest tests -m "not integration ...", which still imports test modules during collection, so this raises ModuleNotFoundError before marker deselection and can fail the whole Windows test job on unrelated PRs. Please make the lock implementation conditional (or defer import) so collection succeeds on Windows.
Useful? React with 👍 / 👎.
Test Failure AnalysisSummary: The Windows unit test job fails during pytest collection because Root Cause: The PR adds Suggested Solution: Guard the import so it doesn't fail on Windows. Since In import fcntlwith: try:
import fcntl as _fcntl
_HAS_FLOCK = True
except ImportError:
_HAS_FLOCK = False # Windows — no file locking availableAnd update with open(lock_path, "w") as lf:
if _HAS_FLOCK:
_fcntl.flock(lf.fileno(), _fcntl.LOCK_EX)
try:
...
finally:
if _HAS_FLOCK:
_fcntl.flock(lf.fileno(), _fcntl.LOCK_UN)This matches the existing precedent in the same file where Detailed AnalysisFailing job: Error log excerpt: The test was introduced/extended as an integration test, but since pytest loads all test modules during collection (to read their markers), the top-level The Related Files
|
fcntl is POSIX-only; importing at module level breaks pytest collection on Windows (the integration test is skipped on Windows but collection still imports the module to read its markers). Move the import inside _ensure_repo, which only runs on POSIX.
|
I think I had two prs that fixed bugs in this area so depending on merge order the type issues may be fixed and we can update the baseline Will look in a bit |
Follow-up to #3826.
The real-world schema crash test was running ~700 provider-level pytest cases serially, taking 25-40 minutes on PRs touching schema-to-type code. Three changes speed it up and make parallel execution correct:
CSafeLoaderinstead ofyaml.safe_load. Profiling showed ~80% of wall-clock was pure-Python YAML parsing —safe_load()doesn't pick up libyaml automatically. Switching to the C loader with the same safety semantics gives ~8x speedup per file.-n autoin the pytest command. GitHub's standard ubuntu runners have 2 cores, so this gives ~2x on top of the YAML win.Disk-backed result persistence +
pytest_sessionfinishaggregation. The previous aggregate test (test_z_aggregate_crash_rate) read a module-level_resultsdict, which broke under xdist because each worker is a separate process. Now each per-provider test writes its counts toSCHEMA_CRASH_RESULTS_DIR, and apytest_sessionfinishhook inconftest.pyreads them all on the xdist master after workers finish. Baseline violations setsession.exitstatusto fail the run.Net: ~3-5 minutes expected in CI (down from ~25-40). One incidental observation for @strawgate — on my local run the
TypeErrorscount dropped from the previous baseline of 388 to 0. The JSON round-trip should produce identical input regardless of YAML loader, so it's worth a second look to confirm nothing silently changed.