Add real-world schema crash test (232K schemas from APIs.guru) by strawgate · Pull Request #3826 · PrefectHQ/fastmcp

strawgate · 2026-04-10T21:06:03Z

Summary

Adds an integration test that runs json_schema_to_type against every schema definition in the APIs.guru openapi-directory — 232,092 schemas from 4,120 real-world OpenAPI specs.

The test snapshots crash counts as regression baselines so future changes can't silently increase the crash rate. As we fix crash patterns, we ratchet the baselines down.

Schemas tested:  232,092
TypeErrors:      2,342  (e.x. datetime serialization in defaults)
SchemaErrors:      273  (e.x. invalid regexes in specs — not our code)
Timeouts:            0
Other errors:        0
Total crashes:   2,615  (1.13%)

Skipped by default — requires a local clone of the openapi-directory repo. Run with:

git clone --depth 1 https://github.com/APIs-guru/openapi-directory /tmp/openapi-directory
uv run pytest tests/utilities/json_schema_type/test_real_world_schemas.py -m integration -s

Takes ~13 minutes on a MacBook Pro. Each schema gets a 5-second SIGALRM timeout to catch infinite loops.

🤖 Generated with Claude Code

marvin-context-protocol · 2026-04-10T21:08:29Z

Test Failure Analysis

(Updated to reflect latest workflow run — previous analysis re: ruff lint violations is no longer current.)

Summary: The Tests: Python 3.10 on windows-latest job failed due to a test timeout in tests/client/test_streamable_http.py. This is a pre-existing, intermittent flaky failure on Windows that is unrelated to this PR.

Root Cause: A test in tests/client/test_streamable_http.py timed out while blocked in the Windows asyncio I/O selector (select.select()). This is a known class of flakiness on Windows Python 3.10 where asyncio's SelectSelector (used instead of Linux's epoll) can hang when a server-side ASGI handler returns without completing the HTTP response (ERROR: ASGI callable returned without completing response). The test process stalls waiting for data that never arrives, eventually hitting the 5-second pytest-timeout.

This PR is not the cause. The most recent push to main (run 24256002384, April 10) had the identical failure pattern — Tests: Python 3.10 on windows-latest timed out in tests/cli/test_run.py for the same reason. The failing test varies between runs, confirming this is intermittent infrastructure-level flakiness, not a regression from this PR.

Suggested Solution: No action needed on this PR. The failure is pre-existing on main and not introduced by these changes. Re-running the CI job will likely pass (or fail in a different test, consistent with the flaky pattern).

Detailed Analysis

Failed job: Tests: Python 3.10 on windows-latest (job ID 70872661074)

Timeout stack trace from this run:

tests\client\test_streamable_http.py +++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Captured stdout ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
INFO:     127.0.0.1:51818 - "POST /mcp HTTP/1.1" 200 OK
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Captured stderr ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ERROR:    ASGI callable returned without completing response.
...
  File "C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\selectors.py", line 315, in _select
    r, w, x = select.select(r, w, w, timeout)
+++++++++++++++++++++++++++++++++++ Timeout +++++++++++++++++++++++++++++++++++

Comparison with most recent main branch failure (run 24256002384, April 10):

tests\cli\test_run.py ... Timeout
  File "C:\hostedtoolcache\windows\Python\3.10.11\x64\lib\selectors.py", line 315, in _select
    r, w, x = select.select(r, w, w, timeout)

Same root cause, different test — confirming this is flaky infrastructure, not a code regression.

All other jobs (Ubuntu Python 3.10, Ubuntu Python 3.13, integration tests, MCP conformance tests, lowest-direct dependencies) passed cleanly on this PR run.

Related Files

tests/client/test_streamable_http.py — the test that timed out; not modified by this PR
tests/utilities/json_schema_type/test_real_world_schemas.py — the new file added by this PR (skipped by default, requires local openapi-directory clone)

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 49a94be1b9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-10T21:09:13Z

+    pytest.mark.skipif(
+        not CLONE_DIR.exists(),


Remove skip gate that blocks real-world crash test execution

The module-level skipif(not CLONE_DIR.exists()) causes this test to skip on a fresh checkout, so _ensure_repo() never runs and the repo is never cloned despite the test/docs saying it will. In practice this means CI and most local runs silently skip the regression baseline, so crash-rate regressions are not detected unless the directory is pre-created manually.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fce60b179a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-10T21:18:15Z

+    paths:
+      - "src/fastmcp/utilities/json_schema_type.py"
+      - "src/fastmcp/utilities/json_schema.py"
+      - "src/fastmcp/utilities/openapi/**"
+      - "src/fastmcp/server/providers/openapi/**"


Add client mixin path to PR workflow trigger

The new workflow only runs on pull_request when files under the listed paths change, but src/fastmcp/client/mixins/tools.py is included for push and omitted for pull_request. That means a PR that changes this OpenAPI-related client path can merge without running the crash baseline, and regressions are only caught after merge when push runs on main. Include the same path in the PR filter so this gate works pre-merge.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ff9668fbe1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-10T21:27:56Z

+    if CLONE_DIR.exists():
+        import shutil
+
+        shutil.rmtree(CLONE_DIR)


Guard removal of arbitrary existing clone path

_ensure_repo() recursively deletes CLONE_DIR whenever it exists but is not already at the pinned commit, without validating that the path is actually an openapi-directory checkout. Since CLONE_DIR comes from OPENAPI_DIRECTORY_PATH, a misconfigured value (for example, a parent directory like /tmp) would cause unrelated files to be deleted during a test run. Please fail safely (or require explicit opt-in) instead of unconditional rmtree on any existing path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-04-10T21:27:56Z

+                signal.alarm(0)
+                signal.signal(signal.SIGALRM, old_handler)


Preserve pytest timeout alarm when resetting per-schema alarm

The per-schema cleanup calls signal.alarm(0), which cancels any existing process alarm, including the pytest.mark.timeout(1200) guard on POSIX runs that use SIGALRM. Because the code only restores the handler (not the prior timer), the global test timeout can be silently disabled after the first schema, so a later non-interruptible hang may run until the workflow/job timeout instead of failing at 1200s.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4567d6998a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T00:16:32Z

+
+@pytest.mark.integration
+@pytest.mark.timeout(300, method="thread")
+@pytest.mark.parametrize("provider", _collect_providers())


Defer provider parametrization until repo is ensured

The provider list is computed at collection time via @pytest.mark.parametrize("provider", _collect_providers()), but _ensure_repo() (which clones/checks out the pinned commit) only runs inside the test body. This means the parametrization can be built from the wrong checkout (or from no checkout at all), so in RUN_REAL_WORLD_SCHEMA_TEST=1 fresh-checkout runs pytest collects an empty provider set ([NOTSET]), and with a non-pinned local clone it can miss providers that exist in the pinned dataset. As a result, the per-provider timeout guard is silently skipped or incomplete even though the test is intended to give per-provider coverage/progress.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fc6bf2816d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T00:54:15Z

+# In CI (RUN_REAL_WORLD_SCHEMA_TEST=1), _ensure_repo clones automatically.
+# Locally, skip unless the repo is already cloned to avoid a surprise 200MB download.
+_run_in_ci = os.environ.get("RUN_REAL_WORLD_SCHEMA_TEST") == "1"
+_skip_locally = not _run_in_ci and not CLONE_DIR.exists()


Skip local run when clone path exists but is not repo

_skip_locally only checks CLONE_DIR.exists(), so any existing path at /tmp/openapi-directory (including an unrelated directory or file) disables the skip and runs the test, after which _ensure_repo() raises RuntimeError because the path is not an openapi-directory clone. This turns a test documented as “skipped by default unless cloned” into a hard failure in normal local pytest runs when that path happens to exist, which is a common /tmp collision case.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 31e99dd3de

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T01:59:23Z

+
+
+# Accumulator: per-provider tests store results here, final test reads them.
+_results: dict[str, ProviderResult] = {}


Make aggregate crash baseline xdist-safe

This module aggregates provider results in the process-local _results dict, so when the test is run with pytest -n auto (xdist), each worker builds a separate partial view and test_z_aggregate_crash_rate only sees whatever ran on that single worker. In practice this can either skip the aggregate check (_results empty on that worker) or fail the >200_000 assertion with partial totals, so the regression baseline is not reliable under parallel execution.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b56818dece

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-11T02:29:38Z

+            str(k) if not isinstance(k, str) else k: _normalize_yaml_types(v)
+            for k, v in obj.items()


Normalize required entries when coercing YAML keys

Converting every dict key to str here changes property names (e.g. YAML 200: becomes '200') but list scalars are left as-is, so required: [200] remains an int. In _create_dataclass, required detection uses prop_name in required, so '200' no longer matches 200 and a required field is silently treated as optional. This weakens validation for YAML-loaded schemas with unquoted numeric keys instead of preserving their JSON semantics.

Useful? React with 👍 / 👎.

Integration test that runs json_schema_to_type against 232K schemas from 4,120 real-world OpenAPI specs (APIs.guru openapi-directory). Snapshots crash counts as regression baselines so future changes can't silently increase the crash rate. Current baseline (openapi-directory@f7207cf0): TypeErrors: 2,342 (datetime serialization) SchemaErrors: 273 (invalid regexes in specs) Timeouts: 0 Other: 0 Skipped unless openapi-directory is cloned locally. Run with: pytest -m integration tests/.../test_real_world_schemas.py 🤖 Generated with Claude Code Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

jlowin

Well-structured crash regression suite — good isolation from the main test run, and the YAML type normalization fix is a nice bonus.

marvin-context-protocol Bot added enhancement Improvement to existing functionality. For issues and smaller PR improvements. tests openapi Related to OpenAPI integration, parsing, or code generation features. labels Apr 10, 2026

chatgpt-codex-connector Bot reviewed Apr 10, 2026

View reviewed changes

strawgate force-pushed the test/real-world-schema-crash-test branch from 49a94be to fce60b1 Compare April 10, 2026 21:16

chatgpt-codex-connector Bot reviewed Apr 10, 2026

View reviewed changes

strawgate force-pushed the test/real-world-schema-crash-test branch from fce60b1 to ff9668f Compare April 10, 2026 21:24

chatgpt-codex-connector Bot reviewed Apr 10, 2026

View reviewed changes

strawgate force-pushed the test/real-world-schema-crash-test branch 5 times, most recently from b08f26c to 4567d69 Compare April 11, 2026 00:12

chatgpt-codex-connector Bot reviewed Apr 11, 2026

View reviewed changes

strawgate force-pushed the test/real-world-schema-crash-test branch from 4567d69 to fc6bf28 Compare April 11, 2026 00:49

chatgpt-codex-connector Bot reviewed Apr 11, 2026

View reviewed changes

strawgate force-pushed the test/real-world-schema-crash-test branch from fc6bf28 to 31e99dd Compare April 11, 2026 01:54

chatgpt-codex-connector Bot reviewed Apr 11, 2026

View reviewed changes

strawgate force-pushed the test/real-world-schema-crash-test branch from 31e99dd to b56818d Compare April 11, 2026 02:24

chatgpt-codex-connector Bot reviewed Apr 11, 2026

View reviewed changes

strawgate force-pushed the test/real-world-schema-crash-test branch from b56818d to 2915cba Compare April 11, 2026 02:53

jlowin approved these changes Apr 11, 2026

View reviewed changes

jlowin merged commit 4685599 into main Apr 11, 2026
10 checks passed

jlowin deleted the test/real-world-schema-crash-test branch April 11, 2026 15:23

jlowin mentioned this pull request Apr 12, 2026

ci: speed up schema crash test (CSafeLoader + xdist-safe aggregation) #3873

Merged



		# Accumulator: per-provider tests store results here, final test reads them.
		_results: dict[str, ProviderResult] = {}

		str(k) if not isinstance(k, str) else k: _normalize_yaml_types(v)
		for k, v in obj.items()

Conversation

strawgate commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

marvin-context-protocol Bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Failure Analysis

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

jlowin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

strawgate commented Apr 10, 2026 •

edited

Loading

marvin-context-protocol Bot commented Apr 10, 2026 •

edited

Loading