fix: strip titles from bare-metadata nodes (Gemini 2.5 Flash) by jlowin · Pull Request #3881 · PrefectHQ/fastmcp

jlowin · 2026-04-12T21:11:41Z

Follow-up to #3861.

Two edge cases Codex flagged on the original Gemini title-stripping PR that were left open:

Bare-metadata nodes leak through. Pydantic emits {"title": "X"} for Any-typed fields — no type, no properties, nothing to make the heuristic fire. Gemini 2.5 Flash rejects those with MALFORMED_FUNCTION_CALL, meaning the original fix didn't actually resolve the bug for Pydantic schemas that include any Any-typed field. The heuristic now strips title when every remaining key is a pure JSON Schema metadata keyword (title, description, deprecated, readOnly, writeOnly), which covers the Pydantic-Any case without touching user property dicts.

Literal values get corrupted. The traversal recursed into default, const, examples, and enum unconditionally. A user default shaped like {"title": "My Dashboard", "type": "vis"} could silently have its "title" field interpreted as schema metadata and stripped. These keywords carry literal user data, not sub-schemas, so the traversal now skips them.

# Before (Pydantic Any field): Gemini still crashes
{"properties": {"anyfield": {"title": "Anyfield"}}}  # title survived
# After: title stripped, Gemini accepts

# Before (user default with metadata-shaped dict):
{"default": {"title": "X", "type": "Y"}}  # title silently removed
# After: default preserved as-is

The "property literally named title" case Codex raised is already handled correctly by the existing isinstance(title, str) guard — when properties["title"] is a dict (a sub-schema), it isn't stripped. Covered by an existing regression test (test_title_pruning_preserves_parameter_named_title).

Two follow-ups to #3861's Gemini 2.5 Flash title-stripping fix: 1. Pydantic emits bare {"title": "X"} schemas for Any-typed fields with no sibling type/properties. The original heuristic required a schema keyword before stripping, so those bare nodes leaked through and Gemini still hit MALFORMED_FUNCTION_CALL. Now we also strip when every remaining key is a pure metadata key (title/description/deprecated/readOnly/writeOnly). 2. The traversal no longer recurses into default, const, examples, or enum. Those hold literal user values — a user default of {"title": "X", "type": "Y"} was at risk of having its keys interpreted as schema metadata.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8b2a8db49e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-12T21:14:44Z

+# Keywords whose values are literal user data, not sub-schemas. Skipping
+# recursion here prevents `default: {"title": "X"}` from losing the "title"
+# data value because it happens to look metadata-shaped.
+_LITERAL_KEYWORDS = frozenset({"default", "const", "examples", "enum"})


Include example in literal-keyword skip list

The new traversal guard skips literal payload keywords via _LITERAL_KEYWORDS, but omitting example means prune_titles=True still descends into example values and strips user data fields named title. In practice, schemas produced with Pydantic json_schema_extra={"example": {"title": "Dashboard", ...}} now have that example mutated during compression, which is a regression from the previous behavior and contradicts the goal of preserving literal values.

Useful? React with 👍 / 👎.

Two follow-ups after the first round: 1. Skipping recursion on literal-keyword keys broke schemas where a user property is literally named 'enum' or 'default' — the traversal stopped at the property name and never reached the sub-schema's $ref, so referenced $defs got pruned as unused (AssertionError in test_all_primitive_field_types). Split arbitrary-key dicts (properties, patternProperties, $defs, definitions, dependentSchemas) into their own branch so their values are always traversed as sub-schemas regardless of key name. 2. Added 'example' (OpenAPI/Swagger 2.0 singular) alongside 'examples' (JSON Schema draft 7+) to the literal-keyword list. Codex flagged that Pydantic's json_schema_extra={'example': {'title': '...'}} was still getting its literal user data stripped. 3. Updated test_lambda snapshot — the whole point of this PR is that bare {'title': 'X'} nodes (Pydantic Any fields) now get pruned, so the snapshot's {'x': {'title': 'X'}} is now {'x': {}}.

marvin-context-protocol · 2026-04-12T21:23:30Z

Test Failure Analysis

Summary: The test test_all_primitive_field_types in tests/client/test_elicitation.py fails across all Python versions because the new _LITERAL_KEYWORDS skip in traverse_and_clean accidentally treats a dataclass property named enum as the JSON Schema enum keyword, preventing its $ref from being tracked — so $defs/DataEnum gets pruned as "unused" while the reference remains.

Root Cause: In src/fastmcp/utilities/json_schema.py, the PR adds:

_LITERAL_KEYWORDS = frozenset({"default", "const", "examples", "enum"})

and then skips recursion into any dict key matching that set:

if key in _LITERAL_KEYWORDS:
    continue

The problem: this check runs while iterating over every dict — including a schema's properties sub-dict where keys are user-defined property names, not JSON Schema keywords. The failing test has a dataclass field enum: DataEnum, so the properties dict contains:

"enum": {"$ref": "#/$defs/DataEnum"}

When traverse_and_clean iterates over that properties dict, it sees the key "enum", hits _LITERAL_KEYWORDS, and skips recursing into {"$ref": "#/$defs/DataEnum"}. The $ref collector never sees that node, "DataEnum" is never added to root_refs, is_def_used("DataEnum") returns False, and the definition is pruned. The surviving $ref then points to nothing, causing the PointerToNowhere error at runtime.

Suggested Solution: In src/fastmcp/utilities/json_schema.py, handle the properties key specially before the _LITERAL_KEYWORDS check so that its values (property sub-schemas) are always recursed into:

for key, value in node.items():
    if skip_defs_section and key == "$defs":
        continue

    # Properties dict keys are user-defined property names, not schema keywords.
    # Recurse into each property sub-schema directly, bypassing keyword checks.
    if key == "properties" and isinstance(value, dict):
        for prop_schema in value.values():
            traverse_and_clean(prop_schema, current_def_name, depth=depth + 1)
        continue

    # Don't descend into keywords that carry literal data, not sub-schemas
    if key in _LITERAL_KEYWORDS:
        continue

    if key in ["allOf", "oneOf", "anyOf"] and isinstance(value, list):
        ...

The same bug would affect properties named "const", "default", or "examples" (any name that collides with _LITERAL_KEYWORDS), so this is the correct structural fix rather than just removing "enum" from the set.

Detailed Analysis

Failing test: tests/client/test_elicitation.py:486 — test_all_primitive_field_types

Error (from job logs):

fastmcp.exceptions.ToolError: PointerToNowhere: '/$defs/DataEnum' does not exist within
{'properties': {'integer': {'type': 'integer'}, ..., 'enum': {'$ref': '#/$defs/DataEnum'}},
 'required': [...], 'type': 'object'}

The schema reaching the validator has a $ref pointing to #/$defs/DataEnum but no $defs key at all — DataEnum was removed by the unused-definition pruner in compress_schema.

Trace through the new code (_single_pass_optimize in json_schema.py, Phase 2):

Top-level schema: {"type": "object", "properties": {..., "enum": {"$ref": "#/$defs/DataEnum"}}, "$defs": {"DataEnum": {...}}}
Traversal visits properties value (the property map).
Inside the property map, key = "enum" (property name), value = {"$ref": "#/$defs/DataEnum"}.
New check: "enum" in _LITERAL_KEYWORDS → True → continue — the sub-schema is never visited.
root_refs never receives "DataEnum".
Phase 4: is_def_used("DataEnum") → False → DataEnum deleted from $defs.
Empty $defs removed from schema.
Schema now has a dangling $ref → PointerToNowhere at validation time.

Jobs that failed:

Tests with lowest-direct dependencies (job 70995129518) — Python 3.10
Tests: Python 3.10 on windows-latest (job 70995129524)
Tests: Python 3.13 on ubuntu-latest (job 70995129530)
Tests: Python 3.10 on ubuntu-latest (job 70995129536)

Related Files

src/fastmcp/utilities/json_schema.py — contains _single_pass_optimize / traverse_and_clean; this is where the fix should be applied (the for key, value in node.items() loop, roughly line 420–465 in the PR branch)
src/fastmcp/server/elicitation.py:324 — get_elicitation_schema calls compress_schema(schema) (without prune_titles, so the title changes in this PR don't directly affect the elicitation path — only the prune_defs logic is in play here)
tests/client/test_elicitation.py:486 — failing test test_all_primitive_field_types

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 055f50ea45

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-12T21:26:06Z

+                and isinstance(node["title"], str)  # type: ignore
+                and (
+                    any(k in node for k in _SCHEMA_KEYWORDS)
+                    or all(k in _METADATA_KEYS for k in node)


Limit metadata-only title stripping to schema contexts

The new all(k in _METADATA_KEYS for k in node) branch removes title from any dict that happens to have metadata-like keys, even when that dict is user extension payload rather than a schema node. For example, with json_schema_extra={"x-ui": {"title": "Dashboard", "description": "..."}}, compress_schema(..., prune_titles=True) now mutates x-ui.title to nothing, which is a regression introduced by this commit (previous logic required structural schema keywords before pruning).

Useful? React with 👍 / 👎.

Codex flagged that the bare-metadata title-strip heuristic fires on any dict whose keys are all metadata — including user extension payloads like json_schema_extra={'x-ui': {'title': 'Dashboard', 'description': '...'}}, where the payload is opaque to us but happens to look metadata-shaped. Fix: thread an in_schema flag through traverse_and_clean. We only apply title/additionalProperties cleanups when we arrived via a known schema position (root, properties/$defs/patternProperties values, items, allOf/anyOf/oneOf/prefixItems elements, or single-schema keywords like if/then/else/contains/propertyNames). Unknown keys (vendor extensions, arbitrary user dicts) still get traversed for $ref collection, but in_schema=False so no cleanups run inside them. Also classified the schema-valued keywords explicitly (SUBSCHEMA_VALUE_KEYS and SUBSCHEMA_LIST_KEYS) to route traversal correctly.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c1d2a27594

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-04-12T21:34:50Z

+                # Unknown keys (user extensions like `x-ui`, vendor
+                # metadata, etc.) — descend for $ref collection but mark
+                # in_schema=False so cleanups don't touch user payloads.
+                traverse_and_clean(
+                    value, current_def_name, depth=depth + 1, in_schema=False


Traverse unlisted schema keywords in schema mode

The fallback path now recurses unknown keys with in_schema=False, which means any valid JSON Schema keyword missing from the hardcoded sets (for example draft-07 dependencies and additionalItems) is treated as opaque payload instead of a subschema. In those branches, compress_schema(..., prune_titles=True) and prune_additional_properties=True stop applying cleanups, whereas the parent implementation still cleaned them, so this introduces a regression for schemas that rely on those standard keywords.

Useful? React with 👍 / 👎.

Codex flagged that unlisted sub-schema keywords now fall into the in_schema=False branch and skip cleanups entirely. Add the missing standard keywords: - dependencies (draft-07): values may be sub-schemas (list values short-circuit in the list branch, so it's safe in _SUBSCHEMA_MAP_KEYS) - additionalItems (draft-07): single sub-schema, precursor to unevaluatedItems - contentSchema (2019-09+): single sub-schema for typed string payloads

marvin-context-protocol Bot added the bug Something isn't working. Reports of errors, unexpected behavior, or broken functionality. label Apr 12, 2026

chatgpt-codex-connector Bot reviewed Apr 12, 2026

View reviewed changes

jlowin merged commit fa9ad6c into main Apr 13, 2026
9 checks passed

jlowin deleted the fix/gemini-title-stripping branch April 13, 2026 01:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: strip titles from bare-metadata nodes (Gemini 2.5 Flash)#3881

fix: strip titles from bare-metadata nodes (Gemini 2.5 Flash)#3881
jlowin merged 4 commits intomainfrom
fix/gemini-title-stripping

jlowin commented Apr 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 12, 2026

Uh oh!

marvin-context-protocol Bot commented Apr 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Apr 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jlowin commented Apr 12, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

marvin-context-protocol Bot commented Apr 12, 2026

Test Failure Analysis

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant