Animated banner by jlowin · Pull Request #3231 · PrefectHQ/fastmcp

jlowin · 2026-02-19T15:37:13Z

No description provided.

marvin-context-protocol · 2026-02-19T15:48:49Z

Test Failure Analysis

Summary: A pre-existing flaky test unrelated to this PR's changes (docs-only) failed on Python 3.10/ubuntu-latest due to non-deterministic timing and protocol overhead assumptions in the rate limiting integration test.

Root Cause: test_rate_limiting_blocks_rapid_requests (tests/server/middleware/test_rate_limiting.py:307) has two sources of flakiness:

Token refill timing: The middleware is created with max_requests_per_second=10.0 — that's 1 new token every 100ms. On a slower system, elapsed time between requests can cause token refill mid-test, allowing more requests to succeed than expected.
Uncertain overhead request count: The test assumes exactly 3 MCP protocol requests flow through the middleware during Client initialization (e.g. initialize + notifications/initialized + tools/list). If only 2 overhead requests are intercepted (e.g. initialize + tools/list), then with burst_capacity=6, the 4th call_tool becomes the 6th request and consumes the last available token — succeeding instead of failing.

On this particular run, Python 3.10/ubuntu-latest took long enough between requests that the token bucket partially refilled, and/or only 2 overhead requests were counted, leaving the 4th call_tool with a token still available.

Suggested Solution: Two changes to tests/server/middleware/test_rate_limiting.py:

Drastically lower max_requests_per_second (e.g. 0.001) so token refill is negligible regardless of test duration.
Reduce burst_capacity by 1 to add a safety margin against overhead request count variance — or restructure the test to only require it to fail after fewer call_tool attempts so the exact overhead count doesn't matter.

Example hardened version of the test:

async def test_rate_limiting_blocks_rapid_requests(self, rate_limit_server):
    """Test that rate limiting blocks rapid successive requests."""
    # Use a near-zero refill rate so timing doesn't affect token count
    rate_limit_server.add_middleware(
        RateLimitingMiddleware(max_requests_per_second=0.001, burst_capacity=5)
    )

    async with Client(rate_limit_server) as client:
        # After init overhead (2-3 requests), at most 2-3 call_tools can succeed
        await client.call_tool("quick_action", {"message": "1"})
        await client.call_tool("quick_action", {"message": "2"})

        # Next should be rate limited (5 tokens exhausted by overhead + 2 calls)
        with pytest.raises(ToolError, match="Rate limit exceeded"):
            await client.call_tool("quick_action", {"message": "3"})

This is a pre-existing test flakiness issue — the PR's changes (animated banner assets and MDX) have no bearing on the failure.

Detailed Analysis

Failure log excerpt:

FAILED tests/server/middleware/test_rate_limiting.py::TestRateLimitingMiddlewareIntegration::test_rate_limiting_blocks_rapid_requests

    async def test_rate_limiting_blocks_rapid_requests(self, rate_limit_server):
        ...
        rate_limit_server.add_middleware(
            RateLimitingMiddleware(max_requests_per_second=10.0, burst_capacity=6)
        )

        async with Client(rate_limit_server) as client:
            await client.call_tool("quick_action", {"message": "1"})
            await client.call_tool("quick_action", {"message": "2"})
            await client.call_tool("quick_action", {"message": "3"})

>           with pytest.raises(ToolError, match="Rate limit exceeded"):
E           Failed: DID NOT RAISE <class 'fastmcp.exceptions.ToolError'>

tests/server/middleware/test_rate_limiting.py:322: Failed

Token bucket mechanics (src/fastmcp/server/middleware/rate_limiting.py:38-58):

consume() refills tokens based on real elapsed time before checking availability
With refill_rate=10.0, any ~100ms gap between requests regenerates 1 token

Request count scenario that causes the failure:

burst_capacity=6 → 6 tokens initially
2 overhead requests during init → 4 tokens remain
3 call_tool calls → 1 token remains
4th call_tool → consumes last token, succeeds (test expects it to fail)

Why it passes on other platforms: Python 3.13 runs tests faster (fewer ms between requests = less refill) and/or the overhead request count happened to be 3 there.

Job that failed: Tests: Python 3.10 on ubuntu-latest (ID: 64169092817). Passed on Python 3.13/ubuntu-latest, Python 3.10/windows-latest, and lowest-direct-dependencies.

Related Files

tests/server/middleware/test_rate_limiting.py:307-323 — failing test; fragile burst_capacity assumption
src/fastmcp/server/middleware/rate_limiting.py:38-58 — TokenBucketRateLimiter.consume() with time-based refill
src/fastmcp/server/middleware/rate_limiting.py:152-167 — RateLimitingMiddleware.on_request() — called for every MCP request including protocol overhead

Animated

1edd756

marvin-context-protocol Bot added the documentation Updates to docs, examples, or guides. Primary change is documentation-related. label Feb 19, 2026

mintlify Bot deployed to staging - docs February 19, 2026 15:38 View deployment

jlowin merged commit 35bbf48 into main Feb 19, 2026
9 of 10 checks passed

jlowin deleted the animated-banners branch February 19, 2026 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Animated banner#3231

Animated banner#3231
jlowin merged 1 commit intomainfrom
animated-banners

jlowin commented Feb 19, 2026

Uh oh!

marvin-context-protocol Bot commented Feb 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jlowin commented Feb 19, 2026

Uh oh!

marvin-context-protocol Bot commented Feb 19, 2026

Test Failure Analysis

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant