Skip to content

Animated banner#3231

Merged
jlowin merged 1 commit intomainfrom
animated-banners
Feb 19, 2026
Merged

Animated banner#3231
jlowin merged 1 commit intomainfrom
animated-banners

Conversation

@jlowin
Copy link
Copy Markdown
Member

@jlowin jlowin commented Feb 19, 2026

No description provided.

@marvin-context-protocol marvin-context-protocol Bot added the documentation Updates to docs, examples, or guides. Primary change is documentation-related. label Feb 19, 2026
@marvin-context-protocol
Copy link
Copy Markdown
Contributor

Test Failure Analysis

Summary: A pre-existing flaky test unrelated to this PR's changes (docs-only) failed on Python 3.10/ubuntu-latest due to non-deterministic timing and protocol overhead assumptions in the rate limiting integration test.

Root Cause: test_rate_limiting_blocks_rapid_requests (tests/server/middleware/test_rate_limiting.py:307) has two sources of flakiness:

  1. Token refill timing: The middleware is created with max_requests_per_second=10.0 — that's 1 new token every 100ms. On a slower system, elapsed time between requests can cause token refill mid-test, allowing more requests to succeed than expected.

  2. Uncertain overhead request count: The test assumes exactly 3 MCP protocol requests flow through the middleware during Client initialization (e.g. initialize + notifications/initialized + tools/list). If only 2 overhead requests are intercepted (e.g. initialize + tools/list), then with burst_capacity=6, the 4th call_tool becomes the 6th request and consumes the last available token — succeeding instead of failing.

On this particular run, Python 3.10/ubuntu-latest took long enough between requests that the token bucket partially refilled, and/or only 2 overhead requests were counted, leaving the 4th call_tool with a token still available.

Suggested Solution: Two changes to tests/server/middleware/test_rate_limiting.py:

  1. Drastically lower max_requests_per_second (e.g. 0.001) so token refill is negligible regardless of test duration.
  2. Reduce burst_capacity by 1 to add a safety margin against overhead request count variance — or restructure the test to only require it to fail after fewer call_tool attempts so the exact overhead count doesn't matter.

Example hardened version of the test:

async def test_rate_limiting_blocks_rapid_requests(self, rate_limit_server):
    """Test that rate limiting blocks rapid successive requests."""
    # Use a near-zero refill rate so timing doesn't affect token count
    rate_limit_server.add_middleware(
        RateLimitingMiddleware(max_requests_per_second=0.001, burst_capacity=5)
    )

    async with Client(rate_limit_server) as client:
        # After init overhead (2-3 requests), at most 2-3 call_tools can succeed
        await client.call_tool("quick_action", {"message": "1"})
        await client.call_tool("quick_action", {"message": "2"})

        # Next should be rate limited (5 tokens exhausted by overhead + 2 calls)
        with pytest.raises(ToolError, match="Rate limit exceeded"):
            await client.call_tool("quick_action", {"message": "3"})

This is a pre-existing test flakiness issue — the PR's changes (animated banner assets and MDX) have no bearing on the failure.

Detailed Analysis

Failure log excerpt:

FAILED tests/server/middleware/test_rate_limiting.py::TestRateLimitingMiddlewareIntegration::test_rate_limiting_blocks_rapid_requests

    async def test_rate_limiting_blocks_rapid_requests(self, rate_limit_server):
        ...
        rate_limit_server.add_middleware(
            RateLimitingMiddleware(max_requests_per_second=10.0, burst_capacity=6)
        )

        async with Client(rate_limit_server) as client:
            await client.call_tool("quick_action", {"message": "1"})
            await client.call_tool("quick_action", {"message": "2"})
            await client.call_tool("quick_action", {"message": "3"})

>           with pytest.raises(ToolError, match="Rate limit exceeded"):
E           Failed: DID NOT RAISE <class 'fastmcp.exceptions.ToolError'>

tests/server/middleware/test_rate_limiting.py:322: Failed

Token bucket mechanics (src/fastmcp/server/middleware/rate_limiting.py:38-58):

  • consume() refills tokens based on real elapsed time before checking availability
  • With refill_rate=10.0, any ~100ms gap between requests regenerates 1 token

Request count scenario that causes the failure:

  • burst_capacity=6 → 6 tokens initially
  • 2 overhead requests during init → 4 tokens remain
  • 3 call_tool calls → 1 token remains
  • 4th call_tool → consumes last token, succeeds (test expects it to fail)

Why it passes on other platforms: Python 3.13 runs tests faster (fewer ms between requests = less refill) and/or the overhead request count happened to be 3 there.

Job that failed: Tests: Python 3.10 on ubuntu-latest (ID: 64169092817). Passed on Python 3.13/ubuntu-latest, Python 3.10/windows-latest, and lowest-direct-dependencies.

Related Files
  • tests/server/middleware/test_rate_limiting.py:307-323 — failing test; fragile burst_capacity assumption
  • src/fastmcp/server/middleware/rate_limiting.py:38-58TokenBucketRateLimiter.consume() with time-based refill
  • src/fastmcp/server/middleware/rate_limiting.py:152-167RateLimitingMiddleware.on_request() — called for every MCP request including protocol overhead

@jlowin jlowin merged commit 35bbf48 into main Feb 19, 2026
9 of 10 checks passed
@jlowin jlowin deleted the animated-banners branch February 19, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Updates to docs, examples, or guides. Primary change is documentation-related.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant