Skip to content

feat(server): Add Rate Limiting Middleware #6

@polaz

Description

@polaz

Summary

Add configurable rate limiting middleware to protect the MCP server and GitLab API from excessive requests.

Motivation

Rate limiting is essential for:

  • Protecting GitLab API quota from aggressive AI agents
  • Preventing accidental infinite loops in agent workflows
  • Managing multi-user deployments (OAuth mode)
  • Ensuring fair resource allocation across sessions
  • Graceful handling when approaching GitLab rate limits

Design Considerations

Scope

Rate limiting should apply at multiple levels:

Level Scope Purpose
Per-session Individual MCP session Prevent single client abuse
Global Entire server Protect GitLab API quota
Per-endpoint Specific tools Protect expensive operations

Configuration

Add to config.ts:
```typescript
// Rate limiting configuration
export const RATE_LIMIT_ENABLED = process.env.RATE_LIMIT_ENABLED !== "false";
export const RATE_LIMIT_WINDOW_MS = parseInt(process.env.RATE_LIMIT_WINDOW_MS ?? "60000", 10); // 1 minute
export const RATE_LIMIT_MAX_REQUESTS = parseInt(process.env.RATE_LIMIT_MAX_REQUESTS ?? "60", 10);
export const RATE_LIMIT_GLOBAL_MAX = parseInt(process.env.RATE_LIMIT_GLOBAL_MAX ?? "300", 10);

// Per-tool rate limits (optional, for expensive operations)
export const RATE_LIMIT_TOOL_OVERRIDES = parseToolRateLimits(process.env.RATE_LIMIT_TOOL_OVERRIDES);
```

Environment Variables

Variable Default Description
RATE_LIMIT_ENABLED true Enable/disable rate limiting
RATE_LIMIT_WINDOW_MS 60000 Time window in milliseconds
RATE_LIMIT_MAX_REQUESTS 60 Max requests per session per window
RATE_LIMIT_GLOBAL_MAX 300 Max total requests per window
RATE_LIMIT_TOOL_OVERRIDES - JSON: {"push_files": 10, "create_merge_request": 20}

Implementation

Middleware Structure

```
src/middleware/
├── index.ts # Re-exports
├── oauth-auth.ts # Existing OAuth middleware
└── rate-limiter.ts # NEW: Rate limiting middleware
```

Rate Limiter Interface

```typescript
interface RateLimiterConfig {
windowMs: number;
maxRequests: number;
globalMax: number;
toolOverrides?: Record<string, number>;
}

interface RateLimitInfo {
remaining: number;
reset: Date;
total: number;
}

class RateLimiter {
constructor(config: RateLimiterConfig);

// Check if request is allowed
isAllowed(sessionId: string, toolName?: string): boolean;

// Record a request
recordRequest(sessionId: string, toolName?: string): void;

// Get current rate limit info
getInfo(sessionId: string): RateLimitInfo;

// Cleanup expired entries
cleanup(): void;
}
```

Integration Points

  1. HTTP Transport (SSE/StreamableHTTP)

    • Apply to /messages and /mcp endpoints
    • Return HTTP 429 with Retry-After header
  2. Tool Execution (handlers.ts)

    • Check rate limit before executing tool
    • Return MCP error response if exceeded
  3. GitLab API Client (http-client.ts)

    • Monitor GitLab rate limit headers
    • Automatically back off when approaching limits

Response Headers

For HTTP transports, include rate limit headers:
```
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1640000000
```

Error Response

When rate limit exceeded:
```json
{
"error": "Rate limit exceeded",
"retryAfter": 30,
"limit": 60,
"remaining": 0,
"resetAt": "2024-01-01T00:01:00Z"
}
```

Implementation Tasks

  • Create src/middleware/rate-limiter.ts
  • Implement in-memory rate limit storage with cleanup
  • Add configuration to config.ts
  • Integrate with HTTP transports in server.ts
  • Integrate with tool execution in handlers.ts
  • Add GitLab rate limit header monitoring to http-client.ts
  • Add unit tests for rate limiter logic
  • Add integration tests for HTTP 429 responses
  • Update documentation

Advanced Features (Future)

  • Redis-backed rate limiting for distributed deployments
  • Per-user rate limits in OAuth mode
  • Adaptive rate limiting based on GitLab API response times
  • Rate limit dashboard/metrics endpoint

Testing Requirements

  • Unit tests for rate limit calculations
  • Test window sliding behavior
  • Test per-tool overrides
  • Test global vs per-session limits
  • Integration test for HTTP 429 response
  • Test cleanup of expired entries

Acceptance Criteria

  • RATE_LIMIT_ENABLED=false disables all rate limiting
  • Per-session limits work correctly
  • Global limits work correctly
  • Per-tool overrides apply correctly
  • HTTP 429 returned with proper headers
  • MCP error response for tool execution
  • Memory cleanup prevents leaks
  • Clear error messages for agents

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature, new MCP tool, new capability

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions