-
Notifications
You must be signed in to change notification settings - Fork 1
feat(server): Add Rate Limiting Middleware #6
Description
Summary
Add configurable rate limiting middleware to protect the MCP server and GitLab API from excessive requests.
Motivation
Rate limiting is essential for:
- Protecting GitLab API quota from aggressive AI agents
- Preventing accidental infinite loops in agent workflows
- Managing multi-user deployments (OAuth mode)
- Ensuring fair resource allocation across sessions
- Graceful handling when approaching GitLab rate limits
Design Considerations
Scope
Rate limiting should apply at multiple levels:
| Level | Scope | Purpose |
|---|---|---|
| Per-session | Individual MCP session | Prevent single client abuse |
| Global | Entire server | Protect GitLab API quota |
| Per-endpoint | Specific tools | Protect expensive operations |
Configuration
Add to config.ts:
```typescript
// Rate limiting configuration
export const RATE_LIMIT_ENABLED = process.env.RATE_LIMIT_ENABLED !== "false";
export const RATE_LIMIT_WINDOW_MS = parseInt(process.env.RATE_LIMIT_WINDOW_MS ?? "60000", 10); // 1 minute
export const RATE_LIMIT_MAX_REQUESTS = parseInt(process.env.RATE_LIMIT_MAX_REQUESTS ?? "60", 10);
export const RATE_LIMIT_GLOBAL_MAX = parseInt(process.env.RATE_LIMIT_GLOBAL_MAX ?? "300", 10);
// Per-tool rate limits (optional, for expensive operations)
export const RATE_LIMIT_TOOL_OVERRIDES = parseToolRateLimits(process.env.RATE_LIMIT_TOOL_OVERRIDES);
```
Environment Variables
| Variable | Default | Description |
|---|---|---|
RATE_LIMIT_ENABLED |
true |
Enable/disable rate limiting |
RATE_LIMIT_WINDOW_MS |
60000 |
Time window in milliseconds |
RATE_LIMIT_MAX_REQUESTS |
60 |
Max requests per session per window |
RATE_LIMIT_GLOBAL_MAX |
300 |
Max total requests per window |
RATE_LIMIT_TOOL_OVERRIDES |
- | JSON: {"push_files": 10, "create_merge_request": 20} |
Implementation
Middleware Structure
```
src/middleware/
├── index.ts # Re-exports
├── oauth-auth.ts # Existing OAuth middleware
└── rate-limiter.ts # NEW: Rate limiting middleware
```
Rate Limiter Interface
```typescript
interface RateLimiterConfig {
windowMs: number;
maxRequests: number;
globalMax: number;
toolOverrides?: Record<string, number>;
}
interface RateLimitInfo {
remaining: number;
reset: Date;
total: number;
}
class RateLimiter {
constructor(config: RateLimiterConfig);
// Check if request is allowed
isAllowed(sessionId: string, toolName?: string): boolean;
// Record a request
recordRequest(sessionId: string, toolName?: string): void;
// Get current rate limit info
getInfo(sessionId: string): RateLimitInfo;
// Cleanup expired entries
cleanup(): void;
}
```
Integration Points
-
HTTP Transport (SSE/StreamableHTTP)
- Apply to
/messagesand/mcpendpoints - Return HTTP 429 with Retry-After header
- Apply to
-
Tool Execution (handlers.ts)
- Check rate limit before executing tool
- Return MCP error response if exceeded
-
GitLab API Client (http-client.ts)
- Monitor GitLab rate limit headers
- Automatically back off when approaching limits
Response Headers
For HTTP transports, include rate limit headers:
```
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1640000000
```
Error Response
When rate limit exceeded:
```json
{
"error": "Rate limit exceeded",
"retryAfter": 30,
"limit": 60,
"remaining": 0,
"resetAt": "2024-01-01T00:01:00Z"
}
```
Implementation Tasks
- Create
src/middleware/rate-limiter.ts - Implement in-memory rate limit storage with cleanup
- Add configuration to
config.ts - Integrate with HTTP transports in
server.ts - Integrate with tool execution in
handlers.ts - Add GitLab rate limit header monitoring to
http-client.ts - Add unit tests for rate limiter logic
- Add integration tests for HTTP 429 responses
- Update documentation
Advanced Features (Future)
- Redis-backed rate limiting for distributed deployments
- Per-user rate limits in OAuth mode
- Adaptive rate limiting based on GitLab API response times
- Rate limit dashboard/metrics endpoint
Testing Requirements
- Unit tests for rate limit calculations
- Test window sliding behavior
- Test per-tool overrides
- Test global vs per-session limits
- Integration test for HTTP 429 response
- Test cleanup of expired entries
Acceptance Criteria
-
RATE_LIMIT_ENABLED=falsedisables all rate limiting - Per-session limits work correctly
- Global limits work correctly
- Per-tool overrides apply correctly
- HTTP 429 returned with proper headers
- MCP error response for tool execution
- Memory cleanup prevents leaks
- Clear error messages for agents