Skip to content

feat: Instance Health Dashboard on GET / endpoint #275

@polaz

Description

@polaz

Summary

Add a health dashboard endpoint at GET / that displays the current server configuration, registered GitLab instances, their connection status, and real-time metrics. This provides visibility into multi-instance federation state without disrupting MCP protocol clients.

Motivation

With multi-instance federation (#274), administrators need visibility into:

  • Which GitLab instances are configured
  • Connection health of each instance
  • Rate limiting metrics (active/queued requests)
  • Per-instance introspection status
  • Active OAuth sessions count

Currently there's no way to inspect server state without logs.

Dependencies

Implementation

Endpoint Behavior

GET /
Accept: text/html     → HTML dashboard
Accept: application/json → JSON metrics
Accept: */*           → HTML dashboard (browser default)

MCP Client Compatibility:

  • MCP clients use POST /mcp or SSE endpoints, not GET /
  • Claude Desktop, VS Code MCP extensions don't request GET /
  • Safe to serve dashboard without breaking MCP protocol

HTML Dashboard

┌─────────────────────────────────────────────────────────────────┐
│  GitLab MCP Server v6.52.0                          [Healthy]   │
│  Uptime: 2d 14h 32m | Mode: OAuth | Sessions: 12                │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  REGISTERED INSTANCES                                           │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ ● gitlab.com                                    [Healthy]  │ │
│  │   Version: 17.2.0 | Tier: Ultimate | Introspected: ✓       │ │
│  │   Requests: 23/100 active | 0 queued | 1,247 total         │ │
│  │   Avg latency: 142ms | Last check: 2m ago                  │ │
│  └────────────────────────────────────────────────────────────┘ │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ ● git.corp.io                                   [Healthy]  │ │
│  │   Version: 16.8.0 | Tier: Premium | Introspected: ✓        │ │
│  │   Requests: 5/50 active | 0 queued | 423 total             │ │
│  │   Avg latency: 89ms | Last check: 1m ago                   │ │
│  └────────────────────────────────────────────────────────────┘ │
│  ┌────────────────────────────────────────────────────────────┐ │
│  │ ○ gl.dev.net                                   [Degraded]  │ │
│  │   Version: 15.0.0 | Tier: Free | Introspected: ✗           │ │
│  │   Requests: 2/20 active | 3 queued | 56 total              │ │
│  │   Avg latency: 2,341ms | Last check: 5m ago                │ │
│  │   ⚠ High latency detected                                  │ │
│  └────────────────────────────────────────────────────────────┘ │
│                                                                 │
│  CONFIGURATION                                                  │
│  ├─ Auth mode: OAuth 2.1 Device Flow                            │
│  ├─ Read-only: No                                               │
│  ├─ Tools enabled: 44/44                                        │
│  ├─ Session timeout: 30m                                        │
│  └─ Config source: /etc/gitlab-mcp/instances.yaml               │
│                                                                 │
│  ACTIVE SESSIONS (anonymized)                                   │
│  ├─ gitlab.com: 8 sessions                                      │
│  ├─ git.corp.io: 3 sessions                                     │
│  └─ gl.dev.net: 1 session                                       │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
              Auto-refresh: 30s | Last updated: 14:32:15

JSON Metrics Response

{
  "server": {
    "version": "6.52.0",
    "uptime": 228720,
    "mode": "oauth",
    "readOnly": false,
    "toolsEnabled": 44,
    "toolsTotal": 44
  },
  "instances": [
    {
      "url": "https://gitlab.com",
      "label": "GitLab.com",
      "status": "healthy",
      "version": "17.2.0",
      "tier": "ultimate",
      "introspected": true,
      "rateLimit": {
        "activeRequests": 23,
        "maxConcurrent": 100,
        "queuedRequests": 0,
        "queueSize": 500,
        "totalRequests": 1247,
        "rejectedRequests": 0
      },
      "latency": {
        "avgMs": 142,
        "p95Ms": 312,
        "p99Ms": 567
      },
      "lastHealthCheck": "2024-01-15T14:30:15Z"
    }
  ],
  "sessions": {
    "total": 12,
    "byInstance": {
      "https://gitlab.com": 8,
      "https://git.corp.io": 3,
      "https://gl.dev.net": 1
    }
  },
  "config": {
    "source": "/etc/gitlab-mcp/instances.yaml",
    "sessionTimeout": 1800,
    "oauthEnabled": true
  }
}

Health Check Logic

type InstanceStatus = 'healthy' | 'degraded' | 'offline';

function determineInstanceStatus(instance: GitLabInstance): InstanceStatus {
  // Offline: no successful request in last 5 minutes
  if (instance.lastSuccessfulRequest < Date.now() - 5 * 60 * 1000) {
    return 'offline';
  }
  
  // Degraded conditions:
  // - Avg latency > 2000ms
  // - Queue > 50% capacity
  // - Error rate > 10%
  if (
    instance.avgLatencyMs > 2000 ||
    instance.queuedRequests > instance.queueSize * 0.5 ||
    instance.errorRate > 0.1
  ) {
    return 'degraded';
  }
  
  return 'healthy';
}

Security Considerations

Concern Mitigation
Sensitive data exposure No tokens, secrets, or user identifiers shown
Session enumeration Only counts shown, no session IDs or user info
Instance URLs URLs are already public in config
Rate limit gaming Metrics are read-only, no control exposed

Optional: Add DASHBOARD_ENABLED=false to disable entirely.

Files to Create/Modify

File Purpose
src/http/dashboard.ts Dashboard route handler
src/http/dashboard.html HTML template with CSS
src/services/MetricsCollector.ts Collect and aggregate metrics
src/http/server.ts Register GET / route
docs/advanced/dashboard.md Documentation

Acceptance Criteria

  • GET / returns HTML dashboard in browser
  • GET / with Accept: application/json returns JSON metrics
  • Dashboard shows all registered instances
  • Dashboard shows instance health status (healthy/degraded/offline)
  • Dashboard shows rate limit metrics per instance
  • Dashboard shows active session counts (anonymized)
  • Dashboard shows server configuration summary
  • Dashboard auto-refreshes every 30 seconds
  • No sensitive data exposed (tokens, user info, session IDs)
  • DASHBOARD_ENABLED=false disables the endpoint
  • MCP clients unaffected (they use POST/SSE, not GET /)
  • Documentation updated

Time Estimate

Task Effort
MetricsCollector service 2h
Dashboard HTML/CSS 2h
JSON endpoint 1h
Route integration 1h
Health check logic 1h
Documentation 1h
Tests 2h
Total 10h

Labels

enhancement, dashboard, monitoring

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions