🐛 fix: reflect 503 backend failures in dashboard health (#8162)#8169
🐛 fix: reflect 503 backend failures in dashboard health (#8162)#8169clubanderson merged 1 commit intomainfrom
Conversation
Two related problems on console.kubestellar.io:
1. Optional feature probes (/api/kagent/status,
/api/kagenti-provider/status, /api/gadget/status, /api/mcs/status,
/api/persistence/status, /api/self-upgrade/status) fell through
the MSW catch-all and returned 503 plus an "unhandled request"
warning. Add explicit MSW handlers that return 200 with
{ available: false, reason: 'not configured in demo mode' }.
Callers already branch on `available` so semantics are unchanged,
and the DevTools network tab is no longer littered with scary
503s in the hosted demo.
2. useDashboardHealth didn't consult useBackendHealth, so the
dashboard health indicator stayed "healthy" even when the
backend /health endpoint was persistently failing (which is
the same signal that drives all those 503s). Wire backendStatus
in as a critical input so the indicator correctly shows
"Backend API unreachable" once useBackendHealth's failure
debounce trips.
Signed-off-by: Andy Anderson <[email protected]>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
✅ Deploy Preview for kubestellarconsole ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
👋 Hey @clubanderson — thanks for opening this PR!
This is an automated message. |
There was a problem hiding this comment.
Pull request overview
Addresses demo/hosted-console UX and correctness issues by preventing noisy MSW 503s for optional integrations and by ensuring the dashboard health indicator reflects persistent backend /health failures.
Changes:
- Add explicit MSW handlers for optional integration probe endpoints to avoid falling through to the
/api/*503 catch-all. - Update
useDashboardHealthto incorporateuseBackendHealth()and mark backend disconnects as a critical health signal. - Extend
useDashboardHealthtests to cover backend disconnected/connecting scenarios.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| web/src/mocks/handlers.ts | Adds explicit MSW handlers for optional feature probe endpoints to avoid 503 noise in demo mode. |
| web/src/hooks/useDashboardHealth.ts | Treats backend health “disconnected” as a critical input to overall dashboard health. |
| web/src/hooks/tests/useDashboardHealth.test.ts | Mocks useBackendHealth and adds test cases for disconnected/connecting backend status. |
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) | ||
| }), | ||
| http.get('/api/mcs/status', () => { | ||
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) |
There was a problem hiding this comment.
/api/mcs/status is mocked with { available: false, reason: ... }, but the app code expects MCSStatusResponse (e.g., { clusters: ClusterMCSStatus[] }). If this endpoint is hit while MSW is active, useMCSStatus will read data.clusters and break. Mock this route with the correct response shape (e.g., { clusters: [] }) and, if needed, convey “not configured” via an additional field or by returning an empty clusters list.
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) | |
| return HttpResponse.json({ clusters: [] }) |
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) | ||
| }), | ||
| http.get('/api/persistence/status', () => { | ||
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) |
There was a problem hiding this comment.
/api/persistence/status is mocked with an { available: false, reason: ... } payload, but usePersistence expects a PersistenceStatus object (e.g., { active, activeCluster, primaryHealth, failoverActive, ... }). Returning an incompatible shape can lead to UI misbehavior if this endpoint is called under MSW. Mock this with a valid PersistenceStatus (e.g., disabled defaults with active: false and primaryHealth: 'unknown' or similar).
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) | |
| return HttpResponse.json({ | |
| active: false, | |
| activeCluster: null, | |
| primaryHealth: 'unknown', | |
| failoverActive: false, | |
| }) |
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) | ||
| }), | ||
| http.get('/api/self-upgrade/status', () => { | ||
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) |
There was a problem hiding this comment.
/api/self-upgrade/status is mocked with only { available: false, reason: ... }, but SelfUpgradeStatus includes required fields like canPatch, namespace, deploymentName, currentImage, and releaseName. If the UI reads these fields under MSW, they will be undefined. Return a complete SelfUpgradeStatus object with safe defaults (and available: false) to match the runtime expectations.
| return HttpResponse.json({ available: false, reason: 'not configured in demo mode' }) | |
| return HttpResponse.json({ | |
| available: false, | |
| reason: 'not configured in demo mode', | |
| canPatch: false, | |
| namespace: '', | |
| deploymentName: '', | |
| currentImage: '', | |
| releaseName: '', | |
| }) |
|
Thank you for your contribution! Your PR has been merged. Check out what's new:
Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey |
|
Post-merge build verification passed ✅ Both Go and frontend builds compiled successfully against merge commit |
❌ Post-Merge Verification: failedCommit: |
Fixes #8175 The `builds wss: URL when page uses https` test in useExecSession-expand was asserting the pre-phase3d URL shape — `wss://<page-host>/ws/exec` — built from `window.location.protocol`. After #8168 migrated pod exec to kc-agent (#7993), the hook uses the `LOCAL_AGENT_WS_URL` constant (`ws://127.0.0.1:8585/ws`) unconditionally, because the SPDY exec stream now runs under the user's own kubeconfig on the local loopback rather than through the page origin. The page protocol no longer influences the URL, so the old assertion has been stale since #8168 merged. This is a pre-existing flake that Coverage Suite run #760 happened to surface on top of b828c87 (#8169). #8169 only touched useDashboardHealth, MSW handlers, and useDashboardHealth.test.ts — none of which are related to exec session URL construction. Confirmed by running the test in isolation and together with useDashboardHealth.test.ts (both pass after the fix; isolated failure reproduces on unmodified main against useExecSession.ts:235). Fix: rename the test to "builds kc-agent /ws/exec URL regardless of page protocol" and assert the new, stable target. Documented the rationale inline with source pointer to useExecSession.ts:235 so future readers don't reintroduce the old assertion. Signed-off-by: Andy Anderson <[email protected]>
Fixes #8175 The `builds wss: URL when page uses https` test in useExecSession-expand was asserting the pre-phase3d URL shape — `wss://<page-host>/ws/exec` — built from `window.location.protocol`. After #8168 migrated pod exec to kc-agent (#7993), the hook uses the `LOCAL_AGENT_WS_URL` constant (`ws://127.0.0.1:8585/ws`) unconditionally, because the SPDY exec stream now runs under the user's own kubeconfig on the local loopback rather than through the page origin. The page protocol no longer influences the URL, so the old assertion has been stale since #8168 merged. This is a pre-existing flake that Coverage Suite run #760 happened to surface on top of b828c87 (#8169). #8169 only touched useDashboardHealth, MSW handlers, and useDashboardHealth.test.ts — none of which are related to exec session URL construction. Confirmed by running the test in isolation and together with useDashboardHealth.test.ts (both pass after the fix; isolated failure reproduces on unmodified main against useExecSession.ts:235). Fix: rename the test to "builds kc-agent /ws/exec URL regardless of page protocol" and assert the new, stable target. Documented the rationale inline with source pointer to useExecSession.ts:235 so future readers don't reintroduce the old assertion. Signed-off-by: Andy Anderson <[email protected]>
Fixes #8162
Summary
Two related problems on console.kubestellar.io, both reported in issue #8162:
MSW 503 noise — Optional feature probes (
/api/kagent/status,/api/kagenti-provider/status, and friends) fell through the MSW/api/*catch-all and returned503plus an[MSW] Warning: intercepted a request without a matching request handlerwarning. Add explicit MSW handlers that return200with{ available: false, reason: 'not configured in demo mode' }. Callers (kagentBackend.ts,kagentiProviderBackend.ts,useKagentBackend) already branch onavailableso semantics are unchanged — the DevTools network tab is just no longer littered with scary 503s in the hosted demo.Dashboard health gap —
useDashboardHealthdidn't consultuseBackendHealth, so the dashboard health indicator stayed healthy even when the backend/healthendpoint was persistently failing (which is the same signal that drives all those downstream 503s). WirebackendStatusin as a critical input so the indicator correctly showsBackend API unreachableonceuseBackendHealth's failure debounce trips (FAILURE_THRESHOLD = 4consecutive failures, so this is a persistent-failure signal, not a transient one).In demo mode on Netlify, the
/healthMSW handler returns{ status: 'ok' }so the new branch stays inactive and the hosted demo continues to show "healthy" — which is now truthful because the 503 probes were replaced with proper 200 responses.Changes
web/src/mocks/handlers.ts— 6 new explicit handlers before the/api/*catch-allweb/src/hooks/useDashboardHealth.ts— consumeuseBackendHealth(), flag disconnected as criticalweb/src/hooks/__tests__/useDashboardHealth.test.ts— mockuseBackendHealth, two new test casesTest plan
npm run buildgreennpm run lintclean on touched files (1117 pre-existing lint problems on main, none from this PR)useDashboardHealth.test.ts— 9/9 pass (2 new tests for bug: Backend API 503 failures not reflected in dashboard health status #8162)kagentBackend.test.ts,useKagentBackend.test.ts,useBackendHealth.test.ts— 69/69 pass200for/api/kagent/statusetc.🤖 Generated with Claude Code