Skip to content

🐛 fix: reflect 503 backend failures in dashboard health (#8162)#8169

Merged
clubanderson merged 1 commit intomainfrom
fix/health-503-8162
Apr 15, 2026
Merged

🐛 fix: reflect 503 backend failures in dashboard health (#8162)#8169
clubanderson merged 1 commit intomainfrom
fix/health-503-8162

Conversation

@clubanderson
Copy link
Copy Markdown
Collaborator

Fixes #8162

Summary

Two related problems on console.kubestellar.io, both reported in issue #8162:

  1. MSW 503 noise — Optional feature probes (/api/kagent/status, /api/kagenti-provider/status, and friends) fell through the MSW /api/* catch-all and returned 503 plus an [MSW] Warning: intercepted a request without a matching request handler warning. Add explicit MSW handlers that return 200 with { available: false, reason: 'not configured in demo mode' }. Callers (kagentBackend.ts, kagentiProviderBackend.ts, useKagentBackend) already branch on available so semantics are unchanged — the DevTools network tab is just no longer littered with scary 503s in the hosted demo.

  2. Dashboard health gapuseDashboardHealth didn't consult useBackendHealth, so the dashboard health indicator stayed healthy even when the backend /health endpoint was persistently failing (which is the same signal that drives all those downstream 503s). Wire backendStatus in as a critical input so the indicator correctly shows Backend API unreachable once useBackendHealth's failure debounce trips (FAILURE_THRESHOLD = 4 consecutive failures, so this is a persistent-failure signal, not a transient one).

In demo mode on Netlify, the /health MSW handler returns { status: 'ok' } so the new branch stays inactive and the hosted demo continues to show "healthy" — which is now truthful because the 503 probes were replaced with proper 200 responses.

Changes

  • web/src/mocks/handlers.ts — 6 new explicit handlers before the /api/* catch-all
  • web/src/hooks/useDashboardHealth.ts — consume useBackendHealth(), flag disconnected as critical
  • web/src/hooks/__tests__/useDashboardHealth.test.ts — mock useBackendHealth, two new test cases

Test plan

  • npm run build green
  • npm run lint clean on touched files (1117 pre-existing lint problems on main, none from this PR)
  • useDashboardHealth.test.ts — 9/9 pass (2 new tests for bug: Backend API 503 failures not reflected in dashboard health status #8162)
  • kagentBackend.test.ts, useKagentBackend.test.ts, useBackendHealth.test.ts — 69/69 pass
  • Verify on preview deploy that DevTools network tab shows 200 for /api/kagent/status etc.
  • Verify health indicator flips to critical when backend is disconnected (manual, non-demo)

🤖 Generated with Claude Code

Two related problems on console.kubestellar.io:

1. Optional feature probes (/api/kagent/status,
   /api/kagenti-provider/status, /api/gadget/status, /api/mcs/status,
   /api/persistence/status, /api/self-upgrade/status) fell through
   the MSW catch-all and returned 503 plus an "unhandled request"
   warning. Add explicit MSW handlers that return 200 with
   { available: false, reason: 'not configured in demo mode' }.
   Callers already branch on `available` so semantics are unchanged,
   and the DevTools network tab is no longer littered with scary
   503s in the hosted demo.

2. useDashboardHealth didn't consult useBackendHealth, so the
   dashboard health indicator stayed "healthy" even when the
   backend /health endpoint was persistently failing (which is
   the same signal that drives all those 503s). Wire backendStatus
   in as a critical input so the indicator correctly shows
   "Backend API unreachable" once useBackendHealth's failure
   debounce trips.

Signed-off-by: Andy Anderson <[email protected]>
@clubanderson clubanderson added the ai-generated Pull request generated by AI label Apr 15, 2026
Copilot AI review requested due to automatic review settings April 15, 2026 15:53
@kubestellar-prow kubestellar-prow Bot added the dco-signoff: yes Indicates the PR's author has signed the DCO. label Apr 15, 2026
@kubestellar-prow
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign clubanderson for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 15, 2026

Deploy Preview for kubestellarconsole ready!

Name Link
🔨 Latest commit 33dc327
🔍 Latest deploy log https://app.netlify.com/projects/kubestellarconsole/deploys/69dfb47513b58d00085087ad
😎 Deploy Preview https://deploy-preview-8169.console-deploy-preview.kubestellar.io
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@kubestellar-prow kubestellar-prow Bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Apr 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

👋 Hey @clubanderson — thanks for opening this PR!

🤖 This project is developed exclusively using AI coding assistants.

Please do not attempt to code anything for this project manually.
All contributions should be authored using an AI coding tool such as:

This ensures consistency in code style, architecture patterns, test coverage,
and commit quality across the entire codebase.


This is an automated message.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Addresses demo/hosted-console UX and correctness issues by preventing noisy MSW 503s for optional integrations and by ensuring the dashboard health indicator reflects persistent backend /health failures.

Changes:

  • Add explicit MSW handlers for optional integration probe endpoints to avoid falling through to the /api/* 503 catch-all.
  • Update useDashboardHealth to incorporate useBackendHealth() and mark backend disconnects as a critical health signal.
  • Extend useDashboardHealth tests to cover backend disconnected/connecting scenarios.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
web/src/mocks/handlers.ts Adds explicit MSW handlers for optional feature probe endpoints to avoid 503 noise in demo mode.
web/src/hooks/useDashboardHealth.ts Treats backend health “disconnected” as a critical input to overall dashboard health.
web/src/hooks/tests/useDashboardHealth.test.ts Mocks useBackendHealth and adds test cases for disconnected/connecting backend status.

Comment thread web/src/mocks/handlers.ts
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
}),
http.get('/api/mcs/status', () => {
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/api/mcs/status is mocked with { available: false, reason: ... }, but the app code expects MCSStatusResponse (e.g., { clusters: ClusterMCSStatus[] }). If this endpoint is hit while MSW is active, useMCSStatus will read data.clusters and break. Mock this route with the correct response shape (e.g., { clusters: [] }) and, if needed, convey “not configured” via an additional field or by returning an empty clusters list.

Suggested change
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
return HttpResponse.json({ clusters: [] })

Copilot uses AI. Check for mistakes.
Comment thread web/src/mocks/handlers.ts
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
}),
http.get('/api/persistence/status', () => {
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/api/persistence/status is mocked with an { available: false, reason: ... } payload, but usePersistence expects a PersistenceStatus object (e.g., { active, activeCluster, primaryHealth, failoverActive, ... }). Returning an incompatible shape can lead to UI misbehavior if this endpoint is called under MSW. Mock this with a valid PersistenceStatus (e.g., disabled defaults with active: false and primaryHealth: 'unknown' or similar).

Suggested change
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
return HttpResponse.json({
active: false,
activeCluster: null,
primaryHealth: 'unknown',
failoverActive: false,
})

Copilot uses AI. Check for mistakes.
Comment thread web/src/mocks/handlers.ts
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
}),
http.get('/api/self-upgrade/status', () => {
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/api/self-upgrade/status is mocked with only { available: false, reason: ... }, but SelfUpgradeStatus includes required fields like canPatch, namespace, deploymentName, currentImage, and releaseName. If the UI reads these fields under MSW, they will be undefined. Return a complete SelfUpgradeStatus object with safe defaults (and available: false) to match the runtime expectations.

Suggested change
return HttpResponse.json({ available: false, reason: 'not configured in demo mode' })
return HttpResponse.json({
available: false,
reason: 'not configured in demo mode',
canPatch: false,
namespace: '',
deploymentName: '',
currentImage: '',
releaseName: '',
})

Copilot uses AI. Check for mistakes.
@clubanderson clubanderson merged commit b828c87 into main Apr 15, 2026
58 of 59 checks passed
@kubestellar-prow kubestellar-prow Bot deleted the fix/health-503-8162 branch April 15, 2026 16:01
@github-actions
Copy link
Copy Markdown
Contributor

Thank you for your contribution! Your PR has been merged.

Check out what's new:

Stay connected: Slack #kubestellar-dev | Multi-Cluster Survey

@github-actions
Copy link
Copy Markdown
Contributor

Post-merge build verification passed

Both Go and frontend builds compiled successfully against merge commit b828c874c7ad0b4cd149aa63ec3228407dbaf2d5.

@github-actions
Copy link
Copy Markdown
Contributor

❌ Post-Merge Verification: failed

Commit: b828c874c7ad0b4cd149aa63ec3228407dbaf2d5
Specs run: smoke.spec.ts
Report: https://github.com/kubestellar/console/actions/runs/24464739949

clubanderson added a commit that referenced this pull request Apr 15, 2026
Fixes #8175

The `builds wss: URL when page uses https` test in useExecSession-expand
was asserting the pre-phase3d URL shape — `wss://<page-host>/ws/exec` —
built from `window.location.protocol`. After #8168 migrated pod exec to
kc-agent (#7993), the hook uses the `LOCAL_AGENT_WS_URL` constant
(`ws://127.0.0.1:8585/ws`) unconditionally, because the SPDY exec stream
now runs under the user's own kubeconfig on the local loopback rather
than through the page origin. The page protocol no longer influences the
URL, so the old assertion has been stale since #8168 merged.

This is a pre-existing flake that Coverage Suite run #760 happened to
surface on top of b828c87 (#8169). #8169 only touched
useDashboardHealth, MSW handlers, and useDashboardHealth.test.ts — none
of which are related to exec session URL construction. Confirmed by
running the test in isolation and together with useDashboardHealth.test.ts
(both pass after the fix; isolated failure reproduces on unmodified
main against useExecSession.ts:235).

Fix: rename the test to "builds kc-agent /ws/exec URL regardless of
page protocol" and assert the new, stable target. Documented the
rationale inline with source pointer to useExecSession.ts:235 so future
readers don't reintroduce the old assertion.

Signed-off-by: Andy Anderson <[email protected]>
clubanderson added a commit that referenced this pull request Apr 15, 2026
Fixes #8175

The `builds wss: URL when page uses https` test in useExecSession-expand
was asserting the pre-phase3d URL shape — `wss://<page-host>/ws/exec` —
built from `window.location.protocol`. After #8168 migrated pod exec to
kc-agent (#7993), the hook uses the `LOCAL_AGENT_WS_URL` constant
(`ws://127.0.0.1:8585/ws`) unconditionally, because the SPDY exec stream
now runs under the user's own kubeconfig on the local loopback rather
than through the page origin. The page protocol no longer influences the
URL, so the old assertion has been stale since #8168 merged.

This is a pre-existing flake that Coverage Suite run #760 happened to
surface on top of b828c87 (#8169). #8169 only touched
useDashboardHealth, MSW handlers, and useDashboardHealth.test.ts — none
of which are related to exec session URL construction. Confirmed by
running the test in isolation and together with useDashboardHealth.test.ts
(both pass after the fix; isolated failure reproduces on unmodified
main against useExecSession.ts:235).

Fix: rename the test to "builds kc-agent /ws/exec URL regardless of
page protocol" and assert the new, stable target. Documented the
rationale inline with source pointer to useExecSession.ts:235 so future
readers don't reintroduce the old assertion.

Signed-off-by: Andy Anderson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-generated Pull request generated by AI dco-signoff: yes Indicates the PR's author has signed the DCO. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Backend API 503 failures not reflected in dashboard health status

2 participants