Summary
Vigilante reduced the worst redundant PR polling, but the daemon still wastes GitHub REST calls in two important ways: deleted issues are still being re-polled every scan instead of being retired from monitoring, and active sessions still re-fetch the same issue details multiple times within a single scan. This follow-up should make unavailable-issue detection work in production and remove avoidable same-scan issue-detail duplication without introducing a general caching mechanism.
Problem
- The daemon still repeatedly polls sessions whose backing GitHub issues are gone. Recent access-log samples continue to show repeated failed calls for the same deleted issues, including
aliengiraffe/vigilante issues 106, 122, 204, 266, 276
- The source code now attempts to stop monitoring unavailable issues on 404/410 responses, but the live daemon behavior shows that this path is not actually firing for those sessions.
- Active sessions still fetch issue details multiple times in the same scan. Recent logs still show repeated
gh api repos/.../issues/... calls for live sessions like aliengiraffe/vigilante#289 and aliengiraffe/env-manager#215 during the same daemon pass.
- These two remaining behaviors still dominate GitHub traffic after the previous polling reduction work and can continue pushing Vigilante toward REST quota exhaustion.
Context
- Repository:
aliengiraffe/vigilante.
- The current source already includes unavailable-issue handling in the daemon flow via
ghcli.IsIssueUnavailableError(...) and stopMonitoringUnavailableIssueSession(...).
- The likely mismatch is that the environment runner returns stderr content in command output while wrapping the error separately as
exit status 1, so the unavailable-issue detector may be checking the wrong string and therefore never recognizing the 404/410 condition in production.
- The current daemon flow still loads issue details from multiple places in a single scan, including success-session cleanup/finalization, blocked-session resume checks, maintenance-time issue access, and issue-label synchronization.
- Do not solve this with a generalized cache layer, HTTP conditional requests, or ETag support. Keep the fix explicit and local to the daemon/session-management flow.
Desired Outcome
- Sessions whose backing GitHub issues are confirmed unavailable are transitioned out of active monitoring in real daemon runs, not just in theory.
- Repeated scans no longer spend the majority of GitHub traffic re-probing the same deleted issues.
- A single daemon scan does not fetch the same issue details multiple times for one session when the data is already available earlier in that scan.
- The fix preserves current behavior for label sync, session cleanup, blocked-session handling, and PR maintenance while materially reducing GitHub REST usage.
- Out of scope: a generalized cache abstraction, ETag or
If-None-Match request handling, or a webhook/event-driven redesign.
Implementation Notes
- Fix unavailable-issue detection against the actual runtime error shape emitted by the command runner. If the HTTP 404/410 signal is present in command output rather than
err.Error(), make the detector consume the correct source of truth.
- Audit all daemon paths that call
GetIssueDetails(...) and ensure deleted/unavailable issue handling converges the session into a finalized non-polled state from every relevant entry point.
- Reduce same-scan issue-detail duplication by threading already-loaded issue details through downstream logic where practical, especially across maintenance and label-sync paths.
- Keep the implementation targeted. The goal is not to add a broad caching subsystem, but to stop reloading data that the daemon already has in-hand during the current scan.
- If access-log coverage still misses important GitHub polling paths, improve observability enough to validate the fix after rollout.
Acceptance Criteria
Testing Expectations
- Add regression tests covering the production-shaped unavailable-issue failure path, including the case where
gh returns a generic command error while the HTTP 404/410 signal is only visible in command output.
- Add tests proving deleted/unavailable sessions are transitioned out of active polling and are not retried on subsequent scans.
- Add tests covering same-scan issue-detail reuse so maintenance and label-sync logic do not redundantly call
GetIssueDetails(...) for one session in one daemon pass.
- Add or update tests for blocked sessions, successful sessions, and label synchronization to confirm the reduced call pattern does not change user-visible behavior incorrectly.
- Add verification around access-log or daemon-log output when needed so the fix can be validated operationally.
Operational / UX Considerations
- Operators should be able to trust that a session repeatedly shown as blocked is still actionable, not just a deleted GitHub issue that Vigilante failed to retire.
- When monitoring stops because the issue is unavailable, that state should remain legible in persisted session data and status output.
- GitHub-usage diagnostics should remain straightforward after the fix; if some polling paths bypass the access log today, tighten that enough to make future regressions easier to catch.
Summary
Vigilante reduced the worst redundant PR polling, but the daemon still wastes GitHub REST calls in two important ways: deleted issues are still being re-polled every scan instead of being retired from monitoring, and active sessions still re-fetch the same issue details multiple times within a single scan. This follow-up should make unavailable-issue detection work in production and remove avoidable same-scan issue-detail duplication without introducing a general caching mechanism.
Problem
aliengiraffe/vigilanteissues106,122,204,266,276gh api repos/.../issues/...calls for live sessions likealiengiraffe/vigilante#289andaliengiraffe/env-manager#215during the same daemon pass.Context
aliengiraffe/vigilante.ghcli.IsIssueUnavailableError(...)andstopMonitoringUnavailableIssueSession(...).exit status 1, so the unavailable-issue detector may be checking the wrong string and therefore never recognizing the 404/410 condition in production.Desired Outcome
If-None-Matchrequest handling, or a webhook/event-driven redesign.Implementation Notes
err.Error(), make the detector consume the correct source of truth.GetIssueDetails(...)and ensure deleted/unavailable issue handling converges the session into a finalized non-polled state from every relevant entry point.Acceptance Criteria
ghreports an issue as unavailable in a real daemon run, Vigilante recognizes that condition and stops actively monitoring that session instead of retrying the same issue lookup every scan.access.jsonlwith repeated failedgh api repos/.../issues/...calls across successive scans.sessions.jsonmakes it clear that monitoring stopped because the issue became unavailable, rather than leaving the session looking actively blocked forever.Testing Expectations
ghreturns a generic command error while the HTTP 404/410 signal is only visible in command output.GetIssueDetails(...)for one session in one daemon pass.Operational / UX Considerations