Skip to content

Cache and throttle per-issue GitHub detail lookups in scan and maintenance paths #381

@nicobistolfi

Description

@nicobistolfi

Summary

Reduce repeated per-issue GitHub detail lookups during scans by avoiding gh api repos/.../issues/<n> unless a session actually needs fresh issue details, and by caching issue-detail results with a controlled refresh strategy. The current scan path repeatedly fetches the same issue details across scans, which materially contributes to GitHub core rate-limit exhaustion.

Problem

  • Vigilante currently performs repeated per-issue GitHub detail lookups for tracked sessions during scan and maintenance workflows.
  • Access-log analysis showed repeated calls such as gh api repos/aliengiraffe/vigilante/issues/357 through .../378 on nearly every scan cycle, even when the underlying issue state was unlikely to have changed.
  • This repeated traffic contributes to exhausting GitHub REST core quota and delays or blocks higher-value operations such as iteration comment handling and normal issue dispatch.

Context

  • Issue detail loading is centralized through loadIssueDetailsForScan(...) in internal/app/app.go, which delegates to IssueTracker.GetWorkItemDetails(...).
  • The current GitHub backend implementation for issue details uses gh api repos/{owner}/{repo}/issues/{n}.
  • Recent access-log analysis showed scan-heavy repetition of per-issue detail calls for open tracked sessions, especially in repositories with many active Vigilante-managed issues.
  • Some paths do need fresh issue details, for example when labels, assignees, issue state, or iteration eligibility must be evaluated safely. This issue should preserve those correctness requirements while eliminating avoidable repeated lookups.
  • The user direction for refresh policy is intentionally flexible: time-based refresh, event-driven refresh after @vigilanteai comments, or another safe trigger are all acceptable if they meaningfully reduce GitHub API usage.

Desired Outcome

  • Vigilante does not call gh api repos/.../issues/<n> on every scan for every tracked session by default.
  • Issue details are reused from cache when no fresh fetch is needed for correctness.
  • Fresh issue-detail fetches still happen when required, using a defined refresh policy such as cache TTL, explicit invalidation, or comment-driven refresh triggers.
  • Iteration, resume, cleanup, label-sync, stale-session recovery, and closed-issue detection continue to behave correctly after the optimization.
  • The change stays focused on issue-detail fetches rather than broad caching of unrelated GitHub API surfaces.

Implementation Notes

  • Prefer a dedicated issue-detail cache boundary around loadIssueDetailsForScan(...) or the GitHub issue-tracker implementation, so the optimization is applied consistently across scan call sites.
  • The cache may be in-memory, persisted, or hybrid, but the chosen approach must materially reduce repeated API calls during normal daemon operation.
  • Refresh policy is flexible, but it must be explicit and testable. Acceptable strategies include:
    • time-based expiry with a conservative TTL
    • invalidation when Vigilante observes a relevant @vigilanteai issue comment
    • invalidation when local session state changes indicate that labels, assignees, or issue state may matter immediately
    • a combination of the above
  • The implementation must not rely exclusively on issue comments for freshness because some correctness-sensitive changes, such as manual issue closure or label edits, may happen without a comment.
  • Preserve existing semantics for unavailable issues, closed issues, assignee checks, and label-based routing.
  • Keep the design compatible with the existing scanIssueDetailsCache concept if that materially reduces duplication, but extend it beyond a single scan cycle if needed.

Acceptance Criteria

  • Repeated daemon scans do not issue gh api repos/.../issues/<n> for every tracked session when no fresh issue details are needed.
  • Issue-detail fetches are cached and refreshed using an explicit policy that is documented in code and exercised by tests.
  • A relevant freshness trigger, such as TTL expiry or a newly observed @vigilanteai comment, causes issue details to refresh when needed.
  • Vigilante still correctly detects issue closure, assignee-sensitive iteration behavior, and label/state changes that materially affect routing or maintenance.
  • The optimization measurably reduces GitHub REST core traffic in the scan path without breaking current issue-management behavior.

Testing Expectations

  • Add or update tests around scan behavior so repeated scans reuse cached issue details rather than refetching every time.
  • Add or update tests for the chosen refresh policy, including at least one case where cached issue details must be refreshed and one case where reuse is expected.
  • Cover regressions around issue closure, unavailable issues, iteration-assignee validation, and label-sensitive routing so stale cache behavior does not silently change semantics.
  • If persisted or cross-scan cache state is introduced, add restart or reload coverage to prove the optimization survives more than one in-memory scan loop.

Operational / UX Considerations

  • Keep access-log visibility sufficient to verify whether issue details were served from cache or refreshed from GitHub.
  • Avoid broadening this issue into a general-purpose GitHub response cache; this issue is specifically about repeated issue-detail lookups in the scan and maintenance path.
  • If a TTL-based approach is chosen, keep the window conservative enough that human-driven issue changes are still observed promptly.
  • If state shape changes are needed for persisted caching, keep them backward-compatible with existing local Vigilante state.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions