Skip to content

classifyFailures spawns unbounded goroutines with no semaphore for concurrent GitHub API calls #7056

@aashu2006

Description

@aashu2006

User Request

Type: bug
Target: Console Application
Submitted by: @aashu2006
Console Request ID: 728f6d23-bbd1-4ce5-8b0b-df66c6a7075d

Description

What happened:
In pkg/api/handlers/nightly_e2e.go:409-422, one goroutine is spawned per failed workflow run with no semaphore. Each detectGPUFailure call makes a GitHub API request with no per-goroutine timeout. A burst of failures across 17+ workflows fans out to dozens of concurrent GitHub API requests with no backpressure.

What I expected:
A semaphore (e.g. golang.org/x/sync/semaphore) should limit concurrent detectGPUFailure calls to a reasonable maximum.

Steps to reproduce:

  1. Have 20+ workflow runs fail simultaneously
  2. Observe 20+ concurrent GitHub API requests fired with no backpressure

This issue was automatically created from the KubeStellar Console.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ai-fix-requestedai-processingAI is currently processing this issuekind/bugCategorizes issue or PR as related to a bug.triage/acceptedIndicates an issue or PR is ready to be actively worked on.triage/neededNeeds triage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions