Skip to content

fix(generic-worker): avoid reclaim creds deadlock#8290

Merged
matt-boris merged 1 commit intomainfrom
matt-boris/fixReclaimDeadlock
Feb 19, 2026
Merged

fix(generic-worker): avoid reclaim creds deadlock#8290
matt-boris merged 1 commit intomainfrom
matt-boris/fixReclaimDeadlock

Conversation

@matt-boris
Copy link
Contributor

Fixes #8289.

Generic Worker: fixes credential expiration during high-volume artifact uploads by narrowing the scope of the queue client lock so that credential refresh is no longer blocked by in-flight HTTP calls.

taskMount.Infof("Downloading %v to %v", ac, file)

var runID int64 = -1 // use the latest run
_, contentLength, err := taskMount.task.Queue.DownloadArtifactToFile(ac.TaskID, runID, ac.Artifact, file)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was previously a data race with the reclaim goroutine that writes taskMount.task.Queue.

return ResourceUnavailable(e)
}

e = artifact.FinishArtifact(resp, task.Queue, task.TaskID, strconv.Itoa(int(task.RunID)), artifact.Base().Name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was previously a data race with the reclaim goroutine that writes task.Queue.

@matt-boris matt-boris marked this pull request as ready for review February 18, 2026 20:27
@matt-boris matt-boris requested a review from a team as a code owner February 18, 2026 20:27
@matt-boris matt-boris requested review from lotas and petemoore and removed request for a team February 18, 2026 20:27
@matt-boris matt-boris force-pushed the matt-boris/fixReclaimDeadlock branch 2 times, most recently from 429959c to 578120b Compare February 18, 2026 22:29
@matt-boris
Copy link
Contributor Author

matt-boris commented Feb 18, 2026

It's possible we can bump up the concurrent artifact upload goroutines after this lands too, as my previous thoughts look to have been incorrect. I'd like to see how this performs, though first.

Relevant previous PR: #8167.

par := tcqueue.PostArtifactRequest(json.RawMessage(payload))
task.queueMux.RLock()
parsp, err := task.Queue.CreateArtifact(
queue := task.Queue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there any benefit of introducing this variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! I'm quickly grabbing the reference to the queue client while holding the lock, then immediately releasing the lock right afterwards. As opposed to holding the lock throughout the entire API call.

Copy link
Contributor

@lotas lotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I think Pete should have a closer look :)

@matt-boris matt-boris merged commit 998cc32 into main Feb 19, 2026
85 checks passed
@matt-boris matt-boris deleted the matt-boris/fixReclaimDeadlock branch February 19, 2026 13:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[generic worker] credential refresh blocked by queueMux during artifact uploads

2 participants