Conversation
We had a few "runaway jobs" recently, where the job got stuck, and kept running for 6 hours (in one case even 24 hours, probably due some github outage). Some of those jobs could not be terminated. While running these actions on public repositories doesn't cost us, it's still not desirable to have jobs running for that long (as they can still hold up the queue). This patch adds a blanket "2 hours" time-limit to all jobs that didn't have a limit set. We should look at tweaking those limits to actually expected duration, but having a default at least is a start. Also changed the position of some existing timeouts so that we have a consistent order in which it's set; making it easier to spot locations where no limit is defined. Signed-off-by: Sebastiaan van Stijn <[email protected]> (cherry picked from commit 6b7e278) Signed-off-by: Austin Vazquez <[email protected]>
We had a couple of runs where these jobs got stuck and github actions didn't allow terminating them, so that they were only terminated after 120 minutes. These jobs usually complete in 5 minutes, so let's give them a shorter timeout. 20 minutes should be enough (don't @ me). Signed-off-by: Sebastiaan van Stijn <[email protected]> (cherry picked from commit c68c9ae) Signed-off-by: Austin Vazquez <[email protected]>
|
I was thinking it was be good to have these in the maintenance branches. |
Yes, it is! I thought it wasn't critical but definitely good to have. And ... evidently we need to be even more aggressive; this PR had one of the
Running it again completed in less than 4 minutes
Not sure what's the cause of these though; they started to show up more recently. Either something changed in the GHA runners, or some deadlock somewhere (but I don't think the docker engine versions changed in GHA) |


- What I did
- How I did it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)