Failure to deploy Self-Hosted runners - High Queue Times and canceled jobs

# PyTorch CI using up GitHub APIs over our ratelimit

## Current Status

*Status: Resolved*

## Error looks like

* GitHub self-hosted runners being terminated unexpectedly.
* GitHub API Rate Limit reached

## Incident timeline (all times pacific)

* Began around Friday Nov 15th @ 9 am 
* GitHub notified Monday Nov 18th @ 8:36 am
* GitHub resolved Monday Nov 18th @ 1:45 pm

## User impact

Intermittently GitHub self-hosted runners may terminate mid-job.

## Root cause

A bug was introduced into the repository-level list runners API which resulted in pagination logic incorrectly being applied twice. Due to this, results were not returned beyond the first page. The change with the bug was intended to be fully feature flagged and disabled, but the offending logic was accidentally added outside of the feature flag block and was missed in reviews. A test case for pagination was missing from this API.

## Mitigation

Split infra load between Meta fleet and LF fleet to spread the API usage across both accounts.

## Prevention/followups

This issue was caused by GitHub rolling out changes to their API. We have monitoring in place that can help us troubleshoot this issue and escalate if it happens again.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Failure to deploy Self-Hosted runners - High Queue Times and canceled jobs #140958

PyTorch CI using up GitHub APIs over our ratelimit

Current Status

Error looks like

Incident timeline (all times pacific)

User impact

Root cause

Mitigation

Prevention/followups

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Failure to deploy Self-Hosted runners - High Queue Times and canceled jobs #140958

Description

PyTorch CI using up GitHub APIs over our ratelimit

Current Status

Error looks like

Incident timeline (all times pacific)

User impact

Root cause

Mitigation

Prevention/followups

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions