refactor: support multiple clients in Poller with independent fetch intervals #1311
No reviewers
Labels
No labels
FreeBSD
Kind/Breaking
Kind/Bug
Kind/Chore
Kind/DependencyUpdate
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
Windows
linux-powerpc64le
linux-riscv64
linux-s390x
run-end-to-end-tests
run-forgejo-tests
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo/runner!1311
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "mfenniak/forgejo-runner:multiple-clients"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Adds infrastructure support in runner for fetching tasks from multiple different Forgejo instances, extending
Pollerto supporting an array ofClientinterfaces. Each instance can be accessed with a separate fetch interval, but otherwise all jobs are executed in the same manner with a common pool of capacity.No user-facing functionality is included in the PR to exploit this capability.
@aahlenst I'm not sure if you've proceeded with any work on multiple runners after we discussed a potential CLI and config file interface for them. I thought I'd take a stab at the backend portion of polling the runners, and we could coordinate next steps from here assuming it isn't stepping on any work you've already started.
c5791d3c02c430a675bb@mfenniak wrote in #1311 (comment):
I am working on it. The results so far: a couple of tests, some refactoring, and the realization that my current plan is not good. I'll post my current thinking in the associated feature request soon-ish. I'll also look at this PR in detail.
@ -74,6 +75,7 @@ func ping(cfg *config.Config, reg *config.Registration) error {"","",ver.Version(),time.Second,I'd prefer a constant with a descriptive name, like
DefaultFetchInterval.Is
cfg.Runner.FetchIntervalnot an option here?Sure,
cfg.Runner.FetchIntervalcan be used here. The value provided doesn't matter today as it's not actually used inping, but there's no reason we can't provide the right config value.@ -147,8 +168,8 @@ func (p *poller) fetchTasks(ctx context.Context, availableCapacity int64) ([]*rudefer cancel()// Load the version value that was in the cache when the request was sent.I'm unable to figure out what cache the comment is talking about.
Not new in this PR, but I've removed it.
@ -100,2 +88,2 @@}}for i := range p.clients {wg.Go(func() {If I understand the code correctly,
pollForClient()is called concurrently for each client. To prevent the invocations from overloading the runner, fetching tasks is protected by a mutex. The remainder ofpollForClient()only spawns a goroutine for each job which seems to be cheap. So while its implemented correctly, it seems overly complicated because the expensivefetchTasks()is in the critical section. Doing it sequentially would be sufficient and make testing much easier.I think that this analysis misses the implementation of the fetch interval. Most of the time in this routine is actually spent on
limiter.Wait(p.pollingCtx)which needs to occur independently for each client in order to support separate fetch intervals for each client. A sequential approach, rather than a concurrent approach, would be possible we could "select" against a list of rate limits and suspend the process until the next rate limit expires, but that isn't possible with the rate limiting library that is in-use.Ah. I missed that the rate limiter is used for waiting until the interval has elapsed. Then, it makes sense.
Right now, there's only one global setting. To have separate fetch intervals for each client, we would have to add it to each connection. Which would make the connection configuration even more awkward.
Yes, there's only one global setting. But even if we maintain only one global setting, runner has logic to automatically "tune" the fetch interval on Codeberg. (This is an ugly hack, but, I don't want be responsible for removing it. 😉)
func (c *Config) Tune(instanceURL string) {if instanceURL == "https://codeberg.org" {if c.Runner.FetchInterval < 30*time.Second {log.Info("The runner is configured to be used by a public instance, fetch interval is set to 30 seconds.")c.Runner.FetchInterval = 30 * time.Second}}}I think a likely place we'd end up is a global config, but any Codeberg clients are automatically overridden to minimum 30 seconds.
@ -108,6 +98,37 @@ func (p *poller) Poll() {close(p.done)}func (p *poller) pollForClient(limiter *rate.Limiter, client client.Client, capacity int64, fetchMutex chan any, taskVersions, inProgressTasks *atomic.Int64, wg *sync.WaitGroup) {I'm wondering whether some of the arguments should be struct fields because they are part of poller's internal state or derived from it:
capacity,fetchMutex,inProgressTasks, andwg.This is tricky to change in a way that won't trigger the go data race detector -- we can't read and mutate the
pollerstruct from multiple different goroutines. Using local storage in thePollmethod that is referenced in this way avoids this risk. I think it could be a worthwhile change but should be isolated from this PR, so that here we're adding functionality, there we're refactoring with no intent of functional change.@ -488,0 +528,4 @@})// invocations of `fetchTasks` are rate limited per configurationt.Run("fetchTasks rate limited separate intervals", func(t *testing.T) {Sounds like the comment should be the test name 😁
c430a675bb49c1087101.runner) in Favour of Runner Configuration #88