feat: use git work trees for remote git actions & workflows #1162
No reviewers
Labels
No labels
FreeBSD
Kind/Breaking
Kind/Bug
Kind/Chore
Kind/DependencyUpdate
Kind/Documentation
Kind/Enhancement
Kind/Feature
Kind/Security
Kind/Testing
Priority
Critical
Priority
High
Priority
Low
Priority
Medium
Reviewed
Confirmed
Reviewed
Duplicate
Reviewed
Invalid
Reviewed
Won't Fix
Status
Abandoned
Status
Blocked
Status
Need More Info
Windows
linux-powerpc64le
linux-riscv64
linux-s390x
run-end-to-end-tests
run-forgejo-tests
No milestone
No project
No assignees
7 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
forgejo/runner!1162
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "mfenniak/forgejo-runner:feat-worktrees"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Breaking: forgejo-runner now requires access to
giton its PATH in order to create efficient work trees for execution of remote git actions or workflows, such asuses: actions/checkout.When you use a remote step action or a remote composite action, forgejo-runner will perform a clone of the remote repo into a temporary directory. For steps, the temporary directory will be named with a SHA of the
uses: ...step text; for reusable workflows, it will be named with a combination of remote org,repo@ref. These are stored in~/.cache/act(or theworkflow_parent, if overridden in the config file).This design has these problems:
uses: ...changes, an independent clone is performed as it causes a different unique directory name. For example, when changing fromuses: actions/checkout@v4to...v5, the existing cached repo that containsv4is not reused in any way; a new cache is made for@v5.git reset --HARDbetween uses, but non-tracked files could be left in the directory between runs polluting from one run to another; there's nogit clean -fxdperformed.forgejo-runnerat once:cloneLock sync.Mutexvar executorLock sync.Mutexfunc newMutexExecutor(executor common.Executor) common.Executor {return func(ctx context.Context) error {executorLock.Lock()defer executorLock.Unlock()return executor(ctx)}}git cloneof a remote, even if they aren't relevant to the reference being fetched. (To be fixed in the future based upon this change.)To address these issues, this PR proposes this design update:
handed over to git, since it is capable of managing multiple fetches and work tree creations from a single bare repotransitioned to a finer grainedMutexMap, allowing higher concurrency but not taking any risks of go-git's unclear concurrency guarantees.This will address the problems described above. It does have a negative side-effect -- each running remote action will require creating a new work tree, increasing on-disk I/O where previously these might have been in-place already. However, this is a trade-off against reducing total disk I/O and usage by sharing a single repo over multiple references to that repo; so long-term usage will likely be lower.
WIP: the core work is done and ready to be put through the test battery to see if any unexpected problems occur, which is why I opened this PR. TODO:
Close()'d~/.cache/actcurrently -- should be stripped1ded1d26668a6c7741b98a6c7741b903aab6a21103aab6a2111fd5c5eb131fd5c5eb138202b3acc18202b3acc1cffdf9d28ccffdf9d28cbf3e51bef0FYI there are no errors server side for the job running the CI on this PR.
@ -148,3 +149,2 @@run: |go test -race ./act/containergo test -race -timeout 30m ./act/runner/...go test -race -v ./act/container 2>&1 > tests.logor
because
&1is not yettests.logThanks for pointing that out... could be why I'm not getting test errors. 🤔
ae878598025ab8171b855ab8171b852028bf61fecascading-pr updated at actions/setup-forgejo#765
WIP: feat: use git work trees for remote git actions & workflowsto feat: use git work trees for remote git actions & workflows2028bf61fedd8e7427cbcascading-pr updated at actions/setup-forgejo#765
git clonefor remote git actions & workflows, don't immediatelygit fetchrepeating same work #1164@ -382,0 +338,4 @@r, err := cloneIfRequired(ctx, refName, input, logger, worktreeDir)if err != nil {return nil, err}s/may be reused optimize future clone/may be reused by future clone/
Thanks, fixed.
dd8e7427cb..40f61a4621@ -382,0 +410,4 @@if hash.String() != input.Ref && len(input.Ref) >= 4 && strings.HasPrefix(hash.String(), input.Ref) {return nil, &Error{err: ErrShortRef,commit: hash.String(),this adds a dependency to the runner
Yes, this is a good point that warrants evaluation. As go-git doesn't support the
worktreecapability (github.com/go-git/go-git@0fd785bbcd/COMPATIBILITY.md (advanced)), shelling out to git is required for this capability.We could manage this in a few ways:
I'd propose to address this in general by adding a prominent release note when this feature is released. It could be argued as a breaking change requiring a major version bump, but considering forgejo-runner's OCI containers already contain the dependency, and it's just git which is widely available, I'd lean towards a prominent release note only.
Same in terms of just documenting it, but breaking version change.
Detect the availability of
gitand use the new capability. In its absence, print a warning message and retain the old approach of using dedicated full-clone repos.Use go-git's capabilities to read the repo and manually create a temporary working directory that contains the target revision's contents. The risk with this approach would be that the new working directory won't be a git repo, but just a bare directory -- if a remote action does something that inspects itself with git tooling, it would fail. It's also tricky to do this right; file perms, symlinks, other complications.
Do 4, but only in the absence of
gitcapable of doing work trees, providing a way to remediate the "action isn't a git repo" problem if it is identified.I'm leaning towards 1 or 2 but I am clearly biased by "it's done", so I appreciate any input and thoughts.
2 seems best: it will break any installation that does not have
gitin the PATH.In favor of adding git as a dependency:
OK, I'll mark this as breaking and ensure a major version bump.
s/[^\w]/-/gwould be saferwhat if the resulting file name is longer than the maximum? wouldn't it be safer to use a checksum?
Yes, I think this is a good idea. I somewhat liked having readable folder names to identify the repos... was useful for manual testing at least... but I've changed it to a hash following the same pattern as the worktree directories.
dd8e7427cb..40f61a4621dd8e7427cb40f61a4621cascading-pr updated at actions/setup-forgejo#765
Ran across https://github.com/go-git/go-git/issues/773 while answering that last comment and it has me concerned that the Mutex removal might not be as safe as it would be if we were using raw git subcommands. I'm going to incorporate the
MutexMapthat I recently added to Forgejo (in https://codeberg.org/forgejo/forgejo/pulls/10127) which can provide fine-grained locking, providing safety without as much performance impact.40f61a46217716bb0ecccascading-pr updated at actions/setup-forgejo#765
7716bb0ecc5c8dcb38e0cascading-pr updated at actions/setup-forgejo#765
@mfenniak wrote in #1162 (comment):
it is one example advocating for replacing one usage of go-git with a git command.
https://forgejo.org/docs/latest/admin/actions/runner-installation/ should also be updated.
Hmm, this causes problems with actions written with nix, such as https://github.com/ajamtli/update-flake-inputs-forgejo - the checkout copied into /var/run/act in the runner environment/namespace has a .git file pointing to the original clone in the runner's cache directory (/var/lib/runner/name/.cache on my system), but the latter directory is not actually accessible. Any ideas for a workaround? And where is the best place to discuss this? :)
@srd424 I think it would be best to open a new issue, ideally with a reproducer that we can run ourselves.
@srd424 please open a new bug issue and add a minimal reproduction
Will do - it's arguably not a bug so much as a new incompatibility, but agree a new issue is probably the best place!
Done in #1369 - I can add more info if required, just shout.