-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Parallel runs of linuxkit pkg build can trounce each other. Specifically, they can corrupt the linuxkit cache.
The process of saving both blobs and the index.json assume that only one process is accessing it at once. If you run 2 parallel builds, they might corrupt index.json as they both try to write it.
They also might corrupt actual blob files, but that is less likely.
This issue also documents all of the places that we do writes to the cache, so that we can clean it up.
Everything is in src/cmd/linuxkit/cache.
Very few files actually write to the cache, and we don't care quite as much about parallel reads, which are easy.
The 3 files that actually write to the cache are:
We will separate between those that write blobs to blob/sha256/, and those that write to index.json.
In all of the listed examples, cache, as in "calling cache.Func, refers to the external library that actually writes to the on-disk cache.
Blobs
We are not concerned for now about writing blobs, for several reasons:
- To cause an issue, we would need two or more parallel processes writing the exact same blob, which is very unlikely.
- Fixing it requires an approach like how containerd does it, with an
ingest/dir and giving unique filenames to downloads, e.g.<digest>.<id>, and then doing a single, quick, atomicmv. - We rely entirely on the underlying ggcr library, which likely does not support what we are trying to do.
If it becomes an issue in the future, we can deal with it then.
References to direct writing of blobs via GGCR:
- pull.go calls cache.WriteIndex
- write.go calls cache.WriteIndex
- write.go calls cache.WriteBlob
- write.go calls cache.WriteBlob
index.json
All of the references in this section call GGCR functions to modify the root index.json. Underlying it, GGCR does the following:
- Read the
index.jsoninto memory and convert to a struct - Make the requested changes
- Convert the structure to json and write it out to
index.json
These all are clean, self-contained calls. So if we can isolate them to lock then make call then unlock, we should eliminate issues with index.json
I specifically ignore any that call other parts of this package, since they are wrapped by definition. It only is the calls outside that matter.
pull.gocalls cache.ReplaceImagepush.gocalls cache.AppendDescriptorremove.gocalls cache.RemoveDescriptorswrite.gocalls cache.ReplaceImagewrite.gocalls cache.RemoveDescriptors and then cache.AppendDescriptorwrite.gocalls cache.RemoveDescriptors and then cache.AppendDescriptorwrite.gocalls cache.RemoveDescriptors and then cache.AppendDescriptor
Approach
First, any calls from outside write.go should be moved into write.go, so we have single chokepoints for everything that modifies the index file. This means the one call each in pull.go, push.go and remove.go.
The remaining calls in write.go, which appear inside 4 funcs - ImagePull, ImageLoad, IndexWrite and DescriptorWrite - should be placed inside a single call that can update the image, and which handles the lock/unlock using fcntl semantics, where available.