Skip to content

Parallel pkg build can trounce each other #4131

@deitch

Description

@deitch

Parallel runs of linuxkit pkg build can trounce each other. Specifically, they can corrupt the linuxkit cache.

The process of saving both blobs and the index.json assume that only one process is accessing it at once. If you run 2 parallel builds, they might corrupt index.json as they both try to write it.

They also might corrupt actual blob files, but that is less likely.

This issue also documents all of the places that we do writes to the cache, so that we can clean it up.

Everything is in src/cmd/linuxkit/cache.

Very few files actually write to the cache, and we don't care quite as much about parallel reads, which are easy.

The 3 files that actually write to the cache are:

We will separate between those that write blobs to blob/sha256/, and those that write to index.json.

In all of the listed examples, cache, as in "calling cache.Func, refers to the external library that actually writes to the on-disk cache.

Blobs

We are not concerned for now about writing blobs, for several reasons:

  • To cause an issue, we would need two or more parallel processes writing the exact same blob, which is very unlikely.
  • Fixing it requires an approach like how containerd does it, with an ingest/ dir and giving unique filenames to downloads, e.g. <digest>.<id>, and then doing a single, quick, atomic mv.
  • We rely entirely on the underlying ggcr library, which likely does not support what we are trying to do.

If it becomes an issue in the future, we can deal with it then.

References to direct writing of blobs via GGCR:

index.json

All of the references in this section call GGCR functions to modify the root index.json. Underlying it, GGCR does the following:

  1. Read the index.json into memory and convert to a struct
  2. Make the requested changes
  3. Convert the structure to json and write it out to index.json

These all are clean, self-contained calls. So if we can isolate them to lock then make call then unlock, we should eliminate issues with index.json

I specifically ignore any that call other parts of this package, since they are wrapped by definition. It only is the calls outside that matter.

Approach

First, any calls from outside write.go should be moved into write.go, so we have single chokepoints for everything that modifies the index file. This means the one call each in pull.go, push.go and remove.go.

The remaining calls in write.go, which appear inside 4 funcs - ImagePull, ImageLoad, IndexWrite and DescriptorWrite - should be placed inside a single call that can update the image, and which handles the lock/unlock using fcntl semantics, where available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions