Parallel pkg build can trounce each other

Parallel runs of `linuxkit pkg build` can trounce each other. Specifically, they can corrupt the linuxkit cache.

The process of saving both blobs and the `index.json` assume that only one process is accessing it at once. If you run 2 parallel builds, they might corrupt `index.json` as they both try to write it.

They also might corrupt actual blob files, but that is less likely.

This issue also documents all of the places that we do writes to the cache, so that we can clean it up.

Everything is in [`src/cmd/linuxkit/cache`](https://github.com/linuxkit/linuxkit/tree/master/src/cmd/linuxkit/cache).

Very few files actually write to the cache, and we don't care quite as much about parallel reads, which are easy.

The 3 files that actually write to the cache are:

* [pull.go](https://github.com/linuxkit/linuxkit/blob/master/src/cmd/linuxkit/cache/pull.go)
* [push.go](https://github.com/linuxkit/linuxkit/blob/master/src/cmd/linuxkit/cache/push.go)
* [remove.go](https://github.com/linuxkit/linuxkit/blob/master/src/cmd/linuxkit/cache/remove.go)
* [write.go](https://github.com/linuxkit/linuxkit/blob/master/src/cmd/linuxkit/cache/write.go)

We will separate between those that write blobs to `blob/sha256/`, and those that write to `index.json`.

In all of the listed examples, `cache`, as in "calling `cache.Func`, refers to the external library that actually writes to the on-disk cache.

## Blobs

We are not concerned for now about writing blobs, for several reasons:

* To cause an issue, we would need two or more parallel processes writing the exact same blob, which is very unlikely.
* Fixing it requires an approach like how containerd does it, with an `ingest/` dir and giving unique filenames to downloads, e.g. `<digest>.<id>`, and then doing a single, quick, atomic `mv`.
* We rely entirely on the underlying [ggcr library](https://github.com/google/go-containerregistry), which likely does not support what we are trying to do.
 
If it becomes an issue in the future, we can deal with it then.

References to direct writing of blobs via GGCR:

* [pull.go calls cache.WriteIndex](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/pull.go#L195)
* [write.go calls cache.WriteIndex](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L120)
* [write.go calls cache.WriteBlob](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L195)
* [write.go calls cache.WriteBlob](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L365)

## `index.json`

All of the references in this section call GGCR functions to modify the root `index.json`. Underlying it, GGCR does the following:

1. Read the `index.json` into memory and convert to a struct
2. Make the requested changes
3. Convert the structure to json and write it out to `index.json`

These all are clean, self-contained calls. So if we can isolate them to lock then make call then unlock, we should eliminate issues with `index.json`

I specifically ignore any that call other parts of this package, since they are wrapped by definition. It only is the calls outside that matter.

* `pull.go` calls [cache.ReplaceImage](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/pull.go#L227)
* `push.go` calls [cache.AppendDescriptor](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/push.go#L138)
* `remove.go` calls [cache.RemoveDescriptors](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/remove.go#L68)
* `write.go` calls [cache.ReplaceImage](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L134)
* `write.go` calls [cache.RemoveDescriptors](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L213) and then [cache.AppendDescriptor](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L223)
* `write.go` calls [cache.RemoveDescriptors](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L369) and then [cache.AppendDescriptor](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L380)
* `write.go` calls [cache.RemoveDescriptors](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L401) and then [cache.AppendDescriptor](https://github.com/linuxkit/linuxkit/blob/3f54a80824cca1dc00a2c500e13716df09740b95/src/cmd/linuxkit/cache/write.go#L405)

## Approach

First, any calls from outside `write.go` should be moved into `write.go`, so we have single chokepoints for everything that modifies the index file. This means the one call each in `pull.go`, `push.go` and `remove.go`.

The remaining calls in `write.go`, which appear inside 4 funcs - `ImagePull`, `ImageLoad`, `IndexWrite` and `DescriptorWrite` - should be placed inside a single call that can update the image, and which handles the lock/unlock using fcntl semantics, where available.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Parallel pkg build can trounce each other #4131

Blobs

`index.json`

Approach

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parallel pkg build can trounce each other #4131

Description

Blobs

index.json

Approach

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`index.json`