Skip to content

llb.File doesn't preserve timestamp of destination directory (if it was created) #2884

@smira

Description

@smira

This was previously reported to Slack

We're using buildkit with custom LLB, and we try to do reproducible builds (reproducible container image output) as much as possible.

If translated back to Dockerfile syntax, the last step in the build would look like

COPY --from=build --chown=0:0 --set-timestamp=<fixed> /toolchain /toolchain

If we build twice (without caches), what we see in the final layer diff is the following:

├── +++ 42350600bf85b18a38f6d8d692ac7c85cbcdea87630dc1a84baf29d634b46f87/layer.tar
│┄ Files 0% similar despite different names
│ ├── file list
│ │ @@ -1,8 +1,8 @@
│ │ -drwxr-xr-x   0 root         (0) root         (0)        0 2022-05-25 14:10:42.000000 toolchain/
│ │ +drwxr-xr-x   0 root         (0) root         (0)        0 2022-05-25 18:03:00.000000 toolchain/
│ │  drwxr-xr-x   0 root         (0) root         (0)        0 2019-06-02 17:37:45.000000 toolchain/bin/
│ │  -rwxr-xr-x   0 root         (0) root         (0)  1850520 2019-06-02 17:37:45.000000 toolchain/bin/addr2line
│ │  -rwxr-xr-x   0 root         (0) root         (0)  1883096 2019-06-02 17:37:45.000000 toolchain/bin/ar
│ │  -rwxr-xr-x   0 root         (0) root         (0)  2581512 2019-06-02 17:37:45.000000 toolchain/bin/as
│ │  -rwxr-xr-x   0 root         (0) root         (0)  1013912 2019-06-02 17:37:45.000000 toolchain/bin/c++
│ │  -rwxr-xr-x   0 root         (0) root         (0)  1845880 2019-06-02 17:37:45.000000 toolchain/bin/c++filt
│ │  lrwxrwxrwx   0 root         (0) root         (0)        0 2019-06-02 17:37:45.000000 toolchain/bin/cc -> gcc```

All timestamps are preserved except for the destination copy directory /toolchain which has the timestamp of the build itself.

Looking down into the code, it seems that when the copy operation is executed, destination directory is created with the specified timestamp, but as files get copied into it, timestamp gets obviously updated, so it ends up having build timestamp (for subdirectories timestamps are preserved correctly by setting the timestamp once the copying is done).

Is this considered to be a bug or a feature?

The workaround is to do yet another FROM sratch step and COPY / / which should correctly set the timestamps, but that extra copying is a performance hit.

The conclusion in Slack was that it's a bug, and it should be fixed: if buildkit creates destination direcotry, it should set the timestamp once copying is done.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions