Skip to content

Conversation

@thaJeztah
Copy link
Member

@thaJeztah thaJeztah commented Sep 1, 2022

⚠️ Changes compared to that patch:

Details
diff --git a/daemon/containerd/mount.go b/daemon/containerd/mount.go
index b0d4af19db..c5476adc60 100644
--- a/daemon/containerd/mount.go
+++ b/daemon/containerd/mount.go
@@ -13,7 +13,7 @@ import (
 
 // Mount mounts and sets container base filesystem
 func (i *ImageService) Mount(ctx context.Context, container *container.Container) error {
-	snapshotter := i.client.SnapshotService(i.snapshotter)
+	snapshotter := i.client.SnapshotService(container.Driver)
 	mounts, err := snapshotter.Mounts(ctx, container.ID)
 	if err != nil {
 		return err
@@ -35,7 +35,7 @@ func (i *ImageService) Mount(ctx context.Context, container *container.Container
 }
 
 // Unmount unmounts the container base filesystem
-func (i *ImageService) Unmount(ctx context.Context, container *container.Container) error {
+func (i *ImageService) Unmount(_ context.Context, container *container.Container) error {
 	root := container.BaseFS.Path()
 
 	if err := mount.UnmountAll(root, 0); err != nil {
diff --git a/daemon/daemon_unix.go b/daemon/daemon_unix.go
index 6e2b13cd3a..a301055f23 100644
--- a/daemon/daemon_unix.go
+++ b/daemon/daemon_unix.go
@@ -1364,19 +1364,19 @@ func (daemon *Daemon) registerLinks(container *container.Container, hostConfig *
 // conditionalMountOnStart is a platform specific helper function during the
 // container start to call mount.
 func (daemon *Daemon) conditionalMountOnStart(container *container.Container) error {
-	if !daemon.UsesSnapshotter() {
-		return daemon.Mount(container)
+	if daemon.UsesSnapshotter() {
+		return nil
 	}
-	return nil
+	return daemon.Mount(container)
 }
 
 // conditionalUnmountOnCleanup is a platform specific helper function called
 // during the cleanup of a container to unmount.
 func (daemon *Daemon) conditionalUnmountOnCleanup(container *container.Container) error {
-	if !daemon.UsesSnapshotter() {
-		return daemon.Unmount(container)
+	if daemon.UsesSnapshotter() {
+		return nil
 	}
-	return nil
+	return daemon.Unmount(container)
 }
 
 func copyBlkioEntry(entries []*statsV1.BlkIOEntry) []types.BlkioStatEntry {
diff --git a/daemon/images/mount.go b/daemon/images/mount.go
index 383c18e713..04ff19aff0 100644
--- a/daemon/images/mount.go
+++ b/daemon/images/mount.go
@@ -28,11 +28,11 @@ func (i *ImageService) Mount(ctx context.Context, container *container.Container
 		// on non-Windows operating systems.
 		if runtime.GOOS != "windows" {
 			i.Unmount(ctx, container)
-			return fmt.Errorf("Error: driver %s is returning inconsistent paths for container %s ('%s' then '%s')",
-				i.GraphDriverName(), container.ID, container.BaseFS, dir)
+			return fmt.Errorf("driver %s is returning inconsistent paths for container %s ('%s' then '%s')",
+				container.Driver, container.ID, container.BaseFS, dir)
 		}
 	}

This fixes things that were broken due to nil BaseFS like docker cp
and running a container with workdir override.

This is more of a temporary hack than a real solution.
The correct fix would be to refactor the code to make BaseFS and LayerRW
an implementation detail of the old image store implementation and use
the temporary mounts for the c8d implementation instead.
That requires more work though.

- A picture of a cute animal (not mandatory but encouraged)

@thaJeztah thaJeztah added status/2-code-review containerd-integration Issues and PRs related to containerd integration labels Sep 1, 2022
@thaJeztah thaJeztah added this to the v-next milestone Sep 1, 2022
logrus.WithError(err).WithField("dir", root).Error("failed to remove mount temp dir")
}

container.BaseFS = nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ I noticed that this was not in the old code

perhaps that was a bug (should it be included as separate commit before "refactor"?) /cc @vvoland @rumpl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was intended. Since we already unmounted and removed the dir the BaseFS set here no longer makes sense.
It will also make it more visible if there is some code path that tries to use the BaseFS after unmount, because it will panic due to nil pointer instead of trying to perform some funny stuff on no longer existing path.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! sorry, I think I commented on the wrong file. I meant to comment on the daemon/images/mount.go (so, the graph driver implementation, not the snapshotter)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yeah I think it would make sense to consider lack of resetting this to nil as a bug. In this case the directory may still exist, because the RWLayer.Unmount seems to just decrement the refcount for this mount, so something else may still use this.
But since Unmount was called on this container, I think it shouldn't have an access to this dir any longer.
+1 on extracting it to another PR, someone else from the maintainers may have some input on this so it would be nice not to sneak this change silently with this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was wondering if it was used elsewhere (e.g., cleaning up dead containers that failed to be removed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is interesting actually.

The daemon/images/mount.go Unmount implementation shouldn't set the BaseFS to nil. The difference is that the graph driver's implementation mounts the rootfs to its final destination, but the snapshotter one creates a new mount each time.
So even if you Unmount there is other code that uses the BaseFS of the container. For example when you docker exec we don't mount the rootfs but you look at the container's BaseFS and then look for resources inside (like the UID/GID for example). So if we set (in the graph driver's case) the BaseFS to nil on unmount parts of the code don't work any more.

@thaJeztah
Copy link
Member Author

Bunch of failures that could be related, e.g.;

=== RUN   TestDockerSuite/TestVolumeCLICreateWithOpts
    check_test.go:402: assertion failed: error is not nil: Error response from daemon: remove test: volume has active mounts: failed to remove volume test
    --- FAIL: TestDockerSuite/TestVolumeCLICreateWithOpts (0.48s)

=== RUN   TestDockerSuite/TestVolumeCLIRm
    docker_cli_volume_test.go:197: assertion failed: 
        Command:  /usr/local/cli/docker volume rm test
        ExitCode: 1
        Error:    exit status 1
        Stdout:   
        Stderr:   Error response from daemon: remove test: volume has active mounts
        
        
        Failures:
        ExitCode was 1 expected 0
        Expected no error
    check_test.go:402: assertion failed: error is not nil: Error response from daemon: remove test: volume has active mounts: failed to remove volume test
    --- FAIL: TestDockerSuite/TestVolumeCLIRm (0.05s)

This fixes things that were broken due to nil BaseFS like `docker cp`
and running a container with workdir override.

This is more of a temporary hack than a real solution.
The correct fix would be to refactor the code to make BaseFS and LayerRW
an implementation detail of the old image store implementation and use
the temporary mounts for the c8d implementation instead.
That requires more work though.

Signed-off-by: Paweł Gronowski <[email protected]>
Signed-off-by: Sebastiaan van Stijn <[email protected]>
@thaJeztah thaJeztah force-pushed the c8d_set_basefs_hack branch from 1c1da2e to 9f35efd Compare March 6, 2023 21:07
@thaJeztah
Copy link
Member Author

Rebased this one to look at the remaining diff in here that was not included in #44804

@rumpl
Copy link
Member

rumpl commented Apr 4, 2023

Closing this one, the upstream PR was already upstreamed

@rumpl rumpl closed this Apr 4, 2023
@thaJeztah thaJeztah deleted the c8d_set_basefs_hack branch April 4, 2023 15:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

containerd-integration Issues and PRs related to containerd integration status/2-code-review

Projects

Development

Successfully merging this pull request may close these issues.

3 participants