Skip to content

Docker 20.10.5 overlay2: "no space left on device" during docker cp despite sufficient disk and inode availability #52201

@vijayaramaraju-kalidindi

Description

Description

We are encountering a critical issue in production where docker cp fails with a misleading error indicating lack of disk space, even though the system has sufficient free space and inodes.

Reproduce

[root@br003rcbupf071crjo5 ~]# docker cp /home/upf/apply_optimization_settings.sh upf-dp-cl1-sg2:/tmp
Error response from daemon: mount /dev:/app1/overlay2/2891d87c261aecac33bbe2278cf9684dede4f96845c4db5ac0038eb28530f22d/merged/dev, flags: 0x5000: no space left on device

Expected behavior

docker cp should copy the required files vice versa.

docker version

[root@br003rcbupf071crjo5 ~]# docker version
Client: Docker Engine - Community
Version:           20.10.5
API version:       1.41
Go version:        go1.13.15
Git commit:        55c4c88
Built:             Tue Mar  2 20:17:04 2021
OS/Arch:           linux/amd64
Context:           default
Experimental:      true
 
Server: Docker Engine - Community
Engine:
  Version:          20.10.5
  API version:      1.41 (minimum version 1.12)
  Go version:       go1.13.15
  Git commit:       363e9a8
  Built:            Tue Mar  2 20:15:27 2021
  OS/Arch:          linux/amd64
  Experimental:     false
containerd:
  Version:          1.4.4
  GitCommit:        05f951a3781f4f2c1911b05e61c160e9c30eaa8e
runc:
  Version:          1.0.0-rc93
  GitCommit:        12644e614e25b05da6fd08a38ffa0cfe1903fdec
docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

[root@br003rcbupf071crjo5 ~]# docker info
Client:
Context:    default
Debug Mode: false
Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.5.1-docker)
 
Server:
Containers: 4
  Running: 4
  Paused: 0
  Stopped: 0
Images: 1
Server Version: 20.10.5
Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: active
  NodeID: mbw0tohvpwx9mlr3wqmiz51ra
  Is Manager: false
  Node Address: 2405:200:5213:47:3902:af:3:108
  Manager Addresses:
   [2405:200:5518:0:3497::10]:2377
   [2405:200:5518:0:3497::11]:2377
   [2405:200:5518:0:3497::12]:2377
Runtimes: io.containerd.runtime.v1.linux runc io.containerd.runc.v2
Default Runtime: runc
Init Binary: docker-init
containerd version: 05f951a3781f4f2c1911b05e61c160e9c30eaa8e
runc version: 12644e614e25b05da6fd08a38ffa0cfe1903fdec
init version: de40ad0
Security Options:
  seccomp
   Profile: default
Kernel Version: 4.18.0-513.5.1.el8_9.x86_64
Operating System: Red Hat Enterprise Linux 8.9 (Ootpa)
OSType: linux
Architecture: x86_64
CPUs: 32
Total Memory: 755GiB
Name: br003rcbupf071crjo5
ID: IQM6:M6CZ:WEDE:2WY6:ADPZ:UP7G:RDPG:XWXG:MODO:JUTP:S7WG:J4MH
Docker Root Dir: /app1
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: false
Insecure Registries:
  production-repo.cn.tn.jionfvmano:5000
  127.0.0.0/8
Live Restore Enabled: false
 
WARNING: API is accessible on http://[2405:0200:5213:47:3902:af:3:108]:2375 without encryption.
         Access to the remote API is equivalent to root access on the host. Refer
         to the 'Docker daemon attack surface' section in the documentation for
         more information: https://docs.docker.com/engine/security/security/#docker-daemon-attack-surface

Additional Info

Environment:

Docker Version: 20.10.5
Storage Driver: overlay2
Host OS: RHEL 8.9
Kernel Version: 4.18.0-513.5.1.el8_9.x86_64
Containers uptime: ~8 months

While executing docker cp, the operation fails with the following error:

Error response from daemon: mount /dev/.../overlay2/.../merged/dev: no space left on device

Observations:

  1. Disk space: df -h → ~1.4TB free on /app1
  2. Inodes are not exhausted: df -i → IUse% ~1%
  3. Docker usage is minimal: docker system df → very low usage, no reclaimable space
  4. Overlay mounts are limited:
    overlay2 directories: 21
    active mounts: 4
  5. Deleted file handles are not significant: lsof | grep deleted → ~420 entries

Key Finding:
The failure occurs specifically during overlay mount operation for /dev inside container: which was working fine till march 20.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugBugs are bugs. The cause may or may not be known at triage time so debugging may be needed.status/0-triageversion/20.10

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions