--userns=keep-id storage-chown-by-maps kills machine on non-btrfs with large images

**Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)**

/kind bug

**Description**

I am trying to build a bigger container image that contains all my usual desktop/applications, and use it with toolbox. With a "fairly large" image, `podman create` already takes some 30s, and once I add TeXlive to the image, it never finishes any more and eventually kills the machine.

I stripped off the numerous toolbox options/layers and reduced that to a podman command. The crucial option is `--userns=keep-id`, which sets off some `storage-chown-by-maps` process.

**Steps to reproduce the issue:**

1. This is the "fairly large" image:
```
podman pull ghcr.io/martinpitt/swaypod:latest
time podman create --userns=keep-id ghcr.io/martinpitt/swaypod:latest
```

2. This is the image that adds TeXlive (which makes it a few hundred MB larger):
```
podman pull ghcr.io/martinpitt/swaypod:allpkgs
time podman create --userns=keep-id ghcr.io/martinpitt/swaypod:allpkgs
```

**Describe the results you received:**

Step 1 takes 4 s on a Fedora 37 cloud VM (2 CPUs, 4 GiB RAM) with the default btrfs. On a standard RHEL 9.2 VM with XFS and on my laptop's Fedora 37 VM with /home being on ext4, it takes about 20  seconds. In `top` I see a process called "exe" which is taking 100% CPU:

```
PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
972 admin     20   0 1351936  65172  28028 S  96.0   1.7   0:12.33 exe
```

That is really this:
```
admin       1972 95.0  1.3 1351680 49344 pts/0   Sl+  04:04   0:01 storage-chown-by-maps /home/admin/.local/share/containers/storage/overlay/3cc2d72c07248c18a9185b6a5bba0e7932b0ce5c26dbc763e476eb50c2a7ea94/merged
```

With the larger image in step 2, the Fedora 37 btrfs VM takes merely 6s. However, both on the RHEL 9.2 XFS VM as well as my ext4 real-iron Fedora 37 laptop, the `storage-chown-by-maps` process never ends. After maybe half a minute it kills the VM (ssh dead, cannot log into the virsh console either), and my laptop becomes really sluggish, I cannot even start `top` any more. Trying to `kill -9` or even `sudo kill -9` (!) that `storage-chown-by-maps` does not work either, it's just unkillable.

**Describe the results you expected:**

The `storage-chown-by-maps` process should finish eventually, but ideally reasonably fast. This is more or less a glorified `chown -R`, no? that shouldn't take more than a few seconds.

**Additional information you deem important (e.g. issue happens only occasionally):** 100% reproducible, also in a synthetic cloud instance.

**Output of `podman version`:**

From Fedora 37:
```
Client:       Podman Engine
Version:      4.3.1
API Version:  4.3.1
Go Version:   go1.19.2
Built:        Fri Nov 11 16:01:27 2022
OS/Arch:      linux/amd64
```

current RHEL 9.2 also has podman 4.3.

**Output of `podman info`:**

```
host:
  arch: amd64
  buildahVersion: 1.28.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.5-1.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.5, commit: '
  cpuUtilization:
    idlePercent: 67.88
    systemPercent: 6.19
    userPercent: 25.92
  cpus: 8
  distribution:
    distribution: fedora
    version: "37"
  eventLogger: journald
  hostname: abakus
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.0.12-300.fc37.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 11440615424
  memTotal: 16533999616
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.7.2-1.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.7.2
      commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 94h 16m 47.00s (Approximately 3.92 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /var/home/martin/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 2
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/martin/.local/share/containers/storage
  graphRootAllocated: 228501188608
  graphRootUsed: 174334164992
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 2
  runRoot: /run/user/1000/containers
  volumePath: /home/martin/.local/share/containers/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 1668178887
  BuiltTime: Fri Nov 11 16:01:27 2022
  GitCommit: ""
  GoVersion: go1.19.2
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.1
```

**Package info (e.g. output of `rpm -q podman` or `apt list podman` or `brew info podman`):**

```
podman-4.3.1-1.fc37.x86_64
```

**Have you tested with the latest version of Podman and have you checked [the Podman Troubleshooting Guide](https://github.com/containers/podman/blob/main/troubleshooting.md)?**


Yes -- it's the latest version. The troubleshooting guide even recommends `--userns=keep-id` for some use cases, but that's what is broken.

**Additional environment details (AWS, VirtualBox, physical, etc.):** physical and QEMU.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--userns=keep-id storage-chown-by-maps kills machine on non-btrfs with large images #16830

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

--userns=keep-id storage-chown-by-maps kills machine on non-btrfs with large images #16830

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions