Skip to content

Crash in clusterRebalanceServers due to nil authorizer #2808

@ibot3

Description

@ibot3

Is there an existing issue for this?

  • There is no existing issue for this bug

Is this happening on an up to date version of Incus?

  • This is happening on a supported version of Incus

Incus system details

Incus 6.20 on Debian 13

api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
- oidc
auth_user_name: root
auth_user_method: unix
environment:
  architectures:
  - x86_64
  - i686
  driver: lxc | qemu
  driver_version: 6.0.5 | 10.1.3
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_binfmt: "true"
    unpriv_fscaps: "true"
  kernel_version: 6.12.57+deb13-amd64
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Debian GNU/Linux
  os_version: "13"
  project: default
  server: incus
  server_clustered: true
  server_event_mode: full-mesh
  server_name: lovelace
  server_pid: 4192854
  server_version: "6.20"
  storage: ceph
  storage_version: 18.2.7
  storage_supported_drivers:
  - name: lvm
    version: 2.03.31(2) (2025-02-27) / 1.02.205 (2025-02-27) / 4.48.0
    remote: false
  - name: truenas
    version: 0.7.3
    remote: true
  - name: ceph
    version: 18.2.7
    remote: true
  - name: cephfs
    version: 18.2.7
    remote: true
  - name: cephobject
    version: 18.2.7
    remote: true
  - name: dir
    version: "1"
    remote: false

Instance details

No response

Instance log

No response

Current behavior

Incus started to crash multiple times within one minute with this error:

Jan 08 09:52:25 node1 incusd[3925000]: panic: runtime error: invalid memory address or nil pointer dereference
Jan 08 09:52:25 node1 incusd[3925000]: [signal SIGSEGV: segmentation violation code=0x1 addr=0xf8 pc=0xba8ca6]
Jan 08 09:52:25 node1 incusd[3925000]: goroutine 593 [running]:
Jan 08 09:52:25 node1 incusd[3925000]: net/http.(*Request).Context(...)
Jan 08 09:52:25 node1 incusd[3925000]:         /usr/local/go/src/net/http/request.go:353
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/project.CheckClusterTargetRestriction({0x7fd7bc57a400, 0xc0001de750}, 0x0, 0x1?, {0xc000bbcc40?, 0xb})
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/project/permissions.go:1615 +0x86
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/project.CheckTarget({0x27b1170, 0xc0001d5180}, {0x7fd7bc57a400?, 0xc0001de750?}, 0xc000865430?, 0xc000d72fd0, 0xc0007da1c0, {0xc000bbcc40?, 0xc000865458?}, {0xc000865528, ...})
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/project/permissions.go:1752 +0xc8
Jan 08 09:52:25 node1 incusd[3925000]: main.clusterRebalanceServers.func1({0x27b1170, 0xc0001d5180}, 0xc000d72fd0)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/cmd/incusd/api_cluster_rebalance.go:144 +0x331
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/db.(*Cluster).transaction.func1.1({0x27b1170?, 0xc0001d5180?}, 0xc0001d5180?)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/db/db.go:344 +0x45
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/db/query.Transaction({0x27b1100?, 0xc000e90870?}, 0xc000ce6410, 0xc000865750)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/db/query/transaction.go:30 +0x198
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/db.(*Cluster).transaction.func1({0x27b1100, 0xc000e90870})
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/db/db.go:347 +0x56
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/db/query.Retry({0x27b1100, 0xc000e90870}, 0xc000865808)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/db/query/retry.go:29 +0xba
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/db.(*Cluster).transaction(0xc000a25800, {0x27b1100, 0xc000e90870}, 0xc000865be0)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/db/db.go:341 +0x6f
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/db.(*Cluster).Transaction(0x1ef8100?, {0x27b1100?, 0xc000e90870?}, 0x9?)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/db/db.go:305 +0xa5
Jan 08 09:52:25 node1 incusd[3925000]: main.clusterRebalanceServers({0x27b1100, 0xc000e90870}, 0xc000731b00, 0xc000eee3c0, 0xc0007e0780, 0x3)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/cmd/incusd/api_cluster_rebalance.go:117 +0x15b
Jan 08 09:52:25 node1 incusd[3925000]: main.clusterRebalance({0x27b1100, 0xc000e90870}, 0xc000731b00, 0xc000b3b530)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/cmd/incusd/api_cluster_rebalance.go:303 +0x62a
Jan 08 09:52:25 node1 incusd[3925000]: main.autoRebalanceCluster({0x27b1100, 0xc000e90870}, 0x38383e0?)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/cmd/incusd/api_cluster_rebalance.go:357 +0x24b
Jan 08 09:52:25 node1 incusd[3925000]: main.autoRebalanceClusterTask.func1({0x27b1100, 0xc000e90870})
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/cmd/incusd/api_cluster_rebalance.go:384 +0x188
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/task.(*Task).loop(0xc000b486a8, {0x27b1100, 0xc000e90870})
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/task/task.go:76 +0x1a5
Jan 08 09:52:25 node1 incusd[3925000]: github.com/lxc/incus/v6/internal/server/task.(*Group).Start.func1(0x4)
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/task/group.go:63 +0x65
Jan 08 09:52:25 node1 incusd[3925000]: created by github.com/lxc/incus/v6/internal/server/task.(*Group).Start in goroutine 1
Jan 08 09:52:25 node1 incusd[3925000]:         /build/incus/internal/server/task/group.go:60 +0x2c5

After incus on node1 had too many restarts and was no restarted by systemd anymore, incus on node2 also had a (different) segfault:

Jan 08 09:53:27 node2 incusd[4053042]: panic: runtime error: invalid memory address or nil pointer dereference
Jan 08 09:53:27 node2 incusd[4053042]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x40 pc=0x1bd9f75]
Jan 08 09:53:27 node2 incusd[4053042]: goroutine 226870 [running]:
Jan 08 09:53:27 node2 incusd[4053042]: main.ImageDownload({0x27b0ed0, 0x3d8ad40}, 0xc000b19900, 0xc000fc1500, 0xc0005ee280, 0xc001c45a88)
Jan 08 09:53:27 node2 incusd[4053042]:         /build/incus/cmd/incusd/daemon_images.go:382 +0x2335
Jan 08 09:53:27 node2 incusd[4053042]: main.imgPostRemoteInfo({0x27b0ed0, 0x3d8ad40}, 0xc000fc1500, 0xc000b19900, {{0x0, 0x0, 0x0, {0x0, 0x0, 0x0}, ...}, ...}, ...)
Jan 08 09:53:27 node2 incusd[4053042]:         /build/incus/cmd/incusd/images.go:538 +0x1d0
Jan 08 09:53:27 node2 incusd[4053042]: main.imagesPost.func3(0xc0005ee280)
Jan 08 09:53:27 node2 incusd[4053042]:         /build/incus/cmd/incusd/images.go:1259 +0x16c
Jan 08 09:53:27 node2 incusd[4053042]: github.com/lxc/incus/v6/internal/server/operations.(*Operation).Start.func1(0xc0005ee280)
Jan 08 09:53:27 node2 incusd[4053042]:         /build/incus/internal/server/operations/operations.go:306 +0x26
Jan 08 09:53:27 node2 incusd[4053042]: created by github.com/lxc/incus/v6/internal/server/operations.(*Operation).Start in goroutine 226793
Jan 08 09:53:27 node2 incusd[4053042]:         /build/incus/internal/server/operations/operations.go:305 +0x106

The segfault on node2 only caused one crash.

Expected behavior

No segfaults.

Steps to reproduce

Unknown. I tried to trigger a rebalance, but incus didn't start to rebalance and no log messages.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions