-
-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Description
Submission type
- Bug report
systemd version the issue has been seen with
219
Looking through a diff of the source code for udev-builtin-path_id.c between v219 and v235 suggests that this issue is not fixed in the latest release. As this is happening on a client's server which is in production, I may not be able to test newer releases (but am willing to try).
Used distribution
CentOS 7.4
Bug Description
The intel SCU on SuperMicro's X9SRi-3F (C606 shipset) is a dual 4-port unit that supports SAS devices.
The kernel populates the /sys filesystem such that two "host#" folders are created under the pci device node like so:
ls -l /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0 | grep host
drwxr-xr-x. 14 root root 0 Oct 17 12:26 host6
drwxr-xr-x. 12 root root 0 Oct 17 12:26 host7
The full list of block devices under the SCU is as follows (only 6 out of 8 possible devices connected)
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:0/end_device-6:0/target6:0:0/6:0:0:0/block/sda
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:1/end_device-6:1/target6:0:1/6:0:1:0/block/sdb
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:2/end_device-6:2/target6:0:2/6:0:2:0/block/sdc
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:3/end_device-6:3/target6:0:3/6:0:3:0/block/sdd
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host7/port-7:0/end_device-7:0/target7:0:0/7:0:0:0/block/sde
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host7/port-7:1/end_device-7:1/target7:0:1/7:0:1:0/block/sdf
Udev can't tell the difference between the devices being members of host6 vs host7, so when the symlinks are created in /dev/disk/by-path, you get something like the following:
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-0x5fcfffff00000002-lun-0 -> ../../sdb
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-0x5fcfffff00000003-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 18 05:49 pci-0000:03:00.0-sas-0x5fcfffff00000004-lun-0 -> ../../sdd
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-phy0-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-phy1-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 18 05:49 pci-0000:03:00.0-sas-phy2-lun-0 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-phy3-lun-0 -> ../../sdd
Note that only four out of the six devices are present.
What is happening here is that once a drive on host7 is recognized (either during boot, when udev rules are reloaded manually, or when hot-plugging a drive), a symlink is created which overwrites the one previously made for the corresponding device on host6 (and vice-versa when applicable).
This makes it impossible to reliably reference devices by their device path in the system.
Based on observing the code in udev-builtin-path_id.c, I believe this is caused by improper handling of the device path by handle_scsi_sas. I do not believe this would be an issue if the device paths were handled by handle_scsi_default instead.
I think being able to change a config file, or pass an argument to the path_id built-in which overrides normal behavior to explicitly use handle_scsi_default would be ideal, as system administrators would be given an alternative option in case the default behavior fails to work as expected.
Before filing this report, I left comments on issue #3943 which seemed at first to be related to my problem.
In case of bug report: Expected behaviour you didn't see
All drives connected to intel SCU having correct symlinks in /dev/disk/by-path
In case of bug report: Unexpected behaviour you saw
Symlinks in /dev/disk/by-path are being overwritten due to incorrect naming
In case of bug report: Steps to reproduce the problem
Connect a drive to port 0 on the SCU, then connect a drive on port 4 (or 1/5, 2/6, 3/7, and their reverse pairings). The symlink in
/dev/disk/by-pathfor the first drive will be overwritten by the symlink for the second drive.