Skip to content

udev assigns same ID_PATH and ID_SAS_PATH for disks connected to intel SCU ports on C606 chipset #7157

@dghodgson

Description

@dghodgson

Submission type

  • Bug report

systemd version the issue has been seen with

219

Looking through a diff of the source code for udev-builtin-path_id.c between v219 and v235 suggests that this issue is not fixed in the latest release. As this is happening on a client's server which is in production, I may not be able to test newer releases (but am willing to try).

Used distribution

CentOS 7.4

Bug Description

The intel SCU on SuperMicro's X9SRi-3F (C606 shipset) is a dual 4-port unit that supports SAS devices.

The kernel populates the /sys filesystem such that two "host#" folders are created under the pci device node like so:

ls -l /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0 | grep host
drwxr-xr-x. 14 root root 0 Oct 17 12:26 host6
drwxr-xr-x. 12 root root 0 Oct 17 12:26 host7

The full list of block devices under the SCU is as follows (only 6 out of 8 possible devices connected)

/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:0/end_device-6:0/target6:0:0/6:0:0:0/block/sda
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:1/end_device-6:1/target6:0:1/6:0:1:0/block/sdb
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:2/end_device-6:2/target6:0:2/6:0:2:0/block/sdc
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host6/port-6:3/end_device-6:3/target6:0:3/6:0:3:0/block/sdd
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host7/port-7:0/end_device-7:0/target7:0:0/7:0:0:0/block/sde
/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:08.0/0000:03:00.0/host7/port-7:1/end_device-7:1/target7:0:1/7:0:1:0/block/sdf

Udev can't tell the difference between the devices being members of host6 vs host7, so when the symlinks are created in /dev/disk/by-path, you get something like the following:

lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-0x5fcfffff00000001-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-0x5fcfffff00000002-lun-0 -> ../../sdb
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-0x5fcfffff00000003-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 18 05:49 pci-0000:03:00.0-sas-0x5fcfffff00000004-lun-0 -> ../../sdd
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-phy0-lun-0 -> ../../sda
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-phy1-lun-0 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Oct 18 05:49 pci-0000:03:00.0-sas-phy2-lun-0 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Oct 18 05:48 pci-0000:03:00.0-sas-phy3-lun-0 -> ../../sdd

Note that only four out of the six devices are present.

What is happening here is that once a drive on host7 is recognized (either during boot, when udev rules are reloaded manually, or when hot-plugging a drive), a symlink is created which overwrites the one previously made for the corresponding device on host6 (and vice-versa when applicable).

This makes it impossible to reliably reference devices by their device path in the system.

Based on observing the code in udev-builtin-path_id.c, I believe this is caused by improper handling of the device path by handle_scsi_sas. I do not believe this would be an issue if the device paths were handled by handle_scsi_default instead.

I think being able to change a config file, or pass an argument to the path_id built-in which overrides normal behavior to explicitly use handle_scsi_default would be ideal, as system administrators would be given an alternative option in case the default behavior fails to work as expected.

Before filing this report, I left comments on issue #3943 which seemed at first to be related to my problem.

In case of bug report: Expected behaviour you didn't see

All drives connected to intel SCU having correct symlinks in /dev/disk/by-path

In case of bug report: Unexpected behaviour you saw

Symlinks in /dev/disk/by-path are being overwritten due to incorrect naming

In case of bug report: Steps to reproduce the problem

Connect a drive to port 0 on the SCU, then connect a drive on port 4 (or 1/5, 2/6, 3/7, and their reverse pairings). The symlink in /dev/disk/by-path for the first drive will be overwritten by the symlink for the second drive.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions