Describe the bug
There's something wrong with the way checksums are handled when enabling CONFIG_ETH_STM32_HAL_API_V2, with the result being that with some PHYs, Ethernet does not work at all. This is easily observed by pinging a board exhibiting this issue, and observing that all ICMP replies have a 0 checksum. Note that this behavior apparently breaks all network traffic, not just ICMP packets.
We're seeing this on a custom board of ours where we have an STM32F767ZI connected to the management port of a switch IC over RMII, but not on a Nucleo F767ZI development kit. @driechers have apparently observed the same problem on a Nucleo F756ZG based on his comment on #46596.
Disabling CONFIG_ETH_STM32_HAL_API_V2 avoids the problem, but is less robust in other ways, and is hence not a solution, at least not for our application.
To Reproduce
Steps to reproduce the behavior:
- Use a board with STM32F7 (and likely H7) and a problematic PHY (e.g. at least not Nucleo F767ZI)
- Enable
CONFIG_ETH_STM32_HAL_API_V2
- Flash an Ethernet-enabled application to the board
- Ping the board from a connected PC
- Observe that Wireshark sees the replies, but that the checksum of the ICMP packet is 0:
Frame 968: 74 bytes on wire (592 bits), 74 bytes captured (592 bits) on interface \Device\NPF_{D5F26515-0661-4924-B26D-309CFEEA8792}, id 0
Ethernet II, Src: STMicroe_20:34:38 (00:80:e1:20:34:38), Dst: Private_38:28:c5 (80:6d:97:38:28:c5)
Destination: Private_38:28:c5 (80:6d:97:38:28:c5)
Source: STMicroe_20:34:38 (00:80:e1:20:34:38)
Type: IPv4 (0x0800)
Internet Protocol Version 4, Src: 10.0.37.239, Dst: 10.0.37.1
0100 .... = Version: 4
.... 0101 = Header Length: 20 bytes (5)
Differentiated Services Field: 0x00 (DSCP: CS0, ECN: Not-ECT)
Total Length: 60
Identification: 0x0000 (0)
000. .... = Flags: 0x0
...0 0000 0000 0000 = Fragment Offset: 0
Time to Live: 64
Protocol: ICMP (1)
Header Checksum: 0x1bd2 [validation disabled]
[Header checksum status: Unverified]
Source Address: 10.0.37.239
Destination Address: 10.0.37.1
Internet Control Message Protocol
Type: 0 (Echo (ping) reply)
Code: 0
Checksum: 0x0000 incorrect, should be 0x516a
[Expert Info (Warning/Checksum): Bad checksum [should be 0x516a]]
[Checksum Status: Bad]
Identifier (BE): 1 (0x0001)
Identifier (LE): 256 (0x0100)
Sequence Number (BE): 1009 (0x03f1)
Sequence Number (LE): 61699 (0xf103)
Data (32 bytes)
Data: 6162636465666768696a6b6c6d6e6f7071727374757677616263646566676869
[Length: 32]
Expected behavior
Packets should have a proper checksum, independent of whether they are calculated in software or hardware.
Workaround
By disabling hardware checksumming with this patch, things work as expected:
diff --git a/drivers/ethernet/eth_stm32_hal.c b/drivers/ethernet/eth_stm32_hal.c
index 75b48daff5..da1ced69a5 100644
--- a/drivers/ethernet/eth_stm32_hal.c
+++ b/drivers/ethernet/eth_stm32_hal.c
@@ -1202,7 +1202,7 @@ static int eth_initialize(const struct device *dev)
memset(&tx_config, 0, sizeof(ETH_TxPacketConfig));
tx_config.Attributes = ETH_TX_PACKETS_FEATURES_CSUM |
ETH_TX_PACKETS_FEATURES_CRCPAD;
- tx_config.ChecksumCtrl = ETH_CHECKSUM_IPHDR_PAYLOAD_INSERT_PHDR_CALC;
+ tx_config.ChecksumCtrl = ETH_CHECKSUM_DISABLE;
tx_config.CRCPadCtrl = ETH_CRC_PAD_INSERT;
#endif /* CONFIG_SOC_SERIES_STM32H7X || CONFIG_ETH_STM32_HAL_API_V2 */
Impact
With the workaround, things work as expected, and this hence isn't a showstopper, but it still feels like a pretty significant issue that needs a proper fix.
Environment (please complete the following information):
- OS: Windows 11
- Toolchain: Zephyr SDK 0.15.0
- Zephyr: v3.3.0
Describe the bug
There's something wrong with the way checksums are handled when enabling
CONFIG_ETH_STM32_HAL_API_V2, with the result being that with some PHYs, Ethernet does not work at all. This is easily observed by pinging a board exhibiting this issue, and observing that all ICMP replies have a 0 checksum. Note that this behavior apparently breaks all network traffic, not just ICMP packets.We're seeing this on a custom board of ours where we have an STM32F767ZI connected to the management port of a switch IC over RMII, but not on a Nucleo F767ZI development kit. @driechers have apparently observed the same problem on a Nucleo F756ZG based on his comment on #46596.
Disabling
CONFIG_ETH_STM32_HAL_API_V2avoids the problem, but is less robust in other ways, and is hence not a solution, at least not for our application.To Reproduce
Steps to reproduce the behavior:
CONFIG_ETH_STM32_HAL_API_V2Expected behavior
Packets should have a proper checksum, independent of whether they are calculated in software or hardware.
Workaround
By disabling hardware checksumming with this patch, things work as expected:
Impact
With the workaround, things work as expected, and this hence isn't a showstopper, but it still feels like a pretty significant issue that needs a proper fix.
Environment (please complete the following information):