Skip to content

cpu/stm32/periph_eth: fix error handling in send()#15783

Merged
benpicco merged 2 commits intoRIOT-OS:masterfrom
maribu:stm32_eth_fix_error_handling
Jan 22, 2021
Merged

cpu/stm32/periph_eth: fix error handling in send()#15783
benpicco merged 2 commits intoRIOT-OS:masterfrom
maribu:stm32_eth_fix_error_handling

Conversation

@maribu
Copy link
Copy Markdown
Member

@maribu maribu commented Jan 15, 2021

Contribution description

An earlier version of periph_eth used to always pack the first chunk of the
outgoing frame to the first DMA descriptor by telling the DMA to jump back
to the first descriptor within the last descriptor. This worked fine unless
the frame was send in one chunk (as e.g. lwip does), which resulted due to a
hardware bug in a frame being send out twice. For that reason, the behavior was
changed to cycle throw the linked DMA descriptor list in round-robin fashion.
However, the error checking was not updated accordingly. Hence, the error
check might run over (parts of) unrelated frames and fail to detect errors
correctly.

This commit fixes the issue and also provides proper return codes for errors.

Additionally, an DMA reset is performed on detected errors during RX/TX. I'm
not sure if/when this is needed, as error conditions are neigh impossible to
produce. But better be safe than sorry.

Testing procedure

  1. Connect e.g. a Nucleo-F767ZI to a Linux host via Ethernet
  2. Flash the Nucleo
    • Set ENABLE_DEBUG (but not ENABLE_DEBUG_VERBOSE!) to 1 in cpu/stm32/periph/eth.c to see if errors actually occur.
    • Flash with USEMODULE=stm32_eth_auto make flash term -C examples/gnrc_networking.
    • The module stm32_eth_auto enables auto-negotiation. (We might want to make this the default, btw.)
  3. Provoke errors:
    • On the Linux side, run sudo ethtool -s eth0 advertise 0x001. This result in the Linux only advertising 10 Mbps half duplex.
    • Due to auto-negotation enabled, the RIOT node will ignore whatever speed and duplex was configured and negotiate with the Linux host, which only offers 10 Mbps half duplex - hence this will be chosen.
    • Due to half-duplex, collisions can occur
  4. Test the shit out of it
    • On the Linux side, run ping -I eth0 -s 1024 -i 0.001 <RIOT_IPV6_ADDR>. You might need to run this with sudo to be able to send at 1 kHz
    • On the RIOT size, run ping -c 100 -i 1 -s 1024 <LINUX_IPV6_ADDR>.

You should see reasonable error messages, some packet loss via ping, but connectivity should remain unaffected.

Issues/PRs references

None

@maribu maribu added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer Area: drivers Area: Device drivers Area: cpu Area: CPU/MCU ports labels Jan 15, 2021
@maribu maribu requested a review from MrKevinWeiss as a code owner January 15, 2021 21:57
@maribu
Copy link
Copy Markdown
Member Author

maribu commented Jan 15, 2021

I added the minor label as I don't expect that the error handling code will be much executed, if ever.

@maribu maribu force-pushed the stm32_eth_fix_error_handling branch from ff77602 to 1ddacfe Compare January 20, 2021 09:34
An earlier version of periph_eth used to always pack the first chunk of the
outgoing frame to the first DMA descriptor by telling the DMA to jump back
to the first descriptor within the last descriptor. This worked fine unless
the frame was send in one chunk (as e.g. lwip does), which resulted due to a
hardware bug in a frame being send out twice. For that reason, the behavior was
changed to cycle throw the linked DMA descriptor list in round-robin fashion.
However, the error checking was not updated accordingly. Hence, the error
check might run over (parts of) unrelated frames and fail to detect errors
correctly.

This commit fixes the issue and also provides proper return codes for errors.

Additionally, an DMA reset is performed on detected errors during RX/TX. I'm
not sure if/when this is needed, as error conditions are neigh impossible to
produce. But better be safe than sorry.
Add ENABLE_DEBUG_VERBOSE flag, so that the noise during debugging can be
reduced. This is super helpful when testing under load, as otherwise there is
just too much noise in the output.
@maribu maribu force-pushed the stm32_eth_fix_error_handling branch from 1ddacfe to 21264b8 Compare January 20, 2021 09:42
@maribu maribu requested a review from aabadie January 20, 2021 09:42
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Jan 20, 2021
@benpicco
Copy link
Copy Markdown
Contributor

Looks good to me!

@benpicco benpicco merged commit 46337ef into RIOT-OS:master Jan 22, 2021
@maribu
Copy link
Copy Markdown
Member Author

maribu commented Jan 22, 2021

Thanks :-)

@maribu maribu deleted the stm32_eth_fix_error_handling branch January 22, 2021 19:30
@kaspar030 kaspar030 added this to the Release 2021.04 milestone Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: cpu Area: CPU/MCU ports Area: drivers Area: Device drivers CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Impact: minor The PR is small in size and might only require a quick look of a knowledgeable reviewer Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants