tests: add tests for netdev flooding race-condition#11256
tests: add tests for netdev flooding race-condition#11256miri64 wants to merge 1 commit intoRIOT-OS:masterfrom
Conversation
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. If you want me to ignore this issue, please mark it with the "State: don't stale" label. Thank you for your contributions. |
|
note that all radios (and probably network devices) are affected by this issue, because the driver ISR and the send messages are handled by the same thread. EDIT: All radios that share the framebuffer for sending/receiving |
From how I understand the issue here (and please correct me if I'm wrong) this is caused by shared frame buffer of the at86rf2xx radio. It has a single 128B buffer used for both the TX PDU and the RX PDU The mrf24j40, cc2420 and the nrf52840 radios all have separate transmit and receive buffers (I didn't check or know about the remaining few), so for those radios it is not possible to overwrite the receive buffer with the PDU from the |
If the radio has a different TX and RX framebuffer, then yes... this problem doesn't happen. |
|
Side-note: while selective fragment recovery (#12303) works very well for mitigating this bug (still waiting for the high-load results, but when taking it slow I get 100% success-rate even when forwarding) it causes the forwarder to create VRB entries to itself: When a fragment is forwarded, it can happen that the forwarder due to this bug receives it itself, determining that it is not the destination of this fragment, and creating a VRB for it with its own address as the source address. The only bad thing is that when ever there is an ACK to be forwarded to the same tag that erroneous read packet is send to, it is forwarded on the medium to the forwarder itself (so causing energy problems), but other that again: some of the problems of this bug are mitigated. For my experiments I will just merge #11264 into the branch my experiments will be based on and use |
|
@fjmolinas sure! |
|
So when I flash those on two and on the other So I guess it's working? Might also mean the boards are stuck 😉 edit: Now I got But I guess that's just another node sending periodic announcements. |
| #if defined(MODULE_AT86RF2XX) | ||
| #define NETDEV_ADDR_LEN (2U) | ||
| #define NETDEV_FLOOD_HDR_SEQ_OFFSET (2U) | ||
| /* IEEE 802.15.4 header */ | ||
| #define NETDEV_FLOOD_HDR { 0x71, 0x98, /* FCF */ \ | ||
| 0xa1, /* Sequence number 161 */ \ | ||
| 0x23, 0x00, /* PAN ID 0x23 */ \ | ||
| 0x0e, 0x50, /* 0x500e (start of NETDEV_FLOOD_TARGET) */ \ | ||
| 0x0e, 0x12, /* 0x120e (start of NETDEV_FLOOD_SOURCE) */ } | ||
| #else |
There was a problem hiding this comment.
Why is this at86rf2xx specific?
There was a problem hiding this comment.
Because this test is trying to point out a problem in the at86rf2xx implementation. For other drivers other data might be needed. See also https://github.com/RIOT-OS/RIOT/pull/11256/files#diff-f29476e871b447f52f062cd754786b75R53-R54
There was a problem hiding this comment.
But the 802.15.4 header should be the same for all 802.15.4 devices - what configuration would that be?
There was a problem hiding this comment.
The 802.15.4 header, yes. I don't remember if this data was specific to the error case though.
|
Is this still relevant? |
|
I guess, if all 802.15.4 devices are ported to the new |
Contribution description
While working on #11068, I noticed a race condition within the state machine (see p. 51 in the datasheet) of the
at86rf2xxdevice driver:.
This PR introduces two accompanying applications that reproduce this race condition.
netdev_flood_flooderthat sends IEEE 802.15.4 frames periodically every 5msnetdev_flood_replierthat receives those frames and tries to reply to them with different content after a 2ms delayIf they would succeed the
netdev_flood_replierapplication would just receive the frames sent bynetdev_flood_flooder, however due to the discovered race condition it may happen that it reads the data it just sent.Testing procedure
In general: check the READMEs, they should describe it very well. But here is the rundown.
Compile and flash
tests/netdev_flood_flooderfirstCheck the output with
Then compile and flash
tests/netdev_flood_repliertooUse the
make testtarget to check the output. If the following message is not shown it is successful (which it shouldn't be in the current master).Issues/PRs references
Issue made clear with these tests was found and is making problems for #11068.