sys/net/gnrc/sock: fix race condition#18949
Conversation
| if (entry->type == GNRC_NETREG_TYPE_MBOX) { | ||
| msg_t msg; | ||
| while (mbox_try_get(entry->target.mbox, &msg)) { | ||
| gnrc_pktbuf_release_error(msg.content.ptr, EBADF); | ||
| if ((msg.type == GNRC_NETAPI_MSG_TYPE_RCV) | ||
| || (msg.type == GNRC_NETAPI_MSG_TYPE_SND)) { | ||
| gnrc_pktbuf_release_error(msg.content.ptr, EBADF); | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Again: I think it is cleaner, to move this whole block to (the yet to be created) gnrc_sock_close() (which also would call gnrc_netreg_unregister(). From our offline discussion yesterday, I think the content of the mbox is much more use case dependent than previously thought and only gnrc_sock really knows what it all put into it.
There was a problem hiding this comment.
But GNRC_NETAPI_MSG_TYPE_SND/RCV messages aren't exclusive to gnrc_sock, or am I missing something?
There was a problem hiding this comment.
No, but _TIMEOUT_MSG_TYPE and whatever gnrc_sock might throw in here (potentially with packet context) in here, is.
There was a problem hiding this comment.
Alterntaively SND/RCV could be cleaned out here, but there should also be a centralized clean-out within gnrc_sock to properly handle the rest.
There was a problem hiding this comment.
Let's focus on the bug here. We can restructure code and clean up things as follow up.
This is already more complex than I have hoped for as it is, but we need to fix this release-blocking issue.
benpicco
left a comment
There was a problem hiding this comment.
Looks good to me, the forward -> backward iterating over cib is confusing, but only some nitpicks from my side.
| msg_t garbage_bin; | ||
| mbox_try_get_with_type(®->mbox, &garbage_bin, _TIMEOUT_MSG_TYPE); |
There was a problem hiding this comment.
I think this still has potential for racey behavior. What if two threads are waiting with timeout? Then this clears out the timeout of the other thread. IMHO, the most ideal way to fix this to use the PID of the waiter as a value for the timeout message instead of some magic integer and check here, if the removed timeout message is actually the one for this thread. Since this requires to search for both type and value, I am not sure, if having this dedicated function makes that much sense.
There was a problem hiding this comment.
Is it legal to have two threads waiting on the same socket? If so, whats the use case?
On mutli-core systems one could use that to distribute load on multiple cores. I don't see this happen on RIOT.
|
To me shuffling the mbox queue feels like overkill and error prone. And there's now a feature gap between msg and mbox. And this solution is O(n) (even though n is really small here as it's a special case catching a race). And mbox is supposed to be multi-consumer, so multiple threads might be waiting with timeout (this must be broken already). And this is a ztimer quirk that's now fixed in gnrc. ... There was a thread flags based solution for xtimer and There's also a similar issue with |
Not really. The race condition is in GNRC. The ztimer quirk just makes the condition likely to trigger. I will also look into the ztimer quirk, but one bug at a time. |
Note that it is reasonable to assume that the expected message type will be sitting in the front of the queue. I think the fundamental issue here is that msg and mbox are a poor choice for the backbone of a network stack. And the consequence are an increased complexity to handle the lifecycle of network messages passing through the network stack, an increased chance to loosing IPC messages (or rather not being able to send without blocking) and having to drop network messages, and an increased number of corner cases and race conditions. I see no way to rework the architecture of GNRC within a reasonable time frame. The alternative is to add the features needed to fix the race with the use of msg/mbox. A simpler approach to fix the issue as it surfaces in I am open to suggestions on how to fix the issue. But other than moving away from mbox/msg (which certainly is out of scope for fixing a bug in an overdue release), I have none. I am also open to suggestions on how to optimize
I agree that this sucks. To me, the PR contains the option that sucks the least. |
Add a non-blocking function that allows draining the mbox queue from message of a certain type, while keeping the rest of the messages in the mbox intact for later consumption.
`gnrc_sock_recv()` blocks waiting on an `mbox` when receiving. The timeout implementation just sends a magic timeout message to that mbox when the timeout timer fires. The implementation assumed that the head of the mbox will always contain either the received message, or the magic timeout message, but never both. This assumption doesn't hold when the timeout fires after the message was received but before the timer was stopped. This can result in the subsequent call to `gnrc_sock_recv()` to incorrectly fail with a timeout right away (without actually waiting), or to a wild pointer (`_TIMEOUT_MAGIC`) being freed in `gnrc_netreg_unregister()` (whatever function call comes next). This fixes the issue by discarding a stale timeout message (if any) from the mbox while keeping the mbox queue otherwise intact.
When freeing any stale pktsnips from stale messages in the mbox, make sure that the messages actually contains a pktsnip before freeing.
0652d54 to
f988919
Compare
I'm just realizing that gnrc_sock is hand-rolling sth like Anyhow,
If this solution works and fixes the issue, can we just push it to the release branch, and find a proper solution for the future? Or, go with this now (as its already done) and let time solve this differently if pressing enough? (maybe mark the added mbox function somehow so there'll not be users should we want drop it?) |
|
I salvaged the uncontroversial parts. Also, the wild pointer freeing is giving me much more headache than the premature timeout error. After all, code that needs to communicate over network will have some fault tolerance and error handling in it, as in real life network communication things go wrong. A premature timeout due to a race condition is something that is not ideal, but should not break real world applications. I agree that a |
|
Btw., the doc of Lines 468 to 469 in 97bc825 This is not encouraging :-/ |
18632: tests/thread_float: do not overload slow MCUs with IRQs r=benpicco a=maribu
### Contribution description
If the regular context switches are triggered too fast, slow MCUs will be able to spent little time on actually progressing in the test. This will scale the IRQ rate with the CPU clock as a crude way too keep load within limits.
### Testing procedure
The unit test should now pass on the Microduino CoreRF
```
$ make BOARD=microduino-corerf AVRDUDE_PROGRAMMER=dragon_jtag -C tests/thread_float flash test
make: Entering directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
Building application "tests_thread_float" for "microduino-corerf" with MCU "atmega128rfa1".
[...]
text data bss dec hex filename
12834 520 3003 16357 3fe5 /home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.elf
avrdude -c dragon_jtag -p m128rfa1 -U flash:w:/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.hex
[...]
Welcome to pyterm!
Type '/exit' to exit.
READY
s
START
main(): This is RIOT! (Version: 2022.10-devel-858-g18566-tests/thread_float)
THREADS CREATED
Context switch every 3125 µs
{ "threads": [{ "name": "idle", "stack_size": 192, "stack_used": 88 }]}
{ "threads": [{ "name": "main", "stack_size": 640, "stack_used": 220 }]}
THREAD t1 start
THREAD t2 start
THREAD t3 start
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
t3: 141.466810
t1: 141.443770
make: Leaving directory '/home/maribu/Repos/software/RIOT/tests/thread_float'
```
(~~Note: The idle thread exiting is something that should never occur. I guess the culprit may be `cpu_switch_context_exit()` messing things up when the main thread exits. But that is not directly related to what this PR aims to fix. Adding a `thread_sleep()` at the end of `main()` does indeed prevent the idle thread from exiting.~~
Update: That's expected. The idle thread stats are printed on exit of the main thread, the idle thread does not actually exit.)
### Issues/PRs references
Fixes #16908 maybe?
18950: tests/unittests: add unit tests for core_mbox r=benpicco a=maribu
### Contribution description
As the title says
### Testing procedure
The test cases are run on `native` by Murdock anyway.
### Issues/PRs references
Split out of #18949
19030: tests/periph_timer_short_relative_set: improve test r=benpicco a=maribu
### Contribution description
Reduce the number lines to output by only testing for intervals 0..15 to speed up the test.
In addition, run each test case 128 repetitions (it is still faster than before) to give some confidence the short relative set actually succeeded.
### Testing procedure
The test application should consistently fail or succeed, rather than occasionally passing.
### Issues/PRs references
None
19085: makefiles/tests/tests.inc.mk: fix test/available target r=benpicco a=maribu
### Contribution description
`dist/tools/compile_and_test_for_board/compile_and_test_for_board.py` relies on `make test/available` to check if a test if available. However, this so far did not take `TEST_ON_CI_BLACKLIST` and `TEST_ON_CI_WHITELIST` into account, resulting in tests being executed for boards which they are not available. This should fix the issue.
### Testing procedure
#### Expected to fail
```
$ make BOARD=nrf52840dk -C tests/gcoap_fileserver test/available
$ make BOARD=microbit -C tests/log_color test/available
```
(On `master`, they succeed, but fail in this PR.)
#### Expected to succeed
```
$ make BOARD=native -C tests/gcoap_fileserver test/available
$ make BOARD=nrf52840dk -C tests/pkg_edhoc_c test/available
$ make BOARD=nrf52840dk -C tests/log_color test/available
```
(Succeed in both `master` and this PR.)
### Issues/PRs references
None
Co-authored-by: Marian Buschsieweke <[email protected]>
18950: tests/unittests: add unit tests for core_mbox r=benpicco a=maribu ### Contribution description As the title says ### Testing procedure The test cases are run on `native` by Murdock anyway. ### Issues/PRs references Split out of #18949 Co-authored-by: Marian Buschsieweke <[email protected]>
Contribution description
When in
gnrc_sock_recv()a timeout triggers after a message came in but before the timeout timer was cancelled, a stale magic timeout message was left in the mbox message queue. This results ingnrc_pktbuf_release_error()if agnrc_netreg_unregister()is called next on the socket (with the number_TIMEOUT_MAGICinterpreted as a pointer to a pktsnip)gnrc_sock_recv()failing with a timeout error on the next call right away (without actually waiting)This also contains the required infrastructure needed to fix the bug.
Finally, the cleanup code in
gnrc_netreg_unregister()is made more fault tolerant in that it ignores (and drops) unexpected messages in the mbox. This may not be ideal, but it is certainly better than interpreting the content of the message as a pointer to agnrc_pktsnip_tthat is in need of being released.Testing procedure
tests/pkt_edhoc_cin combination with GCC >= 12.x reliably triggered the issue for me on both the nRF52830DK an the Nucleo-F767ZI.Issues/PRs references
None