sys/net/gnrc/sock: fix race condition by maribu · Pull Request #18949 · RIOT-OS/RIOT

maribu · 2022-11-22T12:35:48Z

Contribution description

When in gnrc_sock_recv() a timeout triggers after a message came in but before the timeout timer was cancelled, a stale magic timeout message was left in the mbox message queue. This results in

Passing a wild pointer into gnrc_pktbuf_release_error() if a gnrc_netreg_unregister() is called next on the socket (with the number _TIMEOUT_MAGIC interpreted as a pointer to a pktsnip)
gnrc_sock_recv() failing with a timeout error on the next call right away (without actually waiting)

This also contains the required infrastructure needed to fix the bug.

Finally, the cleanup code in gnrc_netreg_unregister() is made more fault tolerant in that it ignores (and drops) unexpected messages in the mbox. This may not be ideal, but it is certainly better than interpreting the content of the message as a pointer to a gnrc_pktsnip_t that is in need of being released.

Testing procedure

tests/pkt_edhoc_c in combination with GCC >= 12.x reliably triggered the issue for me on both the nRF52830DK an the Nucleo-F767ZI.

Issues/PRs references

None

miri64 · 2022-11-22T12:48:00Z

sys/net/gnrc/netreg/gnrc_netreg.c

    if (entry->type == GNRC_NETREG_TYPE_MBOX) {
        msg_t msg;
        while (mbox_try_get(entry->target.mbox, &msg)) {
-            gnrc_pktbuf_release_error(msg.content.ptr, EBADF);
+            if ((msg.type == GNRC_NETAPI_MSG_TYPE_RCV)
+                    || (msg.type == GNRC_NETAPI_MSG_TYPE_SND)) {
+                gnrc_pktbuf_release_error(msg.content.ptr, EBADF);
+            }
        }
    }


Again: I think it is cleaner, to move this whole block to (the yet to be created) gnrc_sock_close() (which also would call gnrc_netreg_unregister(). From our offline discussion yesterday, I think the content of the mbox is much more use case dependent than previously thought and only gnrc_sock really knows what it all put into it.

But GNRC_NETAPI_MSG_TYPE_SND/RCV messages aren't exclusive to gnrc_sock, or am I missing something?

No, but _TIMEOUT_MSG_TYPE and whatever gnrc_sock might throw in here (potentially with packet context) in here, is.

Alterntaively SND/RCV could be cleaned out here, but there should also be a centralized clean-out within gnrc_sock to properly handle the rest.

Let's focus on the bug here. We can restructure code and clean up things as follow up.

This is already more complex than I have hoped for as it is, but we need to fix this release-blocking issue.

benpicco

Looks good to me, the forward -> backward iterating over cib is confusing, but only some nitpicks from my side.

core/mbox.c

core/include/mbox.h

sys/net/gnrc/netreg/gnrc_netreg.c

miri64 · 2022-11-22T12:59:28Z

sys/net/gnrc/sock/gnrc_sock.c

+        msg_t garbage_bin;
+        mbox_try_get_with_type(&reg->mbox, &garbage_bin, _TIMEOUT_MSG_TYPE);


I think this still has potential for racey behavior. What if two threads are waiting with timeout? Then this clears out the timeout of the other thread. IMHO, the most ideal way to fix this to use the PID of the waiter as a value for the timeout message instead of some magic integer and check here, if the removed timeout message is actually the one for this thread. Since this requires to search for both type and value, I am not sure, if having this dedicated function makes that much sense.

Is it legal to have two threads waiting on the same socket? If so, whats the use case?

On mutli-core systems one could use that to distribute load on multiple cores. I don't see this happen on RIOT.

kaspar030 · 2022-11-22T13:13:31Z

To me shuffling the mbox queue feels like overkill and error prone. And there's now a feature gap between msg and mbox. And this solution is O(n) (even though n is really small here as it's a special case catching a race). And mbox is supposed to be multi-consumer, so multiple threads might be waiting with timeout (this must be broken already). And this is a ztimer quirk that's now fixed in gnrc. ...

There was a thread flags based solution for xtimer and msg_recv_timeout() (#13429). I wasn't following it because it added 200b of core_thread_flags dependency (and closed all xtimer PRs at some point), and there were multiple competing solutions (I don't think any got in).

There's also a similar issue with msg_recv_timeout() that ztimer inherited from xtimer.

maribu · 2022-11-22T14:12:25Z

And this is a ztimer quirk that's now fixed in gnrc. ...

Not really. The race condition is in GNRC. The ztimer quirk just makes the condition likely to trigger.

I will also look into the ztimer quirk, but one bug at a time.

maribu · 2022-11-22T14:35:48Z

And this solution is O(n)

Note that it is reasonable to assume that the expected message type will be sitting in the front of the queue.

I think the fundamental issue here is that msg and mbox are a poor choice for the backbone of a network stack. And the consequence are an increased complexity to handle the lifecycle of network messages passing through the network stack, an increased chance to loosing IPC messages (or rather not being able to send without blocking) and having to drop network messages, and an increased number of corner cases and race conditions.

I see no way to rework the architecture of GNRC within a reasonable time frame. The alternative is to add the features needed to fix the race with the use of msg/mbox.

A simpler approach to fix the issue as it surfaces in tests/pkg_edhoc_c is a simple mbox_try_peek(). However, can we be sure that the next message in the mbox() is the stale timeout? What if another thread preempted the calling one just before ztimer_remove() is called and multiple network messages are received before the stale timer fires and sends a stale timeout message? I think we cannot be sure. So we really have to iterate through the whole mbox message queue and remove stale timeouts.

I am open to suggestions on how to fix the issue. But other than moving away from mbox/msg (which certainly is out of scope for fixing a bug in an overdue release), I have none.

I am also open to suggestions on how to optimize mbox_try_get_with_type().

E.g. we could just move the first element into the position of the found message. The cost would still be O(n) (due to the search) and we may add a footgun by changing the order.
We could just add a magic emtpy flag into messages and fragment the queue space. It would still be O(n) for the search and we may have a situation with a single message in the queue preventing new messages to be stored non-blockingly, regardless of the size of the queue.

I agree that this sucks. To me, the PR contains the option that sucks the least.

Add a non-blocking function that allows draining the mbox queue from message of a certain type, while keeping the rest of the messages in the mbox intact for later consumption.

`gnrc_sock_recv()` blocks waiting on an `mbox` when receiving. The timeout implementation just sends a magic timeout message to that mbox when the timeout timer fires. The implementation assumed that the head of the mbox will always contain either the received message, or the magic timeout message, but never both. This assumption doesn't hold when the timeout fires after the message was received but before the timer was stopped. This can result in the subsequent call to `gnrc_sock_recv()` to incorrectly fail with a timeout right away (without actually waiting), or to a wild pointer (`_TIMEOUT_MAGIC`) being freed in `gnrc_netreg_unregister()` (whatever function call comes next). This fixes the issue by discarding a stale timeout message (if any) from the mbox while keeping the mbox queue otherwise intact.

When freeing any stale pktsnips from stale messages in the mbox, make sure that the messages actually contains a pktsnip before freeing.

kaspar030 · 2022-11-22T15:08:31Z

I am open to suggestions on how to fix the issue. But other than moving away from mbox/msg (which certainly is out of scope for fixing a bug in an overdue release), I have none.

I'm just realizing that gnrc_sock is hand-rolling sth like mbox_recv_timeout(). Would a non-leaking version of that fix the issue?

Anyhow,

not breaking on an unexpected message is fixable (and this PR adds that)
not timing out on a stale timeout message is fixable (with ztimer's msg_receive_timeout() that wouldn't happen as the return value for timeout or stale message is different)
and, a mbox_recv_timeout() that doesn't leak is doable (my preferred solution)

If this solution works and fixes the issue, can we just push it to the release branch, and find a proper solution for the future? Or, go with this now (as its already done) and let time solve this differently if pressing enough?

(maybe mark the added mbox function somehow so there'll not be users should we want drop it?)

maribu · 2022-11-22T17:03:26Z

I salvaged the uncontroversial parts. Also, the wild pointer freeing is giving me much more headache than the premature timeout error. After all, code that needs to communicate over network will have some fault tolerance and error handling in it, as in real life network communication things go wrong. A premature timeout due to a race condition is something that is not ideal, but should not break real world applications.

I agree that a ztimer_mbox_get_timeout() is the best presented solution to fix premature timeouts on subsequent calls.

maribu · 2022-11-22T18:19:27Z

Btw., the doc of ztimer_msg_receive_timeout() says:

RIOT/sys/include/ztimer.h

Lines 468 to 469 in 97bc825

    
            * @note: This might function might leave a message with type MSG_ZTIMER in the 
        
            *        thread's message queue, which must be handled (ignored).

This is not encouraging :-/

18632: tests/thread_float: do not overload slow MCUs with IRQs r=benpicco a=maribu ### Contribution description If the regular context switches are triggered too fast, slow MCUs will be able to spent little time on actually progressing in the test. This will scale the IRQ rate with the CPU clock as a crude way too keep load within limits. ### Testing procedure The unit test should now pass on the Microduino CoreRF ``` $ make BOARD=microduino-corerf AVRDUDE_PROGRAMMER=dragon_jtag -C tests/thread_float flash test make: Entering directory '/home/maribu/Repos/software/RIOT/tests/thread_float' Building application "tests_thread_float" for "microduino-corerf" with MCU "atmega128rfa1". [...] text data bss dec hex filename 12834 520 3003 16357 3fe5 /home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.elf avrdude -c dragon_jtag -p m128rfa1 -U flash:w:/home/maribu/Repos/software/RIOT/tests/thread_float/bin/microduino-corerf/tests_thread_float.hex [...] Welcome to pyterm! Type '/exit' to exit. READY s START main(): This is RIOT! (Version: 2022.10-devel-858-g18566-tests/thread_float) THREADS CREATED Context switch every 3125 µs { "threads": [{ "name": "idle", "stack_size": 192, "stack_used": 88 }]} { "threads": [{ "name": "main", "stack_size": 640, "stack_used": 220 }]} THREAD t1 start THREAD t2 start THREAD t3 start t1: 141.443770 t3: 141.466810 t1: 141.443770 t3: 141.466810 t1: 141.443770 t3: 141.466810 t1: 141.443770 t3: 141.466810 t1: 141.443770 t3: 141.466810 t1: 141.443770 t3: 141.466810 t1: 141.443770 make: Leaving directory '/home/maribu/Repos/software/RIOT/tests/thread_float' ``` (~~Note: The idle thread exiting is something that should never occur. I guess the culprit may be `cpu_switch_context_exit()` messing things up when the main thread exits. But that is not directly related to what this PR aims to fix. Adding a `thread_sleep()` at the end of `main()` does indeed prevent the idle thread from exiting.~~ Update: That's expected. The idle thread stats are printed on exit of the main thread, the idle thread does not actually exit.) ### Issues/PRs references Fixes #16908 maybe? 18950: tests/unittests: add unit tests for core_mbox r=benpicco a=maribu ### Contribution description As the title says ### Testing procedure The test cases are run on `native` by Murdock anyway. ### Issues/PRs references Split out of #18949 19030: tests/periph_timer_short_relative_set: improve test r=benpicco a=maribu ### Contribution description Reduce the number lines to output by only testing for intervals 0..15 to speed up the test. In addition, run each test case 128 repetitions (it is still faster than before) to give some confidence the short relative set actually succeeded. ### Testing procedure The test application should consistently fail or succeed, rather than occasionally passing. ### Issues/PRs references None 19085: makefiles/tests/tests.inc.mk: fix test/available target r=benpicco a=maribu ### Contribution description `dist/tools/compile_and_test_for_board/compile_and_test_for_board.py` relies on `make test/available` to check if a test if available. However, this so far did not take `TEST_ON_CI_BLACKLIST` and `TEST_ON_CI_WHITELIST` into account, resulting in tests being executed for boards which they are not available. This should fix the issue. ### Testing procedure #### Expected to fail ``` $ make BOARD=nrf52840dk -C tests/gcoap_fileserver test/available $ make BOARD=microbit -C tests/log_color test/available ``` (On `master`, they succeed, but fail in this PR.) #### Expected to succeed ``` $ make BOARD=native -C tests/gcoap_fileserver test/available $ make BOARD=nrf52840dk -C tests/pkg_edhoc_c test/available $ make BOARD=nrf52840dk -C tests/log_color test/available ``` (Succeed in both `master` and this PR.) ### Issues/PRs references None Co-authored-by: Marian Buschsieweke <[email protected]>

18950: tests/unittests: add unit tests for core_mbox r=benpicco a=maribu ### Contribution description As the title says ### Testing procedure The test cases are run on `native` by Murdock anyway. ### Issues/PRs references Split out of #18949 Co-authored-by: Marian Buschsieweke <[email protected]>

maribu requested review from PeterKietzmann, cgundogan, haukepetersen, kaspar030 and miri64 as code owners November 22, 2022 12:35

github-actions bot added Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: network Area: Networking Area: sys Area: System Area: tests Area: tests and testing framework labels Nov 22, 2022

miri64 reviewed Nov 22, 2022

View reviewed changes

benpicco reviewed Nov 22, 2022

View reviewed changes

core/mbox.c Show resolved Hide resolved

core/mbox.c Outdated Show resolved Hide resolved

core/include/mbox.h Outdated Show resolved Hide resolved

sys/net/gnrc/netreg/gnrc_netreg.c Outdated Show resolved Hide resolved

miri64 requested changes Nov 22, 2022

View reviewed changes

miri64 added the Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) label Nov 22, 2022

maribu added 4 commits November 22, 2022 15:37

core/mbox: add mbox_try_get_with_type()

8329acd

Add a non-blocking function that allows draining the mbox queue from message of a certain type, while keeping the rest of the messages in the mbox intact for later consumption.

tests/unittests: add unit tests for core_mbox

e0ea259

sys/net/grnc/netreg: avoid freeing wild pointers

f988919

When freeing any stale pktsnips from stale messages in the mbox, make sure that the messages actually contains a pktsnip before freeing.

maribu force-pushed the sys/net/gnrc/sock/bugfix branch from 0652d54 to f988919 Compare November 22, 2022 14:38

This was referenced Nov 22, 2022

tests/unittests: add unit tests for core_mbox #18950

Merged

sys/net/grnc/netreg: avoid freeing wild pointers #18951

Merged

benpicco mentioned this pull request Nov 23, 2022

sys/net/grnc/netreg: avoid freeing wild pointers [backport 2022.10] #18961

Merged

maribu mentioned this pull request Nov 25, 2022

core/thread_flags: add thread_flags_wait_any_or_mbox() #18977

Open

		msg_t garbage_bin;
		mbox_try_get_with_type(&reg->mbox, &garbage_bin, _TIMEOUT_MSG_TYPE);

Conversation

maribu commented Nov 22, 2022

Contribution description

Testing procedure

Issues/PRs references

Uh oh!

miri64 Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

benpicco Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

miri64 Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

miri64 Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

maribu Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

benpicco left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

miri64 Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

maribu Nov 22, 2022

Choose a reason for hiding this comment

Uh oh!

kaspar030 commented Nov 22, 2022

Uh oh!

maribu commented Nov 22, 2022

Uh oh!

maribu commented Nov 22, 2022

Uh oh!

kaspar030 commented Nov 22, 2022

Uh oh!

maribu commented Nov 22, 2022

Uh oh!

maribu commented Nov 22, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants