pkg/lwip: fix netdev concurrency issues#18479
Conversation
@miri64 does this assumption holds? |
|
lwIP internally synchronizes all acceses via a global lock and all internal functions assert that the lock is taken when they are called. I guess that lock is implemented via a mutex in RIOT, but I haven dived deep enough into the implementation to say be sure. |
maribu
left a comment
There was a problem hiding this comment.
ACK. Code looks good and I trust your testing.
I believe this can fix a lot of sporadic and hard to reproduce issues.
| static void _event_cb(netdev_t *dev, netdev_event_t event); | ||
| static void *_event_loop(void *arg); | ||
|
|
||
|
|
There was a problem hiding this comment.
There already is an empty above
|
This has been requested for backporting. I don't see the big picture of its complexity yet, but am marking it for "needs backport" if only to make sure I'll have it in the "known issues" section. |
11b7d63 to
a2bf203
Compare
|
amended and squashed direclty |
Contribution description
This PR fixes a concurrency issue in LWIP, as a result of calling
netdevfunctions from different context.Traditionally our netdev driver model assumes implicitly that all
netdevfunctions run on the same thread. As it is now, LWIP calls all send functions from the caller thread (e.gmain), but processes ISR on a higher priority thread. This not only creates concurrency issues in existing devices, but renders other modules such as the SubMAC unusable...To (temporally) solve this, I implemented some recursive mutex lock functions to acquire the device before calling any internal operation. In the future it would be better to schedule all transmissions from a single thread.
Although this won't make the current situation worse, I'm assuming LWIP uses mutex internally to prevent concurrency issues between LWIP functions called from different context. If this doesn't hold, we will need to proceed with the single thread solution.
EDIT: I also adapted the
netdevreturn calls to the newer API. Returning 0 is not considered an error anymore (which is whatnetdev_ieee802154_submacdoes).Testing procedure
Test that LWIP still works as expected (e.g
tests/lwip).Issues/PRs references
Found while porting the KW2XRF to LWIP #18465