gnrc_sixlowpan_frag: check if own message queue is full#10679
gnrc_sixlowpan_frag: check if own message queue is full#10679gschorcht merged 2 commits intoRIOT-OS:masterfrom
Conversation
When issueing the sending of the next fragment the current version of `gnrc_sixlowpan_frag` doesn't check if the queue is full. This leads to leakage of the packet buffer, since when it is full, the package never gets released. This change adds a checks and error exits in case the queue is full.
|
The bug found here #10672 (comment) definitely becomes reproducably fixed. |
|
@gschorcht would be interesting if this already fixes the issue with ESP-now, since the remaining race-condition could also be an issue with native. |
Unfortunately it's still happening for me. Edit: Another observation: if I reboot the faulted node, the other node gets replies again in 50ms, but the now rebooted one still gets replies within 250ms, increasing continously, until this time the other node goes into the error state. |
|
@TimoRoth can you give the output of the |
|
Scratch that, hard-reset the nodes, and it still happened pretty much right away. Seems to be entirely random or related to some unknown factor. pktbuf output
|
|
(next time as a paste bin / gist please) From the patterns I'm seeing this looks very similar to what I am getting on native. |
|
(But I have to analyze it further to be sure, but I'm not even done with my analysis of the native results...) |
|
Those are however, that I am pretty sure of mostly received 6Lo fragments, so independent of the error fixed here. Hints why I think so, the snips are in receive order and the first snip points to something that most definitely looks like a 6Lo fragment. I'll post a picture later to illustrate this. |
|
I have something better than an illustration: a fix! 💃 Will prepare a PR. |
Chunks with ESP-NOW or 6Lo packets below. 6Lo packet with tag
|
| gnrc_pktbuf_release(fragment_msg->pkt); | ||
| fragment_msg->pkt = NULL; | ||
| return; | ||
| goto error; |
There was a problem hiding this comment.
When I learned programming C over 30 years ago, it was an absolute nogo to use goto statements. Our lecturer said, goto breaks the execution structure and if you need the goto statement, there is something wrong with your structure. He said also goto is very bad and something like BASIC. It must not be used.
Has this theory changed?
Even it seems sometimes easier to have a goto, especially for exits on error, I never used it all the years. I know, your intention is to avoid code duplication.
There was a problem hiding this comment.
I can remove them. I believe not using goto is a little bit evangelical, since especially in this one case (handle error case and exit) it is very useful and does not really breaks the execution structure. There are far greater since that can be done with goto where I agree it should not be used ;-).
There was a problem hiding this comment.
It is ok to me to leave the goto statement here. I think it is very comfortable to have a single exit point to avoid code duplication. The alternative would be to have very deep if () ... else if ... structures which doesn't improve the readability of source code.
There was a problem hiding this comment.
Ups. Sorry, it should be just a question no change request.
There was a problem hiding this comment.
Sorry if it sounded like a change request.
There was a problem hiding this comment.
Welp, now I do the same, just without goto ;-)
They are in receive order so they are received. E.g. this part ( and it points to a netif_hdr struct starting I fixed that bug (and also provided a graphical analysis of my dump) in #10680. |
|
Yes, I saw it already and I'm still testing it together with #10680, I will report the results there. |
Contribution description
When issueing the sending of the next fragment the current version of
gnrc_sixlowpan_fragdoesn't check if the queue is full. This leads to leakage of the packet buffer, since when it is full, the package never gets released.This change adds a checks and error exits in case the queue is full.
Testing procedure
See #10672. Packet buffer will still run full though.
Issues/PRs references
Found on the hunt for #10672, but doesn't fix it fully.