Use TAILQ_FOREACH_SAFE whenever we do TAILQ_REMOVE during traversal#11
Use TAILQ_FOREACH_SAFE whenever we do TAILQ_REMOVE during traversal#11pbui wants to merge 1 commit intoNetworkConfiguration:masterfrom
Conversation
We must use TAILQ_FOREACH_SAFE instead of TAILQ_FOREACH if we call TAILQ_REMOVE on the current item and free it. Failure to do so can lead to bad behavior such as infinite loops.
rsmarples
left a comment
There was a problem hiding this comment.
These changes are not necessary.
|
|
||
| /* Remove existing timeout if present. */ | ||
| TAILQ_FOREACH(t, &eloop->timeouts, next) { | ||
| TAILQ_FOREACH_SAFE(t, &eloop->timeouts, next, tt) { |
There was a problem hiding this comment.
This is not needed because we break from the loop immediately after removing the matching item.
|
|
||
| state = IPV4_STATE(addr->iface); | ||
| TAILQ_FOREACH(ap, &state->addrs, next) { | ||
| TAILQ_FOREACH_SAFE(ap, &state->addrs, next, ian) { |
There was a problem hiding this comment.
This is not needed because we break from the loop after removing the matching address.
|
|
||
| state = IPV6_STATE(ia->iface); | ||
| TAILQ_FOREACH(ap, &state->addrs, next) { | ||
| TAILQ_FOREACH_SAFE(ap, &state->addrs, next, ian) { |
There was a problem hiding this comment.
This is not needed because we break from the loop after removing the matching address.
Those commits do not resolve the high CPU usage problem. As you can see in https://github.com/void-linux/void-packages/pull/15858/files, I included those commits in what I thought should fix the bug, but it had no effect as I was still able to experience the high CPU after switching networks. |
|
I actually managed to trigger the bug with these changes... so this is clearly not the fix. However, I was able to produce a backtrace: It looks like it is stuck in this loop in I can close this PR since it does not address the issue, but how should I proceed on the bug itself? |
|
So what appears to happen is that when I transition from wireless to wireless + ethernet (or wireless + ethernet to just wireless), I get stuck in
Because of this, a match is never found and it loops forever... But there are other strange properties. For instance, Moreover, in I'm not sure what to make of this or how to fix it... but I hope this helps in tracking down this bug. Here is some more context (I printed out select values for variables) on that backtrace... |
|
Also, here are the logs from the process: After this, the process hangs with the infinite loop. |
|
This should be fixed in 73ac184. |
|
Looks like that fixed it... Thanks! |
We must use
TAILQ_FOREACH_SAFEinstead ofTAILQ_FOREACHif we callTAILQ_REMOVEon the current item and free it.Failure to do so can lead to bad behavior such as infinite loops.
A few users of Void Linux are currently experiencing high CPU usage due to this bug in 8.1.1: void-linux/void-packages#15858
https://www.reddit.com/r/voidlinux/comments/donf2w/dhcpcd_consuming_100cpu/
It is difficult to reliably trigger this bug, but when it does occur, a backtrace showed that the process was stuck in
ipv4ll_dropwhich callsipv4_deladdr, which in turn had the unsafe traversal where it callsTAILQ_REMOVEinside aTAILQ_FOREACHrather thanTAILQ_FOREACH_SAFE.Further auditing of the code should be performed to ensure the proper loop traversal macro is used to avoid such bugs (I fixed three instances where I believe
TAILQ_FOREACH_SAFEshould be used instead ofTAILQ_FOREACH, but there may be others).