cpu/esp8266: Fix crashes of the ESP866 when rebooting with or disconnecting from WiFi#22014
Merged
leandrolanzieri merged 2 commits intoRIOT-OS:masterfrom Jan 27, 2026
Conversation
With the ESP8266, this function is only called by `ieee80211_sta_new_state`, with interrupts of all levels disabled, for example as a result of executing `esp_wifi_disconnect`. If `ztimer_sleep` is then called with interrupts disabled, for some reason this leads to memory corruption when interrupts are re-enabled afterwards. The only way to avoid this is to re-enable interrupts before calling `ztimer_sleep`. Since debugging ESP8266 code is very limited and sometimes impossible due to there being only one hardware breakpoint and a lot of closed binary code involved when calling `esp_wifi_disconnect`, it was not possible to find the reason for the memory corruption. An exception occurs after executing `esp_wifi_disconnect` in `ztimer_handler` when `_callback_unlock_mutex` is called with a mutex parameter that points to irrelevant read-only memory in IROM. Therefore, enabling the interrupts here is currently the only way to avoid the exception, even though this is only a hack.
Even though `esp_wifi_stop` is implicitly called later in the restart process by `esp_restart`, it is cleaner to stop the WiFi interface here before saving the RTT counters.
a998f55 to
e2d0e28
Compare
Contributor
|
@gschorcht should we also backport this? |
Contributor
Author
|
I don't know how often this actually happens. To be honest, I wasn't aware of this problem before. @benpicco What do you think? |
Contributor
|
I just tested the fix and can confirm it works for me. Without the PRWith this PR |
benpicco
approved these changes
Jan 27, 2026
Contributor
|
Let's go with it then |
Contributor
|
@gschorcht let's also backport this |
Contributor
Author
|
Thanks for reviewing and merging. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Contribution description
This PR fixes the problem described in issue #21558.
With the ESP8266,
vTaskDelayis only called byieee80211_sta_new_state, with interrupts of all levels disabled, for example as a result of executingesp_wifi_disconnect.vTaskDelayusesztimer_sleepand ifztimer_sleepis then called with interrupts disabled, it leads to memory corruption for some reason when interrupts are re-enabled afterwards. The only way to avoid this is to re-enable interrupts before callingztimer_sleepinvTaskDelay.Since debugging ESP8266 code is very limited and sometimes impossible due to there being only one hardware breakpoint and a lot of closed binary code involved when calling
esp_wifi_disconnect, it was not possible to find the reason for the memory corruption. An exception occurs after executingesp_wifi_disconnectinztimer_handlerwhen_callback_unlock_mutexis called with a mutex parameter that points to wrong address in IROM.Therefore, enabling the interrupts in
vTaskDelaybefore callingztimer_sleepis currently the only way to avoid the exception, even though this is only a hack.The PR includes a small improvement of the
pm_rebootfunction. Even thoughesp_wifi_stopis implicitly called later in the restart process byesp_restart, it is cleaner to stop the WiFi interface here before saving the RTT counters.Testing procedure
Compile and flash the GNRC networking example for any ESP8266 with WiFi enabled:
Without the PR, using
rebootafter connecting to the WiFi AP will lead to an exception. With the PR, it should work.Issues/PRs references
Fixes #21558