-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
esp32: Bump esp_tinyusb component, add component version lock files to source control. #17960
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
esp32: Bump esp_tinyusb component, add component version lock files to source control. #17960
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #17960 +/- ##
=======================================
Coverage 98.38% 98.38%
=======================================
Files 171 171
Lines 22296 22296
=======================================
Hits 21937 21937
Misses 359 359 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Code size report: |
91644fb to
6a61356
Compare
|
@projectgus I built 6a61356, and will test when I have access to the boards tomorrow. FWIW: my steps were
If there is more to it, please kindly let me know, as I am not yet very familiar with the build process. |
Keep in mind that it's not an uncommon occurrence for Espressif to yank published component versions. That can make pro and con arguments both for keeping track of versions manually and for letting ESP-IDF resolve versions on its own so I won't weigh on this :) |
Although I agree there's pros and cons, I still think explicitly tracking dependencies is a win for this kind of scenario. We don't currently have any way to tell if someone is building with a local yanked component when they submit a bug report, but with this change they'd have to have explicitly chosen to do that. Also, if we're building with a yanked component and this PR then CI should fail and we get the hint to go and update it. The more complex case, as I see it, is that we won't automatically build MicroPython with bugfix releases of components, without manually running
Overall I am very interested in your thoughts on this approach, both for this specific TinyUSB problem and generally - so feel free to weigh in if you think it's a step in the wrong direction. |
Thanks @gohai, those are the steps! To flash in the same step you can do |
|
Although the lock files add complexity (at least, in the sense of adding more files) it sounds like it's worth it to keep proper track of which component versions are in use. |
|
@projectgus This works fine on my USB-OTG-connected ESP32-S3-Zero. (This one had issues copying files earlier.) However, I am still seeing reproducible repl hangs on the ESP32S3-S3-DEV-KIT-NXR8. See e.g. below (just starting repl made it hang):
|
|
@gohai So you get this immediately from simply running |
|
@projectgus I'd say I don't have a preference on how the tracking is performed, as long as progress is made :) I think it all boils down on how much of a maintenance burden it can be to track versions manually. From my past experiences I can sum it all up as this (virtually none of these can be MicroPython's concern, I'm dumping them here in the hope they may help make a more informed decision): how often dependencies need to be updated, how much time it takes to validate those changes, and what's the turnaround time to get an emergency fix from Espressif into MicroPython (eg. potential security implications, that's yet another Pandora's box that should better stay shut) or how quickly a yanked component can be gotten rid of in the published repo lockfiles. For emergency changes, if people in charge for releasing the Espressif components you rely on are based out of Shanghai, then you folks being a couple hours ahead of them is not a big deal, otherwise you may want to keep that in mind too. Finally, I wonder if this approach is compatible with https://github.com/agatti/micropython-idf-component. Granted, it's up to me to work around your choices in this case if the end result is not compatible so I'm just talking out loud here. Incidentally, @dpgeorge if you ever plan to release MicroPython as an ESP-IDF component and put it up into Espressif's registry, feel free to take as much code/ideas/whatnot from that repo. I've licensed that as MIT in order to make the process easier on your end if an official component is an option you want to consider. |
Just to be extra sure, does that board with the official 1.26 release firmware and with a firmware with updated IDF components work if you connect to it from Linux, Windows, or a virtualised Linux (so the full USB device is passed through)? |
@projectgus It behaves identical to what I saw on 1.26 with that board... it sometimes hangs at startup, but if it does, running |
|
Failures/crashes were seen here too, but with some differences. I applied this PR to Tests were done with two ESP32-S3 N16R8 modules, one with an empty file system and the other with the file system 90% full. The test consisted in copying a 10 kbyte binary file with mpremote to the microcontroller, always with a different target filename. With the microcontroller with empty file system, I see 3 crashes in 200 attempts. On the USB COM port, I can see many With the ESP32-S3 with a full file system and I see 5 failures every 10 attempts (50%) with the same reboot/guru meditation error, or with this This might account for differences of behaviour between tests or boards. The full file system makes the file access slower, i.e. more flash reads/writes for the same transfer, probably making the occurrence of an error more likely. I can't reproduce the ESP32-S3 crash on the REPL. For example |
|
@agatti Apologies that this is taking longer than I had hoped for. I don't have a PC or Linux machine readily available, but I'll try to get my hands on it this Wednesday or Thursday. |
|
@agatti I managed to grab a Windows 11 machine at school. Here my results with the same board (ESP32S3-S3-DEV-KIT-NXR8).
This is it freezing e.g. after a |
|
Thanks @gohai, that was extremely helpful! I was able to reproduce the problem here, and I believe I understand the root cause. I don't yet have a good fix for it, but looking into it... |
|
These USB-CDC issues seem to be a combination of a few things:
So, the solution is obviously to set the So, we need a new set of behaviour where:
|
6a61356 to
e29b383
Compare
|
I've pushed another fix here for the @gohai Is there any chance that you could please test the latest version of this PR on your original MacOS install again, please? |
|
@projectgus I built e29b383 and will test tomorrow with the two boards on macOS and Windows 11. Would you prefer if I test with, or without the mpremote changes in #17999? |
For MacOS, can test with either. For Windows, can test with either but you might see some dropped bytes if you don't have the mpremote changes included (but you shouldn't see a hang any more). |
|
@projectgus e29b383 fixes all issues I saw on macOS and Windows 11 with the ESP32S3-S3-DEV-KIT-NXR8 and an (USB-OTG) ESP32-S3-Zero. (I haven't done too much testing at this point, but at least none of my reproducers fail anymore.) Tested: repl and file copying on macOS; repl on Windows [edit: this was with mpremote 1.25.0, fwiw] |
|
I applied the changes of this PR to The basic test is copying repeatedly a 10_000 byte binary file to the ESP32-S3 with Should I also apply other changes before testing? |
|
I'm new to micropython but did manage to test e29b383. I'm on MacOS 15.5 on an M1 Macbook with a ESP32-S3-Zero.
|
e29b383 to
5f812c0
Compare
|
Rather than duplicate the lockfile check in both CI and autobuild.sh, I added it to esp32 port CMakeLists.txt. To distinguish between a warning (for normal developer builds) or a fatal error (for CI and nightly builds where it indicates an ESP-IDF version mismatch), I've added an environment variable We might be able to lean on this flag for other things as well (for example, disabling |
ce03f43 to
730f94e
Compare
|
Thank you, when I run ``git describe --tags --dirty```, I get the correct result. IDK if this is relevant, but when I apply the complete PR, the ESP32S3 crashes about once every 50 or 100 However, when I apply the commit e29b383 only, I see no error. |
That's quite unexpected! Can you please copy and paste the contents of file |
|
I could not reproduce a version that does not fail. I did: After recompiling MicroPython MicroPython version reported as File This is not the first time I get inconsistent results. Probably I don't know enough about git commands.... :-( |
Reported to fix issues reported with serial corruption when interacting with MicroPython from macOS hosts. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <[email protected]>
This allows us to have some things which are fatal errors in CI or nightly builds, but warnings in normal developer builds. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <[email protected]>
This is recommended by Espressif, and it's the only way to ensure everyone builds the same set of component versions. The awkward part is that updating the ESP-IDF version will churn a line in each of these files (and possibly other changes). Adds a build-time check for lock file changes, which is either a warning or a hard error depending on the value of MICROPY_MAINTAINER_BUILD flag (introduced in previous commit). This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <[email protected]>
730f94e to
9411709
Compare
@bixb922 Thanks for the extra info, the steps look right to me. I don't think this is necessarily your fault, these kind of issues have a lot of moving parts and can be hard to reproduce reliably. I think we should still merge this PR, as it's confirmed to fix some of the issues reported and doesn't seem to introduce any new ones. Will continue to look into ways to debug the remaining issue on MacOS... |
|
Thanks @projectgus for working on this, and everyone who tested it. I did some basic testing of ESP32_GENERIC_S3 on my Arch Linux set up, and it works fine. But, I also tested RPI_PICO2 and there is a regression in the USB CDC performance 😞 That's due to the subtle change in behaviour of I tested using #15909 and find that the "DATA IN" (in to the host PC) rate drops from about 586 kibytes/second down to about 540 kibytes/second. It's definitely noticeable using that test. Doing some quick hacking, I can regain the performance with the following change: --- a/shared/tinyusb/mp_usbd_cdc.c
+++ b/shared/tinyusb/mp_usbd_cdc.c
@@ -102,6 +102,10 @@ mp_uint_t mp_usbd_cdc_tx_strn(const char *str, mp_uint_t len) {
size_t i = 0;
while (i < len && (mp_uint_t)(mp_hal_ticks_ms() - last_write) < MICROPY_HW_USB_CDC_TX_TIMEOUT) {
uint32_t n = len - i;
+ if (n > CFG_TUD_CDC_EP_BUFSIZE) {
+ n = CFG_TUD_CDC_EP_BUFSIZE;
+ mp_event_wait_ms(1);
+ }
if (tud_cdc_connected()) {
// Limit write to available space in tx buffer when connected.I also get back the performance by making that new I suspect waiting a bit before attempting to write to the CDC lets it drain more in certain situations. @projectgus what was the main reason you changed that function? Maybe we can just leave it as it was? |
The reason I changed it was to make two things unconditional each time around the loop:
This isn't strictly necessary, but I thought it makes it less likely that another transient state mismatch leads to an infinite loop (I don't see a sequence of events in the old version of the code that could lead to that now the FIFO issue is patched, but it's hard to guess at every possible sequence of USB state changes.) I'm not surprised there's a small speed regression, but I am a bit surprised at the patch you posted that reverses it. If it's OK with you then I'll have a further try at tweaking the new loop to fix the speed issue, but otherwise it should be safe to take the change out entirely. |
Yes, that's fine. Hopefully you can reproduce the performance measurements using #15909. |
|
Hmm, I wasn't really able to reproduce those findings. I only see a few kibytes/sec variation between different versions (including the two patches mentioned above), it seems random enough to possibly be cache layout. I did make a change that seems like it can't hurt (evaluating the timeout after writing to the FIFO), but the different speeds I was getting were all very similar. Here's my most recent run: (The runs I did from the current master branch are +/-2 kibytes/sec of this.) |
|
I've retested with a different set up: I had two USB hubs in the way (one was Octoprobe) and have now removed those. So now the RPI_PICO2 is connected directly to a Testing with master, this PR before your recent change, and with the most recent change (titled "Possible performance tweak") I get pretty much the same performance in all those three cases: So... I think we are fine with this. (It's interesting you get quite a bit faster rates, I guess it's due to a faster PC. Although that surprises me, the limiting factor should not be the PC!) Do you want to squash down your latest commit? It looks good to me. (Actually, maybe split the changes to |
46674e7 to
b145925
Compare
Yeah, very surprising. I did some more tests and got even odder results, have left a comment at #15909 (comment)
Good idea, done! Left some rationale in the commit message for future reference as well. |
dpgeorge
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating, this looks good now.
This flag is in the main branch of TinyUSB, included in Espressif since their v0.18.0~3 component release (but it's not actually in TinyUSB v0.18.0 release). Setting the flag is needed for the USB device not to block waiting for space in the FIFO if the host is disconnected. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <[email protected]>
This is related to the previous commit (where due to the new config flag this loop could end up stuck indefinitely if the USB host was disconnected). The previous loop could maybe still get stuck if the low-level USB state and the high-level USB state got out of sync. (Not clearly possible, but hard to say definitely not possible.) To be "belts and braces" careful: - Always run mp_usbd_task() each time around the loop to progress the state. - Always evaluate the timeout if we fail to write anything to the FIFO. This work was funded through GitHub Sponsors. Signed-off-by: Angus Gratton <[email protected]>
b145925 to
a43e385
Compare


Summary
Three changes: two related to ESP-IDF dependencies, one to TinyUSB:
mpremotesends corrupted fs hook when mounting local directory #17560 (probably).tx_overwritabe_if_not_connectedflag released by Espressif in their TinyUSBv0.18.0~3component (have reported this versioning inconsistency to them here). Updating with this change otherwise silently changes the CDC FIFO behaviour to "block" and can cause hangs if the FIFO is full and TinyUSB thinks it is disconnected (due to DCD not set). Also rewrite this function to make sure TinyUSB task is always called, avoiding the possibility of a race between the USB host state and TinyUSB processing the state change.v0.18.0~3, which seems to be one of the reasons release v1.26 has issues with ESP32-S3 USB.MICROPY_MAINTAINER_BUILDto distinguish these.Windows users may still have issues with lost bytes on ESP32-S3 native USB unless they either roll back to mpremote v1.25 or use #17999
Testing
Trade-offs and Alternatives
0.18.0in December but Espressif have tagged 0.18.1, 0.18.2, 0.18.3, 0.18.4 since then... If we use upstream TinyUSB then it's likely we'll eventually run into issues that Espressif has already fixed in their fork, or that we should report to Espressif but can't report them because we're not using their supported TinyUSB.idf_component.ymlusing==to get specific versions. However, this still doesn't provide any tracking of when the ESP-IDF version has changed from whatever the current recommended version is...This work was funded through GitHub Sponsors.