Skip to content

cpu/native: Gardening/QoL#21283

Merged
benpicco merged 13 commits intoRIOT-OS:masterfrom
carl-tud:native-gardening
Mar 12, 2025
Merged

cpu/native: Gardening/QoL#21283
benpicco merged 13 commits intoRIOT-OS:masterfrom
carl-tud:native-gardening

Conversation

@carl-tud
Copy link
Copy Markdown
Contributor

@carl-tud carl-tud commented Mar 8, 2025

Hi!

It’s spring, so I thought it’s time for some gardening in native/cpu.

I’ve got a few PRs lined up, including one enabling the native board on arm64 (#20739) and one resurrecting macOS support (TBA).
While writing these I’ve found myself unable to really understand what’s going on in the native/cpu code. So I thought if I were to add documentation and name a few things more descriptively, this might help anyone reading it in the future.

Below, I’ll give a brief overview of the motivation behind certain changes.

_GNU_SOURCE/_XOPEN_SOURCE As a matter of ensuring resilience against future modifications to system headers, I set _GNU_SOURCE (or _XOPEN_SOURCE for BSD/Darwin, respectively) for all code in cpu/native.

Previously, these were set inconsistently, potentially leading to redefinitions, especially if you try to set them globally. This technique also proved to be quite dangerous on macOS – take <ucontext.h> as an example:

If _XOPEN_SOURCE isn’t set, then the context struct will lack a storage buffer. [make|swap|set|get]context however will assume there is indeed a storage buffer and write happily to memory far off earth. And said bug is not easy to find if some other system header imports ucontext.h before and you forgot to set _GNU../_XOPEN…. Then your precious _GNU…/_XOPEN… definitions around <ucontext.h> become worthless. 💥

Context switches Previously, code manipulating ucontexts was duplicated; same goes for retrieving the ucontext_t from the stack buffer. I wrapped them in zero-runtime-cost static inline’s (_context_set_fptr, _context_get_fptr). Note that these aren’t called set_pc intentionally. There are platforms where you cannot set PC, i.e., they require another technique of transferring control back to userspace. Stay tuned for arm64 ;)
makecontext madness Pointers passed as thread arguments were previously truncated to 32 bits. This should crash, don’t really know why it didn’t already.

So on 64-bit platforms, I do not use makecontext’s mechanism of supplying thread arguments. Instead, we first go to _start_task_func64 and pass the 64-bit argument in a register. I made sure that register isn’t mutated in glibc/libplatform.

Then in _start_task_func64, we invoke the real thread function and move the 64-bit argument into the proper register according to the arch’s ABI. No more casting void* to int.

Documentation The doxygen markup added *encloses* symbols in *named sections*. Everytime you see an @name followed by @{, that’s a section. To make finding sections easier, I added corresponding marks (MARK: -) which show up in editors in the minimap (e.g., VS Code, Xcode). Also fixed incorrect uses of @ingroup.
Logging While (trying) to debug native/cpu, it became apparent that understanding logs without knowing where they come from is hard. Prefixing logs with the enclosing function’s name does not help, unless you already know what said function *does*. So, for each *component*— say CPU, IRQ implementation, startup code — I prefixed logs with the rough equivalent of the filename.

@carl-tud carl-tud requested a review from kaspar030 as a code owner March 8, 2025 16:35
@github-actions github-actions bot added Platform: native Platform: This PR/issue effects the native platform Area: build system Area: Build system Area: cpu Area: CPU/MCU ports Area: sys Area: System labels Mar 8, 2025
@carl-tud carl-tud marked this pull request as draft March 8, 2025 16:40
@carl-tud carl-tud force-pushed the native-gardening branch 3 times, most recently from 73f407d to 765cbaf Compare March 9, 2025 14:12
@carl-tud carl-tud marked this pull request as ready for review March 9, 2025 14:16
@carl-tud carl-tud force-pushed the native-gardening branch 2 times, most recently from 4b402b9 to 308a30a Compare March 10, 2025 10:31
@mguetschow mguetschow added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Mar 10, 2025
@riot-ci
Copy link
Copy Markdown

riot-ci commented Mar 10, 2025

Murdock results

✔️ PASSED

8275208 drivers/pcd8544: rename CHAR_WIDTH

Success Failures Total Runtime
10271 0 10271 13m:55s

Artifacts

Copy link
Copy Markdown
Contributor

@mguetschow mguetschow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for your spring clean-up, and for splitting the commits for easier review!

This will need some more in-depth review, especially for the second half of the commits, preferably by someone more experienced with the native code. Just some brief notes from my side while having a first look.

@mguetschow mguetschow added the CI: no fast fail don't abort PR build after first error label Mar 10, 2025
@carl-tud
Copy link
Copy Markdown
Contributor Author

carl-tud commented Mar 10, 2025

CI fails because of said _GNU_SOURCE issue. I'll have a look

U: fixed

@carl-tud
Copy link
Copy Markdown
Contributor Author

carl-tud commented Mar 10, 2025

CI fails because of said _GNU_SOURCE issue. I'll have a look

Okay, funny. _GNU_SOURCE somehow makes some system header define CHAR_WIDTH. In the PDC8544 driver, someone also defined CHAR_WIDTH. I think this shouldn't have been named that way in the first place.

@mguetschow what do you think? I'd say I rename it to CHAR_PIXEL_WIDTH

@github-actions github-actions bot added the Area: drivers Area: Device drivers label Mar 10, 2025
@mguetschow
Copy link
Copy Markdown
Contributor

@mguetschow what do you think? I'd say I rename it to CHAR_PIXEL_WIDTH

Sounds good to me 👍

Copy link
Copy Markdown
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thank you for cleaning this up!
I still dream of a future where we get away from makecontext() and just use pthreads and signals, so no more assembly required - but that would be a major rework…

I also don't think this will fix the dreaded #495 yet, but it's a nice cleanup overall.

@carl-tud
Copy link
Copy Markdown
Contributor Author

Thanks for the review.

I would also like to get away from makecontext, yet from looking into pthreads briefly, this doesn't seem possible to me (We need extremely tight control over the context switch, and that's something only (to my knowledge) ucontext.h can do... It's basically the same as falling back to __asm("mov %cup, 0xc0ffee") inline assembly, ucontext is just a nice wrapper around that...)

@maribu
Copy link
Copy Markdown
Member

maribu commented Mar 11, 2025

I would also like to get away from makecontext, yet from looking into pthreads briefly, this doesn't seem possible to me

This can actually be done. Each thread has a mutex in thread-local storage that is locked and held by the thread. One signal handler locks that mutex (again, so being deadlocked). Another signal handler unlocks that mutex.

The scheduler then needs to make sure that it dead-locks all the thread except of one. A context switch happens by un-deadlocking the target thread by sending it a signal, then deadlocking itself with locking the mutex in TLS. (Technically the un-deadlocked thread may already start to execute before the currently active thread goes into deadlock, but that should be good enough).

I used a similar approach to implement the infrastructure for scheduling and handed that out as tasks to students. It worked quite good.

@benpicco benpicco added this pull request to the merge queue Mar 12, 2025
Merged via the queue into RIOT-OS:master with commit a042d66 Mar 12, 2025
25 checks passed
@benpicco
Copy link
Copy Markdown
Contributor

wow this even fixed the preemption issue that #19002 uncovered - it's now passing ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: build system Area: Build system Area: cpu Area: CPU/MCU ports Area: drivers Area: Device drivers Area: sys Area: System CI: no fast fail don't abort PR build after first error CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: native Platform: This PR/issue effects the native platform

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants