Conversation
99199f5 to
cd93aa7
Compare
|
I also improved the unit test a little bit to cover a few edge cases and to verify that the sorted list is actually sorted. |
f8a1c38 to
689b4cd
Compare
This adds a benchmark for clist_sort() and calls that on three different list sizes.
689b4cd to
9842631
Compare
Add a few cycles before starting the stopwatch for fancy CPUs that do have a cache and branch predictor that may warm up (e.g. Cortex M7).
9842631 to
c33cb94
Compare
|
Some benchmarks:
|
| /* This is the worst case for most sorting algorithms: having the | ||
| * list perfectly sorted but in inverse direction. For real time, we | ||
| * only care about worst case and not about best case / average case */ | ||
| n->value = val--; |
There was a problem hiding this comment.
This is not the worst case for this insertion sort implementation, as using val++ makes sorting much slower.
| n->value = val--; | |
| n->value = val++; |
E.g., on nrf52840dk, for n=256:
before:
2025-04-12 23:18:21,904 # clist_sort (#256, worst): 311423us --- 311.423us per call --- 3211 calls per sec
after:
2025-04-12 23:19:11,954 # clist_sort (#256, worst): 9233345us --- 9233.345us per call --- 108 calls per sec
That's a factor 30 slower than the claimed "worst case".
The cut-off point where this insertion sort becomes slower than the previous implementation (which does not come from IOCCC btw) is below n=32.
|
I'm all for a fixed clist sort, but the benchmark seems misleading, the worst case is actually not the worst case for this insertion sort.
With the "fixed worst case", it becomes noticable with less than 32 list entries. At 32 list entries, the old merge sort is already 50% faster then this insertion sort (worst case). The nrf's memory fits 1000 times 32 list entries. With 16384 list entries, merge sort can sort the list five times per second. (insertion sort is still running and will take a while, I will update this post when it is done.). This might be a size that is unlikely to come up on our little devices. This insertion sort is still much faster in the "avg case" of this benchmark. (Which it shouldn't be actually. 🙂) |
- Iterate of different lengths of unsorted data to also cover corner case one-item list - Actually check that the list is sorted afterwards (and not just that the maximum got sorted in last) - Actually check that the list was stable-sorted (as the API doc promises a stable sort)
This replaces the previous clist sort implementation, which used a merge sort, with a simple insertion sort implementation. The motivation is as follow: 1. The insertion sort is smaller (code size) 2. The performance benefit of the merge sort would only become noticeable on data sets that would be to large to comfortably handle on the MCUs RIOT targets 3. The previous implementation looked like a good candidate for the IOCCC, while the insertion sort is trivial to understand 4. Static analysis could provide for two statements an example control flow path that would trigger a `NULL` pointer dereferencation. The current code has no such defects.
c33cb94 to
66f4a66
Compare
Contribution description
This replaces the previous clist sort implementation, which used a merge sort, with a simple insertion sort implementation.
The motivation is as follow:
NULLpointer dereferencation. The current code has no such defects.Testing procedure
-fanalyzershould now be happy with the codeIssues/PRs references
None