Improve worst case performance when using ARM Neon instructions#970
Conversation
|
There seems to be some confusion with the data files. I noticed some were used in the benchmarks and some were not. I also not that the benchmarks list other files not in the date such as "bytes.15.worstcase". There is a duplicate data file activitypub.compact.txt and activitypub.compat.txt. If the files are JSON and they appear to be then why give them a ".json" suffix instead of ".txt"? Thee other changes tentatively look fine but I'd like to get a better feel for what type of files are slower and which are faster. |
The
Oops... apologies about that. I generated the first by hand then wrote a script... but I guess I had a typo in that first one. I will also add a snippet for how these files were generated to hopefully save someone a few minutes in the future should they ever need to be re-generated.
They are, I'll fix the extension.
Data that is very heavy in bytes that need to be escaped will benefit the most from these changes. Should someone decide to JSON encode binary data, I would expect these changes would help a lot as compared to the current ARM Neon implementation. Text-based input likely wouldn't see much change from these changes. |
|
All your responses make sense. Once you mark the PR as read for review I'll approve and merge. Assuming tests pass of course. :-) |
This PR significantly improves the worst case performance in
oj_dump_cstrwhen dumping strings with a significant number of characters that need to be escaped.This PR also simplifies the ARM Neon implementation.
Using the same real-world and worst-case benchmarks as on #967 we can see that real-world performance is effectively the same (as compared to
master) whereas the worst-case benchmarks are considerably less-bad.Benchmarks
Compiler: clang
Compiler: gcc-14