Experiment: Enable SLP vectorizer for faster codegen#73957
Experiment: Enable SLP vectorizer for faster codegen#73957rschu1ze merged 2 commits intoClickHouse:masterfrom
Conversation
|
This is an automated comment for commit f6ba191 with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
|
Yes that's still 20% slower than before LLVM 16 but nobody said that this PR was going to fix all performance issues 😄 The most reasonable explanation for the slowdown is the transition to the new pass manager: https://github.com/ClickHouse/ClickHouse/pull/69654/files#diff-897102eb40b3e08e4997b80459a83b97ac7d555e4534942cd30581e4f92cefe2 The missing SLP vectorizer enablement (this PR) was obvious but I couldn't find equivalents for pass_manager_builder.LoopVectorize = true;
pass_manager_builder.RerollLoops = true;in the new infrastructure. But I am not an expert in this, the code is actually copy-paste from https://llvm.org/docs/NewPassManager.html. Further thoughts and inputs highly welcome ... |
After the recent upgrades to LLVM 16 and 18, folks started to complain about slower codegen: #66053 (comment)
Prior to the LLVM 16 upgrade, CHJIT force-enabled the SLP vectorizer (see https://github.com/ClickHouse/ClickHouse/pull/69654/files#diff-897102eb40b3e08e4997b80459a83b97ac7d555e4534942cd30581e4f92cefe2). This PR now does the same using the new LLVM pass manager. Let's see if it helps with performance.
Changelog category (leave one):