-
Notifications
You must be signed in to change notification settings - Fork 506
Investigating LLVM as a general backend for x86, x64 and arm64 code generation #8230
Description
Per our chat on gitter.im/dotnet/corert today, we are investigating whether or not it makes sense for us to put some work into getting LLVM as a backend for CoreRT. (For a full context of why we need that, please refer to the gitter conversation, or feel free to contact me directly)
Thanks to all the great work that's already been done via LLVM for wasm, perhaps the project wouldn't be as crazy as I immediately think.
With help from @MichalStrehovsky and @yowl, and based on Michal's zerosharp code, I got a tiny little no-corelib Hello world to compile via LLVM an x86 Mac executable:
master...christianscheuer:macllvm
To test it, make sure to install the prerequisites (it needs llvm 10.0 and the MacOS 10.13 SDK installed, see the README for more info), then build CoreRT in Debug and skip the tests: ./build.sh skiptests, then go into the tests/src/llvm directory and run ./build.sh.
It will run CSC to compile to IL, run the modified ILC from the repo to compile to bitcode (which uses the WebAssembly LLVM backend with some hardcoded modifications) - and then use the installed LLVM 10 linker (it needs to be the same major version of LLVM that's used to produce the bitcode, and we're using libLLVM 10 to produce the bitcode) to produce a runnable executable. It then runs it, it prints Hello world and returns 42 :)
--
Now all of this raises a bunch of questions it would be great to get your input to.
-
Assuming that we would be going down this path, I expect the first step would be to look at how to refactor the WebAssembly backend so that a new, generic ILToLLVMImporter can be created as a base class, and have ILToWebAssemblyImporter inherit from that, as well as other derived classes for the new x86, x64, arm64 targets (or, some other structure). For example, I suspect many opcode importers, stack management etc. should be the same no matter if the "architecture" is wasm or x64 (accounting for different pointer sizes), so there should be a lot of code that can be reused. Things like the shadow stack however I don't think we'll need for anything but wasm, if I understand the reasons behind it correctly, so there would be a challenge to see how to structure the code so both a shadow stack based and regular stack based version can coexist.
So the question is - with all of this in mind: How would you approach code structuring of such an endeavor?
Obviously we can do a clean copy-paste into a different directory, but I thought the two implementations could benefit from each other if they were linked... -
Overall engineering plan
What would be the best way to approach the rest of the bringup, once that initial refactor has been done? What would be the best order of x64-ifying it (ie. add support for 8 byte pointer sizes, potentially remove the shadow stack, etc.)
Any other considerations?