WIP: Porting to LLVM upstream#531
WIP: Porting to LLVM upstream#531mdboom wants to merge 1 commit intopyodide:masterfrom mdboom:llvm-upstream
Conversation
| @@ -1,12 +1,11 @@ | |||
| export EMSCRIPTEN_VERSION = 1.38.30 | |||
| export EMSCRIPTEN_VERSION = 1.38.42 | |||
There was a problem hiding this comment.
Looks like 1.38.47 is latest now...
|
Thanks for opening the PR @mdboom!
Don't worry about it, we are squashing PRs anyway.. :) |
In particular the CircleCI build error: might have been fixed in 1.38.46 emscripten-core/emscripten#9317 |
|
Nice. I'll probably see what upgrading emscripten does for us now... |
|
With emscripten 1.38.47, compiling sqlite crashes the compiler: |
|
Taking sqlite3 out, it does seem to successfully build the Python interpreter, but then it fails again similarly compiling lz4. |
|
@manumartin Are you trying to build this PR or master? I you updated bynarien you also need to update the path in |
|
Hi, Sorry for the noise I think all of the previous errors were caused because of a misconfiguration on my part (I'm still learning how the build system works). I have removed all the previous comments related to that. I think now my build is on par with @mdboom's. I have seen the SIGABRT problem when wasm-ld was trying to link stuff while building sqlite, lz4 and also on libf2c inside CLAPACK. I managed to build pyodide.asm.js by disabling sqlite and lz4 but then CLAPACK started building and libf2c failed: I also had to remove these arguments from LDFLAGS: -s USE_FREETYPE=1 It appears as if either freetype or libpng depend on zlib and emscripten was failing to compile from the emscripten port of zlib because it was trying to execute wasm-ld commands with the "-pie" and "-relocatable" arguments at the same time and it looks like they are incompatible. The -lstdc++ flag failed because the library wasn't found. The emscripten documentation says it should automatically take caer of the c++ standard libraries for you, perhaps this flag isn't really needed? I don't know. |
|
Someone from the LLVM IRC channel pointed out a possible solution to the GlobalSection::writeBody() issue, here is the full conversation: https://pastebin.com/vqZjTf9j I'm trying to check if it works. |
|
Hi, the problem has been fixed on this commit: https://reviews.llvm.org/rL372779 I still have several problems and several temporal workarounds:
I have a new error on scipy/quadpack: I haven't looked too much into it but seems related to the NUM_PARAMS stuff that @rth mentioned. |
Can you be more specific? Pyodide doesn't actually build these libraries, but uses the pre-built ones from emscripten by using the
Probably fine to move that forward. But eventually we will have to determine whether the functions in those objects are required by Scipy and if so fix the build ordering (or what have you) so this works...
This seems fine as a workaround, since we are explicitly using
On master, we patch this in So, we'll need to update our build script so it downloads emsdk with the sources for LLVM, patches them to increase the |
|
-pie & --relocatable problem: -lgfortran problem: NUM_PARAMS problem: I have solved this by downloading binaryen, patching it and compiling it. Sadly, I haven't found a way to use the normal ./emsdk install mechanism because the install command downloads and builds binaryen all at once and doesn't give you a chance to patch the EmulateFuncCast.cpp file. So my emsdk Makefile now downloads & builds binaryen manually. After building it you need to point the BINARYEN_ROOT variable to the binaryen build directory. This variable is inside the emsdk/emsdk/.emscripten file. This is normally done by "./emsdk activate" but again I couldn't use that mechanism because I needed a special version of binaryen which isn't listed in the available ones with "./emsdk list". The reason I needed an special binaryen version is because I encountered an additional problem. Emscripten is relying on another binaryen tool: wasm_emscripten_finalize. In the latest versions of binaryen (1.38.31 and 1.38.32) this tool expects a parameter called --initial-stack-pointer but emscripten 1.38.47 is not passing it. If the parameter isn't passed the tool exits with a fatal error and breaks the build. There is a PR on emscripten about this: At some point they removed the requirement of this parameter on binaryen but I have only found it fixed in the latest development tag: version_89 So now I'm using this binaryen version and the problem doesn't appear anymore. |
|
BTW I'm also pointing LLVM_ROOT inside emsdk/emsdk/.emscripten to my build of LLVM that contains the fix the LLVM guy pushed yesterday. I'm now having another problem when building scipy/linalg though, several errors like this are appearing: I have checked the sources and indeed the function definitions are different. and one happens with blas and libf2c:
|
|
Ok, those error messages were somewhat misleading, I had to look into the wasm-ld source (https://github.com/llvm/llvm-project/blob/67b055841f3b64efd1e92bde3ed7aeeb493c1182/lld/wasm/SymbolTable.cpp#L710) to understand them, there are two "function signature mismatch" cases:
Also I'm starting to think the majority of these warnings are normal, in particular the ones were two functions are found were the mismatch is on the return type (void vs int). I think this happens because the cython wrappers aren't using the same return method as the functions inside clapack, instead they are using the first parameter of every function to pass an object by reference were the return value is put. |
|
problem with scipy/sparse/linalg/dsolve: I have seen that @rth modified setup.py files from other scipy packages, for example from scipy/sparse/linalg/isolve. He did this so that these packages would link to LAPACK. my build is breaking on sparse/linalg/dsolve because it does not find a symbol from BLAS called "scopy". I don't know why this wasn't failing with bynarien. I have tried to modify the setup.py of sparse/linalg/dsolve in the same way as the other setup.py files and after doing it this specific issue is solved and the symbol is found. However now I have duplicate symbol errors because there are some functions inside sparse/linalg/dsolve/SuperLU that seem copy pasted from blas/libf2c. again @rth has solved other duplicate symbol errors by entirely removing object files from LAPACK, but in this case the duplicated functions seem fairly common and I think if I remove them other stuff can break. I don't know how to proceed with this, all of these errors seem to be caused because of the static linking. |
|
Declared those functions from SuperLU as extern and now it works. but.. more errors: |
|
I have built pyodide with emscripten 1.38.47 and the master LLVM version from a few days ago disabling all packages except numpy. I just wanted to see if it ran properly and I'm getting this error: |
|
I'm not sure what might be causing that. Does tweaking the Since this backtrace is in js, you should be able to add |
|
It seems to be an issue with chrome dev tools, it only happens when I refresh with chrome dev tools opened. I don't see it on firefox. I have another error now: I can see a wasm stacktrace and also if I activate debugging symbols / sourcemaps with -g4 I can see this c++ stacktrace: and it says the problem happens on python/Modules/xxsubtype.c:94 which contains this: That doesn't look at all like the calls on the stacktrace.. I have also tried to put a breakpoint there but because of the out of memory bug I have to restart the dev tools everytime I try. On firefox I can't get the sourcemaps to work. From looking directly at the webassembly it seems like there is a "call xxx" instruction without setting any parameters first. xxx has parameters so it seems reasonable that a runtime error happens when calling it without them. |
|
Yes.. definitely the sourcemaps aren't right.. this is the full stacktrace: |
|
Someone has a very similar stacktrace here google/sanitizers#947 He is also using clang but not to compile to wasm. It looks like the problem is related to libc++ initialization "Now, this seems like an initialization problem (e.g. similar to issue #30) with the C++ standard stream objects (std::cout, etc.), since the reference call in the stack traces is std::ios_base::Init." |
|
hi, upstream is currently (very?) unstable with -fPIC, there was some improvement on tot-upstream a while ago but it's seems there are regressions. |
|
Hi @pmp-p thanks for the answer. The thing is I'm interested in using -fPIC because we want to use dynamic linking. Scipy has several libraries that are currently linking statically against LAPACK and this generates a very big object with ~5 copies of LAPACK. I'm wondering if perhaps we could link statically but use some of the link time optimizations that LLVM supports. Currently I can't even make a c++ hello world work using emscripten-1.38.47 + LLVM master + "-fPIC". It builds and it executes but it prints weird stuff. When I remove -fPIC it works. On top of that the dwarf debugging information inside the .wasm file seems wrong or missing some parts, that makes the sourcemaps generated by emscripten to also have missing stuff since they are generated from the dwarf information. I don't know if I'm doing something wrong but I wasn't expecting I wasn't going to be able to build, run and debug a simple c++ "hello world". I can currently build pyodide (Without packages like scipy) with this setup, but it is throwing some weird runtime error inslide libcxx when running. |
|
For anyone interested I managed to make the sourcemap stuff work by setting the -g4 flag when building the system libraries (libstdc++). I had to edit the emscripten tools/system_libs.py script and change a cflags variable there. It's kind of a hack but I haven't found the proper way of doing this. |
|
Thanks for all your work on this @manumartin !
It great that you at least managed to make it build. To make things a bit more incremental, I think it could make sense to create a separate branch (e.g.
Yeah, that part is a mess. There are multiple duplicated symbols in general, in scipy & LAPACK/BLAS, with dynamic linking it doesn't seem to matter, but it will error when linked statically. I haven't found a good approach for it short of disabling modules entirely. |
|
Hi @rth I'm ok with that, however to make it work I'm currently using: emscripten 1.38.47 Those versions can't be installed by the normal "emsdk install / activate" method and I have installed them manually. I could modify emsdk/Makefile temporarilly so that it downloads and builds those until emsdk supports versions with the needed changes. |
|
@manumartin Thanks again for your work on this! In your attempts what branch did you use? I imagine you have additional changes on top of this PR? I'm trying to give this another go in #637 |
|
Just for future reference there is a more incremental apporach for updating emscripten in #480 |
Ref: #476
This doesn't work yet due to some bugs in emscripten. But hopefully this will be useful to others tackling the same problem.
This is based on work started by @rth, but I had to rearrange the history and his original commits were lost. I intend to come back and give him authorship credit on those commits before this is finally merged.