-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
I got Scipy 1.7.3 working in Pyodide. This was quite difficult. I wanted to give a summary of some of our difficulties in case people are interested. Basically all of our issues are Fortran related. My two main questions for people here are:
- are any patches upstreamable?
- does anyone have advice about flang or other LLVM-based Fortran compilers?
Some patches may be interesting
- BLAS detection fails because we have BLAS for Wasm installed but not native BLAS.
setup.pydetects that native BLAS is missing because it doesn't realize it is being cross compiled. It would be useful to have a flag for setup.py to tell it to skip this detection for cross compilation. We currently just disable it. - I believe this patch was written by @rth. I don't know what it does, why we need it, or whether it could be appropriate to upstream.
- Some issues with const are fixed in this patch. May be related to cython_blas does not use const in signatures #14262.
- We have a clash in the definition of sasum. For some reason our sasum returns double. Maybe has to do with our use of f2c. Patched here.
- The make int return values patch, more about this in next section.
We also have 5 patches which are related to f2c issues and one patch due to a problem with Pyodide's packaging system. These are not suitable to upstream.
int vs void return types
Wasm is very picky about function signatures. If a function is defined with return value int and then imported with return value void this causes trouble for us. I believe Fortran ABI returns integers to indicate which custom return was used, but people mostly ignore them. Anyways, we use sed and manual patching to turn all of the functions. I am not sure if this stuff could be appropriate to upstream.
The compiler
Our biggest difficulty is the compiler. We are currently using f2c, which only works with Fortran 77. It is a bit of a miracle that this works. gfortran and other mature fortran compilers don't have a wasm backend. Flang classic seems promising and I have had luck producing wasm binaries with it, but the version that is distributed on apt is based on LLVM 7. We need to link the object files with Emscripten because WASM has no standard for dynamic linking and so we need to use Emscripten linker to create dynamic libraries. But our Emscripten toolchain is based on LLVM 13. LLVM 7 and LLVM 13 do not use the same object file format, so it doesn't work to link objects produced by flang classic with emscripten. The most recent version of flang classic apparently works with LLVM 10, I haven't yet checked if it's possible to generate wasm object files with LLVM 10 and link them with LLVM 13, but if that is possible this could potentially be an approach.
We really just need to fix the compiler, but here are some issues caused by the f2c:
We are stuck on CLAPACK 3.2
LAPACK 3.3 introduces some dynamically sized arrays and other features which aren't compatible with Fortran 77, so it can't be f2c'd anymore. Trying to build SciPy using LAPACK 3.2, we end up with 36 missing symbols. Conveniently, each LAPACK function is defined in a distinct file so we can just copy the missing ones into SciPy. The four functions cuncsd, dorcsd, sorcsd, and zuncsd use dynamically subbed arrays so I have to delete them.
mvnun and mvnun_weighted also don't work
Again, dynamically sized arrays are the culprit. So we delete them.
f2c output requires patching to fix implicit cast function arguments
A lot of this can be done automatically by collecting up the definition signatures and then fixing the declarations so that they agree with the definition. But implicit casts from character to integer appear in a bunch of places and these require manual patching to fix up the extra ftnlen arguments that Fortran ABI has for character * arguments.
f2c doesn't handle the common keyword correctly
It leads to duplicate symbols. We have to manually add some externs.