Teach dynamic linking to handle library -> library dependencies#7512
Teach dynamic linking to handle library -> library dependencies#7512kripken merged 2 commits intoemscripten-core:incomingfrom
Conversation
876d284 to
865de96
Compare
|
Patch updated to fix diff --git a/src/support.js b/src/support.js
index 4e5a18f98..0d8efe1b9 100644
--- a/src/support.js
+++ b/src/support.js
@@ -221,7 +221,7 @@ function loadDynamicLibrary(lib, flags) {
if (libModule.dynamicLibraries) {
if (flags.loadAsync) {
return Promise.all(libModule.dynamicLibraries.map(function(dynNeeded) {
- return loadDynamicLibrary(dynNeeed, flags);
+ return loadDynamicLibrary(dynNeeded, flags);
})).then(function() {
return libModule;
});
diff --git a/tests/test_core.py b/tests/test_core.py
index 97f27f737..6cdcd241a 100644
--- a/tests/test_core.py
+++ b/tests/test_core.py
@@ -4096,16 +4096,18 @@ def test_dylink_dso_needed(self):
self.set_setting('TOTAL_MEMORY', 64 * 1024 * 1024) # XXX default 16MB is not enough
so = '.wasm' if self.is_wasm() else '.js'
+
def ccshared(src, linkto=[]):
- run_process([PYTHON, EMCC, src, '-s', 'SIDE_MODULE=1', '-o', os.path.splitext(src)[0]+so] \
- + self.get_emcc_args() + ['-s', 'RUNTIME_LINKED_LIBS='+str(linkto)])
+ cmdv = [PYTHON, EMCC, src, '-s', 'SIDE_MODULE=1', '-o', os.path.splitext(src)[0] + so]
+ cmdv += self.get_emcc_args() + ['-s', 'RUNTIME_LINKED_LIBS=' + str(linkto)]
+ run_process(cmdv)
ccshared('liba.cpp')
- ccshared('libb.cpp', ['liba'+so])
- ccshared('libc.cpp', ['liba'+so])
+ ccshared('libb.cpp', ['liba' + so])
+ ccshared('libc.cpp', ['liba' + so])
self.set_setting('MAIN_MODULE', 1)
- self.set_setting('RUNTIME_LINKED_LIBS', ['libb'+so, 'libc'+so])
+ self.set_setting('RUNTIME_LINKED_LIBS', ['libb' + so, 'libc' + so])
self.do_run(r'''
void bfunc();
void cfunc();
@@ -4115,8 +4117,8 @@ def ccshared(src, linkto=[]):
cfunc();
return 0;
}
- ''',
- 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
+ ''',
+ 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
self.set_setting('RUNTIME_LINKED_LIBS', [])
self.emcc_args += ['--embed-file', '.@/']
@@ -4145,7 +4147,7 @@ def ccshared(src, linkto=[]):
return 0;
}
''' % locals(),
- 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
+ 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
@needs_dlfcn
def test_dylink_dot_a(self):Flake8 will still be bad because even pristine incoming has flake8 issues: |
|
( |
|
Thanks! I didn't read through all the details here, but the general idea makes sense. Specifically using |
|
@kripken, thanks for feedback. I guess eventually it should be the same as with gcc - e.g. just if an argument or similarly with That said, for now we already have only |
I'm fine with extending |
|
@sbc100, thanks for feedback. |
|
This generally looks great to me. As a side note, for your use case, can't your solve your problem by simply including the transative closure of the all the SIDE_MODULES's you need in the MAIN_MODULES's RUNTIME_LINKED_LIBS? Or do are you using dlopen to delay loading these modules? In any case it makes sense to all SIDE_MODULES to have dependencies. The addition of DT_NEEDED to the dylink section of the wasm so format is the only part that seems a little contraverial, but seeing as that is an evolving spec, and this information is clearly needed by the rtld I don't see why we shouldn't go aheard with something like this. Can you send a PR to the tool-conventions repo with a detailed spec of the format extension? |
src/library.js
Outdated
| }, | ||
| // char* dlerror(void); | ||
| dlerror__deps: ['$DLFCN'], | ||
| dlerror__deps: ['$LDSO', '$DLFCN'], |
There was a problem hiding this comment.
I don't see LDSO being used below, is it needed?
There was a problem hiding this comment.
LDSO is not used in dlerror, yes, but I decided to universally add $LDSO dependency to every dl* function, so that there is zero uncertainty whether it might break or not even after potential future changes to dl* stuff. After all dl* functionality is strongly related to the dynamic linker and unconditionally depending on it is imho ok.
However if it would be better from project maintainers point of view, I will remove LDSO from here.
tests/test_core.py
Outdated
| # B | ||
| # main < > A | ||
| # C | ||
| open('liba.cpp', 'w').write(r''' |
There was a problem hiding this comment.
For new code can you write this as with open(...) as f:
There was a problem hiding this comment.
Yes, sure. There is even a potential problem with how it is presently written - as it potentially leaks an fd and implicitly relies on GC to finalize/close the opened file. I kept it that way because all the code in test_core.py does it that way.
If we are to change something here, then what you suggest would work
with open('liba.cpp', 'w') as f:
f.write(r'''
...but it is 2 lines and more noise than it should be. I thus suggest to add the following small helpers:
# readfile reads data from file @path.
def readfile(path):
with open(path, "r") as f:
return f.read()
# writefile writes data to file @path.
def writefile(path, data):
with open(path, "w") as f:
f.write(data)and switch to using them universally, e.g.:
writefile('liba.cpp', r'''
...The switch should be then better done globally in one go.
The above is of course minor and for now I change the patch to use with open....
(I keep the style to be old and consistent with the around in browser.test_dynamic_link)
There was a problem hiding this comment.
I'm been transitioning the test code over to "with open().." for the last for months. I agree we should probably do a wholesale switch at some point. In the mean time I'd rather see "with open.." used in new code.
In terms on the long term approach, AFAICT using "with open.." only as a single line extra (not two as you claim), but when we do that large refactor I'd be OK with a helper function I think.
There was a problem hiding this comment.
ok. (the patch now uses "with open()..." for new test)
tests/test_core.py
Outdated
| } | ||
| ''') | ||
|
|
||
| self.set_setting('TOTAL_MEMORY', 64 * 1024 * 1024) # XXX default 16MB is not enough |
There was a problem hiding this comment.
Any idea why its not enough?
There was a problem hiding this comment.
Yes - with the following debug patch:
diff --git a/src/support.js b/src/support.js
index 0d8efe1b9..01847d119 100644
--- a/src/support.js
+++ b/src/support.js
@@ -136,6 +136,7 @@ function fetchBinary(url) {
// flags.global and flags.nodelete are handled every time a load request is made.
// Once a library becomes "global" or "nodelete", it cannot be removed or unloaded.
function loadDynamicLibrary(lib, flags) {
+ console.error("loadDynamicLibrary", lib);
// when loadDynamicLibrary did not have flags, libraries were loaded globally & permanently
if (flags === undefined) {
flags = {global: true, nodelete: true}
@@ -345,6 +346,8 @@ function loadWebAssemblyModule(binary, flags) {
var tableSize = getLEB();
var tableAlign = getLEB();
+ console.error("loadWebAssemblyModule, memorySize:", memorySize);
+
// shared libraries this module needs. We need to load them first, so that
// current module could resolve its imports. (see tools/shared.py
// WebAssembly.make_shared_library() for "dylink" section extension format)
diff --git a/tests/runner.py b/tests/runner.py
index ec25ae364..a2ad9d0e5 100755
--- a/tests/runner.py
+++ b/tests/runner.py
@@ -612,6 +612,7 @@ def run_generated_code(self, engine, filename, args=[], check_timeout=True, outp
os.chdir(cwd)
out = open(stdout, 'r').read()
err = open(stderr, 'r').read()
+ print('stderr:\n', err)
if engine == SPIDERMONKEY_ENGINE and self.get_setting('ASM_JS') == 1:
err = self.validate_asmjs(err)
if output_nicerizer:it becomes clear:
wasm@deco:~/emsdk/emscripten/incoming$ ./tests/runner.py binaryen1.test_dylink_dso_needed
test_dylink_dso_needed (test_core.binaryen1) ... stderr:
loadDynamicLibrary libb.wasm
loadWebAssemblyModule, memorySize: 5242896
loadDynamicLibrary liba.wasm
loadWebAssemblyModule, memorySize: 5242928
loadDynamicLibrary libc.wasm
loadWebAssemblyModule, memorySize: 5242896
loadDynamicLibrary liba.wasm
...
in other words every library is so compiled that it tells the linker to reserve ~ 5MB of memory for every module. For 3 libraries (A,B,C) this gives ~15MB that should be only reserved. Probably the rest 1MB is not enough to keep everything else.
There was a problem hiding this comment.
That seems very strange to me. Maybe I'm missing something but why would these modules need 5Mb of static data? They have almost no data at all in them right?
There was a problem hiding this comment.
Right, that's strange, but this is what emscripten currently emits. I can see that make_shared_library() extracts this from var STATIC_BUMP = ...
but I had not debugged further. For the reference there is a sign that similar problems were there before, e.g. here:
but there it is added without any explanation.
There was a problem hiding this comment.
OK, sounds like this is an existing issie. Perhaps just leave a TODO or FIXME to investigate why these shared libraries need so much memory.
There was a problem hiding this comment.
Yes, I added
# XXX in wasm each lib load currently takes 5MB; default TOTAL_MEMORY=16MB is thus not enough
tools/shared.py
Outdated
| for dyn_needed in needed_dynlibs: | ||
| dyn_needed = asbytes(dyn_needed) | ||
| contents += WebAssembly.lebify(len(dyn_needed)) | ||
| contents += dyn_needed |
There was a problem hiding this comment.
If we land this we will also need to update https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md
There was a problem hiding this comment.
| loadedLibs: { // handle -> dso [refcount, name, module, global] | ||
| // program itself | ||
| // XXX uglifyjs fails on "[-1]: {" | ||
| '-1': { |
There was a problem hiding this comment.
Why use 0 rather then -1 here?
There was a problem hiding this comment.
The handles are also used as return for dlopen, and 0 dlopen return means "error". This -1 was semantically moved here from dlopen code. I'm adding the following:
--- a/src/support.js
+++ b/src/support.js
@@ -88,6 +88,7 @@ var asm2wasmImports = { // special asm2wasm imports
var $LDSO = {
loadedLibs: { // handle -> dso [refcount, name, module, global]
// program itself
+ // NOTE handle=0 is avoided as it means "error" in dlopen.
// XXX uglifyjs fails on "[-1]: {"
'-1': {
refcount: Infinity, // = nodeleteThere was a problem hiding this comment.
update: the note goes to LDSO.nextHandle definition:
src/support.js
Outdated
| // when loadDynamicLibrary did not have flags, libraries were loaded globally & permanently | ||
| if (flags === undefined) { | ||
| flags = {global: true, nodelete: true} | ||
| } |
There was a problem hiding this comment.
Are you using this default anywhere? if not just remove the above 4 lines?
There was a problem hiding this comment.
Yes, it is still used, e.g. here in tests:
I also wanted to preserve backward compatibility for users, as loadDynamicLibrary is exported and thus could be used from someones code. If it is so, and we drop flags === undefined handling, then flags.global will be false and the loading seamantic will change from previously default global=true.
There was a problem hiding this comment.
(sorry flags.global will just raise, but still it is compatibility breakage because what used to work becomes non-functional)
src/support.js
Outdated
| handle = 1 | ||
| for (var key in $LDSO.loadedLibs) { // not all browsers support Object.keys() | ||
| if ($LDSO.loadedLibs.hasOwnProperty(key)) handle++; | ||
| } |
There was a problem hiding this comment.
Maybe just var handle = $LDSO.nextHandle++ ?
There was a problem hiding this comment.
This code was moved from dlopen:
However looking at it now it is not even correct. Imagine
- dlopen('A') -> 1
- dlopen('B') -> 2
- dlopen('C') -> 3
- dlclose('A') ; 1 is deleted from loadedLibs
- dlopen('D') : should be either 1 or 4 but it gives 3, thus erroneously overwriting C.
maintaining $LDSO.nextHandle seems indeed easier (and note about handle=0 being avoided goes nearby $LDSO.nextHandle definition).
--- a/src/support.js
+++ b/src/support.js
@@ -86,6 +86,10 @@ var asm2wasmImports = { // special asm2wasm imports
#if RELOCATABLE
// dynamic linking
var $LDSO = {
+ // next free handle to use for a loaded dso.
+ // (handle=0 is avoided as it means "error" in dlopen)
+ nextHandle: 1,
+
loadedLibs: { // handle -> dso [refcount, name, module, global]
// program itself
// XXX uglifyjs fails on "[-1]: {"
@@ -165,10 +169,7 @@ function loadDynamicLibrary(lib, flags) {
}
// allocate new DSO & handle
- handle = 1
- for (var key in $LDSO.loadedLibs) { // not all browsers support Object.keys()
- if ($LDSO.loadedLibs.hasOwnProperty(key)) handle++;
- }
+ handle = $LDSO.nextHandle++;
dso = {
refcount: flags.nodelete ? Infinity : 1,
name: lib,| // add symbols into global namespace TODO: weak linking etc. | ||
| for (var sym in libModule) { | ||
| if (!libModule.hasOwnProperty(sym)) { | ||
| continue; |
There was a problem hiding this comment.
I wonder if there is a more JS idomatic way to do this? for (var sym in libModule.getOwnPropertyNames()) ? Showing my JS ignorance here.
There was a problem hiding this comment.
this kind of checking just semantically moved, not created here. This is the original code:
getOwnPropertyNames is indeed there https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/getOwnPropertyNames but I'm too new to javascript to tell whethere it will be not good here to return the array instead of just iterating.
I suggest to keep the current way for now and do, js modernization, if any in the next patches step by step.
src/support.js
Outdated
| // a global. the g$ function returns the global address. | ||
| var name = prop.substr(2); // without g$ prefix | ||
| } | ||
| // TODO kill vvv (it will likely be not needed if we require that if A wants |
There was a problem hiding this comment.
ok, it should be kill ↓↓↓ (changing to this way) and it says the the block below could be deleted. The code in question to delete is this:
/cc @kripken
|
Thinking about ways we could potentially split this up: PR1: Unify loadDynamicLibraries and dlopen code. (This would an NFC: Non functional change) PR2: Add support for shared library dependencies. (This would include the wasm format extension and the updated/new test code). Only if that makes sense ... |
The use case is to recreate a Python scientific computing environment, where the user loads only the modules they need at run time. For example, Numpy puts its rarely used linear algebra routines in a separate module from the core stuff. Admittedly, this is pretty different from a standard prepackaged "app". |
This comes from my work to teach Emscripten dynamic linker to support library -> library linkage: emscripten-core/emscripten#7512 The motivation is that without support for DSO -> DSO linking, it becomes problematic in cases when there are several shared libraries that all need to use another should-be shared functionality, while linking that should-be shared functionality to main module is not an option for size reasons. See above pull-request for more details. In order to support library -> library linkages, we have to store somewhere the information about which other libraries, a library needs. This patch extends "dylink" section with such information, which is similar to how ELF handles the same situation with DT_NEEDED entries. /cc @sbc100
865de96 to
097e4c8
Compare
|
@sbc100, thanks for review. Yes, as @mdboom explained, our use-case is to dlopen a library only once user requests related functionality. Linking in everything, even dynamically, to main module is thus not an option as it would download a lot for everyone with most of the bits being likely never used. I've prepared DT_NEEDED specification change here: WebAssembly/tool-conventions#77. I've amended the end result with the following interdiff (whitespace omitted): diff --git a/src/support.js b/src/support.js
index 0d8efe1b9..f67b1df41 100644
--- a/src/support.js
+++ b/src/support.js
@@ -86,6 +86,10 @@ var asm2wasmImports = { // special asm2wasm imports
#if RELOCATABLE
// dynamic linking
var $LDSO = {
+ // next free handle to use for a loaded dso.
+ // (handle=0 is avoided as it means "error" in dlopen)
+ nextHandle: 1,
+
loadedLibs: { // handle -> dso [refcount, name, module, global]
// program itself
// XXX uglifyjs fails on "[-1]: {"
@@ -165,10 +169,7 @@ function loadDynamicLibrary(lib, flags) {
}
// allocate new DSO & handle
- handle = 1
- for (var key in $LDSO.loadedLibs) { // not all browsers support Object.keys()
- if ($LDSO.loadedLibs.hasOwnProperty(key)) handle++;
- }
+ handle = $LDSO.nextHandle++;
dso = {
refcount: flags.nodelete ? Infinity : 1,
name: lib,
@@ -294,7 +295,7 @@ function loadDynamicLibrary(lib, flags) {
}
}
- // module for lib is loaded - update the dso
+ // module for lib is loaded - update the dso & global namespace
function moduleLoaded(libModule) {
if (dso.global) {
mergeLibSymbols(libModule);
@@ -414,7 +415,7 @@ function loadWebAssemblyModule(binary, flags) {
env[x] = Module[x];
}
}
- // TODO kill vvv (it will likely be not needed if we require that if A wants
+ // TODO kill ↓↓↓ (it will likely be not needed if we require that if A wants
// symbols from B it has to link to B explicitly: similarly to -Wl,--no-undefined)
//
// wasm dynamic libraries are pure wasm, so they cannot assist in
diff --git a/tests/test_core.py b/tests/test_core.py
index 6cdcd241a..7c0508e58 100644
--- a/tests/test_core.py
+++ b/tests/test_core.py
@@ -3290,6 +3290,63 @@ def zzztest_dlfcn_exceptions(self): # TODO: make this work. need to forward temp
ok
''')
+ @needs_dlfcn
+ def test_dlfcn_handle_alloc(self):
+ # verify that dlopen does not allocate already used handles
+ dirname = self.get_dir()
+
+ def indir(name):
+ return os.path.join(dirname, name)
+
+ libecho = r'''
+ #include <stdio.h>
+
+ static struct %(libname)s {
+ %(libname)s() {
+ puts("%(libname)s: loaded");
+ }
+ } _;
+ '''
+
+ self.prep_dlfcn_lib()
+ self.build_dlfcn_lib(libecho % {'libname': 'a'}, dirname, indir('a.cpp'))
+ shutil.move(indir('liblib.so'), indir('liba.so'))
+ self.build_dlfcn_lib(libecho % {'libname': 'b'}, dirname, indir('b.cpp'))
+ shutil.move(indir('liblib.so'), indir('libb.so'))
+
+ self.set_setting('MAIN_MODULE', 1)
+ self.set_setting('SIDE_MODULE', 0)
+ self.set_setting('EXPORT_ALL', 1)
+ self.emcc_args += ['--embed-file', '.@/']
+
+ # XXX in wasm each lib load currently takes 5MB; default TOTAL_MEMORY=16MB is thus not enough
+ self.set_setting('TOTAL_MEMORY', 32 * 1024 * 1024)
+
+ src = r'''
+ #include <dlfcn.h>
+ #include <assert.h>
+ #include <stddef.h>
+
+ int main() {
+ void *liba, *libb, *liba2;
+ int err;
+
+ liba = dlopen("liba.so", RTLD_NOW);
+ assert(liba != NULL);
+ libb = dlopen("libb.so", RTLD_NOW);
+ assert(liba != NULL);
+
+ err = dlclose(liba);
+ assert(!err);
+
+ liba2 = dlopen("liba.so", RTLD_NOW);
+ assert(liba2 != libb);
+
+ return 0;
+ }
+ '''
+ self.do_run(src, 'a: loaded\nb: loaded\na: loaded\n')
+
def dylink_test(self, main, side, expected, header=None, main_emcc_args=[], force_c=False, need_reverse=True, auto_load=True):
# shared settings
self.set_setting('EXPORT_ALL', 1)
@@ -4052,7 +4109,8 @@ def test_dylink_dso_needed(self):
# B
# main < > A
# C
- open('liba.cpp', 'w').write(r'''
+ with open('liba.cpp', 'w') as f:
+ f.write(r'''
#include <stdio.h>
#include <emscripten.h>
@@ -4073,7 +4131,8 @@ def test_dylink_dso_needed(self):
static ainit _;
''')
- open('libb.cpp', 'w').write(r'''
+ with open('libb.cpp', 'w') as f:
+ f.write(r'''
#include <emscripten.h>
void afunc(const char *s);
@@ -4083,7 +4142,8 @@ def test_dylink_dso_needed(self):
}
''')
- open('libc.cpp', 'w').write(r'''
+ with open('libc.cpp', 'w') as f:
+ f.write(r'''
#include <emscripten.h>
void afunc(const char *s);
@@ -4093,7 +4153,8 @@ def test_dylink_dso_needed(self):
}
''')
- self.set_setting('TOTAL_MEMORY', 64 * 1024 * 1024) # XXX default 16MB is not enough
+ # XXX in wasm each lib load currently takes 5MB; default TOTAL_MEMORY=16MB is thus not enough
+ self.set_setting('TOTAL_MEMORY', 32 * 1024 * 1024)
so = '.wasm' if self.is_wasm() else '.js'
diff --git a/tools/shared.py b/tools/shared.py
index a6f49f2d1..03cb7b544 100644
--- a/tools/shared.py
+++ b/tools/shared.py
@@ -2994,6 +2994,9 @@ def make_shared_library(js_file, wasm_file, needed_dynlibs):
#
# dynlib_name_len varuint32 ; length of dynlib_name_str in bytes
# dynlib_name_str bytes ; name of a needed dynamic library: valid UTF-8 byte sequence
+ #
+ # a proposal has been filed to include the extension into "dylink" specification:
+ # https://github.com/WebAssembly/tool-conventions/pull/77
contents += WebAssembly.lebify(len(needed_dynlibs))
for dyn_needed in needed_dynlibs:
dyn_needed = asbytes(dyn_needed)I've split the whole change into individual patches each doing its own work. Since the patches have been split, please do not squash when merging, and ideally better merge as topic - not only applying patches - to keep the cover leter in the resulting merge commit. Thanks again, |
|
Thanks for taking that time to split this up! I agree we should not squash these. In general we prefer "rebase" to "merge" to keep the history linear. We could make an exception here, or you could just upload these are individual PRs that we can land one at a time? |
|
@sbc100, thanks for feedback. Could you please clarify - what would uploading patches as individual PR bring? Ability to apply some and not others? If it is required to be linear only, you can do the rebase, yes (but without squashing, I think there is an option on github, whether one wants squashing or not). However personally I find landing in bigger changes consisting of individual steps to be better done as a whole with merge. This way one can see the bigger picture from the cover letter, and then the individual steps are also there as finer-grained patches. Said that, let's please proceed doing whatever is most convenient for the project from its maintainers point of view. |
|
Uploading as individual PRs allows things to land incrementally, be reviewed incrementally and be testing incrementally (over the course of several hours/days/weeks/etc). It means individual changes are easier to review, so the can land faster and get more testing in isolation. I'm not saying this is the only way to work (and doesn't always fir well with the github model where PRs don't stack on top of each other) but its the way we work on chromium and llvm. Another way of saying it: Smaller incremental changes are (almost) always preferred. Having said all that the rules for emscripten are not set in stone and this change isn't super huge so I've OK with landing this as is (rebase but not squash). |
097e4c8 to
42cc7ad
Compare
|
Ok, I've posted separately #7517 and #7518. The other 3 patches could not be applied without those without conflicts and are also dependent on top of each other. Be it e.g. gerrit I could still mail them out, but with github I'm not sure how to do. This way I've just rearranged the patches here so that the first couple come first and then the rest of them. Hope it is ok. |
42cc7ad to
daa6f3c
Compare
|
@sbc100, thanks for applying #7518. Could you please also apply #7517? Without that patch being applied I cannot proceed spawning next "ldso: Teach loadDynamicLibrary to work asynchronously under a flag" as a separate PR without conflicts. I have refreshed patches here. This dropped already applied "ldso: dlopen could allocate a handle that was already in use (#7518)" and rebased the rest of the patches to latest incoming. Thanks beforehand, |
|
( the many and the browser console has many: ) |
|
@navytux - those browser errors you see suggest your browser has SharedArrayBuffer and/or wasm threads disabled. You can enable those in chrome or firefox (and maybe other browsers). However, those should be enabled here in our github CI, so it's odd they fail - might have been a random issue, I restarted those tests. |
|
@kripken, thanks for clarifying. I confirm that with a custom chromium launcher: (chromium+threads) #!/bin/sh
# run chromium with wasm threads on
# SharedArrayBuffer is already on by default
exec chromium --js-flags=--experimental-wasm-threads "$@"there is no more errors in browser console. However the Anyway after you retriggered - all tests passed on CI, so probably the failures were due to some flakiness. |
daa6f3c to
da02f7a
Compare
sbc100
left a comment
There was a problem hiding this comment.
lgtm with a couple of nits. I'll leave to alon to land.
tests/test_core.py
Outdated
| ''', | ||
| 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n') | ||
|
|
||
| self.set_setting('RUNTIME_LINKED_LIBS', []) |
There was a problem hiding this comment.
Hmm, I was hoping that using None here would be the way to unset a setting but it doesn't look like that works yet. I will create patch for that.
There was a problem hiding this comment.
I think, since RUNTIME_LINKED_LIBS is a list, it is more explicit to reset it back to []. However, if project keepers insist on resetting with None, or with e.g. imagined .reset_setting() - that can be done.
tools/shared.py
Outdated
| # https://github.com/WebAssembly/tool-conventions/pull/77 | ||
| contents += WebAssembly.lebify(len(needed_dynlibs)) | ||
| for dyn_needed in needed_dynlibs: | ||
| dyn_needed = asbytes(dyn_needed) |
There was a problem hiding this comment.
We use 2-space indentation in emscripten's python code.
There was a problem hiding this comment.
Sorry, my fault here. Reindenting with 2 spaces.
tests/test_browser.py
Outdated
| #include <stdlib.h> | ||
| #include <string.h> | ||
| char *side2(const char *data); | ||
| char *side2(const char *data) { |
There was a problem hiding this comment.
This doesn't need to be declared twice here.
There was a problem hiding this comment.
I did it this way because it was already the case in this tests:
and I thought it is maybe due to compiler (I recall hearing something long ago from a collegue that a C++ compiler insists to have prototypes for every functions). Anyway - after checking it seems not to be the case, so extra prototype for side2 will be gone.
tests/test_browser.py
Outdated
| ''') | ||
| run_process([PYTHON, EMCC, 'side.cpp', '-s', 'SIDE_MODULE=1', '-O2', '-o', 'side.wasm', '-s', 'EXPORT_ALL=1']) | ||
| run_process([PYTHON, EMCC, 'side2.cpp', '-s', 'SIDE_MODULE=1', '-O2', '-o', 'side2.wasm', '-s', 'EXPORT_ALL=1']) | ||
| run_process([PYTHON, EMCC, 'side.cpp', '-s', 'SIDE_MODULE=1', '-s', "RUNTIME_LINKED_LIBS=['side2.wasm']", '-O2', '-o', 'side.wasm', '-s', 'EXPORT_ALL=1']) |
There was a problem hiding this comment.
Maybe add a comment to this test to say what its doing?
There was a problem hiding this comment.
Makes sense. I'm adding the following to browser.test_dynamic_link as the overview:
--- a/tests/test_browser.py
+++ b/tests/test_browser.py
@@ -3238,6 +3238,9 @@ def test_webidl(self):
@requires_sync_compilation
def test_dynamic_link(self):
+ # verify that dynamic linking works in all-kinds of in-browser environment.
+ #
+ # main -> side -> side2
open('pre.js', 'w').write('''
Module.dynamicLibraries = ['side.wasm'];
''')it should hopefully make looking at those compile commands a bit clearer.
edit: "browser environments" -> "in-browser environments"
319799a to
5326fa5
Compare
|
@sbc100 thanks for feedback. I've amended the patch with the following interdiff diff --git a/tests/test_browser.py b/tests/test_browser.py
index f038f14f1..1e59525ff 100644
--- a/tests/test_browser.py
+++ b/tests/test_browser.py
@@ -3238,6 +3238,9 @@ def test_webidl(self):
@requires_sync_compilation
def test_dynamic_link(self):
+ # verify that dynamic linking works in all kinds of in-browser environments.
+ #
+ # main -> side -> side2
open('pre.js', 'w').write('''
Module.dynamicLibraries = ['side.wasm'];
''')
@@ -3275,7 +3278,6 @@ def test_dynamic_link(self):
open('side2.cpp', 'w').write(r'''
#include <stdlib.h>
#include <string.h>
- char *side2(const char *data);
char *side2(const char *data) {
char *ret = (char*)malloc(strlen(data)+1);
strcpy(ret, data);
diff --git a/tools/shared.py b/tools/shared.py
index bc05815fe..1c7c0e56d 100644
--- a/tools/shared.py
+++ b/tools/shared.py
@@ -2977,9 +2977,9 @@ def make_shared_library(js_file, wasm_file, needed_dynlibs):
# https://github.com/WebAssembly/tool-conventions/pull/77
contents += WebAssembly.lebify(len(needed_dynlibs))
for dyn_needed in needed_dynlibs:
- dyn_needed = asbytes(dyn_needed)
- contents += WebAssembly.lebify(len(dyn_needed))
- contents += dyn_needed
+ dyn_needed = asbytes(dyn_needed)
+ contents += WebAssembly.lebify(len(dyn_needed))
+ contents += dyn_needed
size = len(name) + len(contents)
f.write(WebAssembly.lebify(size))and rebased it on top of incoming. Hopefully it is ok, |
src/support.js
Outdated
| if (prop in obj) { | ||
| return obj[prop]; // already present | ||
|
|
||
| return loadModule(); |
There was a problem hiding this comment.
It's not bad to call this function before it's declared, but I think it's clearer when the call is afterward
There was a problem hiding this comment.
I thought about it initially and weighting all pros and cons decided that it was still clearier when the main logic was in the beginning - not scattered - and loadModule() implementation was following that.
Anyway I'm splitting this now upon your request with the following interdiff:
--- a/src/support.js
+++ b/src/support.js
@@ -356,21 +356,6 @@ function loadWebAssemblyModule(binary, flags) {
var name = UTF8ArrayToString(nameUTF8, 0);
neededDynlibs.push(name);
}
- if (neededDynlibs) {
- if (flags.loadAsync) {
- return Promise.all(neededDynlibs.map(function(dynNeeded) {
- return loadDynamicLibrary(dynNeeded, flags);
- })).then(function() {
- return loadModule();
- });
- }
-
- neededDynlibs.forEach(function(dynNeeded) {
- loadDynamicLibrary(dynNeeded, flags);
- });
- }
-
- return loadModule();
// loadModule loads the wasm module after all its dependencies have been loaded.
// can be called both sync/async.
@@ -532,6 +517,20 @@ function loadWebAssemblyModule(binary, flags) {
return postInstantiation(instance);
}
}
+
+ // now load needed libraries and the module itself.
+ if (flags.loadAsync) {
+ return Promise.all(neededDynlibs.map(function(dynNeeded) {
+ return loadDynamicLibrary(dynNeeded, flags);
+ })).then(function() {
+ return loadModule();
+ });
+ }
+
+ neededDynlibs.forEach(function(dynNeeded) {
+ loadDynamicLibrary(dynNeeded, flags);
+ });
+ return loadModule();
}
Module['loadWebAssemblyModule'] = loadWebAssemblyModule;| @@ -3238,6 +3238,9 @@ def test_webidl(self): | |||
|
|
|||
| @requires_sync_compilation | |||
| def test_dynamic_link(self): | |||
There was a problem hiding this comment.
I'd prefer to not modify the existing simple test, and add a new test (test_dynamic_link_lsdo or some other name) that tests the extra functionality added in this PR. That way a failure is more specific as to what has broken.
There was a problem hiding this comment.
I agree with this. I think the only reason I didn't mention it is because the browser tests tend to be more monolithic and have a higher overhead per-test. But in general I agree.
There was a problem hiding this comment.
Thanks for this comment. I initially considered adding another test in test_browser, but saw that browser.test_dynamic_link carries infrastructure to test in different kinds of in-browser environment, and I did not want to duplicate that. Anyway after you askd I've reworked the patch to drop any changes to that "simple" test and instead unroll the full equivalent of ldso tests from test_core, but in {wasm,asmjs}/{inside main,inside worker} browser environments.
This way it even uncovered an old concurrency bug in loadWebAssemblyModule (see 007608c "ldso: wasm: Don't assume that WebAssembly modules are loaded serially" for the fix), so thanks again for wondering here.
The changes to test_browser.py now look as follows:
diff --git a/tests/test_browser.py b/tests/test_browser.py
index 65507adab..9f0b4ceaa 100644
--- a/tests/test_browser.py
+++ b/tests/test_browser.py
@@ -23,6 +23,7 @@
from tools import system_libs
from tools.shared import PYTHON, EMCC, WINDOWS, FILE_PACKAGER, PIPE, SPIDERMONKEY_ENGINE, JS_ENGINES
from tools.shared import try_delete, Building, run_process, run_js
+import test_core
try:
from http.server import BaseHTTPRequestHandler, HTTPServer
@@ -3299,6 +3300,56 @@ def test_dynamic_link(self):
# same wasm side module works
self.btest(self.in_dir('main.cpp'), '2', args=['-s', 'MAIN_MODULE=1', '-O2', '--pre-js', 'pre.js', '-s', 'WASM=1', '-s', 'EXPORT_ALL=1'])
+
+ # verify that dynamic linking works in all kinds of in-browser environments.
+ # don't mix different kinds in a single test.
+ def test_dylink_dso_needed_wasm(self): self._test_dylink_dso_needed(1, 0)
+ def test_dylink_dso_needed_wasm_inworker(self): self._test_dylink_dso_needed(1, 1)
+ def test_dylink_dso_needed_asmjs(self): self._test_dylink_dso_needed(0, 0)
+ def test_dylink_dso_needed_asmjs_inworker(self): self._test_dylink_dso_needed(0, 1)
+
+ @requires_sync_compilation
+ def _test_dylink_dso_needed(self, wasm, inworker):
+ # here we reuse test_core.test_dylink_dso_needed, but the code is run via browser.
+ print('\n# wasm=%d inworker=%d' % (wasm, inworker))
+ self.set_setting('WASM', wasm)
+ self.emcc_args += ['-O2']
+ t = test_core.binaryen2 if wasm else test_core.asm2
+
+ def do_run(src, expected_output):
+ # XXX there is no infrastructure (yet ?) to retrieve stdout from browser in tests.
+ # -> do the assert about expected output inside browser.
+ #
+ # we have to put the hook into post.js because in main it is too late
+ # (in main we won't be able to catch what static constructors inside
+ # linked dynlibs printed), and in pre.js it is too early (out is not yet
+ # setup by the shell).
+ with open('post.js', 'w') as f:
+ f.write(r'''
+ Module.realPrint = out;
+ out = function(x) {
+ if (!Module.printed) Module.printed = "";
+ Module.printed += x + '\n'; // out is passed str without last \n
+ Module.realPrint(x);
+ };
+ ''')
+ src += r'''
+ int main() {
+ _main();
+ EM_ASM({
+ var expected = %r;
+ assert(Module.printed === expected, ['stdout expected:', expected]);
+ });
+ REPORT_RESULT(0);
+ }
+ ''' % (expected_output,)
+ # --proxy-to-worker only on main
+ if inworker:
+ self.emcc_args += ['--proxy-to-worker']
+ self.btest(src, '0', args=self.get_emcc_args() + ['--post-js', 'post.js'])
+
+ t._test_dylink_dso_needed.__func__(self, do_run)
+
@requires_graphics_hardware
@requires_sync_compilation
def test_dynamic_link_glemu(self):Please see whole interdiff for full details.
fedf83a to
7a31ed6
Compare
tests/test_browser.py
Outdated
| print('\n# wasm=%d inworker=%d' % (wasm, inworker)) | ||
| self.set_setting('WASM', wasm) | ||
| self.emcc_args += ['-O2'] | ||
| t = test_core.binaryen2 if wasm else test_core.asm2 |
There was a problem hiding this comment.
Sorry, I don't like this at all. I wouldn't to re-use parts of test_core like this.
test_core is magic enough already without creeping into test_browser this odd way.
There was a problem hiding this comment.
You can use best's also_asmjs=False perform a test for both asmjs and wasm instead I think.
There was a problem hiding this comment.
The shared parts may also go in the main runner.py (e.g. we have logic there for building freetype and other things various test suites need). I agree with @sbc100 that that would be less magical than internal dependencies between the core and browser test suites directly.
There was a problem hiding this comment.
I tried also_asmjs & other options in btest. It doesn't work here (sorry, I'm now too sleepy to remember exactly how it did not work, but for sure I tried hard first with just using btest infrastructure from test_browser).
This part does not really depend on test_core functionality and t = test_core.binaryen2 if wasm else test_core.asm2 could be dropped - as no settings etc are used from there (setUp and tearDown are not run becuase they cannot without proelyinheriting). I would get t as test_core.TestCoreBase but that symbol is deleted.
What matters here is that all the logic and code from test_dylink_dso_needed is not duplicated, so maybe we should just move it into some shared place, like e.g. test_dylink.py.
There was a problem hiding this comment.
ok, I will move to shared place and remove dependency on test_core.
There was a problem hiding this comment.
I general we worry less about duplication in test code than in production code.
Do we really need this test in test_browser? I wonder if we have faced this similar problem before where certain tests make sense both in core and in browser? If you want to create new shared helper code i I think the best place would be tests/runner.py. Files that start with "test_" tend are normally expected to contain test themselves.
There was a problem hiding this comment.
I general we worry less about duplication in test code than in production code.
Though I understand the would-be reason here (to concentrate more on the
production part), let me please explain why this harms projects in the long
run. Not caring about tests hygiene makes them messy over time. It starts to be
difficult to add another tests properly and it also starts difficult to make
sure that all tricky details are covered in all duplicate-edited replicas.
In turn this makes effort to do tests properly increasing, and in turn the
consequence is that on average new tests becomes of lesser and lesser quality.
Which in the end hurts the production part because it becomes tested less well.
In this particular case, if I did not push myself hard to make sure
test_dylink_dso_needed from core also passes on the browser, and just answered to
request to not change browser.test_dynamic_link and simply splitted my original
simpler changes to that test to e.g. browser.test_dynamic_link_needed, the
concurrency bug in loadWebAssemblyModule would remain undiscovered. And if I just
copy-pasted core.test_dylink_dso_needed to test_browser and then adapted it
there to run under BrowserCore, over time core and browser versions would
tend to diverge, which would harm making sure that dynamic-linker functionality
is properly tested under both node and on the web.
That's why I tried to avoid duplication.
Do we really need this test in test_browser?
Yes, synchronous loading does not work in the browser and this way testing ldso
functionality there adds coverage for loadAsync mode. Also the browser is
number-1 target for many web-projects, including my case, so it is better to
have proper coverage for code-paths that are activated only on that
environment.
I wonder if we have faced this similar problem before where certain tests
make sense both in core and in browser?
Yes, kind of. I'm not sure about the rest, but e.g. testing how dynamic linking
works in browser, e.g.
could be made to be generic tests and tested under both node and web environments.
In general I would suggest to consider reducing divergence between "regular"
and "in-browser" tests where possible, and to be able to run all, or almost
all, core tests inside browser too.
If you want to create new shared helper code i I think the best place would
be tests/runner.py. Files that start with "test_" tend are normally expected to
contain test themselves.
Ok, I proceed with tests/runner.py.
There was a problem hiding this comment.
I general we worry less about duplication in test code than in production code.
Though I understand the would-be reason here (to concentrate more on the
production part), let me please explain why this harms projects in the long
run. Not caring about tests hygiene makes them messy over time. It starts to be
difficult to add another tests properly and it also starts difficult to make
sure that all tricky details are covered in all duplicate-edited replicas.
In turn this makes effort to do tests properly increasing, and in turn the
consequence is that on average new tests becomes of lesser and lesser quality.
Which in the end hurts the production part because it becomes tested less well.
Of course, I understand why test hygiene is important. If you look at the the history over the last year you'll see that I've been doing more than anyone recently to cleanup the existing tests here in emscripten. I was absolutely not saying that we should focus on production code.
My point wasn't that test hygiene isn't important. Only that the principles we focus on in test code can sometimes be different. For example, IMHO, DRY can sometimes be less of priority when writing tests. I might, at times, put readability above DRY in test code. I this case of course there is a lot of case and we wouldn't want to duplicate.
My objection was mostly to that rather strange inter-class mixing.
Thanks for refactoring.
There was a problem hiding this comment.
@sbc100, thanks for feedback. Of course nothing should be taken as absolute. Sometimes it is better to duplicate - e.g. when avoiding the duplication would bring more complexity thus making the whole thing worse overall (but then it applies to production part as well). Just, like you say, in this particular case avoiding the duplication is much better overall.
I agree with the objection and that's why I reworked the patch (hopefully for the better - thanks for review feedback). It was just that I'm completely new to emscripten codebase and was trying to get the test working late at night.
Good to hear that tests are being gardened - thanks for your work on this.
|
@kripken, @sbc100, thanks again for feedback. I've reworked the patch and instead of changing pre-exising Full interdiff from previous version follows: diff --git a/src/support.js b/src/support.js
index 68dbf526e..0c857d0d1 100644
--- a/src/support.js
+++ b/src/support.js
@@ -356,21 +356,6 @@ function loadWebAssemblyModule(binary, flags) {
var name = UTF8ArrayToString(nameUTF8, 0);
neededDynlibs.push(name);
}
- if (neededDynlibs) {
- if (flags.loadAsync) {
- return Promise.all(neededDynlibs.map(function(dynNeeded) {
- return loadDynamicLibrary(dynNeeded, flags);
- })).then(function() {
- return loadModule();
- });
- }
-
- neededDynlibs.forEach(function(dynNeeded) {
- loadDynamicLibrary(dynNeeded, flags);
- });
- }
-
- return loadModule();
// loadModule loads the wasm module after all its dependencies have been loaded.
// can be called both sync/async.
@@ -382,29 +367,27 @@ function loadWebAssemblyModule(binary, flags) {
memoryAlign = Math.max(memoryAlign, STACK_ALIGN); // we at least need stack alignment
assert(tableAlign === 1);
// prepare memory
- var memoryStart = alignMemory(getMemory(memorySize + memoryAlign), memoryAlign); // TODO: add to cleanups
+ var memoryBase = alignMemory(getMemory(memorySize + memoryAlign), memoryAlign); // TODO: add to cleanups
// The static area consists of explicitly initialized data, followed by zero-initialized data.
// The latter may need zeroing out if the MAIN_MODULE has already used this memory area before
// dlopen'ing the SIDE_MODULE. Since we don't know the size of the explicitly initialized data
// here, we just zero the whole thing, which is suboptimal, but should at least resolve bugs
// from uninitialized memory.
- for (var i = memoryStart; i < memoryStart + memorySize; ++i) HEAP8[i] = 0;
+ for (var i = memoryBase; i < memoryBase + memorySize; ++i) HEAP8[i] = 0;
// prepare env imports
var env = Module['asmLibraryArg'];
// TODO: use only __memory_base and __table_base, need to update asm.js backend
var table = Module['wasmTable'];
- var oldTableSize = table.length;
- env['__memory_base'] = env['gb'] = memoryStart;
- env['__table_base'] = env['fb'] = oldTableSize;
+ var tableBase = table.length;
var originalTable = table;
table.grow(tableSize);
assert(table === originalTable);
// zero-initialize memory and table
// TODO: in some cases we can tell it is already zero initialized
- for (var i = env['__memory_base']; i < env['__memory_base'] + memorySize; i++) {
+ for (var i = memoryBase; i < memoryBase + memorySize; i++) {
HEAP8[i] = 0;
}
- for (var i = env['__table_base']; i < env['__table_base'] + tableSize; i++) {
+ for (var i = tableBase; i < tableBase + tableSize; i++) {
table.set(i, null);
}
// copy currently exported symbols so the new module can import them
@@ -413,8 +396,9 @@ function loadWebAssemblyModule(binary, flags) {
env[x] = Module[x];
}
}
- // TODO kill ↓↓↓ (it will likely be not needed if we require that if A wants
- // symbols from B it has to link to B explicitly: similarly to -Wl,--no-undefined)
+ // TODO kill ↓↓↓ (except "symbols local to this module", it will likely be
+ // not needed if we require that if A wants symbols from B it has to link
+ // to B explicitly: similarly to -Wl,--no-undefined)
//
// wasm dynamic libraries are pure wasm, so they cannot assist in
// their own loading. When side module A wants to import something
@@ -425,6 +409,16 @@ function loadWebAssemblyModule(binary, flags) {
// be to inspect the binary directly).
var proxyHandler = {
'get': function(obj, prop) {
+ // symbols that should be local to this module
+ switch (prop) {
+ case '__memory_base':
+ case 'gb':
+ return memoryBase;
+ case '__table_base':
+ case 'fb':
+ return tableBase;
+ }
+
if (prop in obj) {
return obj[prop]; // already present
}
@@ -465,7 +459,7 @@ function loadWebAssemblyModule(binary, flags) {
};
#if ASSERTIONS
var oldTable = [];
- for (var i = 0; i < oldTableSize; i++) {
+ for (var i = 0; i < tableBase; i++) {
oldTable.push(table.get(i));
}
#endif
@@ -480,12 +474,12 @@ function loadWebAssemblyModule(binary, flags) {
assert(table === instance.exports['table']);
}
// the old part of the table should be unchanged
- for (var i = 0; i < oldTableSize; i++) {
+ for (var i = 0; i < tableBase; i++) {
assert(table.get(i) === oldTable[i], 'old table entries must remain the same');
}
// verify that the new table region was filled in
for (var i = 0; i < tableSize; i++) {
- assert(table.get(oldTableSize + i) !== undefined, 'table entry was not filled in');
+ assert(table.get(tableBase + i) !== undefined, 'table entry was not filled in');
}
#endif
for (var e in instance.exports) {
@@ -500,10 +494,10 @@ function loadWebAssemblyModule(binary, flags) {
#if EMULATE_FUNCTION_POINTER_CASTS
// it may be a function pointer
if (e.substr(0, 3) == 'fp$' && typeof instance.exports[e.substr(3)] === 'function') {
- value = value + env['__table_base'];
+ value = value + tableBase;
} else {
#endif
- value = value + env['__memory_base'];
+ value = value + memoryBase;
#if EMULATE_FUNCTION_POINTER_CASTS
}
#endif
@@ -532,6 +526,20 @@ function loadWebAssemblyModule(binary, flags) {
return postInstantiation(instance);
}
}
+
+ // now load needed libraries and the module itself.
+ if (flags.loadAsync) {
+ return Promise.all(neededDynlibs.map(function(dynNeeded) {
+ return loadDynamicLibrary(dynNeeded, flags);
+ })).then(function() {
+ return loadModule();
+ });
+ }
+
+ neededDynlibs.forEach(function(dynNeeded) {
+ loadDynamicLibrary(dynNeeded, flags);
+ });
+ return loadModule();
}
Module['loadWebAssemblyModule'] = loadWebAssemblyModule;
diff --git a/tests/test_browser.py b/tests/test_browser.py
index 7bcfe533a..a435a0f7d 100644
--- a/tests/test_browser.py
+++ b/tests/test_browser.py
@@ -23,6 +23,7 @@
from tools import system_libs
from tools.shared import PYTHON, EMCC, WINDOWS, FILE_PACKAGER, PIPE, SPIDERMONKEY_ENGINE, JS_ENGINES
from tools.shared import try_delete, Building, run_process, run_js
+import test_core
try:
from http.server import BaseHTTPRequestHandler, HTTPServer
@@ -3243,9 +3244,6 @@ def test_webidl(self):
@requires_sync_compilation
def test_dynamic_link(self):
- # verify that dynamic linking works in all kinds of in-browser environments.
- #
- # main -> side -> side2
open('pre.js', 'w').write('''
Module.dynamicLibraries = ['side.wasm'];
''')
@@ -3274,23 +3272,16 @@ def test_dynamic_link(self):
}
''')
open('side.cpp', 'w').write(r'''
- char *side(const char *data);
- char *side2(const char *data);
- char *side(const char *data) {
- return side2(data);
- }
- ''')
- open('side2.cpp', 'w').write(r'''
#include <stdlib.h>
#include <string.h>
- char *side2(const char *data) {
+ char *side(const char *data);
+ char *side(const char *data) {
char *ret = (char*)malloc(strlen(data)+1);
strcpy(ret, data);
return ret;
}
''')
- run_process([PYTHON, EMCC, 'side2.cpp', '-s', 'SIDE_MODULE=1', '-O2', '-o', 'side2.wasm', '-s', 'EXPORT_ALL=1'])
- run_process([PYTHON, EMCC, 'side.cpp', '-s', 'SIDE_MODULE=1', '-s', "RUNTIME_LINKED_LIBS=['side2.wasm']", '-O2', '-o', 'side.wasm', '-s', 'EXPORT_ALL=1'])
+ run_process([PYTHON, EMCC, 'side.cpp', '-s', 'SIDE_MODULE=1', '-O2', '-o', 'side.wasm', '-s', 'EXPORT_ALL=1'])
self.btest(self.in_dir('main.cpp'), '2', args=['-s', 'MAIN_MODULE=1', '-O2', '--pre-js', 'pre.js', '-s', 'EXPORT_ALL=1'])
print('wasm in worker (we can read binary data synchronously there)')
@@ -3298,8 +3289,7 @@ def test_dynamic_link(self):
open('pre.js', 'w').write('''
var Module = { dynamicLibraries: ['side.wasm'] };
''')
- run_process([PYTHON, EMCC, 'side2.cpp', '-s', 'SIDE_MODULE=1', '-O2', '-o', 'side2.wasm', '-s', 'WASM=1', '-s', 'EXPORT_ALL=1'])
- run_process([PYTHON, EMCC, 'side.cpp', '-s', 'SIDE_MODULE=1', '-s', "RUNTIME_LINKED_LIBS=['side2.wasm']", '-O2', '-o', 'side.wasm', '-s', 'WASM=1', '-s', 'EXPORT_ALL=1'])
+ run_process([PYTHON, EMCC, 'side.cpp', '-s', 'SIDE_MODULE=1', '-O2', '-o', 'side.wasm', '-s', 'WASM=1', '-s', 'EXPORT_ALL=1'])
self.btest(self.in_dir('main.cpp'), '2', args=['-s', 'MAIN_MODULE=1', '-O2', '--pre-js', 'pre.js', '-s', 'WASM=1', '--proxy-to-worker', '-s', 'EXPORT_ALL=1'])
print('wasm (will auto-preload since no sync binary reading)')
@@ -3310,6 +3300,62 @@ def test_dynamic_link(self):
# same wasm side module works
self.btest(self.in_dir('main.cpp'), '2', args=['-s', 'MAIN_MODULE=1', '-O2', '--pre-js', 'pre.js', '-s', 'WASM=1', '-s', 'EXPORT_ALL=1'])
+ # verify that dynamic linking works in all kinds of in-browser environments.
+ # don't mix different kinds in a single test.
+ def test_dylink_dso_needed_wasm(self):
+ self._test_dylink_dso_needed(1, 0)
+
+ def test_dylink_dso_needed_wasm_inworker(self):
+ self._test_dylink_dso_needed(1, 1)
+
+ def test_dylink_dso_needed_asmjs(self):
+ self._test_dylink_dso_needed(0, 0)
+
+ def test_dylink_dso_needed_asmjs_inworker(self):
+ self._test_dylink_dso_needed(0, 1)
+
+ @requires_sync_compilation
+ def _test_dylink_dso_needed(self, wasm, inworker):
+ # here we reuse test_core.test_dylink_dso_needed, but the code is run via browser.
+ print('\n# wasm=%d inworker=%d' % (wasm, inworker))
+ self.set_setting('WASM', wasm)
+ self.emcc_args += ['-O2']
+ t = test_core.binaryen2 if wasm else test_core.asm2
+
+ def do_run(src, expected_output):
+ # XXX there is no infrastructure (yet ?) to retrieve stdout from browser in tests.
+ # -> do the assert about expected output inside browser.
+ #
+ # we have to put the hook into post.js because in main it is too late
+ # (in main we won't be able to catch what static constructors inside
+ # linked dynlibs printed), and in pre.js it is too early (out is not yet
+ # setup by the shell).
+ with open('post.js', 'w') as f:
+ f.write(r'''
+ Module.realPrint = out;
+ out = function(x) {
+ if (!Module.printed) Module.printed = "";
+ Module.printed += x + '\n'; // out is passed str without last \n
+ Module.realPrint(x);
+ };
+ ''')
+ src += r'''
+ int main() {
+ _main();
+ EM_ASM({
+ var expected = %r;
+ assert(Module.printed === expected, ['stdout expected:', expected]);
+ });
+ REPORT_RESULT(0);
+ }
+ ''' % (expected_output,)
+ # --proxy-to-worker only on main
+ if inworker:
+ self.emcc_args += ['--proxy-to-worker']
+ self.btest(src, '0', args=self.get_emcc_args() + ['--post-js', 'post.js'])
+
+ t._test_dylink_dso_needed.__func__(self, do_run)
+
@requires_graphics_hardware
@requires_sync_compilation
def test_dynamic_link_glemu(self):
diff --git a/tests/test_core.py b/tests/test_core.py
index f4aa54349..f86511e31 100644
--- a/tests/test_core.py
+++ b/tests/test_core.py
@@ -4107,6 +4107,11 @@ def test_dylink_hyper_dupe(self):
@needs_dlfcn
def test_dylink_dso_needed(self):
+ def do_run(src, expected_output):
+ self.do_run(src + 'int main() { return _main(); }', expected_output)
+ self._test_dylink_dso_needed(do_run)
+
+ def _test_dylink_dso_needed(self, do_run):
# test that linking to shared library B, which is linked to A, loads A as well.
# main is also linked to C, which is also linked to A. A is loaded/initialized only once.
#
@@ -4157,14 +4162,20 @@ def test_dylink_dso_needed(self):
}
''')
+ # _test_dylink_dso_needed can be potentially called several times by a test.
+ # reset dylink-related options first.
+ self.set_setting('MAIN_MODULE', 0)
+ self.set_setting('SIDE_MODULE', 0)
+ self.set_setting('RUNTIME_LINKED_LIBS', [])
+
# XXX in wasm each lib load currently takes 5MB; default TOTAL_MEMORY=16MB is thus not enough
self.set_setting('TOTAL_MEMORY', 32 * 1024 * 1024)
so = '.wasm' if self.is_wasm() else '.js'
def ccshared(src, linkto=[]):
- cmdv = [PYTHON, EMCC, src, '-s', 'SIDE_MODULE=1', '-o', os.path.splitext(src)[0] + so]
- cmdv += self.get_emcc_args() + ['-s', 'RUNTIME_LINKED_LIBS=' + str(linkto)]
+ cmdv = [PYTHON, EMCC, src, '-o', os.path.splitext(src)[0] + so] + self.get_emcc_args()
+ cmdv += ['-s', 'SIDE_MODULE=1', '-s', 'RUNTIME_LINKED_LIBS=' + str(linkto)]
run_process(cmdv)
ccshared('liba.cpp')
@@ -4173,11 +4184,11 @@ def ccshared(src, linkto=[]):
self.set_setting('MAIN_MODULE', 1)
self.set_setting('RUNTIME_LINKED_LIBS', ['libb' + so, 'libc' + so])
- self.do_run(r'''
+ do_run(r'''
void bfunc();
void cfunc();
- int main() {
+ int _main() {
bfunc();
cfunc();
return 0;
@@ -4187,12 +4198,12 @@ def ccshared(src, linkto=[]):
self.set_setting('RUNTIME_LINKED_LIBS', [])
self.emcc_args += ['--embed-file', '.@/']
- self.do_run(r'''
+ do_run(r'''
#include <assert.h>
#include <dlfcn.h>
#include <stddef.h>
- int main() {
+ int _main() {
void *bdso, *cdso;
void (*bfunc)(), (*cfunc)(); |
7a31ed6 to
01ded36
Compare
|
@kripken, @sbc100, thanks again for feedback. I've moved dynamic-linker test, which is now shared between test_core and test_browser into tests/runner.py. A section titled it was natual to put the test there. Full interdiff of the change is below. Thanks again, diff --git a/tests/runner.py b/tests/runner.py
index b831bac6a..7e971b72e 100755
--- a/tests/runner.py
+++ b/tests/runner.py
@@ -775,6 +775,126 @@ def setup_runtimelink_test(self):
'''
return (main, supp)
+ # excercise dynamic linker.
+ #
+ # test that linking to shared library B, which is linked to A, loads A as well.
+ # main is also linked to C, which is also linked to A. A is loaded/initialized only once.
+ #
+ # B
+ # main < > A
+ # C
+ #
+ # this test is used by both test_core and test_browser.
+ # when run under broswer it excercises how dynamic linker handles concurrency
+ # - because B and C are loaded in parallel.
+ def _test_dylink_dso_needed(self, do_run):
+ with open('liba.cpp', 'w') as f:
+ f.write(r'''
+ #include <stdio.h>
+ #include <emscripten.h>
+
+ static const char *afunc_prev;
+
+ EMSCRIPTEN_KEEPALIVE void afunc(const char *s);
+ void afunc(const char *s) {
+ printf("a: %s (prev: %s)\n", s, afunc_prev);
+ afunc_prev = s;
+ }
+
+ struct ainit {
+ ainit() {
+ puts("a: loaded");
+ }
+ };
+
+ static ainit _;
+ ''')
+
+ with open('libb.cpp', 'w') as f:
+ f.write(r'''
+ #include <emscripten.h>
+
+ void afunc(const char *s);
+ EMSCRIPTEN_KEEPALIVE void bfunc();
+ void bfunc() {
+ afunc("b");
+ }
+ ''')
+
+ with open('libc.cpp', 'w') as f:
+ f.write(r'''
+ #include <emscripten.h>
+
+ void afunc(const char *s);
+ EMSCRIPTEN_KEEPALIVE void cfunc();
+ void cfunc() {
+ afunc("c");
+ }
+ ''')
+
+ # _test_dylink_dso_needed can be potentially called several times by a test.
+ # reset dylink-related options first.
+ self.set_setting('MAIN_MODULE', 0)
+ self.set_setting('SIDE_MODULE', 0)
+ self.set_setting('RUNTIME_LINKED_LIBS', [])
+
+ # XXX in wasm each lib load currently takes 5MB; default TOTAL_MEMORY=16MB is thus not enough
+ self.set_setting('TOTAL_MEMORY', 32 * 1024 * 1024)
+
+ so = '.wasm' if self.is_wasm() else '.js'
+
+ def ccshared(src, linkto=[]):
+ cmdv = [PYTHON, EMCC, src, '-o', os.path.splitext(src)[0] + so] + self.get_emcc_args()
+ cmdv += ['-s', 'SIDE_MODULE=1', '-s', 'RUNTIME_LINKED_LIBS=' + str(linkto)]
+ run_process(cmdv)
+
+ ccshared('liba.cpp')
+ ccshared('libb.cpp', ['liba' + so])
+ ccshared('libc.cpp', ['liba' + so])
+
+ self.set_setting('MAIN_MODULE', 1)
+ self.set_setting('RUNTIME_LINKED_LIBS', ['libb' + so, 'libc' + so])
+ do_run(r'''
+ void bfunc();
+ void cfunc();
+
+ int _main() {
+ bfunc();
+ cfunc();
+ return 0;
+ }
+ ''',
+ 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
+
+ self.set_setting('RUNTIME_LINKED_LIBS', [])
+ self.emcc_args += ['--embed-file', '.@/']
+ do_run(r'''
+ #include <assert.h>
+ #include <dlfcn.h>
+ #include <stddef.h>
+
+ int _main() {
+ void *bdso, *cdso;
+ void (*bfunc)(), (*cfunc)();
+
+ // FIXME for RTLD_LOCAL binding symbols to loaded lib is not currenlty working
+ bdso = dlopen("libb%(so)s", RTLD_GLOBAL);
+ assert(bdso != NULL);
+ cdso = dlopen("libc%(so)s", RTLD_GLOBAL);
+ assert(cdso != NULL);
+
+ bfunc = (void (*)())dlsym(bdso, "_Z5bfuncv");
+ assert(bfunc != NULL);
+ cfunc = (void (*)())dlsym(cdso, "_Z5cfuncv");
+ assert(cfunc != NULL);
+
+ bfunc();
+ cfunc();
+ return 0;
+ }
+ ''' % locals(),
+ 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
+
def filtered_js_engines(self, js_engines=None):
if js_engines is None:
js_engines = shared.JS_ENGINES
diff --git a/tests/test_browser.py b/tests/test_browser.py
index a435a0f7d..5c842df86 100644
--- a/tests/test_browser.py
+++ b/tests/test_browser.py
@@ -23,7 +23,6 @@
from tools import system_libs
from tools.shared import PYTHON, EMCC, WINDOWS, FILE_PACKAGER, PIPE, SPIDERMONKEY_ENGINE, JS_ENGINES
from tools.shared import try_delete, Building, run_process, run_js
-import test_core
try:
from http.server import BaseHTTPRequestHandler, HTTPServer
@@ -3316,11 +3315,10 @@ def test_dylink_dso_needed_asmjs_inworker(self):
@requires_sync_compilation
def _test_dylink_dso_needed(self, wasm, inworker):
- # here we reuse test_core.test_dylink_dso_needed, but the code is run via browser.
+ # here we reuse runner._test_dylink_dso_needed, but the code is run via browser.
print('\n# wasm=%d inworker=%d' % (wasm, inworker))
self.set_setting('WASM', wasm)
self.emcc_args += ['-O2']
- t = test_core.binaryen2 if wasm else test_core.asm2
def do_run(src, expected_output):
# XXX there is no infrastructure (yet ?) to retrieve stdout from browser in tests.
@@ -3354,7 +3352,7 @@ def do_run(src, expected_output):
self.emcc_args += ['--proxy-to-worker']
self.btest(src, '0', args=self.get_emcc_args() + ['--post-js', 'post.js'])
- t._test_dylink_dso_needed.__func__(self, do_run)
+ super(browser, self)._test_dylink_dso_needed(do_run)
@requires_graphics_hardware
@requires_sync_compilation
diff --git a/tests/test_core.py b/tests/test_core.py
index f86511e31..29844c18e 100644
--- a/tests/test_core.py
+++ b/tests/test_core.py
@@ -4111,120 +4111,6 @@ def do_run(src, expected_output):
self.do_run(src + 'int main() { return _main(); }', expected_output)
self._test_dylink_dso_needed(do_run)
- def _test_dylink_dso_needed(self, do_run):
- # test that linking to shared library B, which is linked to A, loads A as well.
- # main is also linked to C, which is also linked to A. A is loaded/initialized only once.
- #
- # B
- # main < > A
- # C
- with open('liba.cpp', 'w') as f:
- f.write(r'''
- #include <stdio.h>
- #include <emscripten.h>
-
- static const char *afunc_prev;
-
- EMSCRIPTEN_KEEPALIVE void afunc(const char *s);
- void afunc(const char *s) {
- printf("a: %s (prev: %s)\n", s, afunc_prev);
- afunc_prev = s;
- }
-
- struct ainit {
- ainit() {
- puts("a: loaded");
- }
- };
-
- static ainit _;
- ''')
-
- with open('libb.cpp', 'w') as f:
- f.write(r'''
- #include <emscripten.h>
-
- void afunc(const char *s);
- EMSCRIPTEN_KEEPALIVE void bfunc();
- void bfunc() {
- afunc("b");
- }
- ''')
-
- with open('libc.cpp', 'w') as f:
- f.write(r'''
- #include <emscripten.h>
-
- void afunc(const char *s);
- EMSCRIPTEN_KEEPALIVE void cfunc();
- void cfunc() {
- afunc("c");
- }
- ''')
-
- # _test_dylink_dso_needed can be potentially called several times by a test.
- # reset dylink-related options first.
- self.set_setting('MAIN_MODULE', 0)
- self.set_setting('SIDE_MODULE', 0)
- self.set_setting('RUNTIME_LINKED_LIBS', [])
-
- # XXX in wasm each lib load currently takes 5MB; default TOTAL_MEMORY=16MB is thus not enough
- self.set_setting('TOTAL_MEMORY', 32 * 1024 * 1024)
-
- so = '.wasm' if self.is_wasm() else '.js'
-
- def ccshared(src, linkto=[]):
- cmdv = [PYTHON, EMCC, src, '-o', os.path.splitext(src)[0] + so] + self.get_emcc_args()
- cmdv += ['-s', 'SIDE_MODULE=1', '-s', 'RUNTIME_LINKED_LIBS=' + str(linkto)]
- run_process(cmdv)
-
- ccshared('liba.cpp')
- ccshared('libb.cpp', ['liba' + so])
- ccshared('libc.cpp', ['liba' + so])
-
- self.set_setting('MAIN_MODULE', 1)
- self.set_setting('RUNTIME_LINKED_LIBS', ['libb' + so, 'libc' + so])
- do_run(r'''
- void bfunc();
- void cfunc();
-
- int _main() {
- bfunc();
- cfunc();
- return 0;
- }
- ''',
- 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
-
- self.set_setting('RUNTIME_LINKED_LIBS', [])
- self.emcc_args += ['--embed-file', '.@/']
- do_run(r'''
- #include <assert.h>
- #include <dlfcn.h>
- #include <stddef.h>
-
- int _main() {
- void *bdso, *cdso;
- void (*bfunc)(), (*cfunc)();
-
- // FIXME for RTLD_LOCAL binding symbols to loaded lib is not currenlty working
- bdso = dlopen("libb%(so)s", RTLD_GLOBAL);
- assert(bdso != NULL);
- cdso = dlopen("libc%(so)s", RTLD_GLOBAL);
- assert(cdso != NULL);
-
- bfunc = (void (*)())dlsym(bdso, "_Z5bfuncv");
- assert(bfunc != NULL);
- cfunc = (void (*)())dlsym(cdso, "_Z5cfuncv");
- assert(cfunc != NULL);
-
- bfunc();
- cfunc();
- return 0;
- }
- ''' % locals(),
- 'a: loaded\na: b (prev: (null))\na: c (prev: b)\n')
-
@needs_dlfcn
def test_dylink_dot_a(self):
# .a linking must force all .o files inside it, when in a shared module |
tests/test_core.py
Outdated
| def test_dylink_dso_needed(self): | ||
| def do_run(src, expected_output): | ||
| self.do_run(src + 'int main() { return _main(); }', expected_output) | ||
| self._test_dylink_dso_needed(do_run) |
There was a problem hiding this comment.
Sorry, I usually do set ts=8 sts=2 sw=2 et in vim when working on emscripten, but missed this again. Will fix. (btw, a way to automatically check/reformat this would be helpful)
There was a problem hiding this comment.
We use flake8 which should check for indentation but sadly its set of 8 spaces and is not configurable we so had to disable that test completely.
If we could switch to 4 spaces, or find a better tool I'd be happy. Switching at this point would be pretty invasive though.
|
Looks good to me too. Great work, @navytux, thank you! Yeah, let's just figure out the commit message, then land it. |
Working on support for (many) shared libraries and testing that inside browser revealed a problem in loadWebAssemblyModule: it implicitly assumes that modules are loaded serially one after another, and if this invariant is not held programs misbehave by using wrong global data. The problem comes from the fact, that even though loadWebAssemblyModule supports async loading by API, inside it modifies/reads __memory_base & __table_base in process-wide Module['wasmTable']. Be loadWebAssemblyModule just one step inside a promise - that maybe might turn to be not a problem, but even now loadWebAssemblyModule, when running in async-mode, has 2 steps: 1. WebAssembly.instantiate, and then 2. postInstantiation The values for Module['wasmTable']['__memory_base'] & co are set before "1", and are then used inside "2". Given that it might be undetermined time before "1" starts and hands over to "2", and that another WASM module might become loading along that time, at "2" the value for Module['wasmTable']['__memory_base'] might become to be not what it was set to at "1". Also the value for Module['wasmTable']['__memory_base'] might become changed just while "1" is running, if the browser supports asynchronous WebAssembly instantiation. In other words it is kind of race-condition between "threads" that load wasm modules on process-wide Module['wasmTable']['__memory_base']. And the race here is not necessary, because particular __memory_base and __table_base should be applied to one particular module only - as every module should have its own different bases. So we can avoid unrolling async mutexes and instead fix the problem by simply not storing the bases globally and passing them locally only to the imported module. No test in this patch as browser.test_dylink_dso_needed_*, that will be added in the next patch, covers the problem as well. This commit is a preparatory step for the next patch, where dynamic libraries are actually started to be loaded in parallel, and it was prepared as a separate change to raise semantic visibility and ease review.
Currently Emscripten allows to create shared libraries (DSO) and link
them to main module. However a shared library itself cannot be linked to
another shared library.
The lack of support for DSO -> DSO linking becomes problematic in cases when
there are several shared libraries that all need to use another should-be
shared functionality, while linking that should-be shared functionality to main
module is not an option for size reasons. My particular use-case is SciPy
support for Pyodide:
pyodide/pyodide#211,
pyodide/pyodide#240
where several of `*.so` scipy modules need to link to LAPACK. If we link to
LAPACK statically from all those `*.so` - it just blows up compiled size
pyodide/pyodide#211 (comment)
and if we link in LAPACK statically to main module, the main module size is
also increased ~2x, which is not an option, since LAPACK functionality is not
needed by every Pyodide user.
This way we are here to add support for DSO -> DSO linking:
1. similarly to how it is already working for main module -> side module
linking, when building a side module it is now possible to specify
-s RUNTIME_LINKED_LIBS=[...]
with list of shared libraries that side module needs to link to.
2. to store that information, for asm.js, similarly to how it is currently
handled for main module (which always has js part), we transform
RUNTIME_LINKED_LIBS to
libModule.dynamicLibraries = [...]
(see src/preamble_sharedlib.js)
3. for wasm module, in order to store the information about to which libraries
a module links, we could in theory use "module" attribute in wasm imports.
However currently emscripten almost always uses just "env" for that "module"
attribute, e.g.
(import "env" "abortStackOverflow" (func $fimport$0 (param i32)))
(import "env" "_ffunc1" (func $fimport$1))
...
and this way we have to embed the information about required libraries for
the dynamic linker somewhere else.
What I came up with is to extend "dylink" section with information about
which shared libraries a shared library needs. This is similar to DT_NEEDED
entries in ELF.
(see tools/shared.py)
4. then, the dynamic linker (loadDynamicLibrary) is reworked to handle that information:
- for asm.js, after loading a libModule, we check libModule.dynamicLibraries
and post-load them recursively. (it would be better to load needed modules
before the libModule, but for js we currently have to first eval whole
libModule's code to be able to read .dynamicLibraries)
- for wasm the needed libraries are loaded before the wasm module in
question is instantiated.
(see changes to loadWebAssemblyModule for details)
01ded36 to
5173634
Compare
|
@sbc100, @kripken, thanks a lot for feedback and kind words. I've amended the main patch to always use 2 spaces for indent. About commit message: it is already trimmed compared to PR description. We have 2 commits here: 29f45e6 The first fixes found concurrency problem for loading wasm modules. It explains the problem and semantically does only one thing of fixing that particular bug. I thus suggest not to squash it with the main change, where it will become lost in the noise of code churn. For the second patch - its commit message is based on the PR description, but there I removed everything not actually related to the semantic step this main patch is doing. What is left there is motivation and overview of how DSO -> DSO linking is implemented. It is thus ok to be applied with that commit message from my point of view. If that's ok with you, the steps to apply could be:
Please forgive me, if I misunderstood something what you've meant about commit description. I hope it is ok. Thanks again for feedback, ---- 8< ---- (interdiff) diff --git a/tests/test_browser.py b/tests/test_browser.py
index 5c842df86..b454bef2a 100644
--- a/tests/test_browser.py
+++ b/tests/test_browser.py
@@ -3302,16 +3302,16 @@ def test_dynamic_link(self):
# verify that dynamic linking works in all kinds of in-browser environments.
# don't mix different kinds in a single test.
def test_dylink_dso_needed_wasm(self):
- self._test_dylink_dso_needed(1, 0)
+ self._test_dylink_dso_needed(1, 0)
def test_dylink_dso_needed_wasm_inworker(self):
- self._test_dylink_dso_needed(1, 1)
+ self._test_dylink_dso_needed(1, 1)
def test_dylink_dso_needed_asmjs(self):
- self._test_dylink_dso_needed(0, 0)
+ self._test_dylink_dso_needed(0, 0)
def test_dylink_dso_needed_asmjs_inworker(self):
- self._test_dylink_dso_needed(0, 1)
+ self._test_dylink_dso_needed(0, 1)
@requires_sync_compilation
def _test_dylink_dso_needed(self, wasm, inworker):
diff --git a/tests/test_core.py b/tests/test_core.py
index 29844c18e..6f2ae7cc2 100644
--- a/tests/test_core.py
+++ b/tests/test_core.py
@@ -4107,9 +4107,9 @@ def test_dylink_hyper_dupe(self):
@needs_dlfcn
def test_dylink_dso_needed(self):
- def do_run(src, expected_output):
- self.do_run(src + 'int main() { return _main(); }', expected_output)
- self._test_dylink_dso_needed(do_run)
+ def do_run(src, expected_output):
+ self.do_run(src + 'int main() { return _main(); }', expected_output)
+ self._test_dylink_dso_needed(do_run)
@needs_dlfcn
def test_dylink_dot_a(self): |
|
( CI I believe it is unrelated to the change and is just some flakiness. For the reference: |
|
Makes sense, will rebase and merge, keeping the commits separate. The sdl2_timer error fails sometimes, we haven't figured out why yet, but it's unrelated to this PR. Thanks again! |
) This comes from my work to teach Emscripten dynamic linker to support library -> library linkage: emscripten-core/emscripten#7512 The motivation is that without support for DSO -> DSO linking, it becomes problematic in cases when there are several shared libraries that all need to use another should-be shared functionality, while linking that should-be shared functionality to main module is not an option for size reasons. See above pull-request for more details. In order to support library -> library linkages, we have to store somewhere the information about which other libraries, a library needs. This patch extends "dylink" section with such information, which is similar to how ELF handles the same situation with DT_NEEDED entries.
|
Hey guys! Do any of you know how I can import a shared object file like this -> "_ctypes.cpython-37m-x86_64-linux-gnu.so" in pyodide? I already downloaded it on the browser and it is located in /lib/python3.7/lib-dynload. But python can't seem to import it. |
|
|
|
Is that the reason why python can't import it atleast? |
|
Yeah there is no way pyodide could load that file. Only an linux x86 build of python could ever load that file. |
Currently Emscripten allows to create shared libraries (DSO) and link
them to main module. However a shared library itself cannot be linked to
another shared library.
The lack of support for DSO -> DSO linking becomes problematic in cases when
there are several shared libraries that all need to use another should-be
shared functionality, while linking that should-be shared functionality to main
module is not an option for size reasons. My particular use-case is SciPy
support for Pyodide:
pyodide/pyodide#211,
pyodide/pyodide#240
where several of
*.soscipy modules need to link to LAPACK. If we link toLAPACK statically from all those
*.so- it just blows up compiled sizepyodide/pyodide#211 (comment)
and if we link in LAPACK statically to main module, the main module size is
also increased ~2x, which is not an option, since LAPACK functionality is not
needed by every Pyodide user.
This way we are here to add support for DSO -> DSO linking:
similarly to how it is already working for main module -> side module
linking, when building a side module it is now possible to specify
-s RUNTIME_LINKED_LIBS=[...]
with list of shared libraries that side module needs to link to.
to store that information, for asm.js, similarly to how it is currently
handled for main module (which always has js part), we transform
RUNTIME_LINKED_LIBS to
libModule.dynamicLibraries = [...]
(see src/preamble_sharedlib.js)
for wasm module, in order to store the information about to which libraries
a module links, we could in theory use "module" attribute in wasm imports.
However currently emscripten almost always uses just "env" for that "module"
attribute, e.g.
(import "env" "abortStackOverflow" (func $fimport$0 (param i32)))
(import "env" "_ffunc1" (func $fimport$1))
...
and this way we have to embed the information about required libraries for
the dynamic linker somewhere else.
What I came up with is to extend "dylink" section with information about
which shared libraries a shared library needs. This is similar to DT_NEEDED
entries in ELF.
(see tools/shared.py)
then, the dynamic linker (loadDynamicLibrary) is reworked to handle that information:
for asm.js, after loading a libModule, we check libModule.dynamicLibraries
and post-load them recursively. (it would be better to load needed modules
before the libModule, but for js we currently have to first eval whole
libModule's code to be able to read .dynamicLibraries)
for wasm the needed libraries are loaded before the wasm module in
question is instantiated.
(see changes to loadWebAssemblyModule for details)
since we also have to teach dlopen() to handle needed libraries, and since dlopen
was already duplicating loadDynamicLibrary() code in many ways, instead of
adding more duplication, dlopen is now reworked to use loadDynamicLibrary
itself.
This moves functionality to keep track of loaded DSO, their handles,
refcounts, etc into the dynamic linker itself, with loadDynamicLibrary now
accepting various flags (global/nodelete) to handle e.g.
RTLD_LOCAL/RTLD_GLOBAL and RTLD_NODELETE dlopen cases (RTLD_NODELETE
semantic is needed for initially-linked-in libraries).
Also, since dlopen was using FS to read libraries, and loadDynamicLibrary was
previously using Module['read'] and friends, loadDynamicLibrary now also
accepts fs interface, which if provided, is used as FS-like interface to load
library data, and if not - native loading capabilities of the environment
are still used.
Another aspect related to deduplication is that loadDynamicLibrary now also
uses preloaded/precompiled wasm modules, that were previously only used by
dlopen (see a5866a5 "Add preload plugin to compile wasm side modules async
(WIP: Add preload plugin to compile wasm side modules async #6663)").
(see changes to dlopen and loadDynamicLibrary)
The functionality to asynchronously load dynamic libraries is also
integrated into loadDynamicLibrary.
Libraries were asynchronously preloaded for the case when
Module['readBinary'] is absent (browser, see 3446d2a "preload wasm dynamic
libraries when we can't load them synchronously").
Since this codepath was also needed to be taught of DSO -> DSO dependency,
the most straightforward thing to do was to teach loadDynamicLibrary to do
its work asynchronously (under flag) and to switch the preloading to use
loadDynamicLibrary(..., {loadAsync: true})
(see changes to src/preamble.js and loadDynamicLibrary)
A test is added for verifying linking/dlopening a DSO with other needed library.
browser.test_dynamic_link is also amended to verify linking to DSO with
dependencies.
With the patch I've made sure that all core tests (including test_dylink_* and
test_dlfcn_*) are passing for asm{0,1,2} and binaryen{0,1,2}.
However since I cannot get full browser tests to pass even on pristine incoming
(1.38.19-2-g77246e0c1 as of today; there are many failures for both Firefox
63.0.1 and Chromium 70.0.3538.67), I did not tried to verify full browser tests
with my patch. Bit I've made sure that
are passing.
"other" kind of tests also do not pass on pristine incoming for me. This
way I did not tried to verify "other" with my patch.
Thanks beforehand,
Kirill
P.S.
This is my first time I do anything with WebAssembly/Emscripten, and only a
second time with JavaScript, so please forgive me if I missed something.
/cc @kripken, @juj, @sbc100, @max99x, @junjihashimoto, @mdboom, @rth
For historical reference - this landed as:
baf29a3,
b98e4d0,
a2b8aed,
0e4ff7c,
538f958,
8ec4b4a,
6410be8, and
WebAssembly/tool-conventions@864c5c9.