-
-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Description
The UfuncHelpers class is not thread-safe, and this can occasionally lead to strange errors when using astropy from multi-threaded code (e.g. with Dask).
Expected behavior
Multiple threads should be able to call ufuncs simultaneously.
Actual behavior
Errors like this sporadically occur:
File "./astropy-ufunc-threading.py", line 13, in thread_code
erfa.s2c(1 * u.rad, 2 * u.rad)
File "/home/bmerry/work/sdp/env3/lib/python3.8/site-packages/erfa/core.py", line 19788, in s2c
c = ufunc.s2c(theta, phi)
File "/home/bmerry/src/astropy/astropy/units/quantity.py", line 457, in __array_ufunc__
converters, unit = converters_and_unit(function, method, *inputs)
File "/home/bmerry/src/astropy/astropy/units/quantity_helper/converters.py", line 157, in converters_and_unit
ufunc_helper = UFUNC_HELPERS[function]
File "/home/bmerry/src/astropy/astropy/units/quantity_helper/converters.py", line 86, in __missing__
raise TypeError("unknown ufunc {}. If you believe this ufunc "
TypeError: unknown ufunc s2c. If you believe this ufunc should be supported, please raise an issue on https://github.com/astropy/astropy
People have also seen errors about collections mutating while being iterated, which I suspect is closely related.
Steps to Reproduce
It's a little tricky to reproduce reliably since it relies on thread timing. There's a script below, but I've found the easiest way to make it reliable is to add sleeps into the UfuncHelpers code at strategic points.
#!/usr/bin/env python3
import threading
import time
import astropy.units as u
import erfa
def thread_code(excs):
try:
erfa.s2c(1 * u.rad, 2 * u.rad)
except Exception as exc:
excs.append(exc)
def main():
excs = []
threads = [threading.Thread(target=thread_code, args=(excs,))
for i in range(32)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
if excs:
raise excs[0]
main()The races I can immediate see are:
- The
modulesproperty has a race between checking if there is a_modulesattribute and returning it. That's probably harmless in most cases since the first registration will likely be done when first importing astropy and before launching threads. - The call to
list(self.modules.items())in__missing__iterates the modules dictionary to construct the list, and at the same time another thread might mutate it (possibly withregister_module, but more likely withimport_modulewhich pops). import_moduleremoves the module from the list before updating the helper dictionary. If another thread calls__missing__in that gap, it won't be able to find a helper (leading to the exception above).__setitem__modifies bothUNSUPPORTEDand the helper dictionary. There are probably some race conditions if__missing__is called in the middle of that process.
Recommendations
I think the following should address most of the issues. It still depends on Python treating accesses to the core dict class as atomic, which as far as I know is an implementation detail of CPython but not guaranteed anywhere:
- Add a lock, that is held during all the member functions. That shouldn't affect the common case (a ufunc for which the helper is already registered) because this
__getitem__won't take the lock. - Give the class a constructor and initialise
modulesthere, instead of doing it lazily with a property.
Let me know if you'd like a PR.
System Details
Linux-5.4.0-58-generic-x86_64-with-glibc2.29
Python 3.8.5 (default, Jul 28 2020, 12:59:40)
[GCC 9.3.0]
Numpy 1.19.1
astropy 4.3.dev439+g4b242e6f5
Scipy 1.5.2
Matplotlib 3.3.0