-
Notifications
You must be signed in to change notification settings - Fork 185
Closed
Description
This problem is when a function refers (by attribute) to a sub-module of a package. Cloudpickle appears to pickle functions not by name, but by code plus (a subset of) globals. So the parent package is injected into the pickle, but the sub-module is not.
def func():
# import unittest.mock
x = unittest.TestCase
x = unittest.mock.Mock
import unittest.mock
import cloudpickle as pickle
s = pickle.dumps(func)
del unittest
import sys
del sys.modules['unittest']
del sys.modules['unittest.mock']
f = pickle.loads(s)
# import unittest.mock as anything
f()
AttributeError: module 'unittest' has no attribute 'mock'This leads to non-intuitive bugs in applications such as cluster computing (e.g. with dask.distributed).
Workarounds:
- perform imports inside functions (contrary to PEP8).
- import sub-modules (or their contents)
asglobals. - ensure (somehow) that all relevant sub-modules are automatically imported by respective parent packages (e.g.
__init__.py). - arrange for the unpickling process to have already done an import of the sub-module (e.g. by first uncloudpickling something else that did refer to the sub-module as a global).
I assume cloudpickle checks whether a global is an imported module, and if so then stores the name (rather than pickling its attributes). Is it practical to also check (via sys.modules.keys()) which sub-modules had previously been imported, and ensure every such module is subsequently initialised?
Metadata
Metadata
Assignees
Labels
No labels