Implement lazy submodule importing#5101
Conversation
|
Our test runners are unhappy, but I swear it works locally ;) Clearly some details to sort out around what numpy.distutils expects. |
|
Tests passed for the previous commit; now added lazy functions for |
|
3.6 is expected to fail. 3.8 failures are unrelated. Still waiting on the others. |
|
Does PyPy support |
|
I haven't tested PyPy's 3.7 implementation yet, since I've only been using it as part of our CI testing. Fortunately, GH actions will soon support testing with PyPy3.7:
Are you still planning to support Python 3.6 for the next release? Or is the concern mostly related to CI testing? Will this still be an issue once GH actions sorts things out or do you need another CI system to support PyPy3.7? Out of curiosity, do you have many users using PyPy? We try to support it for NetworkX, but we've never had a user ask us anything about it. And none of our developers use it. Do you have a way to tell if scikit-image is being run on PyPy in the wild? If so, I would love to know so I can track it for NetworkX. (I spend a lot of time trying to sort out CI stuff for PyPy and we've spent some time jumping through hoops trying to ensure we mostly work with PyPy. It would be nice to know how many people (if any) are benefiting from the extra work.) |
|
I am pretty sure this requires https://www.python.org/dev/peps/pep-0562/, which is Python 3.7+. |
|
This only works with 3.7 up. Python 3.6 will no longer be supported in the next release. I don't know whether this works on PyPy---I haven't tested. But, I also don't know whether scikit-image works on PyPy. |
rossbar
left a comment
There was a problem hiding this comment.
This looks really promising. The "demo" case in skimage/filters/__init__.py is great for showing how it's a radical departure from the way imports are currently specified (and arguably looks even better).
One thing that came to mind when thinking about how other libraries could adopt this is how modules that already define a __getattr__ could be accommodated. Huge 👍 on the idea and thanks so much for working this up, I'm looking forward to learning more!
|
@rossbar, good question. The easiest way would be to implement |
|
@jarrodmillman just pointed me here, and I'm liking the idea of this PR a lot. I left some comments about the implementation of I'm going to probably spend a day or so working on adding my own variant of this functionality to my mkinit package. I'm hoping to add a formater that generates a EDIT: Note: I could configure mkinit to generate an alternate code path that used explicit imports for testing / Python <3.7 compatibility. That might also be interesting to explore while 3.6 is still widely used. |
Update on mkinitI've started playing around with this in mkinit. I have a working version. I've update the README that walks through a simple example that illustrates what mkinit does: https://github.com/Erotemic/mkinit/tree/dev/0.3.0#the-pitch This ends with a discussion of lazy importing. Note that I haven't finalized this feature yet and its progress is being tracked in Erotemic/mkinit#10 With these new changes I ran: The tests seem to work, but it exposes many more attributes than the current init.py file. This can be controlled. By specifying the I find it slightly weird that the only module currently exposed is General note on
|
|
@Erotemic Thanks for your feedback! I like the optimizations you propose and will integrate them as soon as I get back after the break. Generalizing this for use in other packages is in scope too. |
|
I've released the new version of mkinit 0.3.0 on pypi that contains the option to generate lazy imports. For now, you have to use the in-file boilerplate (i.e. every Again, the current version might expose more names than the existing API, but that can be fixed by specifying the If you only want certain submodules exposed you can specify special global attributes: Each of these global attributes should be a list of strings: (e.g. Effectively, Note, if you use mkinit with With the current __private__ = ['setup']
__protected__ = ['rank']
__submodules__ = [
'rank',
'lpi_filter',
'_gaussian',
'edges',
'_rank_order',
'_gabor',
'thresholding',
'ridges',
'_median',
'_sparse',
'_unsharp_mask',
'_window',
]
# End of custom init code
from ..util.lazy import lazy_import
__getattr__ = lazy_import(
__name__,
submodules={
'rank',
},
submod_attrs={
'_gabor': [
'gabor',
'gabor_kernel',
],
'_gaussian': [
'difference_of_gaussians',
'gaussian',
],
'_median': [
'median',
],
'_rank_order': [
'rank_order',
],
'_sparse': [
'correlate_sparse',
],
'_unsharp_mask': [
'unsharp_mask',
],
'_window': [
'window',
],
'edges': [
'HFARID_WEIGHTS',
'HPREWITT_WEIGHTS',
'HSCHARR_WEIGHTS',
'HSOBEL_WEIGHTS',
'PREWITT_EDGE',
'PREWITT_SMOOTH',
'ROBERTS_ND_WEIGHTS',
'ROBERTS_PD_WEIGHTS',
'SCHARR_EDGE',
'SCHARR_SMOOTH',
'SOBEL_EDGE',
'SOBEL_SMOOTH',
'VFARID_WEIGHTS',
'VPREWITT_WEIGHTS',
'VSCHARR_WEIGHTS',
'VSOBEL_WEIGHTS',
'd1',
'farid',
'farid_h',
'farid_v',
'laplace',
'p',
'prewitt',
'prewitt_h',
'prewitt_v',
'roberts',
'roberts_neg_diag',
'roberts_pos_diag',
'scharr',
'scharr_h',
'scharr_v',
'sobel',
'sobel_h',
'sobel_v',
],
'lpi_filter': [
'LPIFilter2D',
'constrained_least_squares',
'eps',
'forward',
'inverse',
'wiener',
],
'ridges': [
'compute_hessian_eigenvalues',
'frangi',
'hessian',
'meijering',
'sato',
],
'thresholding': [
'apply_hysteresis_threshold',
'threshold_isodata',
'threshold_li',
'threshold_local',
'threshold_mean',
'threshold_minimum',
'threshold_multiotsu',
'threshold_niblack',
'threshold_otsu',
'threshold_sauvola',
'threshold_triangle',
'threshold_yen',
'try_all_threshold',
],
},
)
def __dir__():
return __all__
__all__ = ['HFARID_WEIGHTS', 'HPREWITT_WEIGHTS', 'HSCHARR_WEIGHTS',
'HSOBEL_WEIGHTS', 'LPIFilter2D', 'PREWITT_EDGE', 'PREWITT_SMOOTH',
'ROBERTS_ND_WEIGHTS', 'ROBERTS_PD_WEIGHTS', 'SCHARR_EDGE',
'SCHARR_SMOOTH', 'SOBEL_EDGE', 'SOBEL_SMOOTH', 'VFARID_WEIGHTS',
'VPREWITT_WEIGHTS', 'VSCHARR_WEIGHTS', 'VSOBEL_WEIGHTS',
'apply_hysteresis_threshold', 'compute_hessian_eigenvalues',
'constrained_least_squares', 'correlate_sparse', 'd1',
'difference_of_gaussians', 'eps', 'farid', 'farid_h', 'farid_v',
'forward', 'frangi', 'gabor', 'gabor_kernel', 'gaussian', 'hessian',
'inverse', 'laplace', 'median', 'meijering', 'p', 'prewitt',
'prewitt_h', 'prewitt_v', 'rank', 'rank_order', 'roberts',
'roberts_neg_diag', 'roberts_pos_diag', 'sato', 'scharr',
'scharr_h', 'scharr_v', 'sobel', 'sobel_h', 'sobel_v',
'threshold_isodata', 'threshold_li', 'threshold_local',
'threshold_mean', 'threshold_minimum', 'threshold_multiotsu',
'threshold_niblack', 'threshold_otsu', 'threshold_sauvola',
'threshold_triangle', 'threshold_yen', 'try_all_threshold',
'unsharp_mask', 'wiener', 'window']This is the exact lazy analog of this: __private__ = ['setup']
__protected__ = ['rank']
__submodules__ = [
'rank',
'lpi_filter',
'_gaussian',
'edges',
'_rank_order',
'_gabor',
'thresholding',
'ridges',
'_median',
'_sparse',
'_unsharp_mask',
'_window',
]
# End of custom init code
import rank
from .lpi_filter import (LPIFilter2D, constrained_least_squares, eps, forward,
inverse, wiener,)
from ._gaussian import (difference_of_gaussians, gaussian,)
from .edges import (HFARID_WEIGHTS, HPREWITT_WEIGHTS, HSCHARR_WEIGHTS,
HSOBEL_WEIGHTS, PREWITT_EDGE, PREWITT_SMOOTH,
ROBERTS_ND_WEIGHTS, ROBERTS_PD_WEIGHTS, SCHARR_EDGE,
SCHARR_SMOOTH, SOBEL_EDGE, SOBEL_SMOOTH, VFARID_WEIGHTS,
VPREWITT_WEIGHTS, VSCHARR_WEIGHTS, VSOBEL_WEIGHTS, d1,
farid, farid_h, farid_v, laplace, p, prewitt, prewitt_h,
prewitt_v, roberts, roberts_neg_diag, roberts_pos_diag,
scharr, scharr_h, scharr_v, sobel, sobel_h, sobel_v,)
from ._rank_order import (rank_order,)
from ._gabor import (gabor, gabor_kernel,)
from .thresholding import (apply_hysteresis_threshold, threshold_isodata,
threshold_li, threshold_local, threshold_mean,
threshold_minimum, threshold_multiotsu,
threshold_niblack, threshold_otsu,
threshold_sauvola, threshold_triangle,
threshold_yen, try_all_threshold,)
from .ridges import (compute_hessian_eigenvalues, frangi, hessian, meijering,
sato,)
from ._median import (median,)
from ._sparse import (correlate_sparse,)
from ._unsharp_mask import (unsharp_mask,)
from ._window import (window,)
__all__ = ['HFARID_WEIGHTS', 'HPREWITT_WEIGHTS', 'HSCHARR_WEIGHTS',
'HSOBEL_WEIGHTS', 'LPIFilter2D', 'PREWITT_EDGE', 'PREWITT_SMOOTH',
'ROBERTS_ND_WEIGHTS', 'ROBERTS_PD_WEIGHTS', 'SCHARR_EDGE',
'SCHARR_SMOOTH', 'SOBEL_EDGE', 'SOBEL_SMOOTH', 'VFARID_WEIGHTS',
'VPREWITT_WEIGHTS', 'VSCHARR_WEIGHTS', 'VSOBEL_WEIGHTS',
'apply_hysteresis_threshold', 'compute_hessian_eigenvalues',
'constrained_least_squares', 'correlate_sparse', 'd1',
'difference_of_gaussians', 'eps', 'farid', 'farid_h', 'farid_v',
'forward', 'frangi', 'gabor', 'gabor_kernel', 'gaussian', 'hessian',
'inverse', 'laplace', 'median', 'meijering', 'p', 'prewitt',
'prewitt_h', 'prewitt_v', 'rank', 'rank_order', 'roberts',
'roberts_neg_diag', 'roberts_pos_diag', 'sato', 'scharr',
'scharr_h', 'scharr_v', 'sobel', 'sobel_h', 'sobel_v',
'threshold_isodata', 'threshold_li', 'threshold_local',
'threshold_mean', 'threshold_minimum', 'threshold_multiotsu',
'threshold_niblack', 'threshold_otsu', 'threshold_sauvola',
'threshold_triangle', 'threshold_yen', 'try_all_threshold',
'unsharp_mask', 'wiener', 'window']with this: Given this demo, is there any interest in using mkinit with scikit-learn (or networkx)? If so I can make modifications to improve the generated code based on suggestions, but if not, then I'm glad my prototyping was helpful. |
|
Hello @stefanv! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2021-10-26 07:02:43 UTC |
ec6eabd to
d97b3e0
Compare
|
@Erotemic Thank you again for your feedback; I've incorporated it, and simplified the importing mechanism to avoid one level of indirection. I'd appreciate it if you could take another look. |
skimage/util/lazy.py
Outdated
|
|
||
| def __getattr__(name): | ||
| if name in submodules: | ||
| return importlib.import_module(f'{module_name}.{name}') |
There was a problem hiding this comment.
In a previous version of this there was a "require" logic. It did stuff like importlib.util.find_spec and making importlib.util.LazyLoader(spec.loader). I'm not sure if that is necessary or not. If its not necessary, great, I like getting rid of superfluous code, but I'm curious why it was there to begin with and what made you remove it?
There was a problem hiding this comment.
I think that was primarily because of the evolution of the idea—I started with lazy module imports, and then added the other layers on top.
With __dir__ in place, you can inspect the module, and only on __getattr__ should the module be loaded. The loaded module has the same setup, so it won't be expensive to import either.
You have a keen eye, I must say!
|
@stefanv I took a look at this and added a few comments. Something else I saw is that you aren't adding the returned attribute to the globals dictionary. I would modify def install_lazy(module_name, submodules=None, submod_attrs=None, globals_=None):
...
def __getattr__(name):
...
attr = importlib.import_module(f'{module_name}.{name}')
...
attr = getattr(submod, name
...
globals_[name] = attr
return attr
...replace the returns by assigning the variable to You could even use the def get_stack_frame(n=0, strict=True):
"""
Gets the current stack frame or any of its ancestors dynamically
Args:
n (int): n=0 means the frame you called this function in.
n=1 is the parent frame.
strict (bool): (default = True)
Returns:
frame: frame_cur
Example:
>>> frame_cur = get_stack_frame(n=0)
>>> print('frame_cur = %r' % (frame_cur,))
>>> assert frame_cur.f_globals['frame_cur'] is frame_cur
"""
frame_cur = inspect.currentframe()
# Use n+1 to always skip the frame of this function
for ix in range(n + 1):
frame_next = frame_cur.f_back
if frame_next is None: # nocover
if strict:
raise AssertionError('Frame level %r is root' % ix)
else:
break
frame_cur = frame_next
return frame_cur
...
if globals_ is None:
globals_ = get_stack_frame(n=1).f_globals(Side note I'm going to release 0.3.1 of mkinit later today Erotemic/mkinit#11 is there any sort of integration with that that we should consider here?) |
|
i think there is a simpler way to get the one benefit you listed. but i need to think about it more. this shouldn't stop others from merging but i would be a little more concerned in propagating this pattern too far. |
|
It might be worth reading through the SPEC for motivation. I am definitely not trying to force this onto scikit-image, but I think the benefits are: a) Easy interactive exploration of the entire API without special imports This all happens transparently to the user (if not entirely to developers who add new submodules). @hmaarrfk The external module importer may well address the pooch import times. @jni I'm pretty happy with this for now. @dschult made some great suggestions which I incorporated. There were some documentation language fixes that I wasn't sure about; maybe you can help me cast an eye over those? |
|
Thanks for the reading material @stefanv very informative! I can't say i love how the file 👍🏾 |
|
This pull request has been mentioned on Image.sc Forum. There might be relevant details there: |
|
Great, this seems ready to merge. @stefanv, do you want to do the Squash+Merge and trim some of the intermediate commit messages at that time? (I could do it, but thought you may have a clearer view of what you prefer to trim/summarize) |
|
Thanks, Greg, sorry that I didn't get to the editing of the commit message quickly enough! |
Note: This is an experimental PR, so that we can discuss the idea and get feedback on its implementation.
It allows scikit-image to be used as follows:
The lazy loading strategy could also be implemented inside submodules,
to speed up imports of that submodule. E.g., we could do:
The linalg module is only loaded once it is used in the code, avoiding the
standard workaround of having to move the import inside of a function
call.
UPDATE: On a suggestion from @jni, I also implemented lazy importing of functions. Now, if you only ever use one function from
skimage.filters, only that function will be imported (instead of all the other functions in theskimage.filterssubmodule). For this proof-of-concept, I've only doneskimage.filters.