-
Notifications
You must be signed in to change notification settings - Fork 14
Closed
Labels
enhancementNew feature or requestNew feature or requestperformanceRelating to speed and memory performanceRelating to speed and memory performance
Milestone
Description
Importing cfdm is quite slow ... on my local computer it takes 1.4 seconds.
Another system I have tried (JASMIN) can take between 6 and 10 seconds!
Having had a look at this, there seem to be two main culprits for the slow import:
- Doc string rewriting
- At import time, every docstring (of which there are currently 5569) is inspected for doc string substitutions, replacing any that are found
- Importing external modules that themselves have a slow import
- The main problems here are
dask,scipy,s3fs,zarr,h5netcdf,uritools,netCDF4
- The main problems here are
These can be improved on:
- Doc string rewriting
- Only apply substitutions when necessary, rather than trying every possible substitution for every doc string. Only 3009 of the doc strings need rewriting, and each one of those only utilises a small number of the 110 possible substitutions.
- Importing external modules that themselves have a slow import
- Move the
dask,scipy,s3fs,zarr,h5netcdf,uritoolsto run time, rather than import time. Many will not ever get imported, and when they do, the time is usually negligible compared to the operation being run.
- Move the
Results
By applying these two changes, my local import time reduces to 0.2 seconds (from 1.4 seconds - a factor of 7 speed-up). On the other system I tried, the time reduces to between 1.5 and 2.5 seconds (from between 6 and 10 seconds).
These are good enough improvements for a PR, I think ...
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requestperformanceRelating to speed and memory performanceRelating to speed and memory performance