-
Notifications
You must be signed in to change notification settings - Fork 16
Closed
Description
With a test file on my local system (with the profiler enabled) it took 600s, which is insane (it does take significantly less without the profiler).
Here are some excerpts from the profile (which is insanely large)
612.109 <module> <ipython-input-10-8499de7ed805>:1
└─ 612.109 wrapper functools.py:927
└─ 612.109 _load_from_string /home/stuart/Git/DKIST/dkist/dkist/dataset/loader.py:116
└─ 612.109 _load_from_path /home/stuart/Git/DKIST/dkist/dkist/dataset/loader.py:125
├─ 612.109 _load_from_asdf /home/stuart/Git/DKIST/dkist/dkist/dataset/loader.py:158
│ ├─ 611.866 open_asdf asdf/_asdf.py:1622
│ │ ├─ 611.861 AsdfFile._open_impl asdf/_asdf.py:1006
│ │ │ └─ 611.861 AsdfFile._open_asdf asdf/_asdf.py:890
│ │ │ ├─ 360.544 AsdfFile._validate asdf/_asdf.py:670
│ │ │ ├─ 114.634 tagged_tree_to_custom_tree asdf/yamlutil.py:329
│ │ │ ├─ 88.697 load_tree asdf/yamlutil.py:373
│ │ │ ├─ 39.880 find_references asdf/reference.py:108
│ │ │ ├─ 7.834 Manager.read asdf/_block/manager.py:337
So a significant amount of time is in the validation of the file on read, followed by the conversion of the tree to high-level objects and a good chunk in parsing the yaml and finding all the references in the yaml.
The obvious win would be to disable validation on read, but we should think about the trade off more.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels