Skip to content

Fixing loading ASDF files from a directory#503

Merged
Cadair merged 13 commits intoDKISTDC:mainfrom
Cadair:asdf_directory_loader
Jan 28, 2025
Merged

Fixing loading ASDF files from a directory#503
Cadair merged 13 commits intoDKISTDC:mainfrom
Cadair:asdf_directory_loader

Conversation

@Cadair
Copy link
Member

@Cadair Cadair commented Jan 22, 2025

fixes #496

I am not super happy with the parsing code, so if anyone has any suggestions for making it better I am very open to improving it.

@Cadair Cadair force-pushed the asdf_directory_loader branch from 6a0d149 to ce6e6fa Compare January 22, 2025 14:25
@Cadair Cadair force-pushed the asdf_directory_loader branch from a36da67 to 211e99d Compare January 27, 2025 13:43
@Cadair Cadair changed the title [WIP] Fixing loading ASDF files from a directory Fixing loading ASDF files from a directory Jan 27, 2025
@Cadair Cadair marked this pull request as ready for review January 27, 2025 14:46
Copy link
Contributor

@eigenbrot eigenbrot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, what the parsing lacks in concision it makes up for in clarity (the good comments help, too!).

if len(candidates) == 1:
asdfs_to_load += candidates
else:
# Now we group by prefix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing, but I think "prefix" in the follow context just means "everything that isn't the suffix", which confused me a bit. Would stem be a better name (taking a hint from pathlib.Path)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stem also feels overloaded because it's everything in the filename which isn't .asdf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point. I'm fine with whatever.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not think of a good word.

Copy link
Contributor

@SolarDrew SolarDrew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@Cadair Cadair merged commit 663184f into DKISTDC:main Jan 28, 2025
22 checks passed
@Cadair Cadair deleted the asdf_directory_loader branch January 28, 2025 16:04
Cadair added a commit that referenced this pull request Jan 29, 2025
* Add a regression test and docs for new path loader

[ci skip]

* Oh god what have I done

* Better ordering invariance

* Add a partial match test

* Add a warning for any skipped files

* Add changelog

* This should never be hit

* Add more test cases of multiple patterns

* DKIST dataset IDs are 5 or more uppercase letters

* Dataset IDs must be at least 5 characters

* Add a regex test for <5 char dataset ids

* More review comments

---------

Co-authored-by: Fraser Watson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update dataset loader to handle the potential of many asdf files for a single dataset due to naming suffix changes

5 participants