feat: weights implementation for atom#78
Merged
Merged
Conversation
New functions: - get_weights(key): Retrieve complete weight metadata for a dataset - get_weight_names(key): Get just the list of weight names - get_all_weights_for_release(release_name): Get weights for all datasets in a release
- Added a new `GET /weights/dsids/{dsids_str}` endpoint to the PMG Weights API to allow retrieving weights in bulk for multiple datasets at once.
- Refactored `get_all_weights_for_release` in `metadata.py` to batch-process datasets (in chunks of 500) using the new bulk endpoint, significantly reducing API round-trips.
- Updated dataset extraction to correctly poll the `/datasets` API directly instead of strictly relying on local cache, ensuring offline/non-current releases load correctly.
- Added a `tqdm` progress bar to provide visual feedback during bulk weight fetching.
- Expanded the `pytest` mock API handling and added new unit test cases (handling mismatched tag fallbacks, 404 errors, missing e_tags, and malformed datasets) to achieve 100% test coverage for `metadata.py`.
See:
https://gitlab.cern.ch/atlas-outreach-data-tools/pmg-weights-db-api/-/commit/3d66dfb3631c01d1468276983caa53c49c009900
a527656 to
de703be
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #78 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 3 4 +1
Lines 354 478 +124
==========================================
+ Hits 354 478 +124 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
de703be to
94500e4
Compare
zlmarshall
reviewed
Mar 17, 2026
Contributor
zlmarshall
left a comment
There was a problem hiding this comment.
Thanks! Just a handful of questions; this looks really good.
Better management of modules, less confusion and shorter files.
In addition, Zach's cosmetic comments
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces the capability to query and map "weights" and systematic variations (PMG weights) directly through atom.
To support this without choking the API, a new bulk query mechanism was implemented on the backend.
Changes
Implemented a new GET /weights/dsids/{dsids_str} endpoint that processes multiple DSIDs.
get_weights: fetches weight arrays mapped to a specific dataset ID and explicitly resolves its e_tag if not provided.get_weight_names(key, e_tag): wrapper that returns a list of the weight name strings.get_all_weights_for_release(release): grabs all the weights in a release. Progress bar implemented to match the caching mechanims in place for datasets. I didn't include it in fetch_release because it's not always necessary.Added relevant tests. A lot of them.