Skip to content

feat: add results caching to cohort_diversity_stats#1014

Merged
jonbrenas merged 5 commits intomalariagen:masterfrom
Sharon-codes:issue-798-cohort-diversity-cache
Mar 5, 2026
Merged

feat: add results caching to cohort_diversity_stats#1014
jonbrenas merged 5 commits intomalariagen:masterfrom
Sharon-codes:issue-798-cohort-diversity-cache

Conversation

@Sharon-codes
Copy link
Copy Markdown
Contributor

Summary

Adds results-cache support to cohort_diversity_stats() so repeated calls with identical inputs reuse cached output.

Changes

  • Updated malariagen_data/anopheles.py:
    • Added cache key/version name for cohort_diversity_stats.
    • Normalized params for cache usage.
    • Wrapped allele-count + jackknife computation with results_cache_get/set.
  • Added tests/test_anopheles.py:
    • Verifies cache miss computes once.
    • Verifies second call with same params hits cache.

Validation

  • poetry run pytest tests/test_anopheles.py -q
  • poetry run ruff check malariagen_data/anopheles.py tests/test_anopheles.py

Closes #798

@Sharon-codes
Copy link
Copy Markdown
Contributor Author

@tristanpwdennis hey tristan please check if everything is good , if you need me to make any changes , please let me or else please accept it. Thanks alot !!!

@Sharon-codes
Copy link
Copy Markdown
Contributor Author

@jonbrenas This one too , thanksss !!

Copy link
Copy Markdown
Collaborator

@jonbrenas jonbrenas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Sharon-codes. Can you run the tests on the simulated data as is done, for example, for the PCA?

@Sharon-codes
Copy link
Copy Markdown
Contributor Author

Heyyy @jonbrenas ...... Addressed review feedback in a follow-up commit.

Changes:

  • moved cache coverage to simulated-data tests (tests/anoph/test_cohort_diversity_stats.py) in the same style as other anoph tests
  • removed the standalone dummy test module
  • fixed a cache serialization issue in cohort_diversity_stats() by converting cached values to numpy arrays before calling results_cache_set

Validation run locally:

  • pre-commit
  • mypy (mypy malariagen_data tests --ignore-missing-imports)
  • pytest (tests/anoph/test_cohort_diversity_stats.py)

If everything is as per your liking , please close this PR , thankksss !!!

@jonbrenas jonbrenas merged commit ea9e24c into malariagen:master Mar 5, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add caching to diversity functions

2 participants