Skip to content

Faster xarray concatenation#395

Merged
alimanfoo merged 5 commits intomalariagen:masterfrom
alimanfoo:fast-xarray-concat-2023-05-11
May 11, 2023
Merged

Faster xarray concatenation#395
alimanfoo merged 5 commits intomalariagen:masterfrom
alimanfoo:fast-xarray-concat-2023-05-11

Conversation

@alimanfoo
Copy link
Copy Markdown
Member

There is something funny going on within the xarray concat() function which is causing a big slowdown when we try to concatenate datasets, such as the datasets that are being concatenated internally within the snp_calls() or haplotypes() functions. A lot of time is being spent within this list comprehension in particular and the logic of it doesn't look right (pathological for large dimensions?).

In the mean time, this PR hacks a cut down version of concatenating datasets along a dimension, which is much much faster (like 1000X).

@codecov
Copy link
Copy Markdown

codecov bot commented May 11, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.79%. Comparing base (29b7023) to head (b8b3b00).
Report is 601 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #395   +/-   ##
=======================================
  Coverage   95.79%   95.79%           
=======================================
  Files           4        4           
  Lines         689      689           
=======================================
  Hits          660      660           
  Misses         29       29           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@alimanfoo alimanfoo merged commit 61bb489 into malariagen:master May 11, 2023
@alimanfoo alimanfoo deleted the fast-xarray-concat-2023-05-11 branch May 11, 2023 19:50
@alimanfoo alimanfoo added the BMGF-001927 Work supported by BMGF grant INV-001927 (MalariaGEN 2019-2024). label Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BMGF-001927 Work supported by BMGF grant INV-001927 (MalariaGEN 2019-2024).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant