-
Notifications
You must be signed in to change notification settings - Fork 179
gene_cnv_frequencies can return ValueError #667
Copy link
Copy link
Closed
Labels
Description
While working on #663, I realised that we don't have tests for gene_cnv_frequencies ... but when I started working on creating tests I got quite a few unexpected errors. Turns out, it is not hard, even using non-random values, for example:
import malariagen_data
af1 = malariagen_data.Af1()
af1.gene_cnv_frequencies(region='X:10_000_000-12_000_000', sample_sets="1229-VO-GH-DADZIE-VMF00095", cohorts="admin2_month")
yields:
ValueError Traceback (most recent call last)
[<ipython-input-2-c41477673d75>](https://localhost:8080/#) in <cell line: 1>()
----> 1 af1.gene_cnv_frequencies(region='X:10_000_000-12_000_000', sample_sets="1229-VO-GH-DADZIE-VMF00095", cohorts="admin2_month")
6 frames
[/usr/local/lib/python3.10/dist-packages/pandas/core/internals/construction.py](https://localhost:8080/#) in _extract_index(data)
688 f"length {len(index)}"
689 )
--> 690 raise ValueError(msg)
691 else:
692 index = default_index(lengths[0])
ValueError: array length 228 does not match index length 0
I think the issue is that no CNV data is found for any cohort (resulting in pandas trying to create a column of size 0 for max_af) but ValueError definitely shouldn't be the result of any fairly normal execution of such a function.
Reactions are currently unavailable