Skip to content

gene_cnv_frequencies can return ValueError #667

@jonbrenas

Description

@jonbrenas

While working on #663, I realised that we don't have tests for gene_cnv_frequencies ... but when I started working on creating tests I got quite a few unexpected errors. Turns out, it is not hard, even using non-random values, for example:

import malariagen_data
af1 = malariagen_data.Af1()
af1.gene_cnv_frequencies(region='X:10_000_000-12_000_000', sample_sets="1229-VO-GH-DADZIE-VMF00095", cohorts="admin2_month")

yields:

ValueError                                Traceback (most recent call last)
[<ipython-input-2-c41477673d75>](https://localhost:8080/#) in <cell line: 1>()
----> 1 af1.gene_cnv_frequencies(region='X:10_000_000-12_000_000', sample_sets="1229-VO-GH-DADZIE-VMF00095", cohorts="admin2_month")

6 frames
[/usr/local/lib/python3.10/dist-packages/pandas/core/internals/construction.py](https://localhost:8080/#) in _extract_index(data)
    688                     f"length {len(index)}"
    689                 )
--> 690                 raise ValueError(msg)
    691         else:
    692             index = default_index(lengths[0])

ValueError: array length 228 does not match index length 0

I think the issue is that no CNV data is found for any cohort (resulting in pandas trying to create a column of size 0 for max_af) but ValueError definitely shouldn't be the result of any fairly normal execution of such a function.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions