Moving `karyotype` to anoph by jonbrenas · Pull Request #702 · malariagen/malariagen-data-python

jonbrenas · 2024-12-11T14:52:07Z

Addresses #698.

The move is done and all tests pass locally but more tests need to be added (for funestus, for example).

jonbrenas · 2024-12-11T16:04:11Z

I added tests to test_ag3.py and test_af1.py to check that errors were indeed raised when expected. I am not really satisfied yet, though.

alimanfoo

Hi @jonbrenas, thanks for this, a few suggestions...

malariagen_data/anoph/karyotype.py

alimanfoo · 2024-12-13T17:20:50Z

malariagen_data/anoph/karyotype.py

+        if not self._inversion_tag_path:
+            raise FileNotFoundError(
+                "The file containing the inversion tags is missing."
+            )


FileNotFoundError isn't quite the right exception class here.

Suggested change

if not self._inversion_tag_path:

raise FileNotFoundError(

"The file containing the inversion tags is missing."

)

if self._inversion_tag_path is None:

raise NotImplementedError(

"No inversion tags are available for this data resource."

)

I have been of two minds about this one: on one hand, it is true that we have not generated the tags for Af1 and one could argue that NotImplementedError is a more appropriate error for work that is still to be done; on the other hand, the code itself would (I think) work if the file actually existed and it is thus less an issue of missing code and more an issue of a missing input. Also, I can imagine a situation where someone created tags for an inversion and would want to try to use their local file (it is not currently possible, the path that is used is hard-coded in both Ag3 and Af1) in which case the error would come from the path being incorrect (though, an error message referring to the actual path inputed would be more helpful if we want to offer this option).

Yes, this is an interesting case, and I have sympathies for both sides. However, I would advise against raising FileNotFoundError in cases where the code has not actually looked for a file and found it missing, such as in this case. In this case, we are merely expecting the file to be not found because the variable has not been set, but we haven't technically even looked for it. I would personally prefer to only raise FileNotFoundError in situations like this if the variable has been set to None as a direct consequence of there being no file found, rather than this case where it has been set to None as a direct consequence of the variable not being changed from its default, if I'm reading it correctly. Therefore, I expect that FileNotFoundError would be potentially misleading.

In this situation, where a particular value is begging a halt and an exception to be raised, I would first think of raising a ValueError, since we want to escape from an invalid state, from which we cannot continue on the same path, unless the value is changed to something else before rerunning. However, I can appreciate that this technical fact is not going to be the most useful error for the end user, since they might not know why the value hasn't been set to something valid. I suppose this might be explained in the actual error message. Relatedly, I reckon raising a RuntimeError would be too severe and inappropriate, since we might expect this scenario to happen occasionally.

Raising a NotImplementedError would make sense to me in a situation where this variable is set to None as a direct consequence of the functionality not yet being supported for that particular species group, which seems to be the case here, if I'm reading it correctly. However, what I'm still chewing on is the error message itself. Personally I would want to be explicit as well as helpful in the message, and not make too many assumptions. So I might say something about the fact that the variable has not been set together with the hint that this might mean that we do not yet have inversion tags available for this data resource. If you consider that the exception report will probably show the surrounding code and reveal the precise point of failure around self._inversion_tag_path is None then I seem to end up at an identical conclusion. I probably wouldn't object to raising a ValueError with a similarly helpful error message, instead of NotImplementedError, but this seems more like a "lack of support" scenario rather than an "invalid input" scenario.

malariagen_data/anoph/karyotype.py

malariagen_data/anoph/karyotype_params.py

alimanfoo · 2024-12-13T17:27:54Z

tests/integration/test_af1.py

+    if inversion == "X_x":
+        with pytest.raises(TypeError):
+            af1.karyotype(
+                inversion=inversion, sample_sets="AG1000G-GH", sample_query=None
+            )
+    else:
+        with pytest.raises(FileNotFoundError):
+            af1.karyotype(
+                inversion=inversion,
+                sample_sets="1229-VO-GH-DADZIE-VMF00095",
+                sample_query=None,
+            )


All inversion parameter values should fail with NotImplementedError I think.

tests/integration/test_ag3.py

alimanfoo · 2024-12-13T17:30:53Z

The other question here is if/how to get unit tests using the simulated data. It could be tricky because the simulated data is not guaranteed to generate data at the tag SNP positions.

review-notebook-app · 2025-01-02T15:46:07Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

jonbrenas · 2025-01-02T15:48:51Z

Thanks @alimanfoo. I made most of the changes you recommended. As for the tests, I created another issue #700 to deal with it independently of this as I agree that it is not trivial.

leehart · 2025-01-31T11:48:48Z

Thanks @jonbrenas . I reckon I'd be happy to approve this once the FileNotFoundError exception has been changed to NotImplementedError as suggested above. I've attempted to explain my own reasoning, but please feel free to debate it more if you sense it's the wrong direction.

jonbrenas · 2025-01-31T11:59:25Z

Thanks @leehart. I agree with your and Alistair's reasoning.

…-data-python into 698-move-karyotype

leehart

Thanks @jonbrenas

Moved karyotype to anoph

cdda0a2

jonbrenas marked this pull request as draft December 11, 2024 14:52

Added tests

19e99bc

jonbrenas added 2 commits December 11, 2024 16:21

Changed the error for Af

64eb5ea

Linting

6649eca

jonbrenas marked this pull request as ready for review December 11, 2024 16:51

alimanfoo reviewed Dec 13, 2024

View reviewed changes

Addressing comments

f87e0da

Merge branch 'master' into 698-move-karyotype

abcd9f8

jonbrenas mentioned this pull request Jan 3, 2025

Add tests on simulated data for karyotype #700

Open

leehart requested review from alimanfoo and leehart and removed request for alimanfoo January 24, 2025 11:18

Merge branch 'master' into 698-move-karyotype

c1b5985

jonbrenas added 2 commits January 31, 2025 11:56

Updated error

752a63f

Merge branch 'master' into 698-move-karyotype

a40b5f2

jonbrenas added 2 commits January 31, 2025 12:07

I forgot to update the tests.

ef121d1

Merge branch '698-move-karyotype' of github.com:malariagen/malariagen…

dde825d

…-data-python into 698-move-karyotype

leehart approved these changes Jan 31, 2025

View reviewed changes

leehart merged commit ff10e60 into master Jan 31, 2025

leehart deleted the 698-move-karyotype branch January 31, 2025 14:57

Conversation

jonbrenas commented Dec 11, 2024

Uh oh!

jonbrenas commented Dec 11, 2024

Uh oh!

alimanfoo left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alimanfoo Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

jonbrenas Jan 2, 2025

Choose a reason for hiding this comment

Uh oh!

leehart Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

alimanfoo Dec 13, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

alimanfoo commented Dec 13, 2024

Uh oh!

review-notebook-app bot commented Jan 2, 2025

Uh oh!

jonbrenas commented Jan 2, 2025

Uh oh!

leehart commented Jan 31, 2025

Uh oh!

jonbrenas commented Jan 31, 2025

Uh oh!

leehart left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leehart Jan 31, 2025 •

edited

Loading