Conversation
|
I added tests to |
alimanfoo
left a comment
There was a problem hiding this comment.
Hi @jonbrenas, thanks for this, a few suggestions...
malariagen_data/anoph/karyotype.py
Outdated
| if not self._inversion_tag_path: | ||
| raise FileNotFoundError( | ||
| "The file containing the inversion tags is missing." | ||
| ) |
There was a problem hiding this comment.
FileNotFoundError isn't quite the right exception class here.
| if not self._inversion_tag_path: | |
| raise FileNotFoundError( | |
| "The file containing the inversion tags is missing." | |
| ) | |
| if self._inversion_tag_path is None: | |
| raise NotImplementedError( | |
| "No inversion tags are available for this data resource." | |
| ) |
There was a problem hiding this comment.
I have been of two minds about this one: on one hand, it is true that we have not generated the tags for Af1 and one could argue that NotImplementedError is a more appropriate error for work that is still to be done; on the other hand, the code itself would (I think) work if the file actually existed and it is thus less an issue of missing code and more an issue of a missing input. Also, I can imagine a situation where someone created tags for an inversion and would want to try to use their local file (it is not currently possible, the path that is used is hard-coded in both Ag3 and Af1) in which case the error would come from the path being incorrect (though, an error message referring to the actual path inputed would be more helpful if we want to offer this option).
There was a problem hiding this comment.
Yes, this is an interesting case, and I have sympathies for both sides. However, I would advise against raising FileNotFoundError in cases where the code has not actually looked for a file and found it missing, such as in this case. In this case, we are merely expecting the file to be not found because the variable has not been set, but we haven't technically even looked for it. I would personally prefer to only raise FileNotFoundError in situations like this if the variable has been set to None as a direct consequence of there being no file found, rather than this case where it has been set to None as a direct consequence of the variable not being changed from its default, if I'm reading it correctly. Therefore, I expect that FileNotFoundError would be potentially misleading.
In this situation, where a particular value is begging a halt and an exception to be raised, I would first think of raising a ValueError, since we want to escape from an invalid state, from which we cannot continue on the same path, unless the value is changed to something else before rerunning. However, I can appreciate that this technical fact is not going to be the most useful error for the end user, since they might not know why the value hasn't been set to something valid. I suppose this might be explained in the actual error message. Relatedly, I reckon raising a RuntimeError would be too severe and inappropriate, since we might expect this scenario to happen occasionally.
Raising a NotImplementedError would make sense to me in a situation where this variable is set to None as a direct consequence of the functionality not yet being supported for that particular species group, which seems to be the case here, if I'm reading it correctly. However, what I'm still chewing on is the error message itself. Personally I would want to be explicit as well as helpful in the message, and not make too many assumptions. So I might say something about the fact that the variable has not been set together with the hint that this might mean that we do not yet have inversion tags available for this data resource. If you consider that the exception report will probably show the surrounding code and reveal the precise point of failure around self._inversion_tag_path is None then I seem to end up at an identical conclusion. I probably wouldn't object to raising a ValueError with a similarly helpful error message, instead of NotImplementedError, but this seems more like a "lack of support" scenario rather than an "invalid input" scenario.
tests/integration/test_af1.py
Outdated
| if inversion == "X_x": | ||
| with pytest.raises(TypeError): | ||
| af1.karyotype( | ||
| inversion=inversion, sample_sets="AG1000G-GH", sample_query=None | ||
| ) | ||
| else: | ||
| with pytest.raises(FileNotFoundError): | ||
| af1.karyotype( | ||
| inversion=inversion, | ||
| sample_sets="1229-VO-GH-DADZIE-VMF00095", | ||
| sample_query=None, | ||
| ) |
There was a problem hiding this comment.
All inversion parameter values should fail with NotImplementedError I think.
|
The other question here is if/how to get unit tests using the simulated data. It could be tricky because the simulated data is not guaranteed to generate data at the tag SNP positions. |
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Thanks @alimanfoo. I made most of the changes you recommended. As for the tests, I created another issue #700 to deal with it independently of this as I agree that it is not trivial. |
|
Thanks @jonbrenas . I reckon I'd be happy to approve this once the |
|
Thanks @leehart. I agree with your and Alistair's reasoning. |
…-data-python into 698-move-karyotype
Addresses #698.
The move is done and all tests pass locally but more tests need to be added (for funestus, for example).