Provide clearer error when server provides bad data description XML #1178
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Debugging the errors of some dataset unit tests were particularly difficult, because the real reason for their errors (malformed XML) got swallowed as if the dataset is still in preprocessing (hence the change in
test_dataset_functions). By our own definition, we should only catchOpenMLServerExceptionthere ("exception for when the result of the server was not 200").The malformed XML error was still hard to debug, since only the xml parse
ExpartErrorwas provided, which does not provide information on the XML file. Restructuring the_get_dataset_descriptionfunction has two purposes:I ran
pytest pytest tests/test_datasets/test_dataset_functions.py::TestOpenMLDatasetlocally, and all tests still pass (except those with known server or parquet issues).