Skip to content

Conversation

@PGijsbers
Copy link
Collaborator

It looks like the predictions loaded from an arff file are read as floats by the arff reader, which results in a different type (float v int). Because "equality" of values is already checked, I figured dtype is not as imported. That said, I am not sure why there are so many redundant comparisons in the first place? Anyway, the difference should be due to pandas inference behavior, and if that is what we want to test, then we should make a small isolated test case instead of integrating it into every prediction unit test.

It looks like the predictions loaded from an arff file are read as
floats by the arff reader, which results in a different type
(float v int). Because "equality" of values is already checked,
I figured dtype is not as imported. That said, I am not sure why
there are so many redundant comparisons in the first place?
Anyway, the difference should be due to pandas inference behavior,
and if that is what we want to test, then we should make a small
isolated test case instead of integrating it into every prediction
unit test.
Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me, but why did this fail in the first place 😕

@PGijsbers
Copy link
Collaborator Author

If I remember correctly, the numbers are integer. When openml-python executes the experiments, they are also integer (from for loops). When read from ARFF file (attribute annotated as NUMERIC), they are read as float, despite them really being integer.

@mfeurer
Copy link
Collaborator

mfeurer commented Oct 25, 2022

That description makes sense, but why did this work at some point at all?

The behavior of the ARFF reader is correct as ARFF does not specify integers. Maybe OpenML should also move away from ARFF for internal data structures.

@PGijsbers
Copy link
Collaborator Author

That description makes sense, but why did this work at some point at all?

My best guess is different behaviour of Pandas, but I haven't spent time nailing it down exactly (it's interesting, but didn't seem important).

@PGijsbers PGijsbers merged commit f37ebbe into develop Nov 24, 2022
@PGijsbers PGijsbers deleted the fix_test_runs branch November 24, 2022 18:18
PGijsbers added a commit to Mirkazemi/openml-python that referenced this pull request Feb 23, 2023
It looks like the predictions loaded from an arff file are read as
floats by the arff reader, which results in a different type
(float v int). Because "equality" of values is already checked,
I figured dtype is not as imported. That said, I am not sure why
there are so many redundant comparisons in the first place?
Anyway, the difference should be due to pandas inference behavior,
and if that is what we want to test, then we should make a small
isolated test case instead of integrating it into every prediction
unit test. Finally, over the next year we should move away from ARFF.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants