-
Notifications
You must be signed in to change notification settings - Fork 1k
i.realDouble joins to x.int again by coercing x.int to double #3846
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3846 +/- ##
==========================================
+ Coverage 99.42% 99.42% +<.01%
==========================================
Files 71 71
Lines 13408 13409 +1
==========================================
+ Hits 13331 13332 +1
Misses 77 77
Continue to review full report at Codecov.
|
|
cc @HughParsonage as |
MichaelChirico
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have the rules about type mismatch written into the documentation anywhere? Maybe a section in ?merge.data.table?
| set(i, j=ic, value=val) | ||
| set(callersi, j=ic, value=val) # change the shallow copy of i up in [.data.table to reflect in the result, too. | ||
| } else { | ||
| if (verbose) cat("Coercing integer column x.",names(x)[xc]," to type double to match type of i.",names(i)[ic]," which contains fractions.\n",sep="") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does setNumericRounding come into play here? If so we should note such (in the verbose message and/or in dox)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it comes into play. But it does anyway for regular double to double type-matching joins too. setNumericRounding is really just to resolve numeric accuracy; more like very-slight-inaccuracy. I'm not sure the wording to use to best convey this, or for the manual page. But yes agree would be good to add. I just got swamped by another 11 revdep fails in the full-rerun so I'll try and get through those as priority. The full-rerun included this PR and all the fails were unrelated, so that's good.
If you have something in mind for the verbose message here and/or manual page, feel free to add in a follow-up PR, or raise an issue to follow-up so we don't forget. Agree.
MarkusBonsch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree with the new behaviour that you propose.
Honestly, I wonder, why I implemented it differently since the table that I referenced in my second post in #2592 suggests the behaviour that you now implemented.
Resolves
grattanin #3581In v1.12.2 when joining i.realDouble to x.int, fractions in i.realDouble would be truncated and would succeed in joining to those integers in x. That was widely agreed as wrong behavior. Then in dev we changed that to a type error in just the case when fractions are detected to be present in the i column. This PR finally gets the balance right, I hope, by coercing x.int to double instead so the numbers with fractions will not return a match, and the type and contents of the i column is still preserved in the result. There is a verbose message consistent with the other similar coercions too.
In future we could introduce an option to turn all of the verbose coercion messages in bmerge (not just this one) into warnings or errors. That way users can turn on the strict type-match-required option for production (to avoid wasteful coercions) without getting in the way of revdeps or new users.
The "Coerced" to "Coercing" change in verbose messages is just to make the verbose messages consistent. Chose "Coercing" to convey that the message is printed before the action and isn't past tense yet. If any of the coercions fails with an error for any reason then the verbose message will have just printed in current tense.