Skip to content

Conversation

@mattdowle
Copy link
Member

@mattdowle mattdowle commented Sep 9, 2019

Resolves grattan in #3581

In v1.12.2 when joining i.realDouble to x.int, fractions in i.realDouble would be truncated and would succeed in joining to those integers in x. That was widely agreed as wrong behavior. Then in dev we changed that to a type error in just the case when fractions are detected to be present in the i column. This PR finally gets the balance right, I hope, by coercing x.int to double instead so the numbers with fractions will not return a match, and the type and contents of the i column is still preserved in the result. There is a verbose message consistent with the other similar coercions too.

In future we could introduce an option to turn all of the verbose coercion messages in bmerge (not just this one) into warnings or errors. That way users can turn on the strict type-match-required option for production (to avoid wasteful coercions) without getting in the way of revdeps or new users.

The "Coerced" to "Coercing" change in verbose messages is just to make the verbose messages consistent. Chose "Coercing" to convey that the message is printed before the action and isn't past tense yet. If any of the coercions fails with an error for any reason then the verbose message will have just printed in current tense.

@mattdowle mattdowle added this to the 1.12.4 milestone Sep 9, 2019
@mattdowle mattdowle mentioned this pull request Sep 9, 2019
31 tasks
@codecov
Copy link

codecov bot commented Sep 9, 2019

Codecov Report

Merging #3846 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3846      +/-   ##
==========================================
+ Coverage   99.42%   99.42%   +<.01%     
==========================================
  Files          71       71              
  Lines       13408    13409       +1     
==========================================
+ Hits        13331    13332       +1     
  Misses         77       77
Impacted Files Coverage Δ
R/bmerge.R 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4eb90a1...9881b09. Read the comment docs.

@MichaelChirico
Copy link
Member

cc @HughParsonage as grattan co-author too

Copy link
Member

@MichaelChirico MichaelChirico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have the rules about type mismatch written into the documentation anywhere? Maybe a section in ?merge.data.table?

set(i, j=ic, value=val)
set(callersi, j=ic, value=val) # change the shallow copy of i up in [.data.table to reflect in the result, too.
} else {
if (verbose) cat("Coercing integer column x.",names(x)[xc]," to type double to match type of i.",names(i)[ic]," which contains fractions.\n",sep="")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does setNumericRounding come into play here? If so we should note such (in the verbose message and/or in dox)

Copy link
Member Author

@mattdowle mattdowle Sep 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it comes into play. But it does anyway for regular double to double type-matching joins too. setNumericRounding is really just to resolve numeric accuracy; more like very-slight-inaccuracy. I'm not sure the wording to use to best convey this, or for the manual page. But yes agree would be good to add. I just got swamped by another 11 revdep fails in the full-rerun so I'll try and get through those as priority. The full-rerun included this PR and all the fails were unrelated, so that's good.
If you have something in mind for the verbose message here and/or manual page, feel free to add in a follow-up PR, or raise an issue to follow-up so we don't forget. Agree.

Copy link
Contributor

@MarkusBonsch MarkusBonsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree with the new behaviour that you propose.
Honestly, I wonder, why I implemented it differently since the table that I referenced in my second post in #2592 suggests the behaviour that you now implemented.

@mattdowle mattdowle merged commit 3bb130d into master Sep 9, 2019
@mattdowle mattdowle deleted the bmerge_numeric branch September 9, 2019 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants