Skip to content

Conversation

@tdhock
Copy link
Member

@tdhock tdhock commented Sep 26, 2020

Closes #4455
hi @jangorecki tagging you for a review since this is a fix for #4455 which you labeled.
Turns out the fix was really easy, just had to delete two lines of code.
I had to update the expected values for two existing tests, which I suspect were created based on the empirical output of melt, rather than the theoretically expected values.
In the linked issue the example code which illustrates the problem is:

library(data.table)
wide.dt <- data.table(
  family=1,
  age_child1=NA_real_, sex_child1=NA_character_,
  age_child2=5, sex_child2="m")
measure.vars <- list(age=c(2,4), sex=c(3,5))
(na.rm.T <- melt(wide.dt, measure.vars=measure.vars, na.rm=TRUE))
(na.rm.F <- melt(wide.dt, measure.vars=measure.vars, na.rm=FALSE)[!is.na(age)])

on current master we get results that are inconsistent (na.rm=TRUE gives variable=1, but na.rm=FALSE gives variable=2).

> (na.rm.T <- melt(wide.dt, measure.vars=measure.vars, na.rm=TRUE))

   family variable age sex
1:      1        1   5   m
> (na.rm.F <- melt(wide.dt, measure.vars=measure.vars, na.rm=FALSE)[!is.na(age)])
   family variable age sex
1:      1        2   5   m

This is problematic for downstream applications such as tdhock/nc#11 which need variable to be the corresponding index of the measure.vars list elements. In the example above we need variable=2 in both cases so that we know the data come from child2.

After applying this PR I get

> (na.rm.T <- melt(wide.dt, measure.vars=measure.vars, na.rm=TRUE))
   family variable age sex
1:      1        2   5   m
> (na.rm.F <- melt(wide.dt, measure.vars=measure.vars, na.rm=FALSE)[!is.na(age)])
   family variable age sex
1:      1        2   5   m

All tests pass on my computer (R-4.0.2, Ubuntu 18.04), please code review.

@codecov
Copy link

codecov bot commented Sep 26, 2020

Codecov Report

Merging #4723 (54afdc2) into master (7b4bd7d) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4723      +/-   ##
==========================================
+ Coverage   99.42%   99.43%   +0.01%     
==========================================
  Files          73       73              
  Lines       14466    14460       -6     
==========================================
- Hits        14383    14379       -4     
+ Misses         83       81       -2     
Impacted Files Coverage Δ
src/fmelt.c 99.42% <100.00%> (+0.37%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7b4bd7d...54afdc2. Read the comment docs.

@tdhock
Copy link
Member Author

tdhock commented Dec 1, 2020

hi again @mattdowle this is a pretty simple fix, any chance you could review and consider including in next release?

@tdhock
Copy link
Member Author

tdhock commented Feb 25, 2021

hi again @mattdowle @arunsrinivasan please consider merging this simple fix for melt for the next release, thanks! (sorry if I'm being annoying, I know there are a lot of other PRs to examine, I just wanted to make clear that my suggestion for prioritizing my melt-related PRs should be first this one, then #4720, then #4731, then #4737)

@mattdowle mattdowle added this to the 1.14.1 milestone May 7, 2021
@mattdowle mattdowle merged commit ebc5bc3 into master May 7, 2021
@mattdowle mattdowle deleted the fix4455 branch May 7, 2021 21:45
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

variable should be independent of na.rm when melt into multiple columns

4 participants