-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
reshapedcast meltdcast melt
Milestone
Description
In reference to my comments at #2551, using patterns with melt on unbalanced wide or panel data results in an incorrect molten dataset.
Here's a minimal example, where "a" values are missing "beta" times, and "b" values are missing "alpha" times:
library(data.table)
set.seed(2334)
DT <- data.table(
a_alpha = rnorm(3), a_gamma = rnorm(3),
b_beta = rnorm(3), b_gamma = rnorm(3),
id = c(1:3)
)
melt just sees that there are two sets of variables:
melt(DT, measure.vars = patterns("a_", "b_"))
# id variable value1 value2
# 1: 1 1 -0.1183107 1.682018318
# 2: 2 1 1.2370906 -0.573611132
# 3: 3 1 0.8088209 -0.057320032
# 4: 1 2 -0.7656264 -0.706428227
# 5: 2 2 -0.5919939 0.001899857
# 6: 3 2 0.5279071 1.063851211
Expected behaviour:
melt(DT, "id")[, c("stub", "var") := tstrsplit(variable, "_")][, dcast(.SD, id + var ~ stub)]
# id var a b
# 1: 1 alpha -0.1183107 NA
# 2: 1 beta NA 1.682018318
# 3: 1 gamma -0.7656264 -0.706428227
# 4: 2 alpha 1.2370906 NA
# 5: 2 beta NA -0.573611132
# 6: 2 gamma -0.5919939 0.001899857
# 7: 3 alpha 0.8088209 NA
# 8: 3 beta NA -0.057320032
# 9: 3 gamma 0.5279071 1.063851211
ismirsehregal, markfairbanks and mattdowle
Metadata
Metadata
Assignees
Labels
reshapedcast meltdcast melt