Skip to content

issue with melt.data.table and patterns #1739

@steffenheyne

Description

@steffenheyne

Hi,

I stumbled upon this behavior of melt.data.table:

tmp=data.frame(val_a = sample(x=10,replace=F),val_b = sample(x=10,replace=F),val_c = sample(x=10,replace=F))

tmp$rank_b=rank(tmp$val_b)
tmp$rank_a=rank(tmp$val_a)
tmp$rank_c=rank(tmp$val_c)
str(tmp)

tmpT = melt(data.table(tmp),measure.vars=patterns("val_","rank_"),variable.factor=T,value.name=c("myValue","rank"))
str(tmpT)
ggplot(tmpT,aes(x=rank,y=myValue)) + geom_line(size=2) + facet_wrap(~variable) +theme_bw()

test1

whereas I was expecting this plot:

tmp=data.frame(val_a = sample(x=10,replace=F),val_b = sample(x=10,replace=F),val_c = sample(x=10,replace=F))

tmp$rank_a=rank(tmp$val_a)
tmp$rank_b=rank(tmp$val_b)
tmp$rank_c=rank(tmp$val_c)
str(tmp)

tmpT = melt(data.table(tmp),measure.vars=patterns("val_","rank_"),variable.factor=T,value.name=c("myValue","rank"))
str(tmpT)
ggplot(tmpT,aes(x=rank,y=myValue)) + geom_line(size=2) + facet_wrap(~variable) +theme_bw()

test

the reason is the different order of the first two rank columns (rank_b/rank_a/rank_c vs. rank_a/rank_b/rank_c).

melt probably runs over the different patterns and puts the result together and hence the order of the rank column does not anymore match the order of the value column.

I was especially confused why this happens due to the new factor "variable" that seemed correct as it had 3 levels.

Maybe this simply need a simple mention in the documentation?

Would it be possible to get the factor levels correct and "joined" in the correct way?

Thanks for data.table!
steffen

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions