-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joins
Description
With X[!Y], we can subset to where X has rows not matching Y, but there is no analogue for subsetting to where X does match Y. A X[Y, nomatch=0] join will sort the result according to Y and recognize dupe rows in Y, so I need to do something like X[sort(unique(X[Y, nomatch=0, which=TRUE]))] instead (unless I'm forgetting some other way).
library(data.table)
X = data.table(id = c(3L, 1L, 2L, 1L, 1L), g = c("A", "A", "B", "B", "A"), v = (1:5)*10)
Y = data.table(id = c(1L, 1:3), g = "A")
X[Y, on=names(Y), nomatch=0]
# gives id = 1 before 3, contrary to X's ordering
# gives id = 1 twice, reflecting Y, but the goal is to subset X
# id g v
# 1: 1 A 20
# 2: 1 A 50
# 3: 1 A 20
# 4: 1 A 50
# 5: 3 A 10
X[ sort(unique(X[Y, on=names(Y), nomatch=0, which=TRUE])) ]
# desired result
# id g v
# 1: 3 A 10
# 2: 1 A 20
# 3: 1 A 50
X[!Y, on=names(Y)]
# analogous much simpler code for not join
# id g v
# 1: 2 B 30
# 2: 1 B 40
So I'm looking for new syntax to make this less awkward, maybe something like
X[Y, on=names(Y), subset.join = TRUE]
# or
X[subset.join(Y), on=names(Y)]
mt1022
Metadata
Metadata
Assignees
Labels
joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joinsUse label:"non-equi joins" for rolling, overlapping, and non-equi joins