-
Notifications
You must be signed in to change notification settings - Fork 1k
Closed
Labels
Description
When setkey is called on a DT with sorted attribute, existing keys should be reused if possible.
Most importantly in the setkey(setkey(dt, x, y), x) case but also in a little more advanced cases like setkey(setkey(dt, x, y), y, x). If you per se do not want to trust the sorted attribute at least check for the column being sorted before sorting if it is marked as sorted.
-best, Jan
example:
dtu = data.table(x = sample(1E6, 1E8, replace=T),
y = sample(1E6, 1E8, replace=T))
dts = setkey(copy(dtu),x,y)
onUnsorted <- function() setkey(dtu,y,x)
onSorted <- function() setkey(dts,y,x)
onSortedSmart <- function() setattr(setkey(dts,y),"sorted",c("y","x"))
identical(onUnsorted(), onSorted())
#[1] TRUE
identical(onSorted(), onSortedSmart())
#[1] TRUE
system.time(onUnsorted())
# user system elapsed
# 0.47 0.10 0.56
system.time(onSorted())
# user system elapsed
# 0.50 0.07 0.58
system.time(onSortedSmart())
# user system elapsed
# 0.24 0.06 0.29