-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Description
Issue 1: When working with a data.frame, assignment to a new variable does affect the original dataset while assignment to an existing variable (modification/update) does.
A = setDF(list(mpg = c(21, 21, 22.8, 21.4, 18.7),
cyl = c(6, 6, 4, 6, 8),
disp = c(160, 160, 108, 258, 360)))
# create a count column (output not shown)
A |> DT(, count := .N, by=cyl)
print(A) # does not contain the count column
# mpg cyl disp
# 1 21.0 6 160
# 2 21.0 6 160
# 3 22.8 4 108
# 4 21.4 6 258
# 5 18.7 8 360
# modify an existing column
A |> DT(, disp := disp %% 100)
print(A) # disp has been modified
# mpg cyl disp
# 1 21.0 6 60
# 2 21.0 6 60
# 3 22.8 4 8
# 4 21.4 6 58
# 5 18.7 8 60
I think that the expectation when calling a data.table query on a data.frame is that it should behave in a similar way; that is,
assignments and modifications should affect the original data.frame. This is partially useful to avoid to reassign the data back every time we use DT on a data.frame. This will also make it consistent with what would happen when using a data.table and not a data.frame.
Issue 2: Naming a data.frame or data.table D leads to errors when used with DT function:
D <- copy(A)
D[1:3,] |> DT(D[4:5,], on="cyl") # error
Error: object of type 'closure' is not subsettable
A[1:3,] |> DT(A[4:5,], on="cyl") # works
D |> DT(, names(D) := lapply(.SD, sort)) # error
Error: LHS of := isn't column names ('character') or positions ('integer' or 'numeric')
A |> DT(, names(A) := lapply(.SD, sort)) # works
Info session
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19043)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.14.1
loaded via a namespace (and not attached):
[1] zoo_1.8-9 compiler_4.1.0 htmltools_0.5.1.1 tools_4.1.0 xts_0.12.1 yaml_2.2.1
[7] rmarkdown_2.10 grid_4.1.0 knitr_1.33 xfun_0.23 digest_0.6.27 rlang_0.4.11
[13] lattice_0.20-44 evaluate_0.14
grantmcdermott and mattdowle