-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
I'm not sure if it's the same issue as #3330 because I haven't set the DTthreads parameters to 1.
But I could imagine that the default value is 1.
This list of operations take less than 1 second to be executed with data.table v1.11.8.
With data.table v1.12.0 it's take more than 4 seconds !
> myDataTable[colJ != "production", colK:=na.omit(colK), by=.(colA)]
> myDataTable[colJ != "production", colH:=na.omit(colH), by=.(colA)]
> myDataTable[colJ != "production", colL:=na.omit(colL), by=.(colA)]
> myDataTable[colJ == "code",colM:=colE, by=.(colA)]
> myDataTable[,colM:=na.omit(colM), by=.(colA)]
> myDataTable[is.na(colM), colM:=""]
> myDataTable[colJ == "test",colN:=colE, by=.(colA)]
> myDataTable[,colN:=na.omit(colN), by=.(colA)]
And there is only 4 rows in my data table :
print(myDataTable)
colA colB colC colD colE colF colG colH colI colJ colK colL colM colN
1 text text text text text text text text text text text text text text
2 text text text text text text text text text text text text text text
3 text text text text text text text text text text text text text text
4 text text text text text text text text text text text text text text
(You can replace "text" by whatever)
We see this problem only in our production environment and not in other environment. This production server is really busy so it can explain why it's so long. But the test with v1.11.8 was done in the same environment without performance issue (We done this test many times with both version).
We run R in a Docker container with the system Ubuntu 16.04.
> sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] compiler_3.4.4
I hope this will be fixed in a future data.table release because we will be blocked in v1.11.8 until this issue is not resolved.
Thank you for your support.