Skip to content

Inconsistent copy behavior when subsetting #2254

@renkun-ken

Description

@renkun-ken

When we subset a data.table, it is supposed to copy the resulted columns. However, it seems that whether they are copied depends on the number of rows of the original data.table. Here's an example:

library(data.table)
d1 <- data.table(x = c(0, 1), y = c(1, 0))
d2 <- d1[x >= 0]
d1[, address(x)]
d2[, address(x)]

m1 <- data.table(x = 0, y = 1)
m2 <- m1[x >= 0]
m1[, address(x)]
m2[, address(x)]

The following are the results produced on my machine:

> library(data.table)
data.table 1.10.4
  The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
  Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
  Release notes, videos and slides: http://r-datatable.com
> d1 <- data.table(x = c(0, 1), y = c(1, 0))
> d2 <- d1[x >= 0]
> d1[, address(x)]
[1] "0x356b878"
> d2[, address(x)] # copied!
[1] "0x31fbf08"
> m1 <- data.table(x = 0, y = 1)
> m2 <- m1[x >= 0]
> m1[, address(x)]
[1] "0x4331ab8"
> m2[, address(x)] # not copied!
[1] "0x4331ab8"

It looks like dt[TRUE], dt[1L] and dt[cond] are not always consistent in terms of copy behavior even when the results are the same.

My session info:

R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS

Matrix products: default
BLAS: /usr/lib/openblas-base/libblas.so.3
LAPACK: /usr/lib/libopenblasp-r0.2.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.10.4

loaded via a namespace (and not attached):
[1] compiler_3.4.1 tools_3.4.1    yaml_2.1.14   

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions