Skip to content

[BUG] Cannot print a data.table with column of type rvar from posterior package #5981

@SteveBronder

Description

@SteveBronder

Summary

When printing a data.table that contains an rvar type from the posterior package an empty data.table is printed instead. See the mrp below for a full case. Looking into it more it appears dim.data.table returns a value of zero for the rows where there should be a nonzero value. dim returning a zero causes line 55 of print.data.table to go to the empty data.table message.

I tried looking into the code for dim:

data.table/src/wrappers.c

Lines 101 to 121 in 46ee525

SEXP dim(SEXP x)
{
// fast implementation of dim.data.table
if (TYPEOF(x) != VECSXP) {
error(_("dim.data.table expects a data.table as input (which is a list), but seems to be of type %s"),
type2char(TYPEOF(x)));
}
SEXP ans = PROTECT(allocVector(INTSXP, 2));
if(length(x) == 0) {
INTEGER(ans)[0] = 0;
INTEGER(ans)[1] = 0;
}
else {
INTEGER(ans)[0] = length(VECTOR_ELT(x, 0));
INTEGER(ans)[1] = length(x);
}
UNPROTECT(1);
return ans;
}

The data.table has a length(x) equal to 1 so the last branch is chosen.

Calling dim.data.frame with the data.table gives the correct output below of (4, 1). I think this is because .row_names_info used in dim.data.frame is just asking for the length of the row names via .Internal(shortRowNames(x, type)).

Minimal reproducible example

library(data.table)
library(posterior)
n <- 4   # length of output vector
x_rvar <- rvar(array(rnorm(n*n, mean = 1, sd = 1), dim = c(n, n)))
x_rvar
# rvar<4000>[4] mean ± sd:
#   [1] 0.98 ± 1.00  1.00 ± 1.02  1.00 ± 0.99  0.99 ± 0.99 

# does not work :(
ex_dt = data.table(ex = x_rvar)
ex_dt
# Empty data.table (0 rows and 1 cols): ex

ex_df = data.frame(ex = x_rvar)

# dim for data.frame has wrong rows
dim(ex_dt)
# [1] 0 1
dim(ex_df)
# [1] 4 1

dim.data.frame(ex_dt)
# [1] 4 1

# But `ex` does exist?
ex_dt[, ex]
# rvar<4>[4] mean ± sd:
# [1] 0.025 ± 0.58  0.759 ± 0.69  1.292 ± 1.14  1.601 ± 0.53

ex_dt[, mean(ex)]
[1] 0.02508674 0.75898825 1.29194188 1.60076280

# printing with print.data.frame works
print.data.frame(ex_dt)
# ex
# 1 0.025 ± 0.58
# 2 0.759 ± 0.69
# 3 1.292 ± 1.14
# 4 1.601 ± 0.53

dput(ex_dt)
# structure(list(ex = 
#  structure(list(), draws = structure(
#   c(-0.228122437024597, -0.665473091749799, 0.433053985537569, 0.56088850510324,  
#      1.57671830971838,  -0.104841641013198, 0.817921294855514, 0.746155054023318, 
#      0.778298953156937,  2.20224060437554, -0.078692684532738, 2.26592066385111,
#      1.21605524439755,   1.59716805846671,  2.34552924058062,  1.24429863711927), 
#      dim = c(4L, 4L), dimnames = list(c("1", "2", "3", "4"), NULL)), 
#      nchains = 1L, cache = <environment>, class = c("rvar",  "vctrs_vctr"))),
#   row.names = c(NA, -4L), class = c("data.table", "data.frame"), 
#   .internal.selfref = <pointer: 0x55cc4ae60f60>)

rownames(ex_dt)
# [1] "1" "2" "3" "4"

Looking over the open/closed issues for data.table I could not find anything similar. I don't know enough about R's internals to know why length(VECTOR_ELT(x, 0)) is giving a value of zero, though I found the source for VECTOR_ELThere

# Output of sessionInfo()

R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Rocky Linux 8.9 (Green Obsidian)

Matrix products: default
BLAS/LAPACK: FlexiBLAS /MNT/SW/NIX/STORE/LKQ7SR37F820WRZCLZ0VC9N6FG5ZB3GD-OPENBLAS-0.3.26/LIB/LIBOPENBLAS.SO;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8      
 [8] LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] posterior_1.5.0   data.table_1.15.2 cmdstanr_0.7.1    testthat_3.2.1   

loaded via a namespace (and not attached):
 [1] vctrs_0.6.5          cli_3.6.2            knitr_1.42           rlang_1.1.3          xfun_0.39            processx_3.8.3       pkgload_1.3.4        generics_0.1.3       tensorA_0.36.2.1    
[10] jsonlite_1.8.8       glue_1.7.0           backports_1.4.1      rprojroot_2.0.3      distributional_0.4.0 ps_1.7.6             brio_1.1.3           fansi_1.0.6          tibble_3.2.1        
[19] abind_1.4-5          lifecycle_1.0.4      compiler_4.3.2       dplyr_1.1.2          waldo_0.5.2          pkgconfig_2.0.3      rstudioapi_0.14      R6_2.5.1             tidyselect_1.2.0    
[28] utf8_1.2.4           pillar_1.9.0         magrittr_2.0.3       checkmate_2.3.1      withr_3.0.0          tools_4.3.2          matrixStats_1.2.0    desc_1.4.2    

Metadata

Metadata

Assignees

No one assigned

    Labels

    non-atomic columne.g. list columns, S4 vector columns

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions