-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Summary
When printing a data.table that contains an rvar type from the posterior package an empty data.table is printed instead. See the mrp below for a full case. Looking into it more it appears dim.data.table returns a value of zero for the rows where there should be a nonzero value. dim returning a zero causes line 55 of print.data.table to go to the empty data.table message.
I tried looking into the code for dim:
Lines 101 to 121 in 46ee525
| SEXP dim(SEXP x) | |
| { | |
| // fast implementation of dim.data.table | |
| if (TYPEOF(x) != VECSXP) { | |
| error(_("dim.data.table expects a data.table as input (which is a list), but seems to be of type %s"), | |
| type2char(TYPEOF(x))); | |
| } | |
| SEXP ans = PROTECT(allocVector(INTSXP, 2)); | |
| if(length(x) == 0) { | |
| INTEGER(ans)[0] = 0; | |
| INTEGER(ans)[1] = 0; | |
| } | |
| else { | |
| INTEGER(ans)[0] = length(VECTOR_ELT(x, 0)); | |
| INTEGER(ans)[1] = length(x); | |
| } | |
| UNPROTECT(1); | |
| return ans; | |
| } |
The data.table has a length(x) equal to 1 so the last branch is chosen.
Calling dim.data.frame with the data.table gives the correct output below of (4, 1). I think this is because .row_names_info used in dim.data.frame is just asking for the length of the row names via .Internal(shortRowNames(x, type)).
Minimal reproducible example
library(data.table)
library(posterior)
n <- 4 # length of output vector
x_rvar <- rvar(array(rnorm(n*n, mean = 1, sd = 1), dim = c(n, n)))
x_rvar
# rvar<4000>[4] mean ± sd:
# [1] 0.98 ± 1.00 1.00 ± 1.02 1.00 ± 0.99 0.99 ± 0.99
# does not work :(
ex_dt = data.table(ex = x_rvar)
ex_dt
# Empty data.table (0 rows and 1 cols): ex
ex_df = data.frame(ex = x_rvar)
# dim for data.frame has wrong rows
dim(ex_dt)
# [1] 0 1
dim(ex_df)
# [1] 4 1
dim.data.frame(ex_dt)
# [1] 4 1
# But `ex` does exist?
ex_dt[, ex]
# rvar<4>[4] mean ± sd:
# [1] 0.025 ± 0.58 0.759 ± 0.69 1.292 ± 1.14 1.601 ± 0.53
ex_dt[, mean(ex)]
[1] 0.02508674 0.75898825 1.29194188 1.60076280
# printing with print.data.frame works
print.data.frame(ex_dt)
# ex
# 1 0.025 ± 0.58
# 2 0.759 ± 0.69
# 3 1.292 ± 1.14
# 4 1.601 ± 0.53
dput(ex_dt)
# structure(list(ex =
# structure(list(), draws = structure(
# c(-0.228122437024597, -0.665473091749799, 0.433053985537569, 0.56088850510324,
# 1.57671830971838, -0.104841641013198, 0.817921294855514, 0.746155054023318,
# 0.778298953156937, 2.20224060437554, -0.078692684532738, 2.26592066385111,
# 1.21605524439755, 1.59716805846671, 2.34552924058062, 1.24429863711927),
# dim = c(4L, 4L), dimnames = list(c("1", "2", "3", "4"), NULL)),
# nchains = 1L, cache = <environment>, class = c("rvar", "vctrs_vctr"))),
# row.names = c(NA, -4L), class = c("data.table", "data.frame"),
# .internal.selfref = <pointer: 0x55cc4ae60f60>)
rownames(ex_dt)
# [1] "1" "2" "3" "4"
Looking over the open/closed issues for data.table I could not find anything similar. I don't know enough about R's internals to know why length(VECTOR_ELT(x, 0)) is giving a value of zero, though I found the source for VECTOR_ELThere
# Output of sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Rocky Linux 8.9 (Green Obsidian)
Matrix products: default
BLAS/LAPACK: FlexiBLAS /MNT/SW/NIX/STORE/LKQ7SR37F820WRZCLZ0VC9N6FG5ZB3GD-OPENBLAS-0.3.26/LIB/LIBOPENBLAS.SO; LAPACK version 3.11.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8
[8] LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: America/New_York
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] posterior_1.5.0 data.table_1.15.2 cmdstanr_0.7.1 testthat_3.2.1
loaded via a namespace (and not attached):
[1] vctrs_0.6.5 cli_3.6.2 knitr_1.42 rlang_1.1.3 xfun_0.39 processx_3.8.3 pkgload_1.3.4 generics_0.1.3 tensorA_0.36.2.1
[10] jsonlite_1.8.8 glue_1.7.0 backports_1.4.1 rprojroot_2.0.3 distributional_0.4.0 ps_1.7.6 brio_1.1.3 fansi_1.0.6 tibble_3.2.1
[19] abind_1.4-5 lifecycle_1.0.4 compiler_4.3.2 dplyr_1.1.2 waldo_0.5.2 pkgconfig_2.0.3 rstudioapi_0.14 R6_2.5.1 tidyselect_1.2.0
[28] utf8_1.2.4 pillar_1.9.0 magrittr_2.0.3 checkmate_2.3.1 withr_3.0.0 tools_4.3.2 matrixStats_1.2.0 desc_1.4.2