-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Copying from a discussion chain with @MichaelChirico in #4144 :
In src/init.c the type order is laid out as follows:
static void setSizes() {
for (int i=0; i<100; ++i) { sizes[i]=0; typeorder[i]=0; }
// only these types are currently allowed as column types :
sizes[LGLSXP] = sizeof(int); typeorder[LGLSXP] = 0;
sizes[RAWSXP] = sizeof(Rbyte); typeorder[RAWSXP] = 1;
sizes[INTSXP] = sizeof(int); typeorder[INTSXP] = 2; // integer and factor
sizes[REALSXP] = sizeof(double); typeorder[REALSXP] = 3; // numeric and integer64
sizes[CPLXSXP] = sizeof(Rcomplex); typeorder[CPLXSXP] = 4;
sizes[STRSXP] = sizeof(SEXP *); typeorder[STRSXP] = 5;
sizes[VECSXP] = sizeof(SEXP *); typeorder[VECSXP] = 6; // list column
if (sizeof(char *)>8) error(_("Pointers are %d bytes, greater than 8. We have not tested on any architecture greater than 64bit yet."), sizeof(char *));
// One place we need the largest sizeof is the working memory malloc in reorder.c
}
The order of LGLSXP and RAWSXP are in conflict with R's coercion rules:
c(as.raw(0), TRUE)
# [1] FALSE TRUE
As you can see, the raw is coerced to logical, not the other way round as listed in src/init.c
I believe the coercion rules work this way because a logical vector may contain NA, whereas raw vectors cannot:
as.raw(NA)
# [1] 00
# Warning message:
# out-of-range values treated as 0 in coercion to raw
Changing the order of LGLSXP and RAWSXP in typeorder has minimal impact, breaking only the following tests: 2006.1, 2006.2, and 2129.
These tests all related to the rbind/rbindlist functionality. Tests 2006.1 and 2006.2 are a simple fix: they simply expect the wrong output: when the typeorder is changed, rbindlist(..., fill = TRUE) with missing raw colums are now filled with NA in line with the functionality for all other column types, rather than 00 as is in the expected test output. Its not clear to me why 2129 fails, as the input data.tables do not contain raw columns, but the test is for rbind(), so this would take a little debugging.