-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
As seen on CRAN and reproduced locally with R-devel 2022-10-06 r83040.
Running test id 1741.3 Test 1741.3 ran without errors but failed check that x equals y:
> x = x1 <- capture.output(fwrite(DT, dateTimeAs = "write.csv"))
First 6 of 11 (type 'character'):
[1] "A,B,C,D,E"
[2] "1907-10-21,1907-10-21,23:59:59,1907-10-21 16:59:59,1907-10-21 16:59:59"
[3] "1907-10-22,1907-10-22,00:00:00,1907-10-21 17:00:00,1907-10-21 17:00:00"
[4] "1907-10-22,1907-10-22,00:00:01,1907-10-21 17:00:01,1907-10-21 17:00:01"
[5] "1969-12-31,1969-12-31,23:59:58,1969-12-31 16:59:58,1969-12-31 16:59:58"
[6] "1970-01-01,1970-01-01,00:00:00,1969-12-31 17:00:00,1969-12-31 17:00:00"
> y = capture.output(write.csv(DT, row.names = FALSE, quote = FALSE))
First 6 of 11 (type 'character'):
[1] "A,B,C,D,E"
[2] "1907-10-21,1907-10-21,23:59:59,1907-10-21 16:59:59,1907-10-21 16:59:59.999"
[3] "1907-10-22,1907-10-22,00:00:00,1907-10-21 17:00:00,1907-10-21 17:00:00"
[4] "1907-10-22,1907-10-22,00:00:01,1907-10-21 17:00:01,1907-10-21 17:00:01.5"
[5] "1969-12-31,1969-12-31,23:59:58,1969-12-31 16:59:58,1969-12-31 16:59:58.111112"
[6] "1970-01-01,1970-01-01,00:00:00,1969-12-31 17:00:00,1969-12-31 17:00:00.123456"
8 string mismatches
Running test id 1741.4 Test 1741.4 ran without errors but failed check that x equals y:
> x = x2 <- capture.output(fwrite(DT, dateTimeAs = "write.csv"))
First 6 of 11 (type 'character'):
[1] "A,B,C,D,E"
[2] "1907-10-21,1907-10-21,23:59:59,1907-10-21 16:59:59,1907-10-21 16:59:59.999"
[3] "1907-10-22,1907-10-22,00:00:00,1907-10-21 17:00:00,1907-10-21 17:00:00.000"
[4] "1907-10-22,1907-10-22,00:00:01,1907-10-21 17:00:01,1907-10-21 17:00:01.500"
[5] "1969-12-31,1969-12-31,23:59:58,1969-12-31 16:59:58,1969-12-31 16:59:58.111"
[6] "1970-01-01,1970-01-01,00:00:00,1969-12-31 17:00:00,1969-12-31 17:00:00.123"
> y = capture.output(write.csv(DT, row.names = FALSE, quote = FALSE))
First 6 of 11 (type 'character'):
[1] "A,B,C,D,E"
[2] "1907-10-21,1907-10-21,23:59:59,1907-10-21 16:59:59,1907-10-21 16:59:59.999"
[3] "1907-10-22,1907-10-22,00:00:00,1907-10-21 17:00:00,1907-10-21 17:00:00"
[4] "1907-10-22,1907-10-22,00:00:01,1907-10-21 17:00:01,1907-10-21 17:00:01.5"
[5] "1969-12-31,1969-12-31,23:59:58,1969-12-31 16:59:58,1969-12-31 16:59:58.111112"
[6] "1970-01-01,1970-01-01,00:00:00,1969-12-31 17:00:00,1969-12-31 17:00:00.123456"
9 string mismatches
Running test id 1741.5 Test 1741.5 ran without errors but failed check that x equals y:
> x = x3 <- capture.output(fwrite(DT, dateTimeAs = "write.csv"))
First 6 of 11 (type 'character'):
[1] "A,B,C,D,E"
[2] "1907-10-21,1907-10-21,23:59:59,1907-10-21 16:59:59,1907-10-21 16:59:59.999000"
[3] "1907-10-22,1907-10-22,00:00:00,1907-10-21 17:00:00,1907-10-21 17:00:00.000000"
[4] "1907-10-22,1907-10-22,00:00:01,1907-10-21 17:00:01,1907-10-21 17:00:01.500000"
[5] "1969-12-31,1969-12-31,23:59:58,1969-12-31 16:59:58,1969-12-31 16:59:58.111112"
[6] "1970-01-01,1970-01-01,00:00:00,1969-12-31 17:00:00,1969-12-31 17:00:00.123456"
> y = capture.output(write.csv(DT, row.names = FALSE, quote = FALSE))
First 6 of 11 (type 'character'):
[1] "A,B,C,D,E"
[2] "1907-10-21,1907-10-21,23:59:59,1907-10-21 16:59:59,1907-10-21 16:59:59.999"
[3] "1907-10-22,1907-10-22,00:00:00,1907-10-21 17:00:00,1907-10-21 17:00:00"
[4] "1907-10-22,1907-10-22,00:00:01,1907-10-21 17:00:01,1907-10-21 17:00:01.5"
[5] "1969-12-31,1969-12-31,23:59:58,1969-12-31 16:59:58,1969-12-31 16:59:58.111112"
[6] "1970-01-01,1970-01-01,00:00:00,1969-12-31 17:00:00,1969-12-31 17:00:00.123456"
7 string mismatches
Error: 3 error(s) out of 10935. Search tests/tests.Rraw for test number(s) 1741.3, 1741.4, 1741.5
Thanks Martin. Looking.
Firstly an aside really. I noticed that ?options states the default
for digits.secs is 0 :
https://github.com/wch/r-source/blob/trunk/src/library/base/man/options.Rd#L144
However, in both R-release and R-devel the digits.secs option does not
exist by default, so its default value appears to be NULL.
$ R --vanilla
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
> getOption("digits.secs")
NULL
> grep("dig", names(options()), value=TRUE)
[1] "digits"
$ Rdevel --vanilla
R Under development (unstable) (2022-10-06 r83040) -- "Unsuffered Consequences"
> getOption("digits.secs")
NULL
> grep("dig", names(options()), value=TRUE)
[1] "digits"
In the data.table tests that are failing on R-devel, we use write.csv.
Its behavior seems to have changed in that it no longer respects
getOption("digits.secs"). Is that intentional?
$ R --vanilla
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
> DF = data.frame(A=as.POSIXct("2016-09-12 01:23:45.67"))
> write.csv(DF)
"","A"
"1",2016-09-12 01:23:45
> options(digits.secs=1)
> write.csv(DF)
"","A"
"1",2016-09-12 01:23:45.6 # last 6 (1 digit) appears as expected
$ Rdevel --vanilla
R Under development (unstable) (2022-10-06 r83040) -- "Unsuffered Consequences"
> DF = data.frame(A=as.POSIXct("2016-09-12 01:23:45.67"))
> write.csv(DF)
"","A"
"1",2016-09-12 01:23:45.67 # different to R-release
> options(digits.secs=1)
> write.csv(DF)
"","A"
"1",2016-09-12 01:23:45.67 # different to R-release
My reading of ?options w.r.t. digits.secs is that it influences
write.csv and I'm thinking it's reasonable to do so, if I understand
correctly; i.e. that the R-release behavior is reasonable and
consistent with ?options.
On Thu, Sep 29, 2022 at 3:32 AM Martin Maechler
Dear Matt and André,
yesterday I sent the following E-mail to the maintainers of 38 CRAN packages
whereas we found only later, that your two packages were
affected as well:André Smaniotto <[email protected]>: RColetum
Matt Dowle <[email protected]>: data.tableThe is that you should replace
as.character(dt, ..)
by format(dt, ..)for datetime (and date) objects (and not forget in the future
what the semantic difference is and should be).Dear CRAN package maintainers,
This concerns the CRAN packagesactivAnalyzer adobeanalyticsr aoristic apcf arctools attrib bigrquery bitmexr cmsafops cry
CSCNet dendRoAnalyst DEPONS2R digiRhythm Ecfun GGIR GGIRread iNZightPlots JFE jsonlite
lubridate mark nprcgenekeepr padr PAMmisc plotly pointblank reactable readwritesqlite
rebus.datetimes rotor tidyquant timeDate timeSeries timetk track2KBA TrafficBDE trajectoriesmaintained by one of you:
[.....................]
Some of you have been notified already by us, and others have
already updated their package development version.Your packages mostly newly show an ERROR (others a WARNING)
with the development version of R,
called "R-devel", since svn revision r82904 (2022-09-24 19:32:52)
as there, I've defined an as.character() method for "POSIXt"
which is quite different than the previous one which basically
simply called format(.).So (in R-devel), now
tt <- Sys.time()
dput(tt)
structure(1664372962.27539, class = c("POSIXct", "POSIXt"))
format(tt)
[1] "2022-09-28 15:49:22"
as.character(tt)
[1] "2022-09-28 15:49:22.275388717651367"(and yes, one could ask if it should not show less digits after the
decimal, where I'd answer "probably yes", and I think that (number of
digits) will probably decrease before R-devel is released.)As mentioned, the previous behavior was that for POSIX?t, the
as.character() simply just using format(). But that is
"principally wrong" as format() searches for a common format
for all elements (in case of vector with length > 1) where as
as.character() really must transform to character, i.e., a
string each element independently, where I use "must" from
as.character()'s description and its behavior for all atomic
objects.as.character() being based on format() has another way to be
"principally wrong": format() is typically influenced by
options("digits") or similar options, in our case by
getOption("digits.secs").So, the plan is to also change as.character.Date()
to no longer use format() at least not the way it currently does.A sensible as.character() method implementation should fulfill a few
properties:
- typeof(as.character(x) == "character"
- length(as.character(x)) == length(x)
- as.character(x[i]) == as.character(x)[i] {for a 'valid' index i }
- is.null(attributes(as.character(x)))
- as.character() {and all other basic as.() methods}
should not depend on anyoptions().In some sense, '2)' is really important and was not fulfilled for
the POSIX[cl]t classes
(and I think also not for the "Date" class, but less clearly).Number 3) is also not fulfilled, notably for the "hexmode" and
"octmode" methods in base R and the intent is to change that as well.
Now, as historically as.character(dt) was practically
synonymous to format(dt) for "date time" objects, all of you
in your package code or tests have been using
as.character(<..>) for POSIXct or POSIXlt objects, assuming
you'd always get a well formatted character vector, well
formatted in the sense of format(), defining a common format
for all entries of the (potentially quite long) vector and
also rounding "away" all fractional seconds.For numbers, "all" R users know that as.character() behaves with
the propoperties above, and also that R will basically keep full
accuracy of x in as.character(x),
quite differently to print()ing and formatting() where rounding
will happen and often depends on some options() settings.The difference of as.character() and format() is implemented in
R-devel now, see the simple Sys.time() example above, or e.g.,
and in(s1 <- seq(as.POSIXlt("2022-03-22"), as.POSIXlt("2022-04-01"), by = "12 hours"))
[1] "2022-03-22 00:00:00 CET" "2022-03-22 12:00:00 CET" "2022-03-23 00:00:00 CET"
[4] "2022-03-23 12:00:00 CET" "2022-03-24 00:00:00 CET" "2022-03-24 12:00:00 CET"
[7] "2022-03-25 00:00:00 CET" "2022-03-25 12:00:00 CET" "2022-03-26 00:00:00 CET"
[10] "2022-03-26 12:00:00 CET" "2022-03-27 00:00:00 CET" "2022-03-27 13:00:00 CEST"
[13] "2022-03-28 01:00:00 CEST" "2022-03-28 13:00:00 CEST" "2022-03-29 01:00:00 CEST"
[16] "2022-03-29 13:00:00 CEST" "2022-03-30 01:00:00 CEST" "2022-03-30 13:00:00 CEST"
[19] "2022-03-31 01:00:00 CEST" "2022-03-31 13:00:00 CEST"format(s1)
[1] "2022-03-22 00:00:00" "2022-03-22 12:00:00" "2022-03-23 00:00:00" "2022-03-23 12:00:00"
[5] "2022-03-24 00:00:00" "2022-03-24 12:00:00" "2022-03-25 00:00:00" "2022-03-25 12:00:00"
[9] "2022-03-26 00:00:00" "2022-03-26 12:00:00" "2022-03-27 00:00:00" "2022-03-27 13:00:00"
[13] "2022-03-28 01:00:00" "2022-03-28 13:00:00" "2022-03-29 01:00:00" "2022-03-29 13:00:00"
[17] "2022-03-30 01:00:00" "2022-03-30 13:00:00" "2022-03-31 01:00:00" "2022-03-31 13:00:00"
as.character(s1)
[1] "2022-03-22" "2022-03-22 12:00:00" "2022-03-23" "2022-03-23 12:00:00"
[5] "2022-03-24" "2022-03-24 12:00:00" "2022-03-25" "2022-03-25 12:00:00"
[9] "2022-03-26" "2022-03-26 12:00:00" "2022-03-27" "2022-03-27 13:00:00"
[13] "2022-03-28 01:00:00" "2022-03-28 13:00:00" "2022-03-29 01:00:00" "2022-03-29 13:00:00"
[17] "2022-03-30 01:00:00" "2022-03-30 13:00:00" "2022-03-31 01:00:00" "2022-03-31 13:00:00"for(i in seq_along(s1)) identical(as.character(s1[i]), as.character(s1)[i])
format(s1[1])
[1] "2022-03-22"
format(s1[1:2])
[1] "2022-03-22 00:00:00" "2022-03-22 12:00:00"I am happy to answer questions or discuss the issues, etc.
Best regards,
Martin--
Martin Maechler
ETH Zurich and R Core team