ARROW-13174: [C++][Compute] Add strftime kernel#10647
ARROW-13174: [C++][Compute] Add strftime kernel#10647rok wants to merge 10 commits intoapache:masterfrom
Conversation
2436df1 to
45acfcd
Compare
652d46d to
78d3ffc
Compare
|
@westonpace strftime kernel is almost ready for review. |
cfa23ee to
9e8b775
Compare
westonpace
left a comment
There was a problem hiding this comment.
Thanks for doing this. Looks great.
cpp/src/arrow/compute/api_scalar.cc
Outdated
There was a problem hiding this comment.
Should the default format string include the trailing %z so that it is ISO-8601 compliant?
There was a problem hiding this comment.
I don't think that %z adds a trailing z as in the ISO format. It adds a numeric offset (eg https://www.cplusplus.com/reference/ctime/strftime/):
In [44]: datetime.datetime(2012, 1, 1, tzinfo=datetime.timezone.utc).strftime("%Y-%m-%dT%H:%M:%S%z")
Out[44]: '2012-01-01T00:00:00+0000'
There was a problem hiding this comment.
According to wikipedia these are valid formats:
- 2021-07-15T11:24:54+00:00
- 2021-07-15T11:24:54Z
- 20210715T112454Z
%z will give us option 1. I don't have a strong preference and pandas doesn't seem to have a default.
There was a problem hiding this comment.
If it is significant effort I agree that +00:00 is good enough but I personally prefer Z if there is a quick option to enable it when the time zone is UTC.
There was a problem hiding this comment.
Switched to %Y-%m-%dT%H:%M:%SZ (2.) as default.
There was a problem hiding this comment.
Nit: How precise is this 30? Is it based on the default format string? If I look at 2021-07-12T23:55:41+00:00 I would think you only need ~25, maybe more if subsecond resolution. Could we base this on the resolution? It's not a big deal so if it seems like too much don't worry.
There was a problem hiding this comment.
How about a random date's string size times a fudge factor? :)
int expected_string_size = round(get_timestamp<Duration>(0, &options).size() * 1.1);
RETURN_NOT_OK(string_builder->Reserve(in.length * expected_string_size));
2033695 to
8ce4ab4
Compare
|
Thanks for the review @westonpace! One thing I'm not sure about is if we should be using |
|
Yes, you can. Make it return a |
@pitrou Done. Please review. |
There was a problem hiding this comment.
Please note this is only testing for C locale because that is all that is available in CI at the moment. Ideally we would cover several.
ff49e17 to
4b1d68a
Compare
|
Will merge if green. |
|
Nice! Thanks @westonpace @jorisvandenbossche @pitrou |
|
Sorry for the slow response here, but I think there are still a few behavioural aspects to fix/clarify:
|
|
Thanks for the review @jorisvandenbossche !
|
|
Thanks! |
|
Follow up PR #10998 |
This is to resolve ARROW-13174.