-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Test int96 Parquet file from Spark #7367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
apache/parquet-testing#73 merged so I think this is ready for review. |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbutrovich -- this is a nice contribution. Testing for the WIN!
| }) | ||
| } | ||
|
|
||
| #[test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be worth a test showing what happens when a schema is not supplied for this file (that the data is read out as nanosecond precision)
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @mbutrovich
| let expected = Arc::new(Int64Array::from(vec![ | ||
| Some(1704141296123456000), // Reads as nanosecond fine (note 3 extra 0s) | ||
| Some(1704070800000000000), // Reads as nanosecond fine (note 3 extra 0s) | ||
| Some(-4852191831933722624), // Cannot be represented with nanos timestamp (year 9999) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
Which issue does this PR close?
Rationale for this change
We would like to enforce testing on a challenging int96 file generated by Spark with its microsecond timestamps. It is challenging because it includes dates that cannot be represented in a nanosecond timestamp.
What changes are included in this PR?
Add a test that is dependent on apache/parquet-testing#73 merging first.
Are there any user-facing changes?
No.