Skip to content

Comments

ARROW-13804: [Go] Add Interval type Month, Day, Nano#11310

Closed
zeroshade wants to merge 12 commits intoapache:masterfrom
zeroshade:arrow-13804-month-day-nano
Closed

ARROW-13804: [Go] Add Interval type Month, Day, Nano#11310
zeroshade wants to merge 12 commits intoapache:masterfrom
zeroshade:arrow-13804-month-day-nano

Conversation

@zeroshade
Copy link
Member

@zeroshade zeroshade commented Oct 4, 2021

#10177 Added the interval type Month, Day, Nano to the flatbuffer and C++/Java, this updates the flatbuffer generated files and adds support for the type to Go.

@github-actions
Copy link

github-actions bot commented Oct 4, 2021

@zeroshade zeroshade force-pushed the arrow-13804-month-day-nano branch from f3b9516 to a4d1547 Compare October 5, 2021 15:04
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does int32 error if months or day are out of range?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that int32 will just truncate if the value of months or days is larger than an int32 can hold.

The other areas in the integration testing pieces here didn't check for overflows i don't think so i didn't think it was necessary to do so, but I can add checks to error on over/underflow if that's needed. Thoughts?

@emkornfield
Copy link
Contributor

Thanks @zeroshade it looks like there might have been some large changes outside of flatbuf code unrelated to MonthDayNanoIntervals (it appears that the are formatting?) Would it pay to submit a PR doing the formatting and flatbuf update separately from the actual new logic?

@zeroshade
Copy link
Member Author

zeroshade commented Oct 8, 2021

@emkornfield So the formatting changes happened due to the length of the name MonthDayNanoInterval since I auto run go fmt on the files which will line things up. Since the name is so long, it ends up changing the formatting a bit in order to maintain the alignment of the code, that's all.

If you think that doing the flatbuf update separately from the actual new logic would be beneficial I can do that, mostly I didn't think anyone would be directly reviewing the generated flatbuf code so didn't consider it necessary to break it out as a separate change.

That said, the changes to the enum values to line up with the C++ type ids could be broken out as a separate change to be done first, before the flatbuf and MonthDayNanoInterval logic, which I can do.

@zeroshade
Copy link
Member Author

the failure in the Archery & Crossbow test is unrelated to my changes as far as I can tell.

@zeroshade
Copy link
Member Author

@emkornfield I broke the flatbuf update and the type id enum changes out to #11389 after that gets merged, i'll update this PR accordingly

asfgit pushed a commit that referenced this pull request Oct 12, 2021
Also updates the type ID enums accordingly, adding type ids for as of yet unsupported types to reduce the size of future changes.

Broken out from #11310 to shrink the size of the change to implement MonthDayNano intervals in Go

Closes #11389 from zeroshade/arrow-14296-flatbuf

Lead-authored-by: Matthew Topol <[email protected]>
Co-authored-by: Matt Topol <[email protected]>
Signed-off-by: Matthew Topol <[email protected]>
@zeroshade zeroshade force-pushed the arrow-13804-month-day-nano branch from 39046f2 to 64efd32 Compare October 12, 2021 16:05
@zeroshade

This comment has been minimized.

@zeroshade
Copy link
Member Author

I was able to narrow this down to being related to the application metadata. Unlike the C++ flight integration client, the Java one does not validate the metadata in the response to the DoGet, only in the DoPut, if I modify the Go flight integration server to not send the app metadata field with the DoGet responses, then it succeeds without issue. But when it does send the metadata, it fails claiming a memory leak of the exact number of bytes that were sent via app metadata (1 byte per message).

I don't know why it doesn't have this memory leak when consuming from the C++ flight integration server which appears to be sending the metadata in the same way unless i'm missing something. I'm not particularly well versed with java or the java implementation so any assistance at all in tracking down what's going on and whether it's a Go bug or a Java bug would be extremely helpful. Thanks much!

@zeroshade
Copy link
Member Author

I figured out the cause of the issue, when using WriteWithMetadata from the Go flight library, if the schema message hadn't yet been sent yet it includes the metadata in the schema message in addition to the first record batch message. In the Java, it allocates the memory for the application metadata for the Schema message, but never cleans it up causing the allocator to see it as a memory leak and error out. So adding a check on schema messages to close out the app metadata solves that issue, which I've done here in order to fix the issue with Java as a flight consumer.

@zeroshade zeroshade force-pushed the arrow-13804-month-day-nano branch from 3fcc1de to 4584db1 Compare October 13, 2021 16:37
@zeroshade
Copy link
Member Author

Success! Manually adding a step in the archery.yml to install pygit2 from a binary wheel fixes the missing header issue. Always nice to see all greens :)

@zeroshade
Copy link
Member Author

@lidavidm I was told that I should tag you to check out the small java change here. Could you please take a look and just make sure i'm not doing anything terrible / that should be done a better way? Thanks!

Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for fixing this in the Java client.

Co-authored-by: David Li <[email protected]>
@asfgit asfgit closed this in 12aaf2c Oct 13, 2021
@ursabot
Copy link

ursabot commented Oct 13, 2021

Benchmark runs are scheduled for baseline = e379ee1 and contender = 12aaf2c. 12aaf2c is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️25.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.22% ⬆️0.0%] ursa-thinkcentre-m75q
Supported benchmarks:
ursa-i9-9960x: langs = Python, R, JavaScript
ursa-thinkcentre-m75q: langs = C++, Java
ec2-t3-xlarge-us-east-2: cloud = True

@zeroshade zeroshade deleted the arrow-13804-month-day-nano branch October 20, 2021 16:45
pribor pushed a commit to GlobalWebIndex/arrow that referenced this pull request Oct 24, 2025
apache#10177 Added the interval type Month, Day, Nano to the flatbuffer and C++/Java, this updates the flatbuffer generated files and adds support for the type to Go.

Closes apache#11310 from zeroshade/arrow-13804-month-day-nano

Authored-by: Matthew Topol <[email protected]>
Signed-off-by: Matthew Topol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants