Skip to content

GH-37712: [Go][Parquet] Fix ARM64 assembly for bitmap extract bits#37785

Merged
zeroshade merged 16 commits intoapache:mainfrom
zeroshade:arm64-assembly-funtimes
Oct 5, 2023
Merged

GH-37712: [Go][Parquet] Fix ARM64 assembly for bitmap extract bits#37785
zeroshade merged 16 commits intoapache:mainfrom
zeroshade:arm64-assembly-funtimes

Conversation

@zeroshade
Copy link
Copy Markdown
Member

@zeroshade zeroshade commented Sep 18, 2023

Rationale for this change

The linked issue discovered a problem with the ARM64 implementation of the extractBits functionality which was being hidden because until go1.21, it looks like golang.org/x/sys/cpu wasn't properly detecting the ASIMD bit flag on the ARM processors and so it was using the pure go implementation. So we fix the assembly and add a test which was able to reproduce the failure without the fix on an ARM64 machine.

What changes are included in this PR?

There was a problem with the assembly as it existed as the short circuit case where the selection bitmap is 0 wasn't actually setting the 0 value back onto the stack as a return. The assembly which initialized that register to 0 only occurred AFTER the CBZ instruction which would jump to the bottom and return. This means that we actually returned whatever happened to be on the stack at that location (i.e. garbage!). This only occurred if you were using the ARM64-specific assembly implementation, not the amd64 or pure go versions.

We also include a test to ensure we get the correct values for a variety of bitmaps and selections, along with this ensuring there's enough data on the stack so that if this problem re-occured we would catch it in this test.

Are these changes tested?

Test is added.

@github-actions
Copy link
Copy Markdown

⚠️ GitHub issue #37712 has been automatically assigned in GitHub to PR creator.

@zeroshade
Copy link
Copy Markdown
Member Author

Ugh, looks like ALL the neon assembly is borked, it just wasn't ACTUALLY being run because golang.org/x/sys/cpu wasn't registering the ASIMD bit, but github.com/klauspost/cpuid/v2 DOES correctly recognize it and now we're seeing it die.

I guess I'll take a stab at fixing and debugging that tomorrow.

@zeroshade
Copy link
Copy Markdown
Member Author

Interestingly, it's only failing on linux/arm64 not macOS ARM64.... which is, well, odd. sigh Why can't things be easy. I'm guessing there's probably a feature flag that i should be checking for that I'm not

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so if an invalid value is passed in ARM_ENABLE_EXT, all extensions are silently disabled? It would be more user-friendly to print a warning.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically we aren't actually disabling the extensions on the CPU, we're just telling the library that anything that wants to check for those extensions should consider them disabled and act accordingly. For example if an invalid value is passed we behave as if those extensions weren't enabled regardless and default to the pure go implementation.

That said, I agree it would be more user-friendly to have an explicit "DISABLE" value, and then just print a warning for invalid values. I'll make that change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically we aren't actually disabling the extensions on the CPU

Yes, I figured that out :-)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated so that if the value of the env var is "DISABLE" then we disable things, otherwise we just print any unrecognized options to stderr

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, how did you come with this fix? I thought this assembly code wasn't written by hand?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assembly wasn't generated by hand, but this fix was absolutely written by hand (me) by way of looking at the code, the generated assembly, and stepping through the instructions to figure out what was happening incorrectly.

In this case, the problem was that the instructions which zero-out the return value got placed after the loop condition (checking if the selection bitmap was zero), so in that scenario we ended up returning whatever happened to be the next 8 bytes after the memory location the arguments were placed. I had to look up a bunch of stuff about how Go plan9 assembly works and also ARM64 assembly, to work out that zero'ing out one area was able to be done by simply storing from ZR the zero-register.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it's a C/C++ compiler bug? That sounds a bit unlikely to me, given how simple the extract_bits function is.

Also, why are you compiling this code for Neon? I find it unlikely that Neon can be used to speed up your SW emulation of PEXT.

Also, we have a SW emulation in Parquet C++ that processes 5 bits at a time rather than one. You could perhaps try to compile that one (or rewrite it in Go) :-)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without knowing how Yuqi originally generated the arm64 assembly, I'm not sure what the cause of this was. All i know is that this was never correct, but was hidden by the fact that we were unaware of a bug in the golang.org/x/sys/cpu package which was failing to detect ARM features correctly so the ARM builds were just using the pure go implementation.

Also, why are you compiling this code for Neon? I find it unlikely that Neon can be used to speed up your SW emulation of PEXT.

From what I can tell, ARM doesn't have a PEXT instruction, again this C impl was provided by a contributor but doing a bit of research it looks like the implementation there is essentially the standard way to implement PEXT for ARM in C when I do some stack overflow searching. (There's also a couple variations I came across, but they roughly even out).

Also, we have a SW emulation in Parquet C++ that processes 5 bits at a time rather than one. You could perhaps try to compile that one (or rewrite it in Go) :-)

The pure go impl already does it! if you look at bmi.go it's a direct port of the software emulation in level_conversion_inc.h from the C++ parquet impl. We only go to the assembly / non-pure-go impl if our CPU check says that it supports BMI2 so that we can call pext_u64 directly at runtime.

The compiling for NEON is to get the benefit of vectorization for the levels_to_bitmap function (i.e. the GreaterThanBitmap SIMD method).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without knowing how Yuqi originally generated the arm64 assembly, I'm not sure what the cause of this was.

There should be a well-defined way to generate the assembly, so that other people can later participate in maintenance.
(also, is it possible that the original arm64 assembly was simply out of sync with the current C source?)

The pure go impl already does it!

Nice, so you just need to port it to C now :-)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assembly wasn't generated by hand, but this fix was absolutely written by hand (me) by way of looking at the code, the generated assembly, and stepping through the instructions to figure out what was happening incorrectly.

This is cool!

Might be bug from asm2plan9s. Or errors from manually converting/formatting from normal arm assembly to plan9.
Not sure the best approach to convert normal C/assembly to plan9 assembly. asm2plan9s and c2goasm are archived. Looks we should avoid them.

For this pext like function, the go version (lookup table) should perform better than this one. Maybe we can just delete this assembly implementation?

Copy link
Copy Markdown
Member Author

@zeroshade zeroshade Sep 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyb70289 I'd be fine with removing the arm assembly implementation of the pext function. I haven't benchmarked the lookup table vs the ARM assembly version yet. but I agree the lookup table would likely be faster.

The other issue i'm having is the bit packing side of things, though I think i may have figured out a solution to it... and i've learned more about ARM64 assembly in the last few days than i ever intended on learning LOL as I've needed to do a lot of handcoding assembly to get this to work. Looks like the issue boils down to needing anything with a label to be explicit in the assembly rather than using the WORD/BYTE syntax to pass them to the processor stream so that the labels get translated to the correct PC locations.

There should be a well-defined way to generate the assembly, so that other people can later participate in maintenance.

I agree! I was a little dissappointed that @guyuqi didn't add explicit steps to how he generated the arm64 assembly but wanted to benefit from it, so that was my mistake to accept that PR so long ago. Currently I'm trying to explicitly define (and automate as much as possible) with the generation based on when i figure out what works and solves this issue.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wrote a benchmark, the golang version is indeed faster than assembly. Tested on Neoverse-N1.

$ go test -bench=.
goos: linux
goarch: arm64
pkg: bmi
BenchmarkGo-80               663           1759920 ns/op
BenchmarkAsm-80              435           2720235 ns/op
PASS
ok      bmi     2.838s

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well that settles that :) i'll update this and remove the asm version for arm 😄 Thanks!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the ARM asm version so that it just uses the pure go lookup-table impl

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting committer review Awaiting committer review awaiting changes Awaiting changes awaiting change review Awaiting change review labels Sep 20, 2023
@zeroshade
Copy link
Copy Markdown
Member Author

@cyb70289 @pitrou holy crap i got it to work! 😁 now i just gotta document what i did before i forget so i never have to do this again lol

@github-actions github-actions bot added awaiting changes Awaiting changes awaiting change review Awaiting change review and removed awaiting change review Awaiting change review awaiting changes Awaiting changes labels Sep 22, 2023
@zeroshade
Copy link
Copy Markdown
Member Author

@cyb70289 @pitrou this is ready for another look over and review now! thanks. Given the complexity i'm going to punt creating actual automation for the ARM assembly generation to follow-up PR instead

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cpu.ARM64.HasASIMD is false on some arm64 platform?
It does work correctly on my test machine (ubunt22.05, aarch64, golang1.21).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I could tell in my testing, it appears that on golang1.20 and below, it was ending up false on arm64 platforms when it shouldn't. Try testing it on golang1.20 instead of 1.21 and let me know, because that's what I was observing in my tests (and explains why we never caught this issue before now)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is strange. I tested earlier golang versions. Also tried x/sys/cpu v0.5.0 (used by arrow) and latest 0.12.0. All tests are okay.
Curious what might be the reason. If possible, can your share info about the test env? E.g.. tested in arm64 baremetal or virtual machine, linux kernel version, output of "lscpu". Thanks.

Copy link
Copy Markdown
Member Author

@zeroshade zeroshade Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a virtual machine launched via EC2 on AWS as a Debian 12 arm64 machine. That said, the original issue that launched this investigation was a mac M1 which appeared to show the same situation that using go1.20 acted as per x/sys/cpu returning false for HasASIMD while using go1.21 got true and then hit this issue and failed horribly.

I'll launch a new one when i get a chance and paste the output of lscpu here later on

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyb70289 Here is the output of lscpu on the AWS EC2 instance that I've been using for testing:

admin@ip-172-31-78-97:~/arrow/go/parquet/internal/utils$ lscpu
Architecture:           aarch64
  CPU op-mode(s):       32-bit, 64-bit
  Byte Order:           Little Endian
CPU(s):                 1
  On-line CPU(s) list:  0
Vendor ID:              ARM
  Model name:           Neoverse-N1
    Model:              1
    Thread(s) per core: 1
    Core(s) per socket: 1
    Socket(s):          1
    Stepping:           r3p1
    BogoMIPS:           243.75
    Flags:              fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
Caches (sum of all):
  L1d:                  64 KiB (1 instance)
  L1i:                  64 KiB (1 instance)
  L2:                   1 MiB (1 instance)
  L3:                   32 MiB (1 instance)
NUMA:
  NUMA node(s):         1
  NUMA node0 CPU(s):    0
Vulnerabilities:
  Gather data sampling: Not affected
  Itlb multihit:        Not affected
  L1tf:                 Not affected
  Mds:                  Not affected
  Meltdown:             Not affected
  Mmio stale data:      Not affected
  Retbleed:             Not affected
  Spec rstack overflow: Not affected
  Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:           Mitigation; __user pointer sanitization
  Spectre v2:           Mitigation; CSV2, BHB
  Srbds:                Not affected
  Tsx async abort:      Not affected

Comment on lines 27 to 30
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious, what is this and what is it good for? Doesn't Go provide memcpy/memset replacements?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this exists purely to test the assembly versions I provided.

The issue is that the C implementation I'm using to generate the assembly makes calls to memset and memcpy, but C's calling conventions are different than Go's. So in the assembly when it calls memset/memcpy I replaced those CALL instructions to call these replacement versions (because I can't call out to libc from inside of Go like that, no guarantee that the lib is loaded). Which are essentially just ported versions of the C memcpy/memset to ensure that the Calling conventions and argument handling are the same.

The underscore prefix there ensures that the methods are not exported from the package, so I'm only using these for clib_test.go in order to sanity check the assembly methods.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yuck, is there a way to avoid the calls to memset and memcpy?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even the original C++ implementations (bpacking_simd128_generated.h) use memset and memcpy. There might be ways to avoid those calls but they'd likely be much less efficient. We also already use amd64 versions of these anyways, so it's good to have arm64 implementations.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the looks of it, the calls to memcpy/memset could easily be replaced by the equivalent for loop. Given the size is small and constant, I don't think calling a generic version of memcpy/memset would be a significant win.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll give it a try

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest update has removed the memset/memcpy and replaced them with simple fixed size loops. My local testing showed all the tests passing, so assuming the CI is happy with things then we should be good here if you're happy with the results @pitrou. Let me know please when you get a chance.

@pitrou
Copy link
Copy Markdown
Member

pitrou commented Sep 25, 2023

Regardless of automation, there should be READMEs explaining the rationale and motivations, and also how to generate those files anew.

@zeroshade
Copy link
Copy Markdown
Member Author

@pitrou fair point, I'll update / write the README stuff for this today.

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 2, 2023
@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Oct 2, 2023
@zeroshade
Copy link
Copy Markdown
Member Author

@tschaub Can you please try testing with the latest version of this branch/PR and confirm for me that everything is still working on your end and there's no performance degradation vs the previous fix? I agree with @pitrou's statement that it's unlikely the generic memset/memcpy were giving us much of a benefit over a simple fixed length loop, but it'd be nice to confirm that there's no negative effects

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Oct 2, 2023
@tschaub
Copy link
Copy Markdown
Contributor

tschaub commented Oct 3, 2023

@zeroshade - I only tested a couple different cases, but I don't see any negative effect with the latest changes.

Here are some benchmarks transforming a small.parquet file (2400 rows, 20 physical columns, uncompressed):

Benchmark 1: ./gpq-noasm convert small.parquet small-noasm.parquet
  Time (mean ± σ):      72.2 ms ±   2.2 ms    [User: 103.9 ms, System: 25.2 ms]
  Range (min … max):    68.0 ms …  77.4 ms    41 runs
 
Benchmark 2: ./gpq-before convert small.parquet small-before.parquet
  Time (mean ± σ):      68.6 ms ±   2.2 ms    [User: 99.9 ms, System: 25.1 ms]
  Range (min … max):    63.1 ms …  74.9 ms    42 runs
 
Benchmark 3: ./gpq-after convert small.parquet small-after.parquet
  Time (mean ± σ):      68.5 ms ±   2.6 ms    [User: 99.0 ms, System: 24.5 ms]
  Range (min … max):    63.5 ms …  73.5 ms    40 runs
 
Summary
  ./gpq-after convert small.parquet small-after.parquet ran
    1.00 ± 0.05 times faster than ./gpq-before convert small.parquet small-before.parquet
    1.05 ± 0.05 times faster than ./gpq-noasm convert small.parquet small-noasm.parquet

And here are some numbers from a medium.parquet file (162709 rows, 20 physical columns, uncompressed):

Benchmark 1: ./gpq-noasm convert medium.parquet medium-noasm.parquet
  Time (mean ± σ):     986.3 ms ±  12.3 ms    [User: 1456.6 ms, System: 281.5 ms]
  Range (min … max):   965.2 ms … 1007.7 ms    10 runs
 
Benchmark 2: ./gpq-before convert medium.parquet medium-before.parquet
  Time (mean ± σ):     878.9 ms ±   7.2 ms    [User: 1343.1 ms, System: 269.5 ms]
  Range (min … max):   865.4 ms … 888.1 ms    10 runs
 
Benchmark 3: ./gpq-after convert medium.parquet medium-after.parquet
  Time (mean ± σ):     879.1 ms ±   9.7 ms    [User: 1341.6 ms, System: 268.6 ms]
  Range (min … max):   858.7 ms … 891.8 ms    10 runs
 
Summary
  ./gpq-before convert medium.parquet medium-before.parquet ran
    1.00 ± 0.01 times faster than ./gpq-after convert medium.parquet medium-after.parquet
    1.12 ± 0.02 times faster than ./gpq-noasm convert medium.parquet medium-noasm.parquet

So no significant difference in those between the before (c9693c5) and after (9d4c29b) cases. The other case is a build with -tags noasm.

Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM.

RET

// func _ClibMemcpy(dst, src unsafe.Pointer, n uint) unsafe.Pointer
TEXT ·_ClibMemcpy(SB), NOSPLIT|NOFRAME, $16-24
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so is this file still useful?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't make any changes to the amd64 implementation which still uses memcpy. Since there's no issues or problems with the amd64 implementation, I didn't want to change or remove anything with it. I just left the addition the Go function definition there for easier testing in the future if we need it.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Too bad, perhaps you can create a separate GH issue for it?

@zeroshade
Copy link
Copy Markdown
Member Author

Thanks for checking @tschaub! looks great

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Oct 3, 2023
@zeroshade zeroshade requested review from cyb70289 and pitrou October 3, 2023 17:37
Copy link
Copy Markdown
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but it would be nice to have @cyb70289 's approval as wel

@zeroshade
Copy link
Copy Markdown
Member Author

Thanks @pitrou I'll hold off on merging till the end of the week to give @cyb70289 a chance to approve

Copy link
Copy Markdown
Contributor

@cyb70289 cyb70289 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @zeroshade !

@zeroshade zeroshade merged commit f7530de into apache:main Oct 5, 2023
@zeroshade zeroshade removed the awaiting changes Awaiting changes label Oct 5, 2023
@zeroshade zeroshade deleted the arm64-assembly-funtimes branch October 5, 2023 14:26
@conbench-apache-arrow
Copy link
Copy Markdown

After merging your PR, Conbench analyzed the 5 benchmarking runs that have been run so far on merge-commit f7530de.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

loicalleyne pushed a commit to loicalleyne/arrow that referenced this pull request Nov 13, 2023
…its (apache#37785)

### Rationale for this change
The linked issue discovered a problem with the ARM64 implementation of the extractBits functionality which was being hidden because until `go1.21`, it looks like `golang.org/x/sys/cpu` wasn't properly detecting the ASIMD bit flag on the ARM processors and so it was using the pure go implementation. So we fix the assembly and add a test which was able to reproduce the failure without the fix on an ARM64 machine. 

### What changes are included in this PR?
There was a problem with the assembly as it existed as the short circuit case where the selection bitmap is 0 wasn't actually setting the 0 value back onto the stack as a return. The assembly which initialized that register to 0 only occurred AFTER the `CBZ` instruction which would jump to the bottom and return. This means that we *actually* returned whatever happened to be on the stack at that location (i.e. garbage!). This only occurred if you were using the ARM64-specific assembly implementation, not the amd64 or pure go versions.

We also include a test to ensure we get the correct values for a variety of bitmaps and selections, along with this ensuring there's enough data on the stack so that if this problem re-occured we would catch it in this test.

### Are these changes tested?
Test is added.

* Closes: apache#37712

Lead-authored-by: Matt Topol <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…its (apache#37785)

### Rationale for this change
The linked issue discovered a problem with the ARM64 implementation of the extractBits functionality which was being hidden because until `go1.21`, it looks like `golang.org/x/sys/cpu` wasn't properly detecting the ASIMD bit flag on the ARM processors and so it was using the pure go implementation. So we fix the assembly and add a test which was able to reproduce the failure without the fix on an ARM64 machine. 

### What changes are included in this PR?
There was a problem with the assembly as it existed as the short circuit case where the selection bitmap is 0 wasn't actually setting the 0 value back onto the stack as a return. The assembly which initialized that register to 0 only occurred AFTER the `CBZ` instruction which would jump to the bottom and return. This means that we *actually* returned whatever happened to be on the stack at that location (i.e. garbage!). This only occurred if you were using the ARM64-specific assembly implementation, not the amd64 or pure go versions.

We also include a test to ensure we get the correct values for a variety of bitmaps and selections, along with this ensuring there's enough data on the stack so that if this problem re-occured we would catch it in this test.

### Are these changes tested?
Test is added.

* Closes: apache#37712

Lead-authored-by: Matt Topol <[email protected]>
Co-authored-by: Antoine Pitrou <[email protected]>
Signed-off-by: Matt Topol <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Go][Parquet] Reading a parquet file with a nested field that is null for all records panics when using Go 1.21.1

4 participants