-
Notifications
You must be signed in to change notification settings - Fork 173
fix(go/adbc/driver/snowflake): fix potential deadlocks in reader #3870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Contributor
|
Will this get ported to the Foundry as well @zeroshade ? |
Member
Author
|
@davidhcoe yes, this will get ported to the foundry. Once we complete the shift to the foundry we won't be filing the PRs here anymore and all development will shift to the foundry |
lidavidm
approved these changes
Jan 9, 2026
|
Seems like this branch fixes the deadlock issue I was experiencing in #3730. Thank you for looking into this! |
Member
Author
|
@davidhcoe see adbc-drivers/snowflake#60 for the foundry port |
lidavidm
pushed a commit
to adbc-drivers/snowflake
that referenced
this pull request
Jan 10, 2026
## What's Changed
## Fix Critical Deadlocks and Race Conditions in Snowflake Record Reader
This PR addresses multiple critical concurrency issues in the Snowflake
driver's `recordReader` that could cause complete application hangs
under normal racing conditions.
### Issues Fixed
*1. Critical Deadlock: `Release()` Blocking Forever*
*Problem*: When `Release()` was called while producer goroutines were
blocked on channel sends, a permanent deadlock occurred:
* `Release()` cancels context and attempts to drain channels
* Producer goroutines blocked on `ch <- rec` cannot see the cancellation
* Channels never close because producers never exit
* `Release()` blocks forever on `for rec := range ch`
*Fix:* Added a `done` channel that signals when all producer goroutines
have completed. `Release()` now waits for this signal before attempting
to drain channels.
*2. Severe Deadlock: Non-Context-Aware Channel Sends*
*Problem:* Channel send operations at lines 694 and 732 checked context
before the send but not during:
```go
for rr.Next() && ctx.Err() == nil { // Context checked here
// ...
ch <- rec // But send blocks here without checking context
}
```
*Fix:* Wrapped all channel sends in `select` statements with context
awareness:
```go
select {
case chs[0] <- rec:
// Successfully sent
case <-ctx.Done():
rec.Release()
return ctx.Err()
}
```
*3. Critical Race Condition: Nil Channel Reads*
*Problem:* Channels were created asynchronously in goroutines after
`newRecordReader` returned. If `Next()` was called quickly after
creation, it could read from uninitialized (nil) channels, causing
infinite blocking.
*Fix:* Initialize all channels upfront before starting any goroutines:
```go
chs := make([]chan arrow.RecordBatch, len(batches))
for i := range chs {
chs[i] = make(chan arrow.RecordBatch, bufferSize)
}
```
*4. Goroutine Leaks on Initialization Errors*
*Problem:* Error paths only cleaned up the first channel, potentially
leaking goroutines if initialization failed after starting concurrent
operations.
*Fix:* Moved all error-prone initialization (GetStream, NewReader)
before goroutine creation, and added proper cleanup on errors.
----------------------
#### Changes
* Added `done` channel to `reader` struct to signal goroutine completion
* Initialize all channels upfront to eliminate race conditions
* Use context-aware sends with `select` statements for all channel
operations
* Update `Release()` to wait on `done` channel before draining
* Reorganize initialization to handle errors before starting goroutines
* Signal completion by closing `done` channel after all producers finish
#### Reproduction Scenarios Prevented
*Deadlock:*
1. bufferSize = 1, producer generates 2 records quickly
2. Channel becomes full after first record
3. Producer blocks on send
4. Consumer calls Release() before Next()
5. Without fix: permanent deadlock
6. With fix: producer responds to cancellation, Release() completes
*Race Condition:*
1. Query returns 3 batches
2. First batch processes quickly
3. Next() advances to second channel
4. Without fix: reads from nil channel, blocks forever
5. With fix: channel already initialized, works correctly
Backport of apache/arrow-adbc#3870.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix Critical Deadlocks and Race Conditions in Snowflake Record Reader
This PR addresses multiple critical concurrency issues in the Snowflake driver's
recordReaderthat could cause complete application hangs under normal racing conditions.Issues Fixed
1. Critical Deadlock:
Release()Blocking ForeverProblem: When
Release()was called while producer goroutines were blocked on channel sends, a permanent deadlock occurred:Release()cancels context and attempts to drain channelsch <- reccannot see the cancellationRelease()blocks forever onfor rec := range chFix: Added a
donechannel that signals when all producer goroutines have completed.Release()now waits for this signal before attempting to drain channels.2. Severe Deadlock: Non-Context-Aware Channel Sends
Problem: Channel send operations at lines 694 and 732 checked context before the send but not during:
Fix: Wrapped all channel sends in
selectstatements with context awareness:3. Critical Race Condition: Nil Channel Reads
Problem: Channels were created asynchronously in goroutines after
newRecordReaderreturned. IfNext()was called quickly after creation, it could read from uninitialized (nil) channels, causing infinite blocking.Fix: Initialize all channels upfront before starting any goroutines:
4. Goroutine Leaks on Initialization Errors
Problem: Error paths only cleaned up the first channel, potentially leaking goroutines if initialization failed after starting concurrent operations.
Fix: Moved all error-prone initialization (GetStream, NewReader) before goroutine creation, and added proper cleanup on errors.
Changes
donechannel toreaderstruct to signal goroutine completionselectstatements for all channel operationsRelease()to wait ondonechannel before drainingdonechannel after all producers finishReproduction Scenarios Prevented
Deadlock #1:
Race Condition:
See #3730