fix(csharp/src/Drivers/Databricks): Implement FIFO memory acquisition to prevent starvation in CloudFetch #3756

eric-wang-1990 · 2025-11-25T10:22:05Z

Fixes memory starvation issue in CloudFetch download pipeline where newer downloads could win memory allocation over older waiting downloads, causing indefinite delays.

Problem

The CloudFetch downloader was acquiring memory inside parallel download tasks with a polling-based wait (10ms intervals). This non-FIFO behavior caused memory starvation:

Multiple download tasks acquire semaphore slots and start polling for memory
All tasks wake up simultaneously every 10ms to check memory availability
When memory becomes available, any waiting task could acquire it (non-deterministic)
Newer downloads (e.g., file 5, 7, 8) could repeatedly win the race over older downloads (e.g., file 4)
File 4 never gets memory despite waiting the longest → indefinite starvation

Example scenario with 200MB memory, 50MB files, 3 parallel downloads:

Files 1, 2, 3 download (150MB used)
File 1 completes and releases 50MB
Files 4, 5, 6 are all waiting for memory
File 5 wins the race and acquires the 50MB
File 2 completes, file 7 wins the race
File 3 completes, file 8 wins the race
File 4 never gets memory because files 5, 7, 8 keep winning

Solution

Move memory acquisition from inside parallel download tasks to the main sequential loop. This ensures FIFO ordering:

Main loop acquires memory sequentially for each download
Only after memory is acquired does the download task start
Downloads are guaranteed to get memory in the order they were queued
No starvation possible

Additionally, increased default memory buffer from 100MB to 200MB to allow more parallel downloads without hitting memory limits.

Changes

CloudFetchDownloader.cs:
- Moved AcquireMemoryAsync from inside DownloadFileAsync to main loop (line 278-280)
- Ensures FIFO ordering before spawning download tasks
CloudFetchDownloadManager.cs:
- Increased DefaultMemoryBufferSizeMB from 100 to 200
- Better performance for large result sets

Testing

Manually verified on fast VM with Power BI Desktop
Existing CloudFetchDownloader tests still pass
FIFO ordering prevents the starvation scenario

TODO

The memory buffer design have a flaw that it only captures the compressed size, but does not cover decompressed size. There are a couple of options we can take but I wanna land this first since this one actually can get blocked for customer.

CurtHagenlocher · 2025-11-25T14:57:16Z

If the memory counter is released as soon as the chunk is downloaded, then what's to prevent all of memory being consumed by downloaded chunks if there's a slow reader that's still working its way through chunks at the beginning? This seems to be acting more of a download-parallelism limiter now than a memory-usage limiter.

Fixes memory starvation issue where newer downloads could win memory allocation over older waiting downloads, causing indefinite delays. Changes: 1. Move memory acquisition from DownloadFileAsync to main loop for FIFO ordering - Prevents memory starvation where newer downloads win over older waiting downloads - Memory is now acquired sequentially before spawning download tasks 2. Increase default memory buffer from 100MB to 200MB - Allows more parallel downloads without hitting memory limits - Better performance for large result sets 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

jadewang-db

can we add some test ?

- Add mock setup for ActivityTrace to fix NullReferenceException - Test with maxParallelDownloads=3 to simulate realistic contention - Track actual download start order via HTTP handler mock - Verify FIFO ordering at three points: memory acquisition, download start, and resultQueue - Inspect resultQueue directly instead of consuming via GetNextDownloadedFileAsync 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

CurtHagenlocher

Thanks! Assuming I understand this change correctly, this seems a far better approach!

…in CloudFetch (#56) This is cherry pick of this PR: apache/arrow-adbc#3756 Fixes memory starvation issue in CloudFetch download pipeline where newer downloads could win memory allocation over older waiting downloads, causing indefinite delays. ## Problem The CloudFetch downloader was acquiring memory inside parallel download tasks with a polling-based wait (10ms intervals). This non-FIFO behavior caused memory starvation: 1. Multiple download tasks acquire semaphore slots and start polling for memory 2. All tasks wake up simultaneously every 10ms to check memory availability 3. When memory becomes available, any waiting task could acquire it (non-deterministic) 4. Newer downloads (e.g., file 5, 7, 8) could repeatedly win the race over older downloads (e.g., file 4) 5. File 4 never gets memory despite waiting the longest → **indefinite starvation** Example scenario with 200MB memory, 50MB files, 3 parallel downloads: - Files 1, 2, 3 download (150MB used) - File 1 completes and releases 50MB - Files 4, 5, 6 are all waiting for memory - File 5 wins the race and acquires the 50MB - File 2 completes, file 7 wins the race - File 3 completes, file 8 wins the race - File 4 never gets memory because files 5, 7, 8 keep winning ## Solution Move memory acquisition from inside parallel download tasks to the main sequential loop. This ensures FIFO ordering: 1. Main loop acquires memory sequentially for each download 2. Only after memory is acquired does the download task start 3. Downloads are guaranteed to get memory in the order they were queued 4. No starvation possible Additionally, increased default memory buffer from 100MB to 200MB to allow more parallel downloads without hitting memory limits. ## Changes - **CloudFetchDownloader.cs**: - Moved `AcquireMemoryAsync` from inside `DownloadFileAsync` to main loop (line 278-280) - Ensures FIFO ordering before spawning download tasks - **CloudFetchDownloadManager.cs**: - Increased `DefaultMemoryBufferSizeMB` from 100 to 200 - Better performance for large result sets ## Testing - Manually verified on fast VM with Power BI Desktop - Existing CloudFetchDownloader tests still pass - FIFO ordering prevents the starvation scenario ## TODO The memory buffer design have a flaw that it only captures the compressed size, but does not cover decompressed size. There are a couple of options we can take but I wanna land this first since this one actually can get blocked for customer. --------- Co-authored-by: Claude <[email protected]>

eric-wang-1990 requested a review from CurtHagenlocher as a code owner November 25, 2025 10:22

github-actions bot added this to the ADBC Libraries 22 milestone Nov 25, 2025

eric-wang-1990 force-pushed the fix-cloudfetch-memory-deadlock branch from 311b880 to f0b53bb Compare November 25, 2025 10:49

CurtHagenlocher changed the title ~~fix(csharp/databricks): Release CloudFetch memory immediately after download completes~~ fix(csharp/src/Drivers/Databricks): Release CloudFetch memory immediately after download completes Nov 25, 2025

eric-wang-1990 changed the title ~~fix(csharp/src/Drivers/Databricks): Release CloudFetch memory immediately after download completes~~ [On hold]fix(csharp/src/Drivers/Databricks): Release CloudFetch memory immediately after download completes Nov 25, 2025

eric-wang-1990 changed the title ~~[On hold]fix(csharp/src/Drivers/Databricks): Release CloudFetch memory immediately after download completes~~ fix(csharp/src/Drivers/Databricks): Implement FIFO memory acquisition to prevent starvation in CloudFetch Nov 26, 2025

eric-wang-1990 force-pushed the fix-cloudfetch-memory-deadlock branch from 8afca15 to fd1db09 Compare November 26, 2025 01:56

jadewang-db approved these changes Nov 26, 2025

View reviewed changes

eric-wang-1990 mentioned this pull request Nov 26, 2025

fix(csharp): implement FIFO memory acquisition to prevent starvation in CloudFetch adbc-drivers/databricks#56

Merged

CurtHagenlocher approved these changes Dec 2, 2025

View reviewed changes

CurtHagenlocher merged commit 87c49e9 into apache:main Dec 2, 2025
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(csharp/src/Drivers/Databricks): Implement FIFO memory acquisition to prevent starvation in CloudFetch #3756

fix(csharp/src/Drivers/Databricks): Implement FIFO memory acquisition to prevent starvation in CloudFetch #3756

eric-wang-1990 commented Nov 25, 2025 •

edited

Loading

Uh oh!

CurtHagenlocher commented Nov 25, 2025 •

edited

Loading

Uh oh!

jadewang-db left a comment

Uh oh!

CurtHagenlocher left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix(csharp/src/Drivers/Databricks): Implement FIFO memory acquisition to prevent starvation in CloudFetch #3756

fix(csharp/src/Drivers/Databricks): Implement FIFO memory acquisition to prevent starvation in CloudFetch #3756

Conversation

eric-wang-1990 commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Changes

Testing

TODO

Uh oh!

CurtHagenlocher commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jadewang-db left a comment

Choose a reason for hiding this comment

Uh oh!

CurtHagenlocher left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eric-wang-1990 commented Nov 25, 2025 •

edited

Loading

CurtHagenlocher commented Nov 25, 2025 •

edited

Loading