fix(csharp): implement FIFO memory acquisition to prevent starvation in CloudFetch #56

eric-wang-1990 · 2025-11-26T22:52:45Z

This is cherry pick of this PR: apache/arrow-adbc#3756

Fixes memory starvation issue in CloudFetch download pipeline where newer downloads could win memory allocation over older waiting downloads, causing indefinite delays.

Problem

The CloudFetch downloader was acquiring memory inside parallel download tasks with a polling-based wait (10ms intervals). This non-FIFO behavior caused memory starvation:

Multiple download tasks acquire semaphore slots and start polling for memory
All tasks wake up simultaneously every 10ms to check memory availability
When memory becomes available, any waiting task could acquire it (non-deterministic)
Newer downloads (e.g., file 5, 7, 8) could repeatedly win the race over older downloads (e.g., file 4)
File 4 never gets memory despite waiting the longest → indefinite starvation

Example scenario with 200MB memory, 50MB files, 3 parallel downloads:

Files 1, 2, 3 download (150MB used)
File 1 completes and releases 50MB
Files 4, 5, 6 are all waiting for memory
File 5 wins the race and acquires the 50MB
File 2 completes, file 7 wins the race
File 3 completes, file 8 wins the race
File 4 never gets memory because files 5, 7, 8 keep winning

Solution

Move memory acquisition from inside parallel download tasks to the main sequential loop. This ensures FIFO ordering:

Main loop acquires memory sequentially for each download
Only after memory is acquired does the download task start
Downloads are guaranteed to get memory in the order they were queued
No starvation possible

Additionally, increased default memory buffer from 100MB to 200MB to allow more parallel downloads without hitting memory limits.

Changes

CloudFetchDownloader.cs:
- Moved AcquireMemoryAsync from inside DownloadFileAsync to main loop (line 278-280)
- Ensures FIFO ordering before spawning download tasks
CloudFetchDownloadManager.cs:
- Increased DefaultMemoryBufferSizeMB from 100 to 200
- Better performance for large result sets

Testing

Manually verified on fast VM with Power BI Desktop
Existing CloudFetchDownloader tests still pass
FIFO ordering prevents the starvation scenario

TODO

The memory buffer design have a flaw that it only captures the compressed size, but does not cover decompressed size. There are a couple of options we can take but I wanna land this first since this one actually can get blocked for customer.

…acquisition This PR fixes a memory deadlock issue in CloudFetch by ensuring FIFO (First-In-First-Out) ordering of memory acquisition. ## Problem Previously, memory was acquired inside DownloadFileAsync() after the download task was spawned. This caused downloads to compete for memory in an unpredictable order, potentially leading to deadlocks when memory is limited. ## Solution 1. Move memory acquisition to before spawning the download task - Memory is now acquired in the sequential main loop (line 283-285) - This ensures FIFO ordering based on when files arrive in the download queue - Removed memory acquisition from inside DownloadFileAsync() (previously line 395) 2. Increase default memory buffer from 100MB to 200MB - Reduces likelihood of memory contention - Better accommodates parallel downloads 3. Add comprehensive FIFO test - Verifies memory acquisition order - Verifies download start order (proves file 3 doesn't start before file 2) - Verifies resultQueue ordering ## Test Results - New FIFO test passes successfully - Validates proper ordering at three verification points 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Resolved conflict in CloudFetchDownloadManager.cs by accepting main branch refactoring that removed protocol-specific fields.

…ix-cloudfetch-memory-fifo

…Result interface Updated mock setup to use FileUrl, ByteCount, and ExpirationTime properties instead of the removed Link property after interface refactoring.

github-actions · 2025-12-08T21:38:16Z

📊 CloudFetch Benchmark Results

.NET 8.0

Query	Mean (ms)	Baseline	Diff	Rows	Columns
catalog_sales	3768.75	-	-	1441548	34
customer	1227.52	-	-	100000	18
inventory	7583.74	-	-	11745000	5
sales(...)tamps [21]	4842.10	-	-
store_sales_numeric	4794.30	-	-	2880404	16
web_sales	2370.12	-	-	719384	34
wide_sales_analysis	15430.69	-	-	2880404	54

.NET Framework 4.7.2

Query	Mean (ms)	Baseline	Diff	Rows	Columns
catalog_sales	4314.55	-	-	1441548	34
customer	644.95	-	-	100000	18
inventory	45164.12	-	-	11745000	5
sales(...)tamps [21]	16340.93	-	-
store_sales_numeric	12551.49	-	-	2880404	16
web_sales	3206.65	-	-	719384	34
wide_sales_analysis	16555.17	-	-	2880404	54

🟢 Improvement | 🔴 Regression | ⚪ No change | Baseline from latest main branch run

github-actions · 2025-12-08T23:56:20Z

📊 CloudFetch Benchmark Results

.NET 8.0

Query	Mean (ms)	Baseline	Diff	Rows	Columns

.NET Framework 4.7.2

Query	Mean (ms)	Baseline	Diff	Rows	Columns

🟢 Improvement | 🔴 Regression | ⚪ No change | Baseline from latest main branch run

github-actions · 2025-12-09T00:36:02Z

📊 CloudFetch Benchmark Results

.NET 8.0

Query	Mean (ms)	Baseline	Diff	Rows	Columns

.NET Framework 4.7.2

Query	Mean (ms)	Baseline	Diff	Rows	Columns

🟢 Improvement | 🔴 Regression | ⚪ No change | Baseline from latest main branch run

github-actions · 2025-12-09T00:48:51Z

📊 CloudFetch Benchmark Results

.NET 8.0

Query	Mean (ms)	Baseline	Diff	Rows	Columns

.NET Framework 4.7.2

Query	Mean (ms)	Baseline	Diff	Rows	Columns

🟢 Improvement | 🔴 Regression | ⚪ No change | Baseline from latest main branch run

…ix-cloudfetch-memory-fifo

github-actions · 2025-12-10T07:23:31Z

📊 CloudFetch Benchmark Results

.NET 8.0

Query	Mean (ms)	Baseline	Diff	Rows	Columns
catalog_sales	4288.11	2658.09	🔴 +60.0%	1441548	34
customer	1543.11	677.81	🔴 +120.0%	100000	18
inventory	7366.79	6722.08	⚪ ~0%	11745000	5
sales_with_timestamps	6273.87	9658.16	🟢 -30.0%	2880404	13
store_sales_numeric	5023.56	4500.87	🔴 +10.0%	2880404	16
web_sales	2751.84	1507.29	🔴 +80.0%	719384	34
wide_sales_analysis	15776.66	12598.68	🔴 +20.0%	2880404	54

.NET Framework 4.7.2

Query	Mean (ms)	Baseline	Diff	Rows	Columns
catalog_sales	5687.05	6128.61	⚪ ~0%	1441548	34
customer	1365.08	1565.94	🟢 -10.0%	100000	18
inventory	45554.86	45958.73	⚪ ~0%	11745000	5
sales_with_timestamps	14528.96	15106.52	⚪ ~0%	2880404	13
store_sales_numeric	13371.33	13574.09	⚪ ~0%	2880404	16
web_sales	3927.35	4012.90	⚪ ~0%	719384	34
wide_sales_analysis	18225.52	17864.24	⚪ ~0%	2880404	54

🟢 Improvement | 🔴 Regression | ⚪ No change | Baseline from latest main branch run

eric-wang-1990 requested a review from jadewang-db November 26, 2025 22:52

eric-wang-1990 temporarily deployed to azure-prod November 26, 2025 22:52 — with GitHub Actions Inactive

eric-wang-1990 requested a review from msrathore-db November 26, 2025 22:52

eric-wang-1990 changed the title ~~fix(csharp/src/Drivers/Databricks): Implement FIFO memory acquisition to prevent starvation in CloudFetch~~ fix(csharp): Implement FIFO memory acquisition to prevent starvation in CloudFetch Nov 26, 2025

eric-wang-1990 changed the title ~~fix(csharp): Implement FIFO memory acquisition to prevent starvation in CloudFetch~~ fix(csharp): implement FIFO memory acquisition to prevent starvation in CloudFetch Dec 1, 2025

eric-wang-1990 added 3 commits December 8, 2025 13:13

Merge main into eric/fix-cloudfetch-memory-fifo

47cf92e

Resolved conflict in CloudFetchDownloadManager.cs by accepting main branch refactoring that removed protocol-specific fields.

Merge branch 'main' of github.com:adbc-drivers/databricks into eric/f…

4a1cc9f

…ix-cloudfetch-memory-fifo

fix(csharp): update CloudFetchDownloaderTest for refactored IDownload…

7a07bed

…Result interface Updated mock setup to use FileUrl, ByteCount, and ExpirationTime properties instead of the removed Link property after interface refactoring.

eric-wang-1990 temporarily deployed to azure-prod December 8, 2025 21:22 — with GitHub Actions Inactive

eric-wang-1990 added the benchmark Run performance benchmarks on this PR label Dec 8, 2025

eric-wang-1990 temporarily deployed to azure-prod December 8, 2025 21:24 — with GitHub Actions Inactive

jadewang-db approved these changes Dec 8, 2025

View reviewed changes

eric-wang-1990 removed the benchmark Run performance benchmarks on this PR label Dec 8, 2025

Merge branch 'main' into eric/fix-cloudfetch-memory-fifo

b93c2bc

eric-wang-1990 temporarily deployed to azure-prod December 8, 2025 23:43 — with GitHub Actions Inactive

eric-wang-1990 added the benchmark Run performance benchmarks on this PR label Dec 8, 2025

eric-wang-1990 temporarily deployed to azure-prod December 8, 2025 23:44 — with GitHub Actions Inactive

eric-wang-1990 added benchmark Run performance benchmarks on this PR and removed benchmark Run performance benchmarks on this PR labels Dec 9, 2025

eric-wang-1990 temporarily deployed to azure-prod December 9, 2025 00:23 — with GitHub Actions Inactive

eric-wang-1990 added benchmark Run performance benchmarks on this PR and removed benchmark Run performance benchmarks on this PR labels Dec 9, 2025

eric-wang-1990 temporarily deployed to azure-prod December 9, 2025 00:36 — with GitHub Actions Inactive

eric-wang-1990 added benchmark Run performance benchmarks on this PR and removed benchmark Run performance benchmarks on this PR labels Dec 9, 2025

eric-wang-1990 temporarily deployed to azure-prod December 10, 2025 05:19 — with GitHub Actions Inactive

eric-wang-1990 added benchmark Run performance benchmarks on this PR and removed benchmark Run performance benchmarks on this PR labels Dec 10, 2025

chore: trigger benchmark run

4fedbad

eric-wang-1990 added benchmark Run performance benchmarks on this PR and removed benchmark Run performance benchmarks on this PR labels Dec 10, 2025

Merge branch 'main' of github.com:adbc-drivers/databricks into eric/f…

2bac3df

…ix-cloudfetch-memory-fifo

eric-wang-1990 had a problem deploying to azure-prod December 10, 2025 06:14 — with GitHub Actions Error

eric-wang-1990 removed the benchmark Run performance benchmarks on this PR label Dec 10, 2025

msrathore-db approved these changes Dec 10, 2025

View reviewed changes

fix: remove trailing whitespace in CloudFetchDownloader

b2bfc5c

eric-wang-1990 temporarily deployed to azure-prod December 10, 2025 06:20 — with GitHub Actions Inactive

eric-wang-1990 added the benchmark Run performance benchmarks on this PR label Dec 10, 2025

eric-wang-1990 temporarily deployed to azure-prod December 10, 2025 07:09 — with GitHub Actions Inactive

eric-wang-1990 merged commit 77c5d1d into main Dec 10, 2025
13 checks passed

eric-wang-1990 deleted the eric/fix-cloudfetch-memory-fifo branch December 10, 2025 07:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(csharp): implement FIFO memory acquisition to prevent starvation in CloudFetch #56

fix(csharp): implement FIFO memory acquisition to prevent starvation in CloudFetch #56

eric-wang-1990 commented Nov 26, 2025

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

github-actions bot commented Dec 9, 2025

Uh oh!

github-actions bot commented Dec 9, 2025

Uh oh!

github-actions bot commented Dec 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fix(csharp): implement FIFO memory acquisition to prevent starvation in CloudFetch #56

fix(csharp): implement FIFO memory acquisition to prevent starvation in CloudFetch #56

Conversation

eric-wang-1990 commented Nov 26, 2025

Problem

Solution

Changes

Testing

TODO

Uh oh!

github-actions bot commented Dec 8, 2025

📊 CloudFetch Benchmark Results

Uh oh!

github-actions bot commented Dec 8, 2025

📊 CloudFetch Benchmark Results

Uh oh!

github-actions bot commented Dec 9, 2025

📊 CloudFetch Benchmark Results

Uh oh!

github-actions bot commented Dec 9, 2025

📊 CloudFetch Benchmark Results

Uh oh!

github-actions bot commented Dec 10, 2025

📊 CloudFetch Benchmark Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants