Skip to content

Conversation

@jadewang-db
Copy link
Contributor

Summary

This PR fixes an issue where PowerBI would hang when reading CloudFetch results and significantly improves the logging capabilities in the CloudFetch downloader component.

Problem

  1. The CloudFetchReader was not properly disposing of the download manager after completing downloads, causing resource leaks that led to PowerBI hanging.
  2. The CloudFetchDownloader was using Debug.WriteLine for logging, which is inadequate for production scenarios and doesn't provide sufficient diagnostic information.

Solution

  • Fixed resource management in CloudFetchReader by properly disposing the download manager after all files are processed
  • Replaced Debug.WriteLine calls with more comprehensive Trace logging
  • Added detailed performance metrics and diagnostics:
    • Download start/completion timestamps
    • File sizes and throughput calculations
    • Decompression metrics
    • Overall download statistics (total files, success/failure counts)
  • Added URL sanitization for secure logging
  • Added proper error tracking and reporting

Testing

  • Enhanced CloudFetchE2ETest to verify that the reader properly completes after all data is consumed
  • Verified that PowerBI no longer hangs when reading CloudFetch results

@github-actions github-actions bot added this to the ADBC Libraries 18 milestone Apr 28, 2025
Copy link
Contributor

@CurtHagenlocher CurtHagenlocher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine overall, but I have a question about the tracing.

@CurtHagenlocher CurtHagenlocher merged commit 332e145 into apache:main Apr 29, 2025
6 checks passed
colin-rogers-dbt pushed a commit to dbt-labs/arrow-adbc that referenced this pull request Jun 10, 2025
…bricks driver (apache#2747)

### Summary
This PR fixes an issue where PowerBI would hang when reading CloudFetch
results and significantly improves the logging capabilities in the
CloudFetch downloader component.

### Problem
1. The CloudFetchReader was not properly disposing of the download
manager after completing downloads, causing resource leaks that led to
PowerBI hanging.
2. The CloudFetchDownloader was using Debug.WriteLine for logging, which
is inadequate for production scenarios and doesn't provide sufficient
diagnostic information.

### Solution
- Fixed resource management in CloudFetchReader by properly disposing
the download manager after all files are processed
- Replaced Debug.WriteLine calls with more comprehensive Trace logging
- Added detailed performance metrics and diagnostics:
  - Download start/completion timestamps
  - File sizes and throughput calculations
  - Decompression metrics
  - Overall download statistics (total files, success/failure counts)
- Added URL sanitization for secure logging
- Added proper error tracking and reporting

### Testing
- Enhanced CloudFetchE2ETest to verify that the reader properly
completes after all data is consumed
- Verified that PowerBI no longer hangs when reading CloudFetch results
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants