Skip to content

fix(sql): always free parquet buffers when executing queries in parallel#6372

Merged
bluestreak01 merged 7 commits intomasterfrom
puzpuzpuz_parquet_oom
Nov 12, 2025
Merged

fix(sql): always free parquet buffers when executing queries in parallel#6372
bluestreak01 merged 7 commits intomasterfrom
puzpuzpuz_parquet_oom

Conversation

@puzpuzpuz
Copy link
Copy Markdown
Contributor

Without this fix, queries like the below one lead to OOM:

SELECT COUNT(*) FROM read_parquet('hits.parquet') WHERE AdvEngineID <> 0;

Here, hits.parquet is ClickBench test file and it has large row groups (up to 600K rows). Combined with lack of projections (#5280), this easily leads to OOM errors since we were keeping row group buffers around for each page frame reduce task.

@puzpuzpuz puzpuzpuz self-assigned this Nov 10, 2025
@puzpuzpuz puzpuzpuz added Bug Incorrect or unexpected behavior SQL Issues or changes relating to SQL execution labels Nov 10, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Nov 10, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch puzpuzpuz_parquet_oom

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nwoolmer nwoolmer self-requested a review November 10, 2025 18:43
@puzpuzpuz puzpuzpuz changed the title fix(sql): always release parquet buffers when executing queries in parallel fix(sql): always free parquet buffers when executing queries in parallel Nov 10, 2025
nwoolmer
nwoolmer previously approved these changes Nov 11, 2025
Copy link
Copy Markdown
Contributor

@nwoolmer nwoolmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, will get heavier testing as part of read_parquet upgrades

@nwoolmer
Copy link
Copy Markdown
Contributor

(after failed tests are fixed!)

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 117 / 118 (99.15%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/table/AsyncTopKRecordCursorFactory.java 31 32 96.88%
🔵 io/questdb/cairo/sql/async/PageFrameSequence.java 1 1 100.00%
🔵 io/questdb/std/DirectIntList.java 10 10 100.00%
🔵 io/questdb/griffin/engine/table/AsyncGroupByNotKeyedRecordCursorFactory.java 24 24 100.00%
🔵 io/questdb/griffin/engine/table/AsyncGroupByRecordCursorFactory.java 31 31 100.00%
🔵 io/questdb/cairo/sql/async/PageFrameReduceTask.java 8 8 100.00%
🔵 io/questdb/cairo/sql/PageFrameMemoryPool.java 12 12 100.00%

@puzpuzpuz
Copy link
Copy Markdown
Contributor Author

@nwoolmer it's all green now.

@bluestreak01 bluestreak01 merged commit e4d5543 into master Nov 12, 2025
36 checks passed
@bluestreak01 bluestreak01 deleted the puzpuzpuz_parquet_oom branch November 12, 2025 16:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Incorrect or unexpected behavior SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants