Skip to content

fix(sql): potential segfault on vectorized GROUP BY query timeout#6667

Merged
bluestreak01 merged 3 commits intomasterfrom
puzpuzpuz_vect_groupby_timeout_segfault
Jan 19, 2026
Merged

fix(sql): potential segfault on vectorized GROUP BY query timeout#6667
bluestreak01 merged 3 commits intomasterfrom
puzpuzpuz_vect_groupby_timeout_segfault

Conversation

@puzpuzpuz
Copy link
Copy Markdown
Contributor

@puzpuzpuz puzpuzpuz commented Jan 19, 2026

When a timeout occurred in a vectorized parallel GROUP BY query like SELECT sum(price) FROM trades WHERE ts IN today();, a segfault could happen crashing QuestDB process. That's because frame memory pools used by worker threads were released without properly waiting for all tasks to be processed, i.e. concurrently with the aggregation.

@puzpuzpuz puzpuzpuz self-assigned this Jan 19, 2026
@puzpuzpuz puzpuzpuz added Bug Incorrect or unexpected behavior SQL Issues or changes relating to SQL execution labels Jan 19, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 19, 2026

Walkthrough

Memory cleanup in vectorized group-by cursor factories is refactored to move resource deallocation from catch blocks to finally blocks, ensuring consistent cleanup regardless of error state. New timeout tests are added for vectorized group-by scenarios, and certain existing test cases are disabled.

Changes

Cohort / File(s) Summary
Memory cleanup refactoring in vector groupby factories
core/src/main/java/io/questdb/griffin/engine/groupby/vect/GroupByNotKeyedVectorRecordCursorFactory.java, core/src/main/java/io/questdb/griffin/engine/groupby/vect/GroupByRecordCursorFactory.java
Moved per-worker page frame memory release from catch block to finally block to ensure deallocation after processing completes. GroupByRecordCursorFactory also adds circuit breaker check at end of frame processing loop.
New vectorized timeout tests
core/src/test/java/io/questdb/test/cairo/fuzz/ParallelGroupByFuzzTest.java
Added testParallelNonKeyedVectGroupByThrowsOnTimeout and testParallelVectGroupByThrowsOnTimeout to validate timeout behavior in vectorized paths. Removed JIT enablement assumption from existing timeout test. Schema updated to include id INT column in relevant timeout test cases.
Disabled test cases
core/src/test/java/io/questdb/test/cutlass/suspend/TestCases.java
Commented out multiple query test scenarios including subquery, timestamp equality, sorting, distinct, and various group-by related tests across different cursor factories.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • bluestreak01
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main bug fix: preventing a segfault in vectorized GROUP BY queries when a timeout occurs, which aligns with the changeset's core modifications to memory release timing.
Description check ✅ Passed The description directly addresses the changeset by explaining the root cause (premature memory pool release) and the impact (potential segfault), which matches the core changes moving memory cleanup to finally blocks.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@puzpuzpuz
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 3 / 3 (100.00%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/groupby/vect/GroupByNotKeyedVectorRecordCursorFactory.java 1 1 100.00%
🔵 io/questdb/griffin/engine/groupby/vect/GroupByRecordCursorFactory.java 2 2 100.00%

@bluestreak01 bluestreak01 merged commit c3269a5 into master Jan 19, 2026
43 checks passed
@bluestreak01 bluestreak01 deleted the puzpuzpuz_vect_groupby_timeout_segfault branch January 19, 2026 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Incorrect or unexpected behavior SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants