Skip to content

fix(sql): error on group by with many aggregate functions#6689

Merged
bluestreak01 merged 4 commits intomasterfrom
puzpuzpuz_fix_huge_group_by
Jan 22, 2026
Merged

fix(sql): error on group by with many aggregate functions#6689
bluestreak01 merged 4 commits intomasterfrom
puzpuzpuz_fix_huge_group_by

Conversation

@puzpuzpuz
Copy link
Copy Markdown
Contributor

@puzpuzpuz puzpuzpuz commented Jan 21, 2026

Fixes #3326

This PR fixes bug where GROUP BY queries with many aggregate functions fail with "Bytecode is too long" error, and includes significant performance optimizations for the SQL optimizer to speed up query compilation. We had a number of places with O(n²) complexity.

For instance, the following query:

SELECT avg(value), avg(value + 1), avg(value + 2), ... -- 6K functions in total
FROM my_table;

was taking 4 seconds to compile. With this patch, the compilation takes around 300ms.

List of Changes

1. GroupByFunctionsUpdaterFactory Fallback (core fix)

  • Added SimpleGroupByFunctionUpdater - a loop-based fallback implementation that avoids JVM bytecode limits
  • When aggregate function count exceeds 32, uses the simple loop-based updater instead of generating bytecode
  • Threshold determined via JMH benchmarking: bytecode is faster below 32 functions, loop is faster above

2. O(1) Amortized Alias Generation (performance)

  • Before: Alias generation was O(n²) - for each new alias, it scanned through sequence numbers 1, 2, 3... until finding an unused one
  • After: Added LowerCaseCharSequenceIntHashMap to track the next sequence number for each base alias
  • Added aliasSequenceMap to QueryModel (per-model state) and separate maps for pivot/cursor aliases in SqlOptimiser and SqlParser

3. O(n) Duplicate Aggregate Detection (performance)

  • Before: O(n²) pairwise compareNodesExact() comparison to detect duplicate aggregates
  • After: Hash-based detection using ExpressionNode.deepHashCode() - only compares nodes when hashes collide
  • Added detectDuplicateAggregates() method in SqlOptimiser

4. Optimized findColumnByAst (performance)

  • Moved findColumnByAst() call inside the if (useOuterModel) check
  • Avoids unnecessary O(n) searches when the result won't be used

@puzpuzpuz puzpuzpuz self-assigned this Jan 21, 2026
@puzpuzpuz puzpuzpuz added Bug Incorrect or unexpected behavior SQL Issues or changes relating to SQL execution Performance Performance improvements labels Jan 21, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 21, 2026

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@puzpuzpuz
Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 21, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 159 / 163 (97.55%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/groupby/GroupByFunctionsUpdaterFactory.java 23 26 88.46%
🔵 io/questdb/griffin/SqlOptimiser.java 92 93 98.92%
🔵 io/questdb/griffin/SqlParser.java 7 7 100.00%
🔵 io/questdb/griffin/engine/join/AsyncWindowJoinAtom.java 1 1 100.00%
🔵 io/questdb/griffin/SqlCodeGenerator.java 4 4 100.00%
🔵 io/questdb/griffin/engine/table/AsyncGroupByAtom.java 1 1 100.00%
🔵 io/questdb/griffin/model/QueryModel.java 5 5 100.00%
🔵 io/questdb/griffin/engine/table/AsyncGroupByNotKeyedAtom.java 3 3 100.00%
🔵 io/questdb/griffin/SqlUtil.java 23 23 100.00%

Copy link
Copy Markdown
Contributor

@nwoolmer nwoolmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, this makes it 3/3 places fixed. Let's triple check there's no fourth area we missed!

@bluestreak01 bluestreak01 merged commit fbd2d55 into master Jan 22, 2026
43 checks passed
@bluestreak01 bluestreak01 deleted the puzpuzpuz_fix_huge_group_by branch January 22, 2026 21:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Incorrect or unexpected behavior Performance Performance improvements SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Too Much Input Exception When SELECTing Many AVG Columns

4 participants