Skip to content

perf(sql): speed up avg() functions in non-keyed GROUP BY queries#6542

Merged
bluestreak01 merged 7 commits intomasterfrom
puzpuzpuz_faster_non_keyed_avg
Dec 16, 2025
Merged

perf(sql): speed up avg() functions in non-keyed GROUP BY queries#6542
bluestreak01 merged 7 commits intomasterfrom
puzpuzpuz_faster_non_keyed_avg

Conversation

@puzpuzpuz
Copy link
Copy Markdown
Contributor

@puzpuzpuz puzpuzpuz commented Dec 16, 2025

Removes redundant division from avg() function in case of parallel non-keyed GROUP BY, e.g.

SELECT AVG(price) FROM trades;

Native avg*Acc() functions were using avg += (v - avg) / c in each iteration while the outer Avg*VectorAggregateFunction Java function was multiplying the average and the count before storing them in the partial result (sum.add(value * count)).

Benchmarks

Test box: Ryzen 7900x, 64GB RAM, Ubuntu 24.04

CREATE TABLE IF NOT EXISTS 'x' ( 
        d DOUBLE,
        ts TIMESTAMP
) TIMESTAMP(ts) PARTITION BY DAY;

INSERT INTO x SELECT
    rnd_double() * 20 + 10 AS d,
    generate_series as ts
  FROM generate_series('2025-01-01', '2025-01-02', '172u');

-- before: 90ms
-- after: 74ms
SELECT AVG(d) FROM x;

With ClickBench's hits table the difference is:

-- before: 19ms
-- after: 15ms
SELECT AVG(UserID) FROM hits;

@puzpuzpuz puzpuzpuz self-assigned this Dec 16, 2025
@puzpuzpuz puzpuzpuz added SQL Issues or changes relating to SQL execution Performance Performance improvements labels Dec 16, 2025
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 16, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch puzpuzpuz_faster_non_keyed_avg

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

GitHub Actions - Rebuild Native Libraries and others added 3 commits December 16, 2025 17:07
bluestreak01
bluestreak01 previously approved these changes Dec 16, 2025
@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 15 / 15 (100.00%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/groupby/vect/AvgDoubleVectorAggregateFunction.java 5 5 100.00%
🔵 io/questdb/griffin/engine/groupby/vect/AvgIntVectorAggregateFunction.java 5 5 100.00%
🔵 io/questdb/griffin/engine/groupby/vect/AvgLongVectorAggregateFunction.java 5 5 100.00%

@bluestreak01 bluestreak01 merged commit 8021d57 into master Dec 16, 2025
39 checks passed
@bluestreak01 bluestreak01 deleted the puzpuzpuz_faster_non_keyed_avg branch December 16, 2025 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Performance Performance improvements SQL Issues or changes relating to SQL execution

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants