feat(sql): weighted average and standard deviation#6457
feat(sql): weighted average and standard deviation#6457bluestreak01 merged 36 commits intomasterfrom
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughThe PR extends QuestDB's GROUP BY aggregation subsystem by introducing weighted average and weighted standard deviation functions with parallel-merge support. Key changes include adding merge logic to AbstractStdDevGroupByFunction, introducing new weighted aggregate implementations (AbstractWeightedStdDevGroupByFunction, WeightedAvgDoubleGroupByFunction), removing merge logic from Pop/Sample variance functions in favor of centralized implementations, and adding comprehensive test coverage. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/test/resources/sqllogictest/test/sql/test_stddev.test (1)
120-120: Potential formatting inconsistency: mixed tab and space separators.Line 120 appears to use a tab character between
NULLvalues (NULL NULL\tNULL) while other lines use spaces. This may cause test failures or maintenance confusion.Verify and align the separator to match the expected format:
-3 NULL NULL NULL +3 NULL NULL NULL
♻️ Duplicate comments (2)
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/VarSampleGroupByFunctionFactoryTest.java (1)
133-133: Duplicate: Verify long256 test removal.Similar to
VarPopGroupByFunctionFactoryTest, thetestVarSampleLong256Values()test was removed. This is covered by the verification script in the previous file's review comment.The EOF newline addition is appropriate.
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/VarGroupByFunctionFactoryTest.java (1)
133-133: Duplicate: Verify long256 test removal.The
testVarLong256Values()test was removed, consistent with the pattern inVarPopGroupByFunctionFactoryTestandVarSampleGroupByFunctionFactoryTest. This is covered by the verification script in the first file's review.The EOF newline addition is appropriate.
🧹 Nitpick comments (2)
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/WeightedStdDevRelGroupByFunctionFactoryTest.java (1)
30-206: Comprehensive test coverage for the new weighted stddev functions.The test suite covers essential scenarios including null handling, edge cases (empty table, single row), type variations (double, float, int), and large values. Both
weighted_stddev_reland its aliasweighted_stddevare validated in parallel.Consider adding a test for negative weights to verify the function's behavior, as negative weights could arise from user data and the current implementation doesn't explicitly reject them.
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/AvgDoubleGroupByFunctionFactoryTest.java (1)
80-87: SQL column alias formatting could be clearer.The column alias
valappears on a separate line from theCASE...ENDexpression. While this may be valid QuestDB syntax, it's unusual and could be mistaken for a syntax error.Consider making the alias more explicit:
execute(""" CREATE TABLE test2 AS ( SELECT CASE WHEN rnd_double() > 0.6 then 1.0 ELSE 0.0 - END - val FROM long_sequence(100) + END AS val FROM long_sequence(100) )""");
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
core/src/main/java/io/questdb/griffin/ExpressionParser.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/GroupByFunction.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/AbstractStdDevGroupByFunction.java(2 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/AbstractWeightedStdDevGroupByFunction.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/StdDevPopGroupByFunctionFactory.java(0 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/StdDevSampleGroupByFunctionFactory.java(0 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/VarSampleGroupByFunctionFactory.java(0 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedAvgDoubleGroupByFunction.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedAvgDoubleGroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedStdDevFrequencyGroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedStdDevGroupByFunctionFactory.java(1 hunks)core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedStdDevReliabilityGroupByFunctionFactory.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/AvgDoubleGroupByFunctionFactoryTest.java(2 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevGroupByFunctionFactoryTest.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevPopGroupByFunctionFactoryTest.java(2 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevSampleGroupByFunctionFactoryTest.java(2 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/VarGroupByFunctionFactoryTest.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/VarPopGroupByFunctionFactoryTest.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/VarSampleGroupByFunctionFactoryTest.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/WeightedStdDevFreqGroupByFunctionFactoryTest.java(1 hunks)core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/WeightedStdDevRelGroupByFunctionFactoryTest.java(1 hunks)core/src/test/java/io/questdb/test/sqllogictest/SqlTest.java(2 hunks)core/src/test/resources/sqllogictest/test/sql/test_stddev.test(1 hunks)
💤 Files with no reviewable changes (3)
- core/src/main/java/io/questdb/griffin/engine/functions/groupby/StdDevSampleGroupByFunctionFactory.java
- core/src/main/java/io/questdb/griffin/engine/functions/groupby/StdDevPopGroupByFunctionFactory.java
- core/src/main/java/io/questdb/griffin/engine/functions/groupby/VarSampleGroupByFunctionFactory.java
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-19T12:21:00.062Z
Learnt from: jerrinot
Repo: questdb/questdb PR: 6413
File: core/src/test/java/io/questdb/test/cutlass/pgwire/PGJobContextTest.java:11982-12002
Timestamp: 2025-11-19T12:21:00.062Z
Learning: QuestDB Java tests use a deterministic random seed. The test utilities (e.g., io.questdb.test.tools.TestUtils and io.questdb.std.Rnd) produce reproducible sequences, so rnd_* functions (including rnd_uuid4) yield deterministic outputs across runs. Do not flag tests in core/src/test/** that assert against values produced by rnd_* as flaky due to randomness.
Applied to files:
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevPopGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevSampleGroupByFunctionFactoryTest.javacore/src/test/java/io/questdb/test/sqllogictest/SqlTest.java
📚 Learning: 2025-08-04T09:16:27.366Z
Learnt from: jerrinot
Repo: questdb/questdb PR: 5996
File: core/src/test/java/io/questdb/test/cutlass/http/line/LineHttpSenderTest.java:350-356
Timestamp: 2025-08-04T09:16:27.366Z
Learning: In QuestDB ILP tests, maximum dimension values like `(1 << 28) - 1` are intentionally hard-coded instead of using constants like `ArrayView.DIM_MAX_LEN` to serve as "canaries" that will cause test failures if the API limits change, ensuring developers are aware of breaking changes to the ILP API.
Applied to files:
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevSampleGroupByFunctionFactoryTest.java
🧬 Code graph analysis (3)
core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedAvgDoubleGroupByFunctionFactory.java (2)
core/src/main/java/io/questdb/std/IntList.java (1)
IntList(34-410)core/src/main/java/io/questdb/std/ObjList.java (1)
ObjList(34-395)
core/src/main/java/io/questdb/griffin/engine/functions/groupby/AbstractWeightedStdDevGroupByFunction.java (3)
core/src/main/java/io/questdb/cairo/ArrayColumnTypes.java (1)
ArrayColumnTypes(30-62)core/src/main/java/io/questdb/griffin/engine/functions/DoubleFunction.java (1)
DoubleFunction(41-236)core/src/main/java/io/questdb/std/Numbers.java (1)
Numbers(44-3523)
core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedAvgDoubleGroupByFunction.java (3)
core/src/main/java/io/questdb/cairo/ArrayColumnTypes.java (1)
ArrayColumnTypes(30-62)core/src/main/java/io/questdb/griffin/engine/functions/DoubleFunction.java (1)
DoubleFunction(41-236)core/src/main/java/io/questdb/std/Numbers.java (1)
Numbers(44-3523)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (33)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (24)
core/src/main/java/io/questdb/griffin/engine/functions/GroupByFunction.java (1)
193-196: Based on my verification, I found that:
The comment is accurate -
setEmptyis indeed called by generated code. TheGroupByFunctionsUpdaterFactoryuses ASM (bytecode manipulation library) to dynamically generate bytecode that invokessetEmptyviaasm.invokeInterface(setEmptyIndex, 1)(line 220).However, the review's supporting claims are misleading - The review states similar comments exist at lines 173, 198, 203, 208, 215, but examination of the file shows those lines contain comments like "// used when doing interpolation" (for setByte, setDecimal128, setDouble, etc.), not "// used by generated code". The
setEmptycomment is unique in documenting generated code usage.Here's my rewritten review comment:
Comment accurately documents generated code usage.
The
setEmptymethod is invoked by generated bytecode created byGroupByFunctionsUpdaterFactoryusing ASM (seeGroupByFunctionsUpdaterFactory.javaline 220:asm.invokeInterface(setEmptyIndex, 1)). The comment correctly distinguishes this method's usage pattern from other interface methods that are called during interpolation.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevPopGroupByFunctionFactoryTest.java (2)
82-88: Huge-values stddev_pop test looks correct and valuableThe construction
100_000_000 * xoverlong_sequence(1_000_000)and the asserted result are mathematically consistent with the closed-form population stddev for 1..n scaled by 1e8. This gives good coverage for large magnitudes and checks numerical stability without introducing any apparent overflow or runtime concerns.
92-99: Int-valued stddev_pop coverage is clear and consistentUsing
cast(x as int)fromlong_sequence(100)with the existing expected stddev ensures consistent behavior across double/float/int types and keeps the test straightforward and deterministic. No issues here.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/VarPopGroupByFunctionFactoryTest.java (1)
133-133: Long256 was never supported for variance functions—test removal is intentional and correct.The
var_popfunction signature isvar_pop(D)(double only), and long256 is not implemented in any variance function factory. The removal oftestVarPopLong256Values()aligns with the actual supported type coverage and is consistent with similar removals in related variance test files. No further action needed.The EOF newline addition is a standard formatting improvement.
core/src/main/java/io/questdb/griffin/ExpressionParser.java (1)
891-897: LGTM - tightening CAST context validation.The change correctly rejects CAST after dot-dereference (e.g.,
x.cast(...)) or after a literal, which are both invalid syntactic contexts. The simplified flow then unconditionally handles valid CAST positioning.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevGroupByFunctionFactoryTest.java (1)
133-133: Verify intentional removal oftestStddevLong256Valuestest.The AI summary indicates this test was removed. Please confirm this is intentional and that
long256type support forstddev()is either deprecated/unsupported or covered elsewhere.core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedStdDevGroupByFunctionFactory.java (1)
27-31: Clean alias implementation forweighted_stddev()→weighted_stddev_rel().The factory correctly inherits all behavior from the reliability-weighted variant while providing the alias signature.
core/src/main/java/io/questdb/griffin/engine/functions/groupby/AbstractStdDevGroupByFunction.java (2)
100-125: Correct implementation of Chan et al. parallel merge algorithm.The merge logic correctly implements the formula for combining partial Welford aggregates. Edge cases are handled properly:
- When both counts are 0, division by zero produces NaN (acceptable for no-data state)
- When only one partition is empty, the formula correctly degenerates to preserve the non-empty partition's statistics
The inline comment explaining the numerically stable weighted mean choice (lines 114-119) is appreciated.
141-143: Parallelism enabled consistently with merge implementation.core/src/test/resources/sqllogictest/test/sql/test_stddev.test (1)
2-40: Good test coverage for the new weighted stddev functions.The SQL logic tests properly validate both
weighted_stddev_relandweighted_stddev_freqfunctions with scalar, aggregate, and grouped scenarios, including proper NULL handling.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/StdDevSampleGroupByFunctionFactoryTest.java (1)
82-99: Test refactoring looks correct.The renamed test
testStddevSampHugeValuesproperly validates the function with large values (100M scale over 1M rows), and the repurposedtestStddevSampIntValuestest now focuses on int type validation.core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedAvgDoubleGroupByFunctionFactory.java (1)
47-56: Factory implementation follows QuestDB patterns correctly.The signature
weighted_avg(DD)correctly indicates two double arguments, and the factory properly delegates to the implementation class. The use ofgetQuickis safe here as the function signature guarantees exactly two arguments.core/src/main/java/io/questdb/griffin/engine/functions/groupby/AbstractWeightedStdDevGroupByFunction.java (2)
38-91: Excellent documentation with formula references.The class documentation clearly explains the two variance estimators (reliability vs frequency weights) with equivalent SQL expressions and Wikipedia references. This will help future maintainers understand the algorithm.
102-145: Incremental algorithm implementation looks correct.The
computeFirstandcomputeNextmethods correctly implement the weighted incremental variance algorithm. The null/non-finite/zero-weight checks properly skip invalid records while maintaining consistent state.core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedStdDevFrequencyGroupByFunctionFactory.java (1)
59-86: Frequency weight implementation is correct.The
getDoublemethod correctly implements the frequency weight formula with divisorwSum - 1(Bessel's correction). Thedivisor <= 0.0guard properly returns NaN when insufficient data exists for a valid calculation.core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/WeightedStdDevFreqGroupByFunctionFactoryTest.java (2)
132-141: Different single-row behavior between freq and rel variants is worth noting.The
testWeightedStddevFreqOneRowtest expects0.0while the equivalentweighted_stddev_reltest expectsnull. This is mathematically correct because:
freq: divisor =wSum - 1=17.2... - 1> 0, so returnssqrt(0/divisor) = 0rel: divisor =wSum - wSum²/wSum= 0, so returns NaNThis behavioral difference may surprise users. Consider documenting this distinction.
30-154: Good test coverage for weighted_stddev_freq.The test suite comprehensively covers various scenarios including type coercion, null handling, and edge cases, mirroring the coverage for the reliability weight variant.
core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedStdDevReliabilityGroupByFunctionFactory.java (2)
65-76: Reliability weight formula implementation is correct.The
getDoublemethod correctly implements the reliability weight divisor formulawSum - wSum²/wSum. WhenwSum=0(all null data), the division produces NaN which propagates correctly to the final result.
37-87: Factory implementation follows established patterns.The factory correctly declares the
weighted_stddev_rel(DD)signature and properly instantiates the inner function class with sample and weight arguments.core/src/main/java/io/questdb/griffin/engine/functions/groupby/WeightedAvgDoubleGroupByFunction.java (2)
71-74: Approve: Division-by-zero handling is correct.When
total_weightis zero (no valid samples), dividing by zero producesNaN, which correctly represents a null result for the weighted average. This is consistent with thesetNullmethod behavior and other aggregate functions in the codebase.
123-131: Merge logic is correct for parallel aggregation.The merge operation correctly sums both the weighted sums and total weights from source and destination, enabling proper parallel execution support.
core/src/test/java/io/questdb/test/griffin/engine/functions/groupby/AvgDoubleGroupByFunctionFactoryTest.java (3)
139-161: Excellent test coverage for weighted_avg correctness.The test validates
weighted_avg(x, x)against the equivalent formulasum(x*x)/sum(x), confirming the implementation is mathematically correct. The second assertion with random weights further validates the function handles varying weights properly.
163-193: Good edge case coverage for zero and negative weights.The tests properly validate:
- Zero weight for all samples returns
null- Conditional weighting (only weight=1 for x=7) returns the single weighted value
- Weights that sum to zero (negative + positive) returns
nullThis covers important edge cases for the weighted average implementation.
195-244: Strong formula validation for weighted standard deviation functions.Testing
weighted_stddev_relandweighted_stddev_freqagainst their explicit mathematical formulas provides excellent confidence in the implementation correctness. The use ofround(..., 4)appropriately handles floating-point precision for comparison.
.../java/io/questdb/griffin/engine/functions/groupby/AbstractWeightedStdDevGroupByFunction.java
Show resolved
Hide resolved
.../main/java/io/questdb/griffin/engine/functions/groupby/WeightedAvgDoubleGroupByFunction.java
Show resolved
Hide resolved
[PR Coverage check]😍 pass : 170 / 187 (90.91%) file detail
|
https://github.com/questdb/questdb-enterprise/pull/791
Adds
weighted_avg(),weighted_stddev_rel(),weighted_stddev_freq(). Adds aliasweighted_stddev()toweighted_stddev_rel().According to Wikipedia, there are two variants for the unbiased estimator of population variance based on a subset of it (a sample): one for frequency weights, and another for reliability weights.
A frequency weight represents the number of occurrences of the sample in the dataset. A reliability weight represents the "importance" or "trustworthiness" of the given sample.
We implement both functions, called
weighted_stddev_relfor reliability weights, andweighted_stddev_freqfor frequency weights. We also define the shorthandweighted_stddevforweighted_stddev_rel.These two:
calculate the equivalent of
This one:
calculates the equivalent of
Benchmark
We're benchmarking the markout curve query, which is the motivator for the new aggregation functions.
Data setup:
priceshas 288 million rows.ordershas 86,400 rows.The CROSS JOIN outputs 103,680,000 rows, then we perform two ASOF JOIN lookups per row, for a total of 207,360,000 ASOF JOIN lookups. Then we perform two aggregations (
weighted_avgandweighted_stddev).This takes 42 seconds on EC2,
r7a.4xlarge.