feat(sql): implement EMA window function via avg()#6650
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdds an EMA window function factory and tests, adjusts WindowContext API to expose timestamp type instead of random-access flag, updates SqlExecutionContextImpl callsite, and adds two keyword checks for "alpha" and "weeks". Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/src/main/java/io/questdb/griffin/engine/window/WindowContextImpl.java (1)
56-77: Missing reset oftimestampTypeinclear()method.The new
timestampTypefield is not being reset in theclear()method, unlike other similar fields such astimestampIndex(line 74). This inconsistency could lead to stale values persisting if the context is cleared and reused.Proposed fix
this.timestampIndex = -1; this.ignoreNulls = false; this.nullsDescPos = 0; + this.timestampType = ColumnType.UNDEFINED; }
🧹 Nitpick comments (2)
core/src/main/java/io/questdb/griffin/engine/functions/window/EmaDoubleWindowFunctionFactory.java (1)
168-178: CharSequence may be backed by a reusable buffer; convert to String for safe storage.The
kindCharSequence fromkindArg.getStrA(null)is passed directly to the constructor and stored inkindStr. If the underlying buffer is reused during query execution, this could lead to corrupted plan output.Suggested fix
if (mode == MODE_TIME_WEIGHTED) { return new EmaTimeWeightedOverPartitionFunction( map, partitionByRecord, partitionBySink, valueArg, timestampIndex, tau, - kind, + kind.toString(), paramValue );Apply the same fix at line 197 for
EmaTimeWeightedOverUnboundedRowsFrameFunction.core/src/test/java/io/questdb/test/griffin/engine/window/WindowFunctionTest.java (1)
3414-3416: Consider relocating section comment.This section header for "AVG with EMA mode" tests appears at line 3414, but the actual EMA tests are located much earlier (lines 231-1186). Consider moving this comment to immediately precede the EMA tests for better code organization.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
core/src/main/java/io/questdb/griffin/SqlExecutionContextImpl.javacore/src/main/java/io/questdb/griffin/SqlKeywords.javacore/src/main/java/io/questdb/griffin/engine/functions/window/EmaDoubleWindowFunctionFactory.javacore/src/main/java/io/questdb/griffin/engine/window/WindowContext.javacore/src/main/java/io/questdb/griffin/engine/window/WindowContextImpl.javacore/src/test/java/io/questdb/test/griffin/engine/window/WindowFunctionTest.java
💤 Files with no reviewable changes (1)
- core/src/main/java/io/questdb/griffin/SqlExecutionContextImpl.java
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-10T14:28:48.329Z
Learnt from: mtopolnik
Repo: questdb/questdb PR: 0
File: :0-0
Timestamp: 2025-11-10T14:28:48.329Z
Learning: In AsOfJoinDenseRecordCursorFactoryBase.java, the `backwardScanExhausted` flag is intentionally NOT reset in `toTop()` because backward scan results are reusable across cursor rewinds. The backward scan caches historical matches that remain valid when the cursor is rewound.
Applied to files:
core/src/main/java/io/questdb/griffin/engine/window/WindowContextImpl.java
📚 Learning: 2025-11-19T12:21:00.062Z
Learnt from: jerrinot
Repo: questdb/questdb PR: 6413
File: core/src/test/java/io/questdb/test/cutlass/pgwire/PGJobContextTest.java:11982-12002
Timestamp: 2025-11-19T12:21:00.062Z
Learning: QuestDB Java tests use a deterministic random seed. The test utilities (e.g., io.questdb.test.tools.TestUtils and io.questdb.std.Rnd) produce reproducible sequences, so rnd_* functions (including rnd_uuid4) yield deterministic outputs across runs. Do not flag tests in core/src/test/** that assert against values produced by rnd_* as flaky due to randomness.
Applied to files:
core/src/test/java/io/questdb/test/griffin/engine/window/WindowFunctionTest.java
🧬 Code graph analysis (2)
core/src/main/java/io/questdb/griffin/engine/functions/window/EmaDoubleWindowFunctionFactory.java (6)
core/src/main/java/io/questdb/cairo/map/MapFactory.java (1)
MapFactory(34-145)core/src/main/java/io/questdb/cairo/sql/VirtualRecord.java (1)
VirtualRecord(40-285)core/src/main/java/io/questdb/griffin/SqlException.java (1)
SqlException(37-273)core/src/main/java/io/questdb/griffin/SqlKeywords.java (1)
SqlKeywords(32-2508)core/src/main/java/io/questdb/std/IntList.java (1)
IntList(34-410)core/src/main/java/io/questdb/std/ObjList.java (1)
ObjList(34-403)
core/src/test/java/io/questdb/test/griffin/engine/window/WindowFunctionTest.java (1)
core/src/main/java/io/questdb/griffin/engine/functions/window/EmaDoubleWindowFunctionFactory.java (1)
EmaDoubleWindowFunctionFactory(70-635)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
- GitHub Check: New pull request (Coverage Report Coverage Report)
- GitHub Check: New pull request (Hosted Running tests on windows-other-2)
- GitHub Check: New pull request (Hosted Running tests on windows-other-1)
- GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
- GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
- GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
- GitHub Check: New pull request (Hosted Running tests on mac-other)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
- GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
- GitHub Check: New pull request (Hosted Running tests on mac-cairo)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
- GitHub Check: New pull request (Trigger Enterprise CI Trigger Enterprise Pipeline)
- GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
- GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
- GitHub Check: New pull request (Hosted Running tests on mac-griffin)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
- GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
- GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
- GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
- GitHub Check: New pull request (Check Changes Check changes)
🔇 Additional comments (24)
core/src/main/java/io/questdb/griffin/SqlKeywords.java (2)
63-70: LGTM!The
isAlphaKeywordmethod correctly implements case-insensitive detection of "alpha" following the established pattern used throughout this class.
2296-2303: LGTM!The
isWeeksKeywordmethod correctly implements case-insensitive detection of "weeks" and appropriately complements the existingisWeekKeywordmethod (singular form) following the same pattern used for other time-unit keywords (days/day, hours/hour, etc.).core/src/main/java/io/questdb/griffin/engine/window/WindowContext.java (1)
71-72: LGTM!The new
getTimestampType()method is a clean addition to the interface, enabling timestamp precision-aware operations for the EMA window function. This appropriately replaces the removedbaseSupportsRandomAccess()with a more specific API for timestamp type handling.core/src/main/java/io/questdb/griffin/engine/window/WindowContextImpl.java (2)
142-145: LGTM!Standard getter implementation following the established pattern in this class.
175-219: LGTM!The
timestampTypeparameter is correctly used for timestamp driver conversions (lines 205-212) and properly assigned to the instance field. The integration withColumnType.getTimestampDriver()enables proper handling of both microsecond and nanosecond timestamp precision for the EMA window function.core/src/main/java/io/questdb/griffin/engine/functions/window/EmaDoubleWindowFunctionFactory.java (5)
400-410: LGTM on time-weighted alpha calculation.The edge case handling for
dt <= 0(same timestamp or backwards time) withalpha = 1.0is reasonable—it effectively replaces the EMA with the new value when time doesn't advance. The exponential decay formula is correctly implemented.
458-531: LGTM on non-partitioned fixed-alpha EMA.State management with
reset()andtoTop()properly reinitializes the EMA state. NULL handling correctly preserves the previous EMA value.
533-627: LGTM on non-partitioned time-weighted EMA.The implementation is consistent with the partitioned version. State reset in
reset()andtoTop()correctly handlesprevTimestamp. The unconditional timestamp update at line 582 ensures time decay continues across NULL values.
629-634: LGTM on static initializer.The column types correctly match the map value usage in the partitioned
computeNextmethods.
235-336: Map state is properly cleaned up by the base class.The
EmaOverPartitionFunctioncorrectly inherits map cleanup behavior fromBasePartitionedWindowFunction, which implements bothreset()(freeing the map) andtoTop()(clearing map contents). No stale partition data will persist across re-executions.core/src/test/java/io/questdb/test/griffin/engine/window/WindowFunctionTest.java (14)
37-37: LGTM!Import is correctly added and alphabetically ordered with other window function factory imports.
231-254: LGTM!Correctly tests the alpha=1.0 boundary case where EMA equals the current value.
256-284: LGTM!EMA calculations with alpha=0.5 are mathematically verified and correct.
286-317: LGTM!Partition-based EMA calculations are correct, verifying that each partition maintains independent state.
337-445: LGTM!Comprehensive exception handling tests covering all validation paths: alpha bounds, framing restrictions, infinity/NaN, invalid kinds, null checks, constant requirements, and ORDER BY requirements.
447-505: LGTM!Good coverage of EXPLAIN plan output for all EMA modes (alpha, period, time-weighted) with and without partitions.
507-645: LGTM!Comprehensive time unit tests covering all supported units (microsecond, millisecond, second, minute, hour, day, week) with good coverage of singular/plural and case-insensitive variants.
647-652: LGTM!Simple factory validation test ensuring EmaDoubleWindowFunctionFactory is correctly recognized as a window function.
654-668: LGTM!Good stress test for memory handling with 1000 rows across 10 partitions.
670-788: LGTM!Comprehensive NULL handling tests covering: NULL skipping with previous value retention, all-NULL partitions returning NaN, and first-row-NULL scenarios.
790-864: LGTM!Period mode tests correctly verify the alpha = 2/(N+1) conversion, including the edge case of period=1 (alpha=1.0).
866-1052: LGTM!Comprehensive time-weighted mode tests including state reset verification, same-timestamp edge case (dt=0 → alpha=1), single row, and partitioned scenarios with correct dt calculations.
1054-1159: LGTM!Important tests verifying that NULL values in time-weighted mode preserve the previous EMA while still updating the timestamp reference for subsequent dt calculations. This aligns with the implementation behavior documented in the referenced code lines.
1161-1186: LGTM!Correctly tests the dt=0 edge case in partitioned time-weighted mode where identical timestamps result in alpha=1.0.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
...src/main/java/io/questdb/griffin/engine/functions/window/EmaDoubleWindowFunctionFactory.java
Show resolved
Hide resolved
|
@CodeRabbit review again |
|
✅ Actions performedReview triggered.
|
[PR Coverage check]😍 pass : 240 / 246 (97.56%) file detail
|
Summary
avg(value, kind, param)syntax'alpha'- direct smoothing factor (0 < alpha ≤ 1)'period'- N-period EMA where alpha = 2 / (N + 1)'second','minute','hour','day','week') - time-weighted decay using alpha = 1 - exp(-Δt / τ)PARTITION BYfor computing separate EMAs per groupUsage Examples
Test plan
🤖 Generated with Claude Code