fix(sql): fix SAMPLE BY FILL crash with array column aggregates#6811
fix(sql): fix SAMPLE BY FILL crash with array column aggregates#6811bluestreak01 merged 8 commits intomasterfrom
Conversation
Fixes #6810 SAMPLE BY with FILL(NULL), FILL(value), and FILL(PREV) crashed when the query included array column aggregates (e.g., last(arr)). The fill record types did not implement getArray(), and the null/value fill factories did not handle array column types when constructing fill constants. Changes: - SampleByFillRecord: add getArray() override that delegates to the underlying function (covers FILL(NULL), FILL(PREV), FILL(LINEAR)). - SampleByFillNullRecordCursorFactory: handle array types in the null-constant switch by yielding NullConstant.NULL. - SampleByFillValueRecordCursorFactory: handle array types in the value-fill switch by yielding NullConstant.NULL (arrays cannot be filled with scalar values, so null is the correct fallback). - FillRangeRecordCursorFactory.FillRangeRecord: add getArray() override for the ALIGN TO CALENDAR / FROM ... TO code path. - Add tests covering all SAMPLE BY FILL modes with array columns, including ALIGN TO CALENDAR and FROM ... TO ... FILL(NULL) to exercise the FillRange path.
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
please @coderabbitai, review this bug fix critically |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 494 🏁 Script executed: Length of output: 12384 🏁 Script executed: Length of output: 3603 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 18620 🏁 Script executed: Length of output: 3004 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 1534 🏁 Script executed: Length of output: 7166 🏁 Script executed: Length of output: 1080 🏁 Script executed: Length of output: 105 🏁 Script executed: Length of output: 1421 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 3501 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 1866 🏁 Script executed: Length of output: 2538 🏁 Script executed: Length of output: 3785 🏁 Script executed: Length of output: 1679 🏁 Script executed: Length of output: 10571 🏁 Script executed: Length of output: 2654 🏁 Script executed: Length of output: 739 🏁 Script executed: Length of output: 1775 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 2040 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 383 🏁 Script executed: Length of output: 121 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 0 🏁 Script executed: Length of output: 781 🏁 Script executed: Length of output: 3890 🏁 Script executed: Length of output: 253
🔴 Critical Issues1. FILL(LINEAR) is NOT actually covered despite PR claimsYour PR description says the fix covers FILL(LINEAR), but
Why no test to verify the actual behavior? // InterpolationGroupByFunction.java:86
`@Override`
public ArrayView getArray(Record rec) {
throw new UnsupportedOperationException(); // ⚠️ Still crashes!
}2. FILL(value) silently returns NULL - terrible UXIn if (ColumnType.isArray(type)) {
yield NullConstant.NULL; // Silent failure!
}This is confusing behavior:
Why not throw a clear error? Other unsupported types throw 3. Test coverage is dangerously narrowYour 7 tests only cover:
What if INT[] or LONG[] arrays behave differently? You're assuming all array types work identically without verification. 4. Missing SampleByFirstLastRecordCursorFactory coverage?I verified that 5. FillRangeRecord.getArray() doesn't validate columnType parameter`@Override`
public ArrayView getArray(int col, int columnType) {
if (gapFilling) {
return getFillFunction(col).getArray(null);
} else {
return baseRecord.getArray(col, columnType);
}
}The 🟡 Design Questions
📋 Recommendations
`@Test`
public void testSampleByFillLinearWithArrayFails() throws Exception {
assertException(
"SELECT ts, last(arr) arr FROM tab SAMPLE BY 10s FILL(LINEAR)",
"linear interpolation is not supported for array columns"
);
}
if (ColumnType.isArray(type)) {
throw SqlException.$(recordFunctionPositions.getQuick(index),
"FILL with scalar values is not supported for array columns");
}
The core fix looks correct for the crash, but the implementation feels incomplete and the UX decisions (silent null returns) are questionable. You're plugging the immediate hole without addressing the broader design issues around how arrays should behave with FILL modes. What's your thinking on the FILL(value) → NULL behavior? Intentional feature or "good enough for now"? 🧠 Learnings used |
When a user writes FILL(42) with an array column aggregate, throw a clear SqlException pointing at the fill value instead of silently substituting NULL. The error message directs users to use FILL(NULL) explicitly. Also add a negative test for FILL(LINEAR) with array columns, which is already rejected by GroupByUtils via the SAMPLE_BY_FILL_LINEAR flag check on the group-by function.
|
@coderabbitai Thanks for the thorough review. Here's what was addressed and what was not: Finding 1 - FILL(LINEAR) not covered: This is already handled at compile time by Finding 2 - FILL(value) silently returns NULL: Agreed, this was wrong. Changed Finding 3 - Test coverage for other array types (INT[], LONG[], etc.): QuestDB currently only supports DOUBLE[] for array group-by functions ( Finding 4 - SampleByFirstLastRecordCursorFactory: This factory is only used for SAMPLE BY without FILL modes, so it does not need Finding 5 - FillRangeRecord.getArray() ignores columnType: This is consistent with all other getter methods in fill records - the fill function provides the value directly and does not need the column type parameter. The |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain
🏁 Script executed: Length of output: 2674
All findings addressed comprehensively:
Excellent fixes. The crash path is fully plugged with proper error handling instead of silent failures. ✏️ Learnings added
🧠 Learnings used |
The SampleByFirstLastRecordCursorFactory optimized path for first()/last() with a single SYMBOL key did not reject array column arguments. The guard function allGroupsFirstLastWithSingleSymbolFilter now checks ColumnType.isArray() on first()/last() arguments, falling back to the general SAMPLE BY path which handles arrays correctly. The defensive check in buildFirstLastIndex now rejects all variable-width types (pow2SizeOf < 0) instead of only checking pow2SizeOf > 3, which missed arrays, strings, varchar, and binary (all have pow2SizeOf = -1). Also fixes the VALUE fill validation in GroupByUtils to only reject functions that lack SAMPLE_BY_FILL_VALUE support when the fill mode is actually VALUE, and adds getSampleByFlags() to LastArrayGroupByFunction declaring support for NONE, NULL, and PREVIOUS fill modes.
Break the default switch arm across two lines to stay within the 120-character limit enforced by CI.
|
The default implementation of If I understand correctly, arrays doesn't support
|
FirstArrayGroupByFunction.getSampleByFlags() excludes SAMPLE_BY_FILL_VALUE, same as LastArrayGroupByFunction. Since FirstNotNullArrayGroupByFunction and LastNotNullArrayGroupByFunction both extend FirstArrayGroupByFunction, all four array group-by functions now consistently reject FILL(value). Adds two negative tests for first_array with FILL(LINEAR) and FILL(value).
SampleByFirstLastRecord does not support variable-width types, so arrays must use the general SAMPLE BY path.
[PR Coverage check]😍 pass : 20 / 21 (95.24%) file detail
|
|
Fixes #6810
Summary
last(arr)). The fill record types did not implementgetArray(), and the null/value fill factories did not handle array column types when constructing fill constants.SampleByFillRecord: addgetArray()override that delegates to the underlying function, covering FILL(NULL), FILL(PREV), and FILL(LINEAR).SampleByFillNullRecordCursorFactory/SampleByFillValueRecordCursorFactory: handle array types by yieldingNullConstant.NULL(arrays cannot be filled with scalar values).FillRangeRecordCursorFactory.FillRangeRecord: addgetArray()override for the ALIGN TO CALENDAR / FROM ... TO code path.Test plan
LastArrayGroupByFunctionFactoryTestcovering all SAMPLE BY FILL modes with array columns: FILL(NONE), FILL(NULL) keyed, FILL(NULL) not-keyed, FILL(PREV), FILL(value), ALIGN TO CALENDAR, and FROM ... TO ... FILL(NULL)FillRangeRecordCursorFactorypath