Skip to content

fix(sql): order-by ignored in sub-queries of aggregation#6414

Merged
ideoma merged 10 commits intomasterfrom
rd_fix_order_by_aggregations
Nov 20, 2025
Merged

fix(sql): order-by ignored in sub-queries of aggregation#6414
ideoma merged 10 commits intomasterfrom
rd_fix_order_by_aggregations

Conversation

@RaphDal
Copy link
Copy Markdown
Contributor

@RaphDal RaphDal commented Nov 19, 2025

Fixes #6405

This PR fixes a bug from the optimizeOrderBy pass that was removing order-by in sub-queries of aggregations.

Functions like first/last/first_not_null/last_not_null needs to receive the rows in the expected order to work properly.

Example

select sym, ts, first(b)
from (
select x.sym, x.ts, y.b
from x
left join y on x.sym = y.sym
order by x.ts, y.ts -- optimizeOrderBy was removing this clause
);

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Nov 19, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

This PR fixes an issue where ORDER BY clauses in subqueries were being discarded during GROUP BY operations. The fix forces the order-by mnemonic to ORDER_BY_REQUIRED when the model is a GROUP BY to ensure correct ordering semantics are preserved through aggregation operations.

Changes

Cohort / File(s) Summary
Core Fix
core/src/main/java/io/questdb/griffin/SqlOptimiser.java
Adds a new rule in optimiseOrderBy() method to detect GROUP BY models and force the order-by mnemonic to ORDER_BY_REQUIRED, preventing ORDER BY clauses from being incorrectly discarded in aggregate queries.
Test Coverage
core/src/test/java/io/questdb/test/griffin/OrderByAdviceTest.java
Introduces new test method testOrderByAggregateKeepsOrdering() that validates ORDER BY preservation in queries combining table joins, aggregation functions, and explicit ordering.

Sequence Diagram

sequenceDiagram
    participant Query as Outer Query<br/>(ORDER BY ts)
    participant Optimizer as SqlOptimiser
    participant Subquery as Subquery<br/>(GROUP BY + Aggregate)
    participant InnerOrder as Inner ORDER BY<br/>(t.ts, p.ts)

    Query->>Optimizer: optimiseOrderBy()
    alt Before Fix
        Optimizer->>Subquery: Check model type
        Note over Optimizer: Discards ORDER BY<br/>from subquery
        Subquery->>Query: Returns unordered results
    else After Fix
        Optimizer->>Subquery: Check model type
        Optimizer->>Subquery: Detect GROUP BY model
        Optimizer->>Subquery: Force ORDER_BY_REQUIRED
        Subquery->>InnerOrder: Preserve inner ordering
        InnerOrder->>Query: Returns correctly ordered results
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • SqlOptimiser.java: The logic change is focused but requires verification that the new GROUP BY detection rule doesn't inadvertently affect other order-by optimization scenarios or edge cases.
  • OrderByAdviceTest.java: The test method is straightforward and directly mirrors the bug report scenario, making validation straightforward but important for regression prevention.
  • Key attention areas: Verify the GROUP BY detection logic is correct and complete; ensure no existing tests regress with this change.

Suggested reviewers

  • bluestreak01

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main fix: preventing order-by from being ignored in sub-queries of aggregation operations, which directly matches the changeset's core objective.
Description check ✅ Passed The description is well-related to the changeset, explaining the bug being fixed, referencing the linked issue, and providing a clear example of the problem scenario.
Linked Issues check ✅ Passed The code changes directly address issue #6405 by modifying SqlOptimiser to prevent ORDER BY removal in GROUP BY sub-queries and adding a test case that validates the fix works correctly.
Out of Scope Changes check ✅ Passed All changes are directly related to fixing the ORDER BY bug in aggregation sub-queries; no unrelated modifications to other components are present.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@RaphDal
Copy link
Copy Markdown
Contributor Author

RaphDal commented Nov 19, 2025

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Nov 19, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
core/src/main/java/io/questdb/griffin/SqlOptimiser.java (1)

3366-3369: Forcing ORDER_BY_REQUIRED for GROUP BY models correctly preserves nested ordering for order-sensitive aggregates

Setting orderByMnemonic = ORDER_BY_REQUIRED whenever model.getSelectModelType() == SELECT_MODEL_GROUP_BY ensures that, for global aggregates like first(...)/last(...) without an explicit GROUP BY list, the nested sub-query is optimised with topLevelOrderByMnemonic == ORDER_BY_REQUIRED instead of ORDER_BY_INVARIANT. That prevents the nested call to optimiseOrderBy from entering the default branch that clears model.getOrderBy(), so inner ORDER BY clauses are retained and aggregates that depend on row order behave correctly (fixing the reported bug).

This is a conservative choice (it may keep more ordering than strictly necessary for order-insensitive aggregates) but is reasonable for correctness; any further refinement (e.g., distinguishing order-sensitive vs order-insensitive aggregates) would be an optional future optimisation, not required for this fix.

core/src/test/java/io/questdb/test/griffin/OrderByAdviceTest.java (1)

261-326: Consider adding inline documentation to explain the test's purpose.

This test validates a critical fix for issue #6405, where ORDER BY clauses in subqueries were being incorrectly stripped during aggregation. Given the query's complexity (nested ORDER BY, JOIN with range conditions, order-dependent first() aggregate), adding a brief comment would help future maintainers understand:

  1. Why the ORDER BY in the subquery (line 318) is essential for correctness
  2. That first() depends on row order and would return incorrect results without preserved ordering
  3. The specific scenario being tested (aggregation over joined subquery results)

Consider adding a comment like:

@Test
public void testOrderByAggregateKeepsOrdering() throws Exception {
    // Regression test for issue #6405: ORDER BY in subqueries must be preserved
    // when used with order-dependent aggregates (first, last, first_not_null, last_not_null).
    // The optimizer was incorrectly removing ORDER BY clauses during GROUP BY operations,
    // causing first()/last() to return wrong results.
    
    execute("""
        ...
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e88abf7 and efc4bce.

📒 Files selected for processing (2)
  • core/src/main/java/io/questdb/griffin/SqlOptimiser.java (2 hunks)
  • core/src/test/java/io/questdb/test/griffin/OrderByAdviceTest.java (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (35)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-other)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-pgwire)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-sub)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-cairo-root)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz2)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-fuzz1)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-sub)
  • GitHub Check: New pull request (SelfHosted Running tests with cover on linux-griffin-root)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-x86-graal)
  • GitHub Check: New pull request (SelfHosted Griffin tests on linux-x86-graal)
  • GitHub Check: New pull request (Rust Test and Lint on linux-jdk17)
  • GitHub Check: New pull request (Hosted Running tests on windows-other-2)
  • GitHub Check: New pull request (Hosted Running tests on windows-other-1)
  • GitHub Check: New pull request (Hosted Running tests on windows-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on windows-cairo-2)
  • GitHub Check: New pull request (Hosted Running tests on windows-cairo-1)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz2)
  • GitHub Check: New pull request (Hosted Running tests on windows-fuzz1)
  • GitHub Check: New pull request (Hosted Running tests on windows-griffin-sub)
  • GitHub Check: New pull request (Hosted Running tests on windows-griffin-base)
  • GitHub Check: New pull request (Hosted Running tests on mac-other)
  • GitHub Check: New pull request (Hosted Running tests on mac-pgwire)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo-fuzz)
  • GitHub Check: New pull request (Hosted Running tests on mac-cairo)
  • GitHub Check: New pull request (SelfHosted Griffin tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Other tests on linux-arm64)
  • GitHub Check: New pull request (Hosted Running tests on mac-griffin)
  • GitHub Check: New pull request (SelfHosted Griffin tests on linux-arm64)
  • GitHub Check: New pull request (SelfHosted Cairo tests on linux-x64-zfs)
  • GitHub Check: New pull request (SelfHosted Cairo tests on linux-x86-graal)
  • GitHub Check: New pull request (SelfHosted Cairo tests on linux-arm64)
  • GitHub Check: New pull request (Check Changes Check changes)
  • GitHub Check: build
  • GitHub Check: build
🔇 Additional comments (1)
core/src/main/java/io/questdb/griffin/SqlOptimiser.java (1)

3312-3320: Comment clarifies existing LIMIT/timestamp behavior; no functional change

The updated comments around the LIMIT and explicit timestamp handling match the existing logic and improve readability without changing semantics. Nothing to fix here.

kafka1991
kafka1991 previously approved these changes Nov 20, 2025
Copy link
Copy Markdown
Collaborator

@kafka1991 kafka1991 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 37 / 37 (100.00%)

file detail

path covered line new line coverage
🔵 io/questdb/std/Chars.java 9 9 100.00%
🔵 io/questdb/griffin/SqlCodeGenerator.java 1 1 100.00%
🔵 io/questdb/griffin/SqlOptimiser.java 27 27 100.00%

@ideoma ideoma merged commit 5555187 into master Nov 20, 2025
41 checks passed
@ideoma ideoma deleted the rd_fix_order_by_aggregations branch November 20, 2025 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ORDER BY clause is wrongly ignored

5 participants