Skip to content

feat(sql): implement percent_rank() window function#6712

Merged
bluestreak01 merged 5 commits intomasterfrom
vi_percent_rank
Jan 27, 2026
Merged

feat(sql): implement percent_rank() window function#6712
bluestreak01 merged 5 commits intomasterfrom
vi_percent_rank

Conversation

@bluestreak01
Copy link
Copy Markdown
Member

Summary

  • Implement the SQL standard percent_rank() window function
  • Returns the relative rank of the current row: (rank - 1) / (total_rows - 1)
  • Returns 0 if there is only one row in the partition

Implementation

Three inner classes handle different use cases:

  • PercentRankNoOrderFunction (ZERO_PASS): For percent_rank() over () or percent_rank() over (partition by x) without ORDER BY. All rows are peers with rank 1, so percent_rank = 0 for all rows.

  • PercentRankFunction (TWO_PASS): For percent_rank() over (order by x) without PARTITION BY. Uses two passes to first compute ranks, then calculate percent_rank using total row count.

  • PercentRankOverPartitionFunction (TWO_PASS): For percent_rank() over (partition by x order by y). Tracks per-partition row counts in a map to compute percent_rank within each partition.

Test plan

  • percent_rank() over () returns 0 for all rows
  • percent_rank() over (partition by x) returns 0 for all rows
  • percent_rank() over (order by x) computes correct values
  • percent_rank() over (partition by x order by y) computes correct values per partition
  • Single row returns 0 (both with and without partition by)
  • Single row partition returns 0
  • Error when used without OVER clause
  • Error when IGNORE/RESPECT NULLS is used
  • EXPLAIN plans show correct output

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 27, 2026

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

  • 🔍 Trigger a full review

Walkthrough

The PR introduces a new PercentRankFunctionFactory implementing the percent_rank() window function with three execution modes (with partition and order, with order only, with neither), removes deprecated window function factories from the registry, and adds comprehensive test coverage for the new function.

Changes

Cohort / File(s) Summary
PercentRankFunctionFactory implementation
core/src/main/java/io/questdb/griffin/engine/functions/window/PercentRankFunctionFactory.java
Adds new window function factory supporting percent_rank() with three modes: partition-aware (PercentRankOverPartitionFunction), order-by-only (PercentRankFunction), and no-order variants. Implements two-phase pass-based processing for ordered partitions, per-partition rank tracking, proper resource cleanup, and window context validation (rejects NULLS handling, disallows framing, requires ORDER BY for standard mode).
Window function registry cleanup
core/src/main/resources/function_list.txt
Removes 10 deprecated window function factories: RankFunctionFactory, DenseRankFunctionFactory, AvgDoubleWindowFunctionFactory, VwemaDoubleWindowFunctionFactory, FirstValueDoubleWindowFunctionFactory, SumDoubleWindowFunctionFactory, KSumDoubleWindowFunctionFactory, MaxDoubleWindowFunctionFactory, MinDoubleWindowFunctionFactory, and LastValueDoubleWindowFunctionFactory.
PercentRank function test suite
core/src/test/java/io/questdb/test/griffin/engine/window/PercentRankFunctionTest.java
Adds comprehensive test class with 21 test methods covering error handling (no OVER clause, invalid NULLS handling, framing), behavior validation with/without ORDER BY and PARTITION BY, edge cases (empty datasets, single rows, NULLs in ORDER/PARTITION columns), DESC ordering, multiple order columns, explain plan output, and combined usage with other ranking functions.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

New feature, SQL

Suggested reviewers

  • puzpuzpuz
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(sql): implement percent_rank() window function' accurately summarizes the main change: introducing a new SQL window function implementation.
Description check ✅ Passed The description clearly explains the percent_rank() window function implementation, its formula, three inner classes for different use cases, and validates against a comprehensive test plan.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bluestreak01
Copy link
Copy Markdown
Member Author

@CodeRabbit can you review this window function implementation pls?

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 27, 2026

🧩 Analysis chain

@bluestreak01 I'll review the percent_rank() window function implementation. Let me start by examining the code.


🏁 Script executed:

#!/bin/bash
# Read the main implementation file
cat core/src/main/java/io/questdb/griffin/engine/functions/window/PercentRankFunctionFactory.java

Length of output: 16091


🏁 Script executed:

#!/bin/bash
# Find and read test files for percent_rank
fd -e java -x grep -l "percent_rank" {} \; | grep -i test | head -5 | xargs cat

Length of output: 26362


🏁 Script executed:

#!/bin/bash
# Check if there are any other related files or similar window function implementations for comparison
fd "RankFunctionFactory.java" --type f

Length of output: 272


🏁 Script executed:

#!/bin/bash
# Read RankFunctionFactory for comparison
cat core/src/main/java/io/questdb/griffin/engine/functions/window/RankFunctionFactory.java

Length of output: 18703


@bluestreak01 Excellent implementation! I've thoroughly reviewed the percent_rank() window function. Here's my analysis:

Strengths

  1. Correct SQL Standard Implementation: The formula (rank - 1) / (total_rows - 1) is properly implemented with correct edge case handling (returns 0 for single row).

  2. Well-Structured Design: The three-class approach mirrors RankFunctionFactory perfectly:

    • PercentRankNoOrderFunction for no ORDER BY (all peers → percent_rank = 0)
    • PercentRankFunction for ORDER BY without PARTITION BY
    • PercentRankOverPartitionFunction for both PARTITION BY and ORDER BY
  3. Correct TWO_PASS Strategy: Unlike rank() which can use ZERO_PASS, percent_rank() correctly uses TWO_PASS because it needs the total row count. The approach of:

    • Pass1: Store rank temporarily as LONG
    • Pass2: Compute and store percent_rank as DOUBLE using total count

    is elegant and efficient.

  4. Count Tracking Logic: The count management is correct throughout:

    // Pass1: stores count + 1
    mapValue.putLong(2, count + 1);
    
    // Pass2: retrieves and adjusts
    long totalRows = mapValue.getLong(2) - 1;  // Gets actual row count
  5. Comprehensive Test Coverage (93.67%): Your tests are exemplary:

    • Edge cases: single row, empty table, all ties, single-row partitions
    • Multiple ORDER BY columns, DESC order
    • NULLs in ORDER BY and PARTITION BY
    • Combined with other ranking functions
    • Error cases (no OVER, RESPECT/IGNORE NULLS, framing)
    • EXPLAIN plans
    • Large datasets (1000 rows)
  6. Proper Resource Management: close(), reset(), reopen(), and toTop() are all correctly implemented.

📝 Minor Observation

The toPlan() implementations don't include the ORDER BY clause details:

  • PercentRankFunction.toPlan(): shows "() over ()" but doesn't mention ORDER BY
  • PercentRankOverPartitionFunction.toPlan(): shows partition by but not ORDER BY

However, this is consistent with RankFunctionFactory and doesn't affect functionality since the EXPLAIN output gets the full context from the window framework. Your test testPercentRankExplainPlan() confirms the output is correct.

🎯 Verdict

This is production-ready code. The implementation is clean, efficient, follows established patterns, and has excellent test coverage. The logic for handling partitions, peer groups, and edge cases is all correct. Great work! 🚀


🧠 Learnings used
Learnt from: bluestreak01
Repo: questdb/questdb PR: 6643
File: core/src/main/java/io/questdb/griffin/SqlOptimiser.java:8671-8687
Timestamp: 2026-01-20T00:59:19.630Z
Learning: QuestDB’s parser rejects pure window functions (e.g., row_number(), rank()) used without an OVER clause; such invocations are invalid and fail early, so validators don’t need to catch pure-name window functions in window specs.
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@bluestreak01
Copy link
Copy Markdown
Member Author

@CodeRabbit i added "order by" if exists to the plan. Please review.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Jan 27, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Read the updated PercentRankFunctionFactory to see the changes
cat core/src/main/java/io/questdb/griffin/engine/functions/window/PercentRankFunctionFactory.java

Length of output: 16446


@bluestreak01 Excellent update! 👍 I've reviewed the changes to the toPlan() methods.

Changes Look Perfect

The ORDER BY clause is now properly included in the plan output:

  1. PercentRankFunction.toPlan() (Line 231-237):

    sink.val(" over (");
    sink.val("order by ");
    sink.val(orderBy);
    sink.val(')');

    Now shows percent_rank() over (order by ...) instead of just percent_rank() over ()

  2. PercentRankOverPartitionFunction.toPlan() (Line 391-399):

    sink.val(" over (");
    sink.val("partition by ");
    sink.val(partitionByRecord.getFunctions());
    sink.val(" order by ");
    sink.val(orderBy);
    sink.val(')');

    Now shows the complete specification: percent_rank() over (partition by ... order by ...)

  3. PercentRankNoOrderFunction.toPlan() correctly omits ORDER BY since it handles the no-order case.

The orderBy field is properly captured during initRecordComparator() and used in the plan output. This makes the EXPLAIN plans much more informative and complete! The implementation is spot-on. 🚀


🧠 Learnings used
Learnt from: bluestreak01
Repo: questdb/questdb PR: 6643
File: core/src/main/java/io/questdb/griffin/SqlOptimiser.java:8671-8687
Timestamp: 2026-01-20T00:59:19.630Z
Learning: QuestDB’s parser rejects pure window functions (e.g., row_number(), rank()) used without an OVER clause; such invocations are invalid and fail early, so validators don’t need to catch pure-name window functions in window specs.

Copy link
Copy Markdown
Contributor

@javier javier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does what it says :)

@glasstiger
Copy link
Copy Markdown
Contributor

[PR Coverage check]

😍 pass : 156 / 166 (93.98%)

file detail

path covered line new line coverage
🔵 io/questdb/griffin/engine/functions/window/PercentRankFunctionFactory.java 156 166 93.98%

@bluestreak01 bluestreak01 merged commit 61e417a into master Jan 27, 2026
42 checks passed
@bluestreak01 bluestreak01 deleted the vi_percent_rank branch January 27, 2026 19:45
javier pushed a commit to questdb/documentation that referenced this pull request Jan 28, 2026
…tion (#350)

## Summary

- Add documentation for the new `percent_rank()` window function
([questdb/questdb#6712](questdb/questdb#6712))
- Comprehensive improvements to all window functions documentation for
better human and LLM comprehension

## Changes

### New Content
- **percent_rank()** documentation with syntax, description, use cases,
and examples
- **Ranking functions comparison table** showing differences between
`row_number`, `rank`, `dense_rank`, and `percent_rank`
- **Practical use cases** for all 14 window functions explaining when
and why to use each

### Documentation Quality Improvements
- Added **return types** to all function signatures (e.g., `double`,
`long`, "same type as input")
- Added **supported data types** for aggregate functions (`short`,
`int`, `long`, `float`, `double`)
- Updated **keywords and descriptions** in frontmatter for better
searchability
- Added note about **unsupported functions** (`ntile()`, `cume_dist()`)

### Example Fixes
- Updated all examples to use **TICK expressions** (`'[$today]'`)
instead of `today()`
- Fixed column names to match **demo schema** (`amount` instead of
`volume`, `BTC-USDT` instead of `BTC-USD`)
- Added **timestamp** column to all SELECT examples for consistency
- Added **demo tags** and WHERE filters to all runnable examples
- Fixed VWAP example CTE reference

## Test plan

- [x] All SQL examples validated against demo.questdb.io (34 queries
tested)
- [x] `yarn build` succeeds
- [x] Preview server renders correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants