Skip to content

fix: Aggregate custom scan handles NULL bool values in GROUP BY#4582

Merged
rebasedming merged 5 commits intomainfrom
worktree-fix+agg-bool-and-migration
Mar 30, 2026
Merged

fix: Aggregate custom scan handles NULL bool values in GROUP BY#4582
rebasedming merged 5 commits intomainfrom
worktree-fix+agg-bool-and-migration

Conversation

@rebasedming
Copy link
Copy Markdown
Collaborator

Ticket(s) Closed

  • Closes #

What

  • GROUP BY on a nullable bool column with the aggregate custom scan crashed with: InvalidArgument("Missing value U64(2) for field ... is not supported for column type Bool")
  • The cause was that Tantivy's terms aggregation rejects all numeric Key variants for Bool columns
  • Instead, use Key::Str(NULL_SENTINEL_*) for Bool fields, which routes through Tantivy's MissingTermAgg collector

Why

How

Tests

Added regression test

@rebasedming rebasedming requested a review from a team as a code owner March 30, 2026 20:59
@rebasedming rebasedming added the cherry-pick/0.22.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.22.x` after it lands. label Mar 30, 2026
@rebasedming rebasedming requested a review from stuhood March 30, 2026 20:59
rebasedming and others added 3 commits March 30, 2026 14:00
Use Key::Str sentinel for bool fields in terms aggregation instead of
skipping them. This routes through Tantivy's MissingTermAgg collector,
which properly counts docs missing the field. NULL bools now form their
own group in GROUP BY results.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@rebasedming rebasedming force-pushed the worktree-fix+agg-bool-and-migration branch from 390d782 to bffb05f Compare March 30, 2026 21:01
@rebasedming rebasedming force-pushed the worktree-fix+agg-bool-and-migration branch from e5e7833 to 95f4a3b Compare March 30, 2026 21:04
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.79%. Comparing base (725708e) to head (9360f38).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #4582      +/-   ##
==========================================
- Coverage   81.10%   80.79%   -0.31%     
==========================================
  Files         183      184       +1     
  Lines       45117    45282     +165     
==========================================
- Hits        36593    36587       -6     
- Misses       8524     8695     +171     
Files with missing lines Coverage Δ
pg_search/src/aggregate/mod.rs 91.71% <100.00%> (+0.39%) ⬆️
...earch/src/postgres/customscan/aggregatescan/mod.rs 88.54% <100.00%> (-0.55%) ⬇️

... and 18 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mdashti mdashti requested review from mdashti and removed request for stuhood March 30, 2026 22:54
Copy link
Copy Markdown
Contributor

@mdashti mdashti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rebasedming Thanks for the PR.

Comment thread pg_search/tests/pg_regress/sql/agg-bool-terms.sql Outdated
Comment thread pg_search/tests/pg_regress/expected/agg-bool-terms.out
Comment thread pg_search/src/aggregate/mod.rs
Comment thread pg_search/src/postgres/customscan/aggregatescan/mod.rs
Test 2 was grouping by category (text), testing the non-fast field error
path rather than the bool fix. Nullable bool GROUP BY is already covered
by Test 4b.
@rebasedming rebasedming merged commit 99b6ddf into main Mar 30, 2026
16 checks passed
@rebasedming rebasedming deleted the worktree-fix+agg-bool-and-migration branch March 30, 2026 23:22
paradedb-bot pushed a commit that referenced this pull request Mar 30, 2026
…4582)

# Ticket(s) Closed

- Closes #

## What

- `GROUP BY` on a nullable bool column with the aggregate custom scan
crashed with: `InvalidArgument("Missing value U64(2) for field ... is
not supported for column type Bool")`
- The cause was that Tantivy's terms aggregation rejects all numeric
`Key` variants for Bool columns
- Instead, use `Key::Str(NULL_SENTINEL_*)` for Bool fields, which routes
through Tantivy's `MissingTermAgg` collector

## Why

## How

## Tests

Added regression test

---------

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
(cherry picked from commit 99b6ddf)
@paradedb-bot
Copy link
Copy Markdown
Contributor

Successfully created backport PR for 0.22.x:

rebasedming added a commit that referenced this pull request Mar 30, 2026
…4582)

# Ticket(s) Closed

- Closes #

## What

- `GROUP BY` on a nullable bool column with the aggregate custom scan
crashed with: `InvalidArgument("Missing value U64(2) for field ... is
not supported for column type Bool")`
- The cause was that Tantivy's terms aggregation rejects all numeric
`Key` variants for Bool columns
- Instead, use `Key::Str(NULL_SENTINEL_*)` for Bool fields, which routes
through Tantivy's `MissingTermAgg` collector

## Why

## How

## Tests

Added regression test

---------

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
(cherry picked from commit 99b6ddf)
rebasedming added a commit that referenced this pull request Mar 30, 2026
…4584)

# Description
Backport of #4582 to `0.22.x`.

Co-authored-by: Ming <[email protected]>
Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick/0.22.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.22.x` after it lands.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants