test: add regression test for text field aggregation bug by mithuncy · Pull Request #3952 · paradedb/paradedb

mithuncy · 2026-01-20T14:55:28Z

Summary

Adds pg_regress regression test for the "unexpected type Str" bug that affected v0.21.2
Tests metric aggregations (value_count, count) on TEXT fields when used as sub-aggregations inside bucket aggregations (histogram, range, terms)
The bug was fixed in tantivy commit 65b5a1a3
Includes EXPLAIN output showing query plans to verify custom scan is being used

Test Cases

GROUP BY on text field + ORDER BY count() - High cardinality triggers HashMap path
pdb.agg value_count on text field with GROUP BY - Direct value_count on text
Histogram + value_count sub-aggregation - Bucket agg with metric sub-agg on text
Range + value_count sub-aggregation - Range buckets with text field metric
Simple value_count on text field - Top-level metric (baseline, always worked)

Background

The bug occurred when SegmentStatsCollector::collect() received a text field column type (ColumnType::Str) during sub-aggregation collection. Unlike collect_block_with_field(), it lacked the is_number_or_date_type check and panicked at f64_from_fastfield_u64().

This test ensures any future tantivy regressions in this area are caught.

Add pg_regress test for the "unexpected type Str" bug that affected v0.21.2 when using metric aggregations (value_count, count, etc.) on text fields as sub-aggregations inside bucket aggregations. The bug was fixed in tantivy commit 65b5a1a3. Tests cover: - GROUP BY on text field + ORDER BY count() (high cardinality) - pdb.agg value_count on text field with GROUP BY - Histogram + value_count sub-aggregation on text field - Range + value_count sub-aggregation on text field - Simple value_count on text field (top-level, always worked)

Added EXPLAIN (FORMAT TEXT, COSTS OFF, TIMING OFF, VERBOSE) before each test query to show query plans and verify the custom scan is being used.

mdashti

LGTM.

## Summary - Adds pg_regress regression test for the "unexpected type Str" bug that affected v0.21.2 - Tests metric aggregations (value_count, count) on TEXT fields when used as sub-aggregations inside bucket aggregations (histogram, range, terms) - The bug was fixed in tantivy commit 65b5a1a3 - Includes EXPLAIN output showing query plans to verify custom scan is being used ## Test Cases 1. GROUP BY on text field + ORDER BY count() - High cardinality triggers HashMap path 2. pdb.agg value_count on text field with GROUP BY - Direct value_count on text 3. Histogram + value_count sub-aggregation - Bucket agg with metric sub-agg on text 4. Range + value_count sub-aggregation - Range buckets with text field metric 5. Simple value_count on text field - Top-level metric (baseline, always worked) ## Background The bug occurred when SegmentStatsCollector::collect() received a text field column type (ColumnType::Str) during sub-aggregation collection. Unlike collect_block_with_field(), it lacked the is_number_or_date_type check and panicked at f64_from_fastfield_u64(). This test ensures any future tantivy regressions in this area are caught.

mithuncy added 2 commits January 20, 2026 19:59

test: add EXPLAIN output to text field aggregation regression test

8da384a

Added EXPLAIN (FORMAT TEXT, COSTS OFF, TIMING OFF, VERBOSE) before each test query to show query plans and verify the custom scan is being used.

mithuncy requested review from mdashti, philippemnoel, rebasedming and stuhood as code owners January 20, 2026 14:55

mithuncy added the cherry-pick/0.23.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.23.x` after it lands. label Jan 20, 2026

mdashti approved these changes Jan 20, 2026

View reviewed changes

mithuncy added Do Not Cherry Pick PR should not be cherry-picked to other branches and removed cherry-pick/0.23.x Request that this PR to `main` should get an automatic cherry-pick PR to `0.23.x` after it lands. labels Jan 20, 2026

mithuncy merged commit c36317f into main Jan 20, 2026
21 of 23 checks passed

mithuncy deleted the fix/text-field-aggregation-regression-test branch January 20, 2026 16:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test: add regression test for text field aggregation bug#3952

test: add regression test for text field aggregation bug#3952
mithuncy merged 2 commits intomainfrom
fix/text-field-aggregation-regression-test

mithuncy commented Jan 20, 2026

Uh oh!

mdashti left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

mithuncy commented Jan 20, 2026

Summary

Test Cases

Background

Uh oh!

mdashti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants