Skip to content

(Test) Advanced adaptive filter selectivity evaluation#20363

Draft
adriangb wants to merge 4 commits intoapache:mainfrom
pydantic:filter-pushdown-dynamic-bytes
Draft

(Test) Advanced adaptive filter selectivity evaluation#20363
adriangb wants to merge 4 commits intoapache:mainfrom
pydantic:filter-pushdown-dynamic-bytes

Conversation

@adriangb
Copy link
Contributor

@adriangb adriangb commented Feb 15, 2026

Which issue does this PR close?

Related to filter pushdown performance optimization work.

Rationale for this change

Currently when pushdown_filters = true, DataFusion pushes all filter predicates into the Parquet reader as row-level filters (ArrowPredicates) unconditionally. This is suboptimal because:

  1. Some filters are expensive relative to their selectivity. A filter that references wide columns but prunes few rows wastes CPU decoding those columns during the row-filter phase, when it would be cheaper to apply the filter post-scan on the already-decoded batch.
  2. The old reorder_filters heuristic was static. It used compressed column size as a proxy for cost and sorted filters by that metric, but never measured actual runtime selectivity or evaluation cost. It could not adapt to data skew or runtime conditions.
  3. Dynamic join filters (e.g., from HashJoinExec) cannot be dropped even when they provide no benefit. Without a way to mark filters as optional, the system was forced to always evaluate them.

This PR introduces an adaptive filter selectivity tracking system that observes filter behavior at runtime and makes data-driven decisions about whether each filter should be pushed down as a row-level predicate or applied post-scan.

What changes are included in this PR?

1. New module: selectivity.rs (1,554 lines)

The core of this PR. Introduces SelectivityTracker, a shared, lock-guarded structure that:

  • Tracks per-filter statistics using Welford's online algorithm for numerically stable streaming mean and variance of filter "effectiveness" (bytes_pruned_per_second_of_eval_time).
  • Implements a filter state machine: each filter transitions through New -> RowFilter | PostScan -> (promoted/demoted/dropped) states based on:
    • Initial placement: uses a byte-ratio heuristic (filter_bytes / projection_bytes) to cheaply decide whether a new filter starts as a row filter or post-scan filter.
    • Promotion (PostScan -> RowFilter): when the confidence interval lower bound on effectiveness exceeds filter_pushdown_min_bytes_per_sec.
    • Demotion (RowFilter -> PostScan): when the confidence interval upper bound drops below the threshold.
    • Dropping (for optional filters only): filters wrapped in OptionalFilterPhysicalExpr can be dropped entirely when ineffective.
  • Detects dynamic filter updates via snapshot_generation(), resetting statistics when a filter's predicate changes (e.g., when a DynamicFilterPhysicalExpr from a hash join updates its value set).
  • Sorts filters by effectiveness within each partition (row-level and post-scan), so the most selective filters are applied first.

Key types:

  • SelectivityTracker -- cross-file tracker shared by all ParquetOpener instances
  • TrackerConfig -- immutable configuration (built from ParquetOptions)
  • SelectivityStats -- per-filter Welford statistics with confidence interval methods
  • FilterState -- RowFilter | PostScan | Dropped enum
  • PartitionedFilters -- output of partition_filters(), consumed by the opener
  • FilterId -- stable usize identifier assigned by ParquetSource::with_predicate

2. New wrapper: OptionalFilterPhysicalExpr (in physical_expr_common)

A transparent PhysicalExpr wrapper that marks a filter as optional -- droppable without affecting query correctness. All PhysicalExpr trait methods delegate to the inner expression. The selectivity tracker detects this via downcast_ref::<OptionalFilterPhysicalExpr>() and can drop the filter entirely when it is ineffective, rather than demoting it to post-scan.

HashJoinExec now wraps its dynamic join filters in OptionalFilterPhysicalExpr before pushing them down. This is why plan output now shows Optional(DynamicFilter [...]) instead of DynamicFilter [...].

3. Removal of reorder_filters config option

The old static reorder_filters boolean and its associated heuristic (sort by required_bytes, then can_use_index) are removed entirely. The adaptive system subsumes this:

  • FilterCandidate no longer stores required_bytes or can_use_index fields.
  • The size_of_columns() and columns_sorted() helper functions in row_filter.rs are removed.
  • Filter ordering is now handled by SelectivityTracker::partition_filters() based on measured effectiveness or byte-ratio fallback.

4. Three new configuration options (in ParquetOptions)

Option Default Purpose
filter_pushdown_min_bytes_per_sec 52,428,800 (50 MiB/s) Throughput threshold for promoting a filter to row-level. 0.0 = all promoted, INFINITY = none promoted (feature disabled).
filter_collecting_byte_ratio_threshold 0.15 Byte-ratio threshold for initial filter placement. Filters whose columns use < 15% of projected bytes start as row filters; otherwise post-scan.
filter_confidence_z 2.0 Z-score for confidence intervals (~95%). Controls how much evidence is needed before promoting or demoting a filter.

5. Changes to ParquetOpener / opener.rs

  • Predicates are now stored as Vec<(FilterId, Arc<dyn PhysicalExpr>)> instead of a single combined Arc<dyn PhysicalExpr>.
  • The opener calls selectivity_tracker.partition_filters() to split filters into row-level vs. post-scan.
  • Row-level filters are built via build_row_filter() (updated signature).
  • Post-scan filters are applied in apply_post_scan_filters_with_stats(), a new function that evaluates each filter individually, reports per-filter timing and selectivity back to the tracker, and combines results into a single boolean mask.
  • The limit is only applied to the Parquet reader when there are no post-scan filters (otherwise limiting would cut off rows before the filter could find matches).
  • The projection mask is expanded to include columns needed by post-scan filters.
  • A new filter_apply_time metric tracks post-scan filter evaluation time.

6. Changes to ParquetSource / source.rs

  • Internal predicate storage changed from Option<Arc<dyn PhysicalExpr>> to Option<Vec<(FilterId, Arc<dyn PhysicalExpr>)>>.
  • with_predicate() now splits the predicate into conjuncts and assigns stable FilterIds (indices).
  • SelectivityTracker is stored as a shared Arc on ParquetSource and passed to all openers.
  • with_table_parquet_options() now builds a fresh SelectivityTracker from the three new config values.
  • with_reorder_filters() and reorder_filters() methods are removed.

7. Changes to build_row_filter() / row_filter.rs

  • Signature changed: takes Vec<(FilterId, Arc<dyn PhysicalExpr>)> + &Arc<SelectivityTracker> instead of &Arc<dyn PhysicalExpr> + reorder_predicates: bool.
  • Returns RowFilterWithMetrics (new struct) containing both the RowFilter and any unbuildable filters that must be applied post-scan.
  • DatafusionArrowPredicate now carries a FilterId and Arc<SelectivityTracker>, reporting per-batch evaluation metrics back to the tracker after each evaluate() call.
  • No reordering is done inside build_row_filter -- filters arrive pre-ordered by the tracker.

8. Changes to HashJoinExec

  • Dynamic join filters are now wrapped in OptionalFilterPhysicalExpr before being pushed down.
  • When receiving a pushed-down filter back, the join unwraps OptionalFilterPhysicalExpr to find the inner DynamicFilterPhysicalExpr.

9. Protobuf schema updates

  • reorder_filters field (tag 6) marked as reserved in datafusion_common.proto.
  • Three new optional fields added: filter_pushdown_min_bytes_per_sec (tag 35), filter_collecting_byte_ratio_threshold (tag 40), filter_confidence_z (tag 41).
  • Corresponding serialization/deserialization code updated in pbjson.rs, prost.rs, from_proto, to_proto, and file_formats.rs.

10. Test and benchmark updates

  • All references to reorder_filters removed from tests and benchmarks.
  • Existing filter pushdown tests set filter_pushdown_min_bytes_per_sec = 0.0 to preserve deterministic behavior (all filters always pushed down).
  • Snapshot test expectations updated from DynamicFilter [...] to Optional(DynamicFilter [...]).
  • New unit tests in selectivity.rs covering: effectiveness calculation, Welford's algorithm, confidence intervals, state machine transitions (initial placement, promotion, demotion, dropping), dynamic filter generation tracking, filter ordering, and integration lifecycle tests.
  • One expected output change in explain_analyze.rs (output_rows=8 -> output_rows=5) due to the adaptive system now placing some filters as post-scan that were previously row-level, causing slight row count differences in EXPLAIN ANALYZE output.

Are these changes tested?

Yes:

  • Existing tests: All existing pushdown_filters and filter pushdown SLT tests pass (with filter_pushdown_min_bytes_per_sec = 0.0 to force all filters to row-level for deterministic behavior).
  • New unit tests: Comprehensive tests in selectivity.rs (~450 lines of tests) covering the SelectivityStats calculator, TrackerConfig builder, state machine transitions (initial placement, promotion, demotion, dropping, reset on generation change), filter ordering, and full promotion/demotion lifecycle integration tests.
  • Updated snapshot tests: All physical optimizer filter pushdown snapshot tests updated to reflect the Optional(...) wrapper on dynamic filters.
  • Updated SLT tests: dynamic_filter_pushdown_config.slt, information_schema.slt, preserve_file_partitioning.slt, projection_pushdown.slt, push_down_filter.slt, and repartition_subset_satisfaction.slt updated.
  • Benchmark data included: benchmarks/results.txt shows TPC-H (13 faster, 6 slower, 3 unchanged), TPC-DS (33 faster, 31 slower, 35 unchanged, with notable 24x improvement on Q64), and ClickBench (18 faster, 12 slower, 13 unchanged) results.

Are there any user-facing changes?

Yes:

  1. reorder_filters config option removed. This is a breaking change. Users who set SET datafusion.execution.parquet.reorder_filters = true will get an error. The adaptive system replaces this functionality automatically.

  2. Three new config options added under datafusion.execution.parquet:

    • filter_pushdown_min_bytes_per_sec (default: 52428800)
    • filter_collecting_byte_ratio_threshold (default: 0.15)
    • filter_confidence_z (default: 2.0)
  3. Changed default behavior when pushdown_filters = true. Previously, all filters were unconditionally pushed into the Parquet reader. Now, the adaptive system decides per-filter based on byte-ratio thresholds and runtime effectiveness measurements. To restore the old behavior of pushing all filters unconditionally, set filter_pushdown_min_bytes_per_sec = 0.0.

  4. EXPLAIN plan output changes. Dynamic join filters now display as Optional(DynamicFilter [...]) instead of DynamicFilter [...], reflecting their new optional wrapper.

  5. Deprecated predicate() method signature changed. ParquetSource::predicate() now returns Option<Arc<dyn PhysicalExpr>> (owned) instead of Option<&Arc<dyn PhysicalExpr>> (reference). This method was already deprecated in favor of filter().

@github-actions github-actions bot added documentation Improvements or additions to documentation physical-expr Changes to the physical-expr crates core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) common Related to common crate proto Related to proto crate datasource Changes to the datasource crate physical-plan Changes to the physical-plan crate labels Feb 15, 2026
@adriangb
Copy link
Contributor Author

run benchmark tpcds
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@adriangb
Copy link
Contributor Author

run benchmark clickbench_partitioned
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (dbab02b) to 53b0ffb diff using: clickbench_partitioned
Results will be posted here when complete

@adriangb
Copy link
Contributor Author

run benchmark tpch
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and filter-pushdown-dynamic-bytes
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ filter-pushdown-dynamic-bytes ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.52 ms │                       2.78 ms │  1.10x slower │
│ QQuery 1  │    52.24 ms │                      49.61 ms │ +1.05x faster │
│ QQuery 2  │   132.32 ms │                     136.95 ms │     no change │
│ QQuery 3  │   160.45 ms │                     159.43 ms │     no change │
│ QQuery 4  │  1006.07 ms │                    1001.72 ms │     no change │
│ QQuery 5  │  1264.63 ms │                    1240.53 ms │     no change │
│ QQuery 6  │    17.57 ms │                       6.46 ms │ +2.72x faster │
│ QQuery 7  │    68.54 ms │                      55.34 ms │ +1.24x faster │
│ QQuery 8  │  1432.88 ms │                    1357.09 ms │ +1.06x faster │
│ QQuery 9  │  1783.22 ms │                    1723.13 ms │     no change │
│ QQuery 10 │   469.85 ms │                     339.66 ms │ +1.38x faster │
│ QQuery 11 │   515.70 ms │                     392.95 ms │ +1.31x faster │
│ QQuery 12 │  1424.70 ms │                    1142.60 ms │ +1.25x faster │
│ QQuery 13 │  2086.66 ms │                    1758.57 ms │ +1.19x faster │
│ QQuery 14 │  1465.46 ms │                    1166.84 ms │ +1.26x faster │
│ QQuery 15 │  1205.75 ms │                    1142.22 ms │ +1.06x faster │
│ QQuery 16 │  2455.92 ms │                    2414.92 ms │     no change │
│ QQuery 17 │  2453.39 ms │                    2375.62 ms │     no change │
│ QQuery 18 │  4799.30 ms │                    4693.76 ms │     no change │
│ QQuery 19 │   141.25 ms │                     142.58 ms │     no change │
│ QQuery 20 │  1861.44 ms │                    1845.91 ms │     no change │
│ QQuery 21 │  2312.86 ms │                    2179.66 ms │ +1.06x faster │
│ QQuery 22 │  3956.56 ms │                    4079.52 ms │     no change │
│ QQuery 23 │  1067.61 ms │                    4758.55 ms │  4.46x slower │
│ QQuery 24 │   244.71 ms │                     185.57 ms │ +1.32x faster │
│ QQuery 25 │   635.55 ms │                     447.91 ms │ +1.42x faster │
│ QQuery 26 │   318.23 ms │                     204.37 ms │ +1.56x faster │
│ QQuery 27 │  2945.79 ms │                    2432.80 ms │ +1.21x faster │
│ QQuery 28 │ 23684.76 ms │                   23006.72 ms │     no change │
│ QQuery 29 │   948.88 ms │                     986.96 ms │     no change │
│ QQuery 30 │  1269.24 ms │                    1229.75 ms │     no change │
│ QQuery 31 │  1311.55 ms │                    1343.10 ms │     no change │
│ QQuery 32 │  4161.16 ms │                    3919.03 ms │ +1.06x faster │
│ QQuery 33 │  5017.59 ms │                    5183.95 ms │     no change │
│ QQuery 34 │  5507.17 ms │                    5238.45 ms │     no change │
│ QQuery 35 │  1862.72 ms │                    1797.80 ms │     no change │
│ QQuery 36 │   171.63 ms │                     185.39 ms │  1.08x slower │
│ QQuery 37 │    90.51 ms │                      72.77 ms │ +1.24x faster │
│ QQuery 38 │    85.76 ms │                     111.77 ms │  1.30x slower │
│ QQuery 39 │   286.54 ms │                     332.16 ms │  1.16x slower │
│ QQuery 40 │    56.28 ms │                      39.13 ms │ +1.44x faster │
│ QQuery 41 │    50.29 ms │                      35.28 ms │ +1.43x faster │
│ QQuery 42 │    36.46 ms │                      32.72 ms │ +1.11x faster │
└───────────┴─────────────┴───────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 80821.72ms │
│ Total Time (filter-pushdown-dynamic-bytes)   │ 80952.02ms │
│ Average Time (HEAD)                          │  1879.57ms │
│ Average Time (filter-pushdown-dynamic-bytes) │  1882.61ms │
│ Queries Faster                               │         20 │
│ Queries Slower                               │          5 │
│ Queries with No Change                       │         18 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (dbab02b) to 53b0ffb diff using: tpcds
Results will be posted here when complete

@Dandandan
Copy link
Contributor

show benchmark queue

@alamb-ghbot
Copy link

🤖 Hi @Dandandan, you asked to view the benchmark queue (#20363 (comment)).

Job User Benchmarks Comment
20363_3903261774.sh adriangb tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903261774
20363_3903262814.sh adriangb tpch (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903262814
20365_3903537986.sh Dandandan tpch tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20365#issuecomment-3903537986

@Dandandan
Copy link
Contributor

show benchmark queue

@alamb-ghbot
Copy link

🤖 Hi @Dandandan, you asked to view the benchmark queue (#20363 (comment)).

Job User Benchmarks Comment
20363_3903261774.sh adriangb tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903261774
20363_3903262814.sh adriangb tpch (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903262814
20365_3903537986.sh Dandandan tpch tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20365#issuecomment-3903537986
20365_3903568877.sh Dandandan clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20365#issuecomment-3903568877

@Dandandan
Copy link
Contributor

Hm it seems stuck again

@Dandandan
Copy link
Contributor

FYI @alamb

Hm it seems stuck again

@adriangb
Copy link
Contributor Author

@Dandandan this is mostly vibe coded, I'm only 50% confident it even makes sense without reviewing the code fwiw

@adriangb adriangb force-pushed the filter-pushdown-dynamic-bytes branch from e0240af to 09cdb0b Compare February 15, 2026 13:11
@adriangb
Copy link
Contributor Author

show benchmark queue

@alamb-ghbot
Copy link

🤖 Hi @adriangb, you asked to view the benchmark queue (#20363 (comment)).

Job User Benchmarks Comment
20363_3903261774.sh adriangb tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903261774
20363_3903262814.sh adriangb tpch (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903262814
20365_3903537986.sh Dandandan tpch tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20365#issuecomment-3903537986
20365_3903568877.sh Dandandan clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20365#issuecomment-3903568877

@adriangb
Copy link
Contributor Author

Wonder if I'm infinite looping it or something :(

@Dandandan
Copy link
Contributor

Wonder if I'm infinite looping it or something :(

Yes I think previously it got stuck during infinite loops / extremely long running tasks.

@adriangb
Copy link
Contributor Author

Wonder if I'm infinite looping it or something :(

Yes I think previously it got stuck during infinite loops / extremely long running tasks.

My bad I’ll try to add a PR to have timeouts and a cancel command

@adriangb
Copy link
Contributor Author

show benchmark queue

@alamb-ghbot
Copy link

🤖 Hi @adriangb, you asked to view the benchmark queue (#20363 (comment)).

Job User Benchmarks Comment
20363_3903261774.sh adriangb tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903261774
20363_3903262814.sh adriangb tpch (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3903262814
20365_3903537986.sh Dandandan tpch tpcds (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20365#issuecomment-3903537986
20365_3903568877.sh Dandandan clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20365#issuecomment-3903568877
arrow-9414-3904516323.sh Dandandan arrow_reader_clickbench https://github.com/apache/arrow-rs/pull/9414#issuecomment-3904516323

@alamb
Copy link
Contributor

alamb commented Feb 15, 2026

run benchmark tpch

@Dandandan
Copy link
Contributor

I think it is stuck again 😆

@Dandandan
Copy link
Contributor

@alamb could you take a look? Somehow the result is also empty.

@adriangb
Copy link
Contributor Author

Yeah, seems like it’s always tpcds? I don’t think it’s this branch necessarily, it got stuck on your branch earlier and this one has been pretty much completely rewritten since last time it got stuck here.

@Dandandan
Copy link
Contributor

Yeah, seems like it’s always tpcds? I don’t think it’s this branch necessarily, it got stuck on your branch earlier and this one has been pretty much completely rewritten since last time it got stuck here.

Hmmm could be...

@adriangb
Copy link
Contributor Author

run benchmark clickbench_partitioned
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@adriangb
Copy link
Contributor Author

show benchmark queue

@alamb-ghbot
Copy link

🤖 Hi @adriangb, you asked to view the benchmark queue (#20363 (comment)).

Job User Benchmarks Comment
20534_3955110156.sh alamb sql_planner https://github.com/apache/datafusion/pull/20534#issuecomment-3955110156
20481_3955122956.sh alamb default https://github.com/apache/datafusion/pull/20481#issuecomment-3955122956
20481_3955149024.sh Dandandan clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20481#issuecomment-3955149024
20363_3955225206.sh adriangb clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3955225206

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (aaa75f2) to b9328b9 diff using: clickbench_partitioned
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and filter-pushdown-dynamic-bytes
--------------------
Benchmark clickbench_partitioned.json
--------------------

@adriangb
Copy link
Contributor Author

run benchmark clickbench_partitioned
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@adriangb
Copy link
Contributor Author

show benchmark queue

@alamb-ghbot
Copy link

🤖 Hi @adriangb, you asked to view the benchmark queue (#20363 (comment)).

Job User Benchmarks Comment
19728_3959360119.sh alamb clickbench_partitioned https://github.com/apache/datafusion/pull/19728#issuecomment-3959360119
19728_3959359739.sh alamb clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/19728#issuecomment-3959359739
20363_3959693326.sh adriangb clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20363#issuecomment-3959693326
20481_3960310838.sh Dandandan clickbench_partitioned (env: DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true) https://github.com/apache/datafusion/pull/20481#issuecomment-3960310838

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (aaa75f2) to b9328b9 diff using: clickbench_partitioned
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and filter-pushdown-dynamic-bytes
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ filter-pushdown-dynamic-bytes ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.53 ms │                       2.64 ms │     no change │
│ QQuery 1  │    52.35 ms │                      52.96 ms │     no change │
│ QQuery 2  │   131.00 ms │                     134.25 ms │     no change │
│ QQuery 3  │   154.14 ms │                     151.73 ms │     no change │
│ QQuery 4  │  1010.08 ms │                    1014.38 ms │     no change │
│ QQuery 5  │  1238.25 ms │                    1262.65 ms │     no change │
│ QQuery 6  │    17.37 ms │                       7.24 ms │ +2.40x faster │
│ QQuery 7  │    68.70 ms │                      68.96 ms │     no change │
│ QQuery 8  │  1358.20 ms │                    1408.47 ms │     no change │
│ QQuery 9  │  1734.13 ms │                    1763.40 ms │     no change │
│ QQuery 10 │   482.46 ms │                     487.49 ms │     no change │
│ QQuery 11 │   534.68 ms │                     546.47 ms │     no change │
│ QQuery 12 │  1376.64 ms │                    1400.04 ms │     no change │
│ QQuery 13 │  2090.56 ms │                    2049.19 ms │     no change │
│ QQuery 14 │  1392.07 ms │                    1405.61 ms │     no change │
│ QQuery 15 │  1153.69 ms │                    1188.94 ms │     no change │
│ QQuery 16 │  2420.32 ms │                    2468.34 ms │     no change │
│ QQuery 17 │  2439.14 ms │                    2475.68 ms │     no change │
│ QQuery 18 │  4696.48 ms │                    4827.59 ms │     no change │
│ QQuery 19 │   138.08 ms │                     140.65 ms │     no change │
│ QQuery 20 │  1770.09 ms │                    1777.28 ms │     no change │
│ QQuery 21 │  2183.88 ms │                    2157.75 ms │     no change │
│ QQuery 22 │  3747.64 ms │                    2942.05 ms │ +1.27x faster │
│ QQuery 23 │  1045.27 ms │                    1207.91 ms │  1.16x slower │
│ QQuery 24 │   241.92 ms │                     204.22 ms │ +1.18x faster │
│ QQuery 25 │   608.39 ms │                     628.69 ms │     no change │
│ QQuery 26 │   336.69 ms │                     238.10 ms │ +1.41x faster │
│ QQuery 27 │  2799.20 ms │                    2438.20 ms │ +1.15x faster │
│ QQuery 28 │ 21908.37 ms │                   23904.20 ms │  1.09x slower │
│ QQuery 29 │   998.99 ms │                     952.53 ms │     no change │
│ QQuery 30 │  1265.27 ms │                    1272.97 ms │     no change │
│ QQuery 31 │  1311.55 ms │                    1304.77 ms │     no change │
│ QQuery 32 │  3951.87 ms │                    4131.34 ms │     no change │
│ QQuery 33 │  4967.33 ms │                    5084.78 ms │     no change │
│ QQuery 34 │  5583.07 ms │                    5855.00 ms │     no change │
│ QQuery 35 │  1842.97 ms │                    1874.90 ms │     no change │
│ QQuery 36 │   181.72 ms │                     168.30 ms │ +1.08x faster │
│ QQuery 37 │    86.77 ms │                      86.91 ms │     no change │
│ QQuery 38 │    84.89 ms │                      94.07 ms │  1.11x slower │
│ QQuery 39 │   289.83 ms │                     291.41 ms │     no change │
│ QQuery 40 │    59.72 ms │                      54.82 ms │ +1.09x faster │
│ QQuery 41 │    51.56 ms │                      56.45 ms │  1.09x slower │
│ QQuery 42 │    38.89 ms │                      52.88 ms │  1.36x slower │
└───────────┴─────────────┴───────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 77846.73ms │
│ Total Time (filter-pushdown-dynamic-bytes)   │ 79636.21ms │
│ Average Time (HEAD)                          │  1810.39ms │
│ Average Time (filter-pushdown-dynamic-bytes) │  1852.00ms │
│ Queries Faster                               │          7 │
│ Queries Slower                               │          5 │
│ Queries with No Change                       │         31 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

@adriangb
Copy link
Contributor Author

run benchmark clickbench_partitioned
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (3a4511f) to b9328b9 diff using: clickbench_partitioned
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and filter-pushdown-dynamic-bytes
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ filter-pushdown-dynamic-bytes ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     2.56 ms │                       2.63 ms │     no change │
│ QQuery 1  │    52.46 ms │                      52.89 ms │     no change │
│ QQuery 2  │   131.74 ms │                     130.76 ms │     no change │
│ QQuery 3  │   157.86 ms │                     156.25 ms │     no change │
│ QQuery 4  │   995.97 ms │                    1031.06 ms │     no change │
│ QQuery 5  │  1270.00 ms │                    1264.02 ms │     no change │
│ QQuery 6  │    17.80 ms │                       9.34 ms │ +1.90x faster │
│ QQuery 7  │    68.58 ms │                      65.71 ms │     no change │
│ QQuery 8  │  1365.22 ms │                    1415.75 ms │     no change │
│ QQuery 9  │  1776.90 ms │                    1764.53 ms │     no change │
│ QQuery 10 │   479.01 ms │                     483.21 ms │     no change │
│ QQuery 11 │   528.89 ms │                     530.44 ms │     no change │
│ QQuery 12 │  1371.54 ms │                    1384.82 ms │     no change │
│ QQuery 13 │  2018.88 ms │                    2040.67 ms │     no change │
│ QQuery 14 │  1388.12 ms │                    1403.30 ms │     no change │
│ QQuery 15 │  1146.12 ms │                    1176.75 ms │     no change │
│ QQuery 16 │  2444.29 ms │                    2529.31 ms │     no change │
│ QQuery 17 │  2430.13 ms │                    2482.21 ms │     no change │
│ QQuery 18 │  5264.45 ms │                    4906.84 ms │ +1.07x faster │
│ QQuery 19 │   135.54 ms │                     137.71 ms │     no change │
│ QQuery 20 │  1740.32 ms │                    1750.65 ms │     no change │
│ QQuery 21 │  2208.20 ms │                    2144.18 ms │     no change │
│ QQuery 22 │  3728.19 ms │                    3088.91 ms │ +1.21x faster │
│ QQuery 23 │  1055.02 ms │                    1068.56 ms │     no change │
│ QQuery 24 │   243.01 ms │                     212.54 ms │ +1.14x faster │
│ QQuery 25 │   599.75 ms │                     608.94 ms │     no change │
│ QQuery 26 │   340.35 ms │                     227.70 ms │ +1.49x faster │
│ QQuery 27 │  2769.48 ms │                    2339.84 ms │ +1.18x faster │
│ QQuery 28 │ 22177.26 ms │                   23878.20 ms │  1.08x slower │
│ QQuery 29 │   967.67 ms │                     955.31 ms │     no change │
│ QQuery 30 │  1288.14 ms │                    1231.92 ms │     no change │
│ QQuery 31 │  1341.31 ms │                    1321.31 ms │     no change │
│ QQuery 32 │  4533.90 ms │                    4026.53 ms │ +1.13x faster │
│ QQuery 33 │  5366.53 ms │                    5044.04 ms │ +1.06x faster │
│ QQuery 34 │  5411.59 ms │                    5346.41 ms │     no change │
│ QQuery 35 │  1833.12 ms │                    1857.85 ms │     no change │
│ QQuery 36 │   177.99 ms │                     159.91 ms │ +1.11x faster │
│ QQuery 37 │    89.31 ms │                      72.87 ms │ +1.23x faster │
│ QQuery 38 │    87.03 ms │                      91.05 ms │     no change │
│ QQuery 39 │   278.76 ms │                     281.03 ms │     no change │
│ QQuery 40 │    56.86 ms │                      48.66 ms │ +1.17x faster │
│ QQuery 41 │    49.16 ms │                      33.22 ms │ +1.48x faster │
│ QQuery 42 │    35.87 ms │                      52.40 ms │  1.46x slower │
└───────────┴─────────────┴───────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 79424.85ms │
│ Total Time (filter-pushdown-dynamic-bytes)   │ 78810.20ms │
│ Average Time (HEAD)                          │  1847.09ms │
│ Average Time (filter-pushdown-dynamic-bytes) │  1832.80ms │
│ Queries Faster                               │         12 │
│ Queries Slower                               │          2 │
│ Queries with No Change                       │         29 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

@adriangb
Copy link
Contributor Author

run benchmark clickbench_extended
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@adriangb
Copy link
Contributor Author

run benchmark tcph
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (3a4511f) to b9328b9 diff using: clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖 Hi @adriangb, thanks for the request (#20363 (comment)).

scrape_comments.py only supports whitelisted benchmarks.

  • Standard: clickbench_1, clickbench_extended, clickbench_partitioned, clickbench_pushdown, external_aggr, tpcds, tpch, tpch10, tpch_mem, tpch_mem10
  • Criterion: aggregate_query_sql, aggregate_vectorized, case_when, character_length, in_list, left, plan_reuse, range_and_generate_series, replace, reset_plan_states, sort, sql_planner, strpos, substr_index, with_hashes

Please choose one or more of these with run benchmark <name> or run benchmark <name1> <name2>...

You can also set environment variables on subsequent lines:

run benchmark tpch_mem
DATAFUSION_RUNTIME_MEMORY_LIMIT=1G

Unsupported benchmarks: tcph.

@adriangb
Copy link
Contributor Author

run benchmark tpch
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and filter-pushdown-dynamic-bytes
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ filter-pushdown-dynamic-bytes ┃         Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0 │  2166.07 ms │                    2159.88 ms │      no change │
│ QQuery 1 │   870.73 ms │                     920.67 ms │   1.06x slower │
│ QQuery 2 │  1755.11 ms │                    1764.73 ms │      no change │
│ QQuery 3 │  1034.51 ms │                    1020.70 ms │      no change │
│ QQuery 4 │  2157.67 ms │                    2248.62 ms │      no change │
│ QQuery 5 │ 28430.69 ms │                   28133.92 ms │      no change │
│ QQuery 6 │   109.39 ms │                   11631.36 ms │ 106.33x slower │
│ QQuery 7 │  2637.42 ms │                    2858.60 ms │   1.08x slower │
└──────────┴─────────────┴───────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 39161.59ms │
│ Total Time (filter-pushdown-dynamic-bytes)   │ 50738.48ms │
│ Average Time (HEAD)                          │  4895.20ms │
│ Average Time (filter-pushdown-dynamic-bytes) │  6342.31ms │
│ Queries Faster                               │          0 │
│ Queries Slower                               │          3 │
│ Queries with No Change                       │          5 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (3a4511f) to b9328b9 diff using: tpch
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and filter-pushdown-dynamic-bytes
--------------------
Benchmark tpch_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ filter-pushdown-dynamic-bytes ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 184.85 ms │                     175.10 ms │ +1.06x faster │
│ QQuery 2  │  98.02 ms │                      91.78 ms │ +1.07x faster │
│ QQuery 3  │ 167.39 ms │                     129.22 ms │ +1.30x faster │
│ QQuery 4  │ 134.67 ms │                      83.88 ms │ +1.61x faster │
│ QQuery 5  │ 305.28 ms │                     301.44 ms │     no change │
│ QQuery 6  │ 192.92 ms │                      59.22 ms │ +3.26x faster │
│ QQuery 7  │ 274.67 ms │                     282.27 ms │     no change │
│ QQuery 8  │ 326.54 ms │                     327.62 ms │     no change │
│ QQuery 9  │ 408.51 ms │                     530.61 ms │  1.30x slower │
│ QQuery 10 │ 271.23 ms │                     310.94 ms │  1.15x slower │
│ QQuery 11 │  75.72 ms │                      65.50 ms │ +1.16x faster │
│ QQuery 12 │ 256.99 ms │                     162.81 ms │ +1.58x faster │
│ QQuery 13 │ 216.94 ms │                     226.47 ms │     no change │
│ QQuery 14 │ 113.60 ms │                     206.07 ms │  1.81x slower │
│ QQuery 15 │ 191.77 ms │                     131.22 ms │ +1.46x faster │
│ QQuery 16 │  73.43 ms │                      66.58 ms │ +1.10x faster │
│ QQuery 17 │ 223.77 ms │                     236.98 ms │  1.06x slower │
│ QQuery 18 │ 488.89 ms │                     498.84 ms │     no change │
│ QQuery 19 │ 155.51 ms │                     149.29 ms │     no change │
│ QQuery 20 │ 153.22 ms │                     220.46 ms │  1.44x slower │
│ QQuery 21 │ 350.53 ms │                     294.18 ms │ +1.19x faster │
│ QQuery 22 │  65.47 ms │                      59.63 ms │ +1.10x faster │
└───────────┴───────────┴───────────────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 4729.92ms │
│ Total Time (filter-pushdown-dynamic-bytes)   │ 4610.09ms │
│ Average Time (HEAD)                          │  215.00ms │
│ Average Time (filter-pushdown-dynamic-bytes) │  209.55ms │
│ Queries Faster                               │        11 │
│ Queries Slower                               │         5 │
│ Queries with No Change                       │         6 │
│ Queries with Failure                         │         0 │
└──────────────────────────────────────────────┴───────────┘

@adriangb
Copy link
Contributor Author

run benchmark clickbench_extended
DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true
DATAFUSION_EXECUTION_PARQUET_REORDER_FILTERS=true

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing filter-pushdown-dynamic-bytes (3a4511f) to b9328b9 diff using: clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and filter-pushdown-dynamic-bytes
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ filter-pushdown-dynamic-bytes ┃         Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━┩
│ QQuery 0 │  2239.73 ms │                    2226.99 ms │      no change │
│ QQuery 1 │   895.61 ms │                     884.02 ms │      no change │
│ QQuery 2 │  1698.72 ms │                    1721.93 ms │      no change │
│ QQuery 3 │  1007.64 ms │                    1061.55 ms │   1.05x slower │
│ QQuery 4 │  2246.31 ms │                    2249.39 ms │      no change │
│ QQuery 5 │ 28217.27 ms │                   28120.76 ms │      no change │
│ QQuery 6 │   109.15 ms │                   11511.62 ms │ 105.47x slower │
│ QQuery 7 │  2708.26 ms │                    2673.31 ms │      no change │
└──────────┴─────────────┴───────────────────────────────┴────────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                            ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                            │ 39122.69ms │
│ Total Time (filter-pushdown-dynamic-bytes)   │ 50449.58ms │
│ Average Time (HEAD)                          │  4890.34ms │
│ Average Time (filter-pushdown-dynamic-bytes) │  6306.20ms │
│ Queries Faster                               │          0 │
│ Queries Slower                               │          2 │
│ Queries with No Change                       │          6 │
│ Queries with Failure                         │          0 │
└──────────────────────────────────────────────┴────────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate datasource Changes to the datasource crate documentation Improvements or additions to documentation physical-expr Changes to the physical-expr crates physical-plan Changes to the physical-plan crate proto Related to proto crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants