Skip to content

[C++][Parquet] Support read by row ranges #39392

@huberylee

Description

@huberylee

Describe the enhancement requested

FileReader supports reading data based on specified RowRanges to provide the most fundamental Filter pushdown capability to various upper-level computing engines. This implementation mainly consists of three parts:

  • Implementing a ChunkBufferedInputStream, which takes ReadRanges as initialization parameters, provides simple IO merging capabilities internally, and shields the upper layer from the discontinuity of IO, ensuring that ColumnReader can read filtered Page data as if reading continuous Pages, achieving the purpose of saving IO.
  • Implementing the specific data reading based on RowRanges, including interface definitions and the entire data reading pipeline.
  • Necessary testing and benchmarking.

According to the benchmark results, compared to scanning the entire column chunk, utilizing page pruning yields significant performance improvements. In scenarios with a low number of matched rows, single-column scans exhibit a performance boost of 1 to 30 times, while multi-column scans show an improvement of 10 to 16 times. However, in cases where a larger number of rows are matched, as the number of hit RowRanges increases, the scanning performance gradually deteriorates, potentially even experiencing performance regression. Here are some benchmark test results:

./build/relwithdebinfo/parquet-arrow-page-pruning-benchmark --benchmark_counters_tabular=true --benchmark_min_warmup_time=1 --benchmark_filter=BM_SingleColumn_NumPages
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                                                          Time             CPU   Iterations    HitRows  TotalPage items_per_second
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int32Type>/hit_page(%):10/hit_rows:1/iterations:50        63511 ns        63500 ns           50          1          8        15.748k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int32Type>/hit_page(%):30/hit_rows:1/iterations:50       134182 ns       134180 ns           50          3          8        22.358k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int32Type>/hit_page(%):50/hit_rows:1/iterations:50       215335 ns       211120 ns           50          4          8       18.9466k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int32Type>/hit_page(%):70/hit_rows:1/iterations:50       496982 ns       432440 ns           50          6          8       13.8748k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int32Type>/hit_page(%):100/hit_rows:1/iterations:50      600318 ns       481400 ns           50          8          8       16.6182k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<false,::arrow::Int32Type>/iterations:50                       1754878 ns      1666260 ns           50         2M          8       1.20029G/s
BM_SingleColumn_NumPages_ReadRowGroup<false,::arrow::Int32Type>/iterations:50                                1328994 ns      1329000 ns           50         2M          8       1.50489G/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int32Type>/hit_page(%):10/hit_rows:1/iterations:50        402561 ns       378580 ns           50          1          4       2.64145k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int32Type>/hit_page(%):30/hit_rows:1/iterations:50        389339 ns       389340 ns           50          2          4        5.1369k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int32Type>/hit_page(%):50/hit_rows:1/iterations:50       1056578 ns      1005800 ns           50          2          4       1.98847k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int32Type>/hit_page(%):70/hit_rows:1/iterations:50       1252089 ns      1021040 ns           50          3          4       2.93818k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int32Type>/hit_page(%):100/hit_rows:1/iterations:50      1146366 ns      1116160 ns           50          4          4       3.58372k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<true,::arrow::Int32Type>/iterations:50                        9214042 ns      8720260 ns           50         2M          4       229.351M/s
BM_SingleColumn_NumPages_ReadRowGroup<true,::arrow::Int32Type>/iterations:50                                 9003524 ns      8660100 ns           50         2M          4       230.944M/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int64Type>/hit_page(%):10/hit_rows:1/iterations:50       240033 ns       168320 ns           50          2         16       11.8821k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int64Type>/hit_page(%):30/hit_rows:1/iterations:50       309177 ns       273640 ns           50          5         16       18.2722k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int64Type>/hit_page(%):50/hit_rows:1/iterations:50       748207 ns       656960 ns           50          8         16       12.1773k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int64Type>/hit_page(%):70/hit_rows:1/iterations:50       530234 ns       529200 ns           50         12         16       22.6757k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::Int64Type>/hit_page(%):100/hit_rows:1/iterations:50      795477 ns       689480 ns           50         16         16       23.2059k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<false,::arrow::Int64Type>/iterations:50                       2168220 ns      1999420 ns           50         2M         16       1000.29M/s
BM_SingleColumn_NumPages_ReadRowGroup<false,::arrow::Int64Type>/iterations:50                                2220696 ns      2015920 ns           50         2M         16       992.103M/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int64Type>/hit_page(%):10/hit_rows:1/iterations:50        229509 ns       205660 ns           50          1          8       4.86239k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int64Type>/hit_page(%):30/hit_rows:1/iterations:50        844970 ns       826700 ns           50          3          8       3.62889k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int64Type>/hit_page(%):50/hit_rows:1/iterations:50        533323 ns       533320 ns           50          4          8       7.50019k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int64Type>/hit_page(%):70/hit_rows:1/iterations:50       1397721 ns      1262320 ns           50          6          8       4.75315k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::Int64Type>/hit_page(%):100/hit_rows:1/iterations:50      1734829 ns      1595120 ns           50          8          8        5.0153k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<true,::arrow::Int64Type>/iterations:50                        8717569 ns      8309240 ns           50         2M          8       240.696M/s
BM_SingleColumn_NumPages_ReadRowGroup<true,::arrow::Int64Type>/iterations:50                                 8686948 ns      8002060 ns           50         2M          8       249.936M/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::FloatType>/hit_page(%):10/hit_rows:1/iterations:50        89482 ns        82020 ns           50          1          8       12.1921k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::FloatType>/hit_page(%):30/hit_rows:1/iterations:50       273510 ns       188620 ns           50          3          8        15.905k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::FloatType>/hit_page(%):50/hit_rows:1/iterations:50       406551 ns       226700 ns           50          4          8       17.6445k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::FloatType>/hit_page(%):70/hit_rows:1/iterations:50       373658 ns       339440 ns           50          6          8       17.6762k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::FloatType>/hit_page(%):100/hit_rows:1/iterations:50      542894 ns       496540 ns           50          8          8       16.1115k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<false,::arrow::FloatType>/iterations:50                       1985082 ns      1883040 ns           50         2M          8       1062.11M/s
BM_SingleColumn_NumPages_ReadRowGroup<false,::arrow::FloatType>/iterations:50                                1785856 ns      1625300 ns           50         2M          8       1.23054G/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::FloatType>/hit_page(%):10/hit_rows:1/iterations:50        680770 ns       491100 ns           50          1          4       2.03625k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::FloatType>/hit_page(%):30/hit_rows:1/iterations:50       1005213 ns       999940 ns           50          2          4       2.00012k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::FloatType>/hit_page(%):50/hit_rows:1/iterations:50        702304 ns       646740 ns           50          2          4       3.09243k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::FloatType>/hit_page(%):70/hit_rows:1/iterations:50        740994 ns       547620 ns           50          3          4       5.47825k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::FloatType>/hit_page(%):100/hit_rows:1/iterations:50      1838819 ns      1731000 ns           50          4          4        2.3108k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<true,::arrow::FloatType>/iterations:50                        8214554 ns      8150740 ns           50         2M          4       245.376M/s
BM_SingleColumn_NumPages_ReadRowGroup<true,::arrow::FloatType>/iterations:50                                 9686748 ns      8991060 ns           50         2M          4       222.443M/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::DoubleType>/hit_page(%):10/hit_rows:1/iterations:50      128584 ns       128600 ns           50          2         16       15.5521k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::DoubleType>/hit_page(%):30/hit_rows:1/iterations:50      336180 ns       305060 ns           50          5         16       16.3902k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::DoubleType>/hit_page(%):50/hit_rows:1/iterations:50      373698 ns       373680 ns           50          8         16       21.4087k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::DoubleType>/hit_page(%):70/hit_rows:1/iterations:50      590868 ns       545020 ns           50         12         16       22.0175k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::DoubleType>/hit_page(%):100/hit_rows:1/iterations:50     704253 ns       704260 ns           50         16         16       22.7189k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<false,::arrow::DoubleType>/iterations:50                      1677933 ns      1676200 ns           50         2M         16       1.19318G/s
BM_SingleColumn_NumPages_ReadRowGroup<false,::arrow::DoubleType>/iterations:50                               1903368 ns      1815540 ns           50         2M         16        1.1016G/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::DoubleType>/hit_page(%):10/hit_rows:1/iterations:50       119272 ns       119240 ns           50          1          8       8.38645k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::DoubleType>/hit_page(%):30/hit_rows:1/iterations:50       438818 ns       437980 ns           50          3          8       6.84963k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::DoubleType>/hit_page(%):50/hit_rows:1/iterations:50       779543 ns       779540 ns           50          4          8       5.13123k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::DoubleType>/hit_page(%):70/hit_rows:1/iterations:50      1040031 ns      1040020 ns           50          6          8       5.76912k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::DoubleType>/hit_page(%):100/hit_rows:1/iterations:50     1367778 ns      1367420 ns           50          8          8       5.85043k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<true,::arrow::DoubleType>/iterations:50                       7576036 ns      7575580 ns           50         2M          8       264.006M/s
BM_SingleColumn_NumPages_ReadRowGroup<true,::arrow::DoubleType>/iterations:50                                7514859 ns      7496620 ns           50         2M          8       266.787M/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::StringType>/hit_page(%):10/hit_rows:1/iterations:50      298405 ns       298400 ns           50          3         27       10.0536k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::StringType>/hit_page(%):30/hit_rows:1/iterations:50     1201950 ns      1201960 ns           50          9         27       7.48777k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::StringType>/hit_page(%):50/hit_rows:1/iterations:50     1182310 ns      1182300 ns           50         14         27       11.8413k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::StringType>/hit_page(%):70/hit_rows:1/iterations:50     2323919 ns      2217760 ns           50         19         27        8.5672k/s
BM_SingleColumn_NumPages_PagePruning<false, ::arrow::StringType>/hit_page(%):100/hit_rows:1/iterations:50    2412017 ns      2412000 ns           50         27         27        11.194k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<false,::arrow::StringType>/iterations:50                     23347672 ns     23340280 ns           50         2M         27       85.6888M/s
BM_SingleColumn_NumPages_ReadRowGroup<false,::arrow::StringType>/iterations:50                              22411320 ns     22402560 ns           50         2M         27       89.2755M/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::StringType>/hit_page(%):10/hit_rows:1/iterations:50       282815 ns       282800 ns           50          2         14       7.07214k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::StringType>/hit_page(%):30/hit_rows:1/iterations:50      1002259 ns      1002220 ns           50          5         14       4.98892k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::StringType>/hit_page(%):50/hit_rows:1/iterations:50      1343057 ns      1342980 ns           50          7         14       5.21229k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::StringType>/hit_page(%):70/hit_rows:1/iterations:50      1687520 ns      1687520 ns           50         10         14       5.92586k/s
BM_SingleColumn_NumPages_PagePruning<true, ::arrow::StringType>/hit_page(%):100/hit_rows:1/iterations:50     1936757 ns      1930300 ns           50         14         14       7.25276k/s
BM_SingleColumn_NumPages_PagePruningWithHitAll<true,::arrow::StringType>/iterations:50                      16042260 ns     16013700 ns           50         2M         14       124.893M/s
BM_SingleColumn_NumPages_ReadRowGroup<true,::arrow::StringType>/iterations:50                               15811481 ns     15811240 ns           50         2M         14       126.492M/s


./build/relwithdebinfo/parquet-arrow-page-pruning-benchmark --benchmark_counters_tabular=true --benchmark_min_warmup_time=1 --benchmark_filter=BM_MultipleColumns
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                        Time             CPU   Iterations    HitRows MaxPageNum MinPageNum  TotalPage items_per_second
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_MultipleColumns_PagePruning/hit_page(%):5/hit_rows:1/iterations:50      6179859 ns      5986020 ns           50          2         31          1        283        334.112/s
BM_MultipleColumns_PagePruning/hit_page(%):10/hit_rows:1/iterations:50    11490077 ns     11259660 ns           50          4         31          1        283        355.251/s
BM_MultipleColumns_PagePruning/hit_page(%):30/hit_rows:1/iterations:50    17940528 ns     17802260 ns           50         10         31          1        283        561.726/s
BM_MultipleColumns_PagePruning/hit_page(%):50/hit_rows:1/iterations:50    25005303 ns     24890520 ns           50         16         31          1        283        642.815/s
BM_MultipleColumns_PagePruning/hit_page(%):70/hit_rows:1/iterations:50    30059767 ns     29288740 ns           50         22         31          1        283        751.142/s
BM_MultipleColumns_PagePruning/hit_page(%):100/hit_rows:1/iterations:50   32490276 ns     32479600 ns           50         31         31          1        283        954.445/s
BM_MultipleColumns_ReadRowGroup/iterations:50                            370906187 ns    370519020 ns           50         2M         31          1        283       5.39783M/s

./build/relwithdebinfo/parquet-arrow-page-pruning-benchmark --benchmark_counters_tabular=true --benchmark_min_warmup_time=1 --benchmark_filter=BM_SingleColumn_NumRanges
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Benchmark                                                                                                                   Time             CPU   Iterations    HitRows  TotalPage items_per_second
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:500/iterations:50         1487395 ns      1405880 ns           50       999k          8       710.587M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:1000/iterations:50        1795974 ns      1580800 ns           50       999k          8       631.959M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:10000/iterations:50       5383828 ns      5271540 ns           50       990k          8       187.801M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:100000/iterations:50     44106347 ns     42609500 ns           50   782.496k          8       18.3644M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:1000000/iterations:50    55134386 ns     53774440 ns           50      1000k          8       18.5962M/s
BM_SingleColumn_NumRanges_ReadRowGroup<false,::arrow::Int32Type>/iterations:50                                        1557364 ns      1541500 ns           50         2M          8       1.29744G/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:500/iterations:50          6483447 ns      6427080 ns           50     999.5k          4       155.514M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:1000/iterations:50         7577990 ns      7136140 ns           50       999k          4       139.992M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:10000/iterations:50       14655075 ns     14269100 ns           50       990k          4       69.3807M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:100000/iterations:50      52135581 ns     50977440 ns           50       800k          4       15.6932M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int32Type>/hit_page(%):100/hit_ranges:1000000/iterations:50     96627734 ns     94002220 ns           50      1000k          4        10.638M/s
BM_SingleColumn_NumRanges_ReadRowGroup<true,::arrow::Int32Type>/iterations:50                                         9284147 ns      9060400 ns           50         2M          4       220.741M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:500/iterations:50         2780413 ns      2699520 ns           50       999k         16       370.066M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:1000/iterations:50        3117969 ns      3103660 ns           50       991k         16         319.3M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:10000/iterations:50      10857759 ns     10618860 ns           50       910k         16       85.6966M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:100000/iterations:50     52281539 ns     51806180 ns           50      1000k         16       19.3027M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:1000000/iterations:50    53256162 ns     52542200 ns           50      1000k         16       19.0323M/s
BM_SingleColumn_NumRanges_ReadRowGroup<false,::arrow::Int64Type>/iterations:50                                        3036522 ns      3020580 ns           50         2M         16       662.124M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:500/iterations:50          7736525 ns      7564300 ns           50       999k          8       132.068M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:1000/iterations:50         9378083 ns      8917440 ns           50       999k          8       112.028M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:10000/iterations:50       19232368 ns     18976780 ns           50       990k          8        52.169M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:100000/iterations:50      70984532 ns     70568600 ns           50   782.496k          8       11.0884M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::Int64Type>/hit_page(%):100/hit_ranges:1000000/iterations:50     88546468 ns     88347420 ns           50      1000k          8       11.3189M/s
BM_SingleColumn_NumRanges_ReadRowGroup<true,::arrow::Int64Type>/iterations:50                                         8904477 ns      8872560 ns           50         2M          8       225.414M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::FloatType>/hit_page(%):100/hit_ranges:500/iterations:50         1342717 ns      1340940 ns           50       999k          8           745M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::FloatType>/hit_page(%):100/hit_ranges:1000/iterations:50        1480489 ns      1480480 ns           50       999k          8       674.781M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::FloatType>/hit_page(%):100/hit_ranges:10000/iterations:50       5134953 ns      5134960 ns           50       990k          8       192.796M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::FloatType>/hit_page(%):100/hit_ranges:100000/iterations:50     41475635 ns     41351500 ns           50   782.496k          8        18.923M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::FloatType>/hit_page(%):100/hit_ranges:1000000/iterations:50    49940678 ns     49919900 ns           50      1000k          8       20.0321M/s
BM_SingleColumn_NumRanges_ReadRowGroup<false,::arrow::FloatType>/iterations:50                                        1466148 ns      1466140 ns           50         2M          8       1.36413G/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::FloatType>/hit_page(%):100/hit_ranges:500/iterations:50          5810324 ns      5810360 ns           50     999.5k          4        172.02M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::FloatType>/hit_page(%):100/hit_ranges:1000/iterations:50         6257747 ns      6257740 ns           50       999k          4       159.642M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::FloatType>/hit_page(%):100/hit_ranges:10000/iterations:50       13449354 ns     13445280 ns           50       990k          4       73.6318M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::FloatType>/hit_page(%):100/hit_ranges:100000/iterations:50      44484917 ns     44484920 ns           50       800k          4       17.9836M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::FloatType>/hit_page(%):100/hit_ranges:1000000/iterations:50     87668032 ns     87665320 ns           50      1000k          4        11.407M/s
BM_SingleColumn_NumRanges_ReadRowGroup<true,::arrow::FloatType>/iterations:50                                         8244576 ns      8243860 ns           50         2M          4       242.605M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:500/iterations:50        2941689 ns      2926600 ns           50       999k         16       341.352M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:1000/iterations:50       2758582 ns      2758600 ns           50       991k         16        359.24M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:10000/iterations:50     10177306 ns     10174800 ns           50       910k         16       89.4366M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:100000/iterations:50    49739064 ns     49700420 ns           50      1000k         16       20.1206M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:1000000/iterations:50   52075806 ns     50505760 ns           50      1000k         16       19.7997M/s
BM_SingleColumn_NumRanges_ReadRowGroup<false,::arrow::DoubleType>/iterations:50                                       2653357 ns      2653340 ns           50         2M         16       753.767M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:500/iterations:50         6847587 ns      6795380 ns           50       999k          8       147.012M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:1000/iterations:50        8701575 ns      8615520 ns           50       999k          8       115.954M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:10000/iterations:50      18250828 ns     18235140 ns           50       990k          8       54.2908M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:100000/iterations:50     72778973 ns     70897040 ns           50   782.496k          8       11.0371M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::DoubleType>/hit_page(%):100/hit_ranges:1000000/iterations:50    89796725 ns     88411400 ns           50      1000k          8       11.3108M/s
BM_SingleColumn_NumRanges_ReadRowGroup<true,::arrow::DoubleType>/iterations:50                                        8722744 ns      8717700 ns           50         2M          8       229.418M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::StringType>/hit_page(%):100/hit_ranges:500/iterations:50       14747333 ns     14737380 ns           50     989.5k         27       67.1422M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::StringType>/hit_page(%):100/hit_ranges:1000/iterations:50      15420189 ns     15417820 ns           50       976k         27       63.3034M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::StringType>/hit_page(%):100/hit_ranges:10000/iterations:50     26830887 ns     26824880 ns           50       790k         27       29.4503M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::StringType>/hit_page(%):100/hit_ranges:100000/iterations:50    74758644 ns     74637000 ns           50      1000k         27       13.3982M/s
BM_SingleColumn_NumRanges_PagePruning<false, ::arrow::StringType>/hit_page(%):100/hit_ranges:1000000/iterations:50   74697486 ns     74693320 ns           50      1000k         27       13.3881M/s
BM_SingleColumn_NumRanges_ReadRowGroup<false,::arrow::StringType>/iterations:50                                      24560657 ns     24549260 ns           50         2M         27       81.4689M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::StringType>/hit_page(%):100/hit_ranges:500/iterations:50        29453166 ns     29451740 ns           50     996.5k         14        33.835M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::StringType>/hit_page(%):100/hit_ranges:1000/iterations:50       47641184 ns     47614640 ns           50       996k         14       20.9179M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::StringType>/hit_page(%):100/hit_ranges:10000/iterations:50     325770038 ns    325471820 ns           50       930k         14       2.85739M/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::StringType>/hit_page(%):100/hit_ranges:100000/iterations:50   2429500345 ns   2426462860 ns           50      1000k         14       412.123k/s
BM_SingleColumn_NumRanges_PagePruning<true, ::arrow::StringType>/hit_page(%):100/hit_ranges:1000000/iterations:50  5.1448e+11 ns   3944426420 ns           50      1000k         14       253.522k/s
BM_SingleColumn_NumRanges_ReadRowGroup<true,::arrow::StringType>/iterations:50                                       16679447 ns     16597600 ns           50         2M         14       120.499M/s

Please kindly request the community's assistance in reviewing and determining whether it can be merged into the community. If needed, I can split the Merge Request into multiple ones. Thank you!

Component(s)

C++, Parquet

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions