Skip to content

Comments

Upgrade DataFusion to arrow-rs/parquet 58.0.0 / object_store 0.13.0#19728

Open
alamb wants to merge 60 commits intoapache:mainfrom
alamb:alamb/update_arrow_58
Open

Upgrade DataFusion to arrow-rs/parquet 58.0.0 / object_store 0.13.0#19728
alamb wants to merge 60 commits intoapache:mainfrom
alamb:alamb/update_arrow_58

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jan 10, 2026

Which issue does this PR close?

Oustanding issues

Rationale for this change

Keep datafusion up to date (and test Arrow using DataFusion tests)

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) common Related to common crate proto Related to proto crate functions Changes to functions implementation datasource Changes to the datasource crate physical-plan Changes to the physical-plan crate labels Jan 10, 2026
| alltypes_plain.parquet | 1851 | 8882 | 2 | page_index=false |
| alltypes_tiny_pages.parquet | 454233 | 269266 | 2 | page_index=true |
| lz4_raw_compressed_larger.parquet | 380836 | 1347 | 2 | page_index=false |
| alltypes_tiny_pages.parquet | 454233 | 269074 | 2 | page_index=true |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this reduction in metadata size is a direct consequence of @WaterWhisperer's PR to improve PageEncoding representation

@Dandandan
Copy link
Contributor

Run benchmarks

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/update_arrow_58 (35b97fa) to b9a3b9f diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and alamb_update_arrow_58
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ alamb_update_arrow_58 ┃    Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0 │  2354.54 ms │            2406.77 ms │ no change │
│ QQuery 1 │   939.82 ms │             929.76 ms │ no change │
│ QQuery 2 │  1882.93 ms │            1891.80 ms │ no change │
│ QQuery 3 │  1163.47 ms │            1165.13 ms │ no change │
│ QQuery 4 │  2306.07 ms │            2241.55 ms │ no change │
│ QQuery 5 │ 28452.06 ms │           28080.09 ms │ no change │
│ QQuery 6 │  4018.71 ms │            4055.12 ms │ no change │
│ QQuery 7 │  3790.29 ms │            3670.14 ms │ no change │
└──────────┴─────────────┴───────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 44907.89ms │
│ Total Time (alamb_update_arrow_58)   │ 44440.36ms │
│ Average Time (HEAD)                  │  5613.49ms │
│ Average Time (alamb_update_arrow_58) │  5555.05ms │
│ Queries Faster                       │          0 │
│ Queries Slower                       │          0 │
│ Queries with No Change               │          8 │
│ Queries with Failure                 │          0 │
└──────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ alamb_update_arrow_58 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     1.86 ms │               1.94 ms │     no change │
│ QQuery 1  │    50.56 ms │              50.42 ms │     no change │
│ QQuery 2  │   134.60 ms │             134.00 ms │     no change │
│ QQuery 3  │   156.94 ms │             152.15 ms │     no change │
│ QQuery 4  │  1074.92 ms │            1227.05 ms │  1.14x slower │
│ QQuery 5  │  1349.11 ms │            1512.35 ms │  1.12x slower │
│ QQuery 6  │     1.82 ms │               1.86 ms │     no change │
│ QQuery 7  │    56.21 ms │              53.80 ms │     no change │
│ QQuery 8  │  1404.44 ms │            1546.63 ms │  1.10x slower │
│ QQuery 9  │  1744.32 ms │            1884.58 ms │  1.08x slower │
│ QQuery 10 │   344.43 ms │             351.00 ms │     no change │
│ QQuery 11 │   393.88 ms │             399.97 ms │     no change │
│ QQuery 12 │  1259.31 ms │            1442.13 ms │  1.15x slower │
│ QQuery 13 │  1916.02 ms │            2053.79 ms │  1.07x slower │
│ QQuery 14 │  1228.80 ms │            1371.03 ms │  1.12x slower │
│ QQuery 15 │  1223.04 ms │            1359.64 ms │  1.11x slower │
│ QQuery 16 │  2568.94 ms │            2675.51 ms │     no change │
│ QQuery 17 │  2521.47 ms │            2647.03 ms │     no change │
│ QQuery 18 │  6120.19 ms │            5000.68 ms │ +1.22x faster │
│ QQuery 19 │   120.13 ms │             119.44 ms │     no change │
│ QQuery 20 │  1948.33 ms │            1896.06 ms │     no change │
│ QQuery 21 │  2245.14 ms │            2178.01 ms │     no change │
│ QQuery 22 │  3847.87 ms │            3802.23 ms │     no change │
│ QQuery 23 │ 21572.08 ms │           12373.92 ms │ +1.74x faster │
│ QQuery 24 │   225.72 ms │             216.64 ms │     no change │
│ QQuery 25 │   471.41 ms │             478.99 ms │     no change │
│ QQuery 26 │   218.83 ms │             235.13 ms │  1.07x slower │
│ QQuery 27 │  2823.98 ms │            2691.15 ms │     no change │
│ QQuery 28 │ 23860.88 ms │           23263.75 ms │     no change │
│ QQuery 29 │   982.83 ms │             966.29 ms │     no change │
│ QQuery 30 │  1343.19 ms │            1378.01 ms │     no change │
│ QQuery 31 │  1521.78 ms │            1379.20 ms │ +1.10x faster │
│ QQuery 32 │  5104.40 ms │            4890.24 ms │     no change │
│ QQuery 33 │  5837.64 ms │            5506.92 ms │ +1.06x faster │
│ QQuery 34 │  5933.39 ms │            5605.95 ms │ +1.06x faster │
│ QQuery 35 │  1921.56 ms │            1960.63 ms │     no change │
│ QQuery 36 │    65.96 ms │              68.18 ms │     no change │
│ QQuery 37 │    45.85 ms │              45.16 ms │     no change │
│ QQuery 38 │    66.53 ms │              65.89 ms │     no change │
│ QQuery 39 │   102.71 ms │             106.18 ms │     no change │
│ QQuery 40 │    28.23 ms │              25.52 ms │ +1.11x faster │
│ QQuery 41 │    24.10 ms │              23.51 ms │     no change │
│ QQuery 42 │    19.09 ms │              18.80 ms │     no change │
└───────────┴─────────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 103882.45ms │
│ Total Time (alamb_update_arrow_58)   │  93161.37ms │
│ Average Time (HEAD)                  │   2415.87ms │
│ Average Time (alamb_update_arrow_58) │   2166.54ms │
│ Queries Faster                       │           6 │
│ Queries Slower                       │           9 │
│ Queries with No Change               │          28 │
│ Queries with Failure                 │           0 │
└──────────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ alamb_update_arrow_58 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 119.70 ms │             118.08 ms │     no change │
│ QQuery 2  │  35.82 ms │              36.60 ms │     no change │
│ QQuery 3  │  40.70 ms │              36.99 ms │ +1.10x faster │
│ QQuery 4  │  30.37 ms │              30.72 ms │     no change │
│ QQuery 5  │  91.59 ms │              90.47 ms │     no change │
│ QQuery 6  │  21.09 ms │              20.76 ms │     no change │
│ QQuery 7  │ 222.34 ms │             225.43 ms │     no change │
│ QQuery 8  │  44.85 ms │              40.29 ms │ +1.11x faster │
│ QQuery 9  │ 122.80 ms │             104.76 ms │ +1.17x faster │
│ QQuery 10 │  81.06 ms │              66.33 ms │ +1.22x faster │
│ QQuery 11 │  23.80 ms │              24.89 ms │     no change │
│ QQuery 12 │  52.22 ms │              52.78 ms │     no change │
│ QQuery 13 │  48.94 ms │              49.75 ms │     no change │
│ QQuery 14 │  14.95 ms │              14.96 ms │     no change │
│ QQuery 15 │  30.50 ms │              30.48 ms │     no change │
│ QQuery 16 │  28.47 ms │              29.06 ms │     no change │
│ QQuery 17 │ 151.61 ms │             154.03 ms │     no change │
│ QQuery 18 │ 287.03 ms │             288.27 ms │     no change │
│ QQuery 19 │  39.57 ms │              39.77 ms │     no change │
│ QQuery 20 │  57.69 ms │              57.36 ms │     no change │
│ QQuery 21 │ 322.81 ms │             318.54 ms │     no change │
│ QQuery 22 │  21.98 ms │              21.61 ms │     no change │
└───────────┴───────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 1889.89ms │
│ Total Time (alamb_update_arrow_58)   │ 1851.95ms │
│ Average Time (HEAD)                  │   85.90ms │
│ Average Time (alamb_update_arrow_58) │   84.18ms │
│ Queries Faster                       │         4 │
│ Queries Slower                       │         0 │
│ Queries with No Change               │        18 │
│ Queries with Failure                 │         0 │
└──────────────────────────────────────┴───────────┘

@Dandandan
Copy link
Contributor

run benchmarks

@Dandandan
Copy link
Contributor

run benchmark tpch

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/update_arrow_58 (35b97fa) to b9a3b9f diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and alamb_update_arrow_58
--------------------
Benchmark clickbench_extended.json
--------------------
┏━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Query    ┃        HEAD ┃ alamb_update_arrow_58 ┃    Change ┃
┡━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ QQuery 0 │  2389.61 ms │            2355.66 ms │ no change │
│ QQuery 1 │   975.65 ms │             969.78 ms │ no change │
│ QQuery 2 │  1875.49 ms │            1871.78 ms │ no change │
│ QQuery 3 │  1174.32 ms │            1145.53 ms │ no change │
│ QQuery 4 │  2258.94 ms │            2223.01 ms │ no change │
│ QQuery 5 │ 27824.57 ms │           28406.72 ms │ no change │
│ QQuery 6 │  4031.59 ms │            4002.29 ms │ no change │
│ QQuery 7 │  3463.56 ms │            3533.17 ms │ no change │
└──────────┴─────────────┴───────────────────────┴───────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 43993.73ms │
│ Total Time (alamb_update_arrow_58)   │ 44507.93ms │
│ Average Time (HEAD)                  │  5499.22ms │
│ Average Time (alamb_update_arrow_58) │  5563.49ms │
│ Queries Faster                       │          0 │
│ Queries Slower                       │          0 │
│ Queries with No Change               │          8 │
│ Queries with Failure                 │          0 │
└──────────────────────────────────────┴────────────┘
--------------------
Benchmark clickbench_partitioned.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ alamb_update_arrow_58 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 0  │     1.90 ms │               1.92 ms │     no change │
│ QQuery 1  │    52.12 ms │              50.35 ms │     no change │
│ QQuery 2  │   142.92 ms │             133.95 ms │ +1.07x faster │
│ QQuery 3  │   154.00 ms │             152.77 ms │     no change │
│ QQuery 4  │  1072.51 ms │            1063.15 ms │     no change │
│ QQuery 5  │  1369.60 ms │            1320.07 ms │     no change │
│ QQuery 6  │     1.82 ms │               1.86 ms │     no change │
│ QQuery 7  │    54.39 ms │              52.98 ms │     no change │
│ QQuery 8  │  1446.73 ms │            1417.75 ms │     no change │
│ QQuery 9  │  1835.01 ms │            1706.51 ms │ +1.08x faster │
│ QQuery 10 │   344.52 ms │             354.29 ms │     no change │
│ QQuery 11 │   392.19 ms │             397.62 ms │     no change │
│ QQuery 12 │  1286.14 ms │            1275.41 ms │     no change │
│ QQuery 13 │  1947.40 ms │            1962.96 ms │     no change │
│ QQuery 14 │  1259.13 ms │            1231.11 ms │     no change │
│ QQuery 15 │  1247.90 ms │            1222.46 ms │     no change │
│ QQuery 16 │  2534.25 ms │            2515.84 ms │     no change │
│ QQuery 17 │  2514.25 ms │            2527.22 ms │     no change │
│ QQuery 18 │  5440.35 ms │            4806.73 ms │ +1.13x faster │
│ QQuery 19 │   119.26 ms │             118.76 ms │     no change │
│ QQuery 20 │  1929.28 ms │            1872.84 ms │     no change │
│ QQuery 21 │  2213.86 ms │            2178.92 ms │     no change │
│ QQuery 22 │  4264.32 ms │            3784.45 ms │ +1.13x faster │
│ QQuery 23 │ 20258.80 ms │           12236.28 ms │ +1.66x faster │
│ QQuery 24 │   215.95 ms │             219.20 ms │     no change │
│ QQuery 25 │   488.91 ms │             481.13 ms │     no change │
│ QQuery 26 │   226.35 ms │             216.76 ms │     no change │
│ QQuery 27 │  2799.99 ms │            2713.67 ms │     no change │
│ QQuery 28 │ 23840.67 ms │           23084.98 ms │     no change │
│ QQuery 29 │   974.43 ms │             971.18 ms │     no change │
│ QQuery 30 │  1351.52 ms │            1299.11 ms │     no change │
│ QQuery 31 │  1399.75 ms │            1421.76 ms │     no change │
│ QQuery 32 │  4761.26 ms │            4490.64 ms │ +1.06x faster │
│ QQuery 33 │  5521.96 ms │            5313.98 ms │     no change │
│ QQuery 34 │  5548.81 ms │            5584.56 ms │     no change │
│ QQuery 35 │  1979.74 ms │            1851.73 ms │ +1.07x faster │
│ QQuery 36 │    66.91 ms │              67.24 ms │     no change │
│ QQuery 37 │    46.16 ms │              47.24 ms │     no change │
│ QQuery 38 │    66.08 ms │              67.97 ms │     no change │
│ QQuery 39 │   102.12 ms │             100.78 ms │     no change │
│ QQuery 40 │    28.61 ms │              27.21 ms │     no change │
│ QQuery 41 │    23.45 ms │              22.02 ms │ +1.06x faster │
│ QQuery 42 │    19.24 ms │              19.38 ms │     no change │
└───────────┴─────────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃             ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 101344.52ms │
│ Total Time (alamb_update_arrow_58)   │  90386.74ms │
│ Average Time (HEAD)                  │   2356.85ms │
│ Average Time (alamb_update_arrow_58) │   2102.02ms │
│ Queries Faster                       │           8 │
│ Queries Slower                       │           0 │
│ Queries with No Change               │          35 │
│ Queries with Failure                 │           0 │
└──────────────────────────────────────┴─────────────┘
--------------------
Benchmark tpch_mem_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃      HEAD ┃ alamb_update_arrow_58 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │ 118.62 ms │             117.02 ms │     no change │
│ QQuery 2  │  37.93 ms │              38.57 ms │     no change │
│ QQuery 3  │  41.97 ms │              37.52 ms │ +1.12x faster │
│ QQuery 4  │  31.39 ms │              30.22 ms │     no change │
│ QQuery 5  │  91.63 ms │              91.98 ms │     no change │
│ QQuery 6  │  21.13 ms │              20.89 ms │     no change │
│ QQuery 7  │ 229.95 ms │             228.06 ms │     no change │
│ QQuery 8  │  39.25 ms │              37.93 ms │     no change │
│ QQuery 9  │ 107.60 ms │             107.66 ms │     no change │
│ QQuery 10 │  69.45 ms │              69.75 ms │     no change │
│ QQuery 11 │  23.16 ms │              24.35 ms │  1.05x slower │
│ QQuery 12 │  53.07 ms │              52.75 ms │     no change │
│ QQuery 13 │  51.17 ms │              48.55 ms │ +1.05x faster │
│ QQuery 14 │  15.28 ms │              15.11 ms │     no change │
│ QQuery 15 │  30.14 ms │              30.73 ms │     no change │
│ QQuery 16 │  31.26 ms │              29.13 ms │ +1.07x faster │
│ QQuery 17 │ 159.46 ms │             154.29 ms │     no change │
│ QQuery 18 │ 287.59 ms │             284.81 ms │     no change │
│ QQuery 19 │  39.20 ms │              39.99 ms │     no change │
│ QQuery 20 │  56.56 ms │              57.03 ms │     no change │
│ QQuery 21 │ 300.14 ms │             325.49 ms │  1.08x slower │
│ QQuery 22 │  22.64 ms │              22.18 ms │     no change │
└───────────┴───────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃           ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 1858.59ms │
│ Total Time (alamb_update_arrow_58)   │ 1864.00ms │
│ Average Time (HEAD)                  │   84.48ms │
│ Average Time (alamb_update_arrow_58) │   84.73ms │
│ Queries Faster                       │         3 │
│ Queries Slower                       │         2 │
│ Queries with No Change               │        17 │
│ Queries with Failure                 │         0 │
└──────────────────────────────────────┴───────────┘

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and alamb_update_arrow_58
--------------------
Benchmark tpcds_sf1.json
--------------------
┏━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Query     ┃        HEAD ┃ alamb_update_arrow_58 ┃        Change ┃
┡━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ QQuery 1  │    74.18 ms │              73.30 ms │     no change │
│ QQuery 2  │   222.59 ms │             238.22 ms │  1.07x slower │
│ QQuery 3  │   157.14 ms │             161.39 ms │     no change │
│ QQuery 4  │  1924.29 ms │            1871.05 ms │     no change │
│ QQuery 5  │   260.93 ms │             267.29 ms │     no change │
│ QQuery 6  │  1458.28 ms │            1429.00 ms │     no change │
│ QQuery 7  │   511.40 ms │             506.59 ms │     no change │
│ QQuery 8  │   171.56 ms │             176.05 ms │     no change │
│ QQuery 9  │   288.60 ms │             312.58 ms │  1.08x slower │
│ QQuery 10 │   173.04 ms │             171.52 ms │     no change │
│ QQuery 11 │  1324.03 ms │            1281.72 ms │     no change │
│ QQuery 12 │    67.76 ms │              67.05 ms │     no change │
│ QQuery 13 │   542.62 ms │             547.72 ms │     no change │
│ QQuery 14 │  1903.07 ms │            1896.90 ms │     no change │
│ QQuery 15 │    27.23 ms │              26.95 ms │     no change │
│ QQuery 16 │    65.54 ms │              65.49 ms │     no change │
│ QQuery 17 │   358.10 ms │             360.27 ms │     no change │
│ QQuery 18 │   191.89 ms │             186.67 ms │     no change │
│ QQuery 19 │   218.75 ms │             221.78 ms │     no change │
│ QQuery 20 │    24.77 ms │              24.35 ms │     no change │
│ QQuery 21 │    37.17 ms │              36.11 ms │     no change │
│ QQuery 22 │   722.89 ms │             741.88 ms │     no change │
│ QQuery 23 │  1785.65 ms │            1721.55 ms │     no change │
│ QQuery 24 │   690.41 ms │             662.43 ms │     no change │
│ QQuery 25 │   528.32 ms │             515.49 ms │     no change │
│ QQuery 26 │   126.65 ms │             121.43 ms │     no change │
│ QQuery 27 │   518.22 ms │             496.54 ms │     no change │
│ QQuery 28 │   298.23 ms │             301.86 ms │     no change │
│ QQuery 29 │   457.34 ms │             455.57 ms │     no change │
│ QQuery 30 │    77.88 ms │              71.19 ms │ +1.09x faster │
│ QQuery 31 │   304.15 ms │             314.99 ms │     no change │
│ QQuery 32 │    82.37 ms │              82.64 ms │     no change │
│ QQuery 33 │   207.40 ms │             215.40 ms │     no change │
│ QQuery 34 │   161.15 ms │             165.21 ms │     no change │
│ QQuery 35 │   175.81 ms │             170.50 ms │     no change │
│ QQuery 36 │   285.30 ms │             288.41 ms │     no change │
│ QQuery 37 │   255.27 ms │             255.72 ms │     no change │
│ QQuery 38 │   153.59 ms │             157.87 ms │     no change │
│ QQuery 39 │   194.54 ms │             193.41 ms │     no change │
│ QQuery 40 │   180.45 ms │             176.72 ms │     no change │
│ QQuery 41 │    24.87 ms │              27.90 ms │  1.12x slower │
│ QQuery 42 │   146.35 ms │             150.88 ms │     no change │
│ QQuery 43 │   131.15 ms │             132.86 ms │     no change │
│ QQuery 44 │    28.19 ms │              28.48 ms │     no change │
│ QQuery 45 │    87.41 ms │              84.35 ms │     no change │
│ QQuery 46 │   322.64 ms │             327.04 ms │     no change │
│ QQuery 47 │  1043.88 ms │            1026.41 ms │     no change │
│ QQuery 48 │   401.68 ms │             410.42 ms │     no change │
│ QQuery 49 │   378.10 ms │             375.18 ms │     no change │
│ QQuery 50 │   331.70 ms │             338.74 ms │     no change │
│ QQuery 51 │   304.94 ms │             304.90 ms │     no change │
│ QQuery 52 │   146.63 ms │             150.21 ms │     no change │
│ QQuery 53 │   144.96 ms │             149.77 ms │     no change │
│ QQuery 54 │   205.68 ms │             210.09 ms │     no change │
│ QQuery 55 │   147.42 ms │             149.73 ms │     no change │
│ QQuery 56 │   208.39 ms │             210.77 ms │     no change │
│ QQuery 57 │   286.78 ms │             284.78 ms │     no change │
│ QQuery 58 │   501.50 ms │             484.30 ms │     no change │
│ QQuery 59 │   293.61 ms │             325.35 ms │  1.11x slower │
│ QQuery 60 │   212.72 ms │             215.12 ms │     no change │
│ QQuery 61 │   245.89 ms │             253.40 ms │     no change │
│ QQuery 62 │  1277.50 ms │            1320.46 ms │     no change │
│ QQuery 63 │   147.21 ms │             149.34 ms │     no change │
│ QQuery 64 │  1143.22 ms │            1121.11 ms │     no change │
│ QQuery 65 │   356.66 ms │             356.05 ms │     no change │
│ QQuery 66 │   382.71 ms │             400.63 ms │     no change │
│ QQuery 67 │   545.92 ms │             537.67 ms │     no change │
│ QQuery 68 │   374.97 ms │             386.80 ms │     no change │
│ QQuery 69 │   170.50 ms │             166.18 ms │     no change │
│ QQuery 70 │   504.80 ms │             513.98 ms │     no change │
│ QQuery 71 │   187.38 ms │             189.85 ms │     no change │
│ QQuery 72 │  2127.42 ms │            2005.80 ms │ +1.06x faster │
│ QQuery 73 │   155.54 ms │             163.62 ms │  1.05x slower │
│ QQuery 74 │   817.04 ms │             814.26 ms │     no change │
│ QQuery 75 │   412.77 ms │             394.61 ms │     no change │
│ QQuery 76 │   185.08 ms │             182.55 ms │     no change │
│ QQuery 77 │   287.69 ms │             283.62 ms │     no change │
│ QQuery 78 │   689.90 ms │             676.11 ms │     no change │
│ QQuery 79 │   326.72 ms │             332.70 ms │     no change │
│ QQuery 80 │   528.27 ms │             517.91 ms │     no change │
│ QQuery 81 │    52.86 ms │              51.71 ms │     no change │
│ QQuery 82 │   284.06 ms │             285.68 ms │     no change │
│ QQuery 83 │    78.63 ms │              76.76 ms │     no change │
│ QQuery 84 │    68.98 ms │              66.64 ms │     no change │
│ QQuery 85 │   210.76 ms │             215.34 ms │     no change │
│ QQuery 86 │    58.77 ms │              58.28 ms │     no change │
│ QQuery 87 │   155.67 ms │             156.86 ms │     no change │
│ QQuery 88 │   266.38 ms │             269.97 ms │     no change │
│ QQuery 89 │   169.20 ms │             167.03 ms │     no change │
│ QQuery 90 │    46.15 ms │              47.07 ms │     no change │
│ QQuery 91 │    98.58 ms │              89.73 ms │ +1.10x faster │
│ QQuery 92 │    80.32 ms │              83.44 ms │     no change │
│ QQuery 93 │   290.74 ms │             282.26 ms │     no change │
│ QQuery 94 │    89.21 ms │              92.03 ms │     no change │
│ QQuery 95 │   248.19 ms │             246.64 ms │     no change │
│ QQuery 96 │   116.70 ms │             119.10 ms │     no change │
│ QQuery 97 │   194.26 ms │             197.63 ms │     no change │
│ QQuery 98 │   217.14 ms │             211.99 ms │     no change │
│ QQuery 99 │ 14106.25 ms │           15687.07 ms │  1.11x slower │
└───────────┴─────────────┴───────────────────────┴───────────────┘
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃ Benchmark Summary                    ┃            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│ Total Time (HEAD)                    │ 50257.18ms │
│ Total Time (alamb_update_arrow_58)   │ 51589.99ms │
│ Average Time (HEAD)                  │   507.65ms │
│ Average Time (alamb_update_arrow_58) │   521.11ms │
│ Queries Faster                       │          3 │
│ Queries Slower                       │          6 │
│ Queries with No Change               │         90 │
│ Queries with Failure                 │          0 │
└──────────────────────────────────────┴────────────┘

@alamb alamb changed the title WIP: Upgrade DataFusion to arrow-rs/parquet 58.0.0 / object_store 13.0.0 Upgrade DataFusion to arrow-rs/parquet 58.0.0 / object_store 13.0.0 Feb 23, 2026
@github-actions github-actions bot added the functions Changes to functions implementation label Feb 23, 2026
let timestamp = Utc::now();
let range = options.range.clone();

let head = options.head;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A substantial amount of the changes in this PR are due to the upgrade to object_store 0.13 where several of the trait methods are consolidated (e.g. get, get_opts, head, etc) have been consolidated.

You can see the upgrade guide here: https://docs.rs/object_store/latest/object_store/trait.ObjectStore.html#upgrade-guide-for-0130


let props = WriterProperties::builder()
.set_max_row_group_size(2)
.set_max_row_group_row_count(Some(2))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.location
.parts()
.last()
.next_back()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clippy told me to do this -- I am not sure why it doesn't do so on main

_opts: PutOptions,
) -> object_store::Result<PutResult> {
Err(object_store::Error::NotImplemented)
unimplemented!()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the NotImplemented error now contains new fields about operations that are not implemented. I just kept the code simple and used unimplemented instead

Total Requests: 2
- HEAD path=csv_table.csv
- GET path=csv_table.csv
- GET (opts) path=csv_table.csv head=true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are many fewer methods in the ObjectStore trait now, and the tests have been updated to reflect that. The actual calls are all still the same

// no content for head requests
GetResultPayload::Stream(stream::empty().boxed())
} else if let Some(range) = options.range {
let GetRange::Bounded(range) = range else {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to inline the implementation of head() and get_range() here


let stop = if !self.include_upper_bound {
Date32Type::subtract_month_day_nano(stop, step)
Date32Type::subtract_month_day_nano_opt(stop, step).ok_or_else(|| {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

due to apache/arrow-rs#9144 from @cht42 (the underlying library now checks for overflow and returns None rather than panic'ing)

07)│ DataSourceExec │
08)│ -------------------- │
09)│ bytes: 1040
09)│ bytes: 1024
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the size of parquet files changes slightly from version to version (as the embedded version changes, etc)

@alamb alamb changed the title Upgrade DataFusion to arrow-rs/parquet 58.0.0 / object_store 13.0.0 Upgrade DataFusion to arrow-rs/parquet 58.0.0 / object_store 0.13.0 Feb 23, 2026
@alamb alamb marked this pull request as ready for review February 23, 2026 13:31
@alamb
Copy link
Contributor Author

alamb commented Feb 23, 2026

Ok, I think this PR is ready to go

@alamb
Copy link
Contributor Author

alamb commented Feb 23, 2026

run benchmarks

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 23, 2026
@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/update_arrow_58 (bf17d4a) to 89a8576 diff using: tpch_mem clickbench_partitioned clickbench_extended
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and alamb_update_arrow_58
--------------------
Benchmark clickbench_extended.json
--------------------

@alamb
Copy link
Contributor Author

alamb commented Feb 23, 2026

run benchmark clickbench_partitioned

@alamb-ghbot
Copy link

🤖 ./gh_compare_branch.sh gh_compare_branch.sh Running
Linux aal-dev 6.14.0-1018-gcp #19~24.04.1-Ubuntu SMP Wed Sep 24 23:23:09 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing alamb/update_arrow_58 (dd0464e) to b9328b9 diff using: clickbench_partitioned
Results will be posted here when complete

@alamb-ghbot
Copy link

🤖: Benchmark completed

Details

Comparing HEAD and alamb_update_arrow_58
--------------------
Benchmark clickbench_partitioned.json
--------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate datasource Changes to the datasource crate documentation Improvements or additions to documentation functions Changes to functions implementation spark sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[datafusion-spark] [SQL] [TEST] IntervalMonthDayNano(0,0,0) give line blank

5 participants