fix: performance regression for filtering ListView by a10y · Pull Request #5390 · vortex-data/vortex

a10y · 2025-11-18T21:45:56Z

In #4946, we forced rebuilding ListView arrays that were being built.

A user report came in with a 10x performance regression with lots of time being spent inside of ZSTD decompression. On further investigation, the flame graph clarified that inside of a file scan, we were filtering a ListView array, where the elements were ZSTD compressed. Calling naive_rebuild goes through an awful append_scalar pathway, and scalar_at for ZSTD...decompresses a whole frame.

To avoid this, we fully canonicalize the elements, offsets, and sizes in bulk, then stitch a new ListView from the components.

This is 10-20x faster than the previous codepath, per the added benchmark.

Before:

Timer precision: 41 ns
listview_rebuild  fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ rebuild_naive  1.821 ms      │ 2.535 ms      │ 2.019 ms      │ 2.024 ms      │ 100     │ 100

After:

Timer precision: 41 ns
listview_rebuild  fastest       │ slowest       │ median        │ mean          │ samples │ iters
╰─ rebuild_naive  109.5 µs      │ 192.2 µs      │ 121.1 µs      │ 122 µs        │ 100     │ 100

Signed-off-by: Andrew Duffy <[email protected]>

Timer precision: 41 ns listview_rebuild fastest │ slowest │ median │ mean │ samples │ iters ╰─ rebuild_naive 109.5 µs │ 192.2 µs │ 121.1 µs │ 122 µs │ 100 │ 100 Signed-off-by: Andrew Duffy <[email protected]>

Signed-off-by: Andrew Duffy <[email protected]>

robert3005

We could do a take but this is also fine

Signed-off-by: Andrew Duffy <[email protected]>

codspeed-hq · 2025-11-18T22:24:08Z

CodSpeed Performance Report

Merging #5390 will improve performances by 17.59%

_{Comparing aduffy/listview-perf (c247bc9) with develop (60492c4)}

Summary

⚡ 51 improvements
✅ 1369 untouched
🆕 8 new
⏩ 645 skipped¹
🗄️ 28 archived benchmarks run²

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
🆕	`rebuild_naive`	N/A	1.3 ms	N/A
⚡	`chunked_dict_primitive_canonical_into[f32, (1000, 10, 10)]`	104 µs	92.6 µs	+12.39%
⚡	`chunked_dict_primitive_canonical_into[f32, (1000, 10, 100)]`	798 µs	703 µs	+13.52%
⚡	`chunked_dict_primitive_canonical_into[f32, (1000, 100, 10)]`	105.8 µs	94.2 µs	+12.27%
⚡	`chunked_dict_primitive_canonical_into[f32, (1000, 100, 100)]`	827.3 µs	734 µs	+12.72%
⚡	`chunked_dict_primitive_canonical_into[f32, (1000, 1000, 10)]`	122.3 µs	110.6 µs	+10.52%
⚡	`chunked_dict_primitive_canonical_into[f32, (1000, 1000, 100)]`	978.7 µs	883.2 µs	+10.82%
⚡	`chunked_dict_primitive_canonical_into[f64, (1000, 10, 10)]`	132.2 µs	119.9 µs	+10.27%
⚡	`chunked_dict_primitive_canonical_into[f64, (1000, 10, 100)]`	1,060.3 µs	960 µs	+10.45%
⚡	`chunked_dict_primitive_canonical_into[f64, (1000, 100, 10)]`	135.8 µs	123.4 µs	+10.03%
⚡	`chunked_dict_primitive_canonical_into[f64, (1000, 100, 100)]`	1,093.7 µs	993.4 µs	+10.09%
⚡	`chunked_dict_primitive_canonical_into[u32, (1000, 10, 10)]`	102.1 µs	90.6 µs	+12.7%
⚡	`chunked_dict_primitive_canonical_into[u32, (1000, 10, 100)]`	795.2 µs	704 µs	+12.96%
⚡	`chunked_dict_primitive_canonical_into[u32, (1000, 100, 10)]`	105.5 µs	93.9 µs	+12.32%
⚡	`chunked_dict_primitive_canonical_into[u32, (1000, 100, 100)]`	810.7 µs	720.4 µs	+12.54%
⚡	`chunked_dict_primitive_canonical_into[u32, (1000, 1000, 10)]`	120 µs	108.5 µs	+10.63%
⚡	`chunked_dict_primitive_canonical_into[u32, (1000, 1000, 100)]`	978.2 µs	886.2 µs	+10.39%
⚡	`chunked_dict_primitive_into_canonical[f32, (1000, 10, 10)]`	104.5 µs	92.7 µs	+12.79%
⚡	`chunked_dict_primitive_into_canonical[f32, (1000, 10, 100)]`	797.7 µs	702.8 µs	+13.51%
⚡	`chunked_dict_primitive_into_canonical[f32, (1000, 100, 10)]`	106.1 µs	94.3 µs	+12.46%
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

645 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
28 benchmarks were run, but are now archived. If they were deleted in another branch, consider rebasing to remove them from the report. Instead if they were added back, click here to restore them. ↩

codecov · 2025-11-18T22:25:50Z

Codecov Report

❌ Patch coverage is 78.26087% with 10 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.02%. Comparing base (1aac9ca) to head (c247bc9).
⚠️ Report is 13 commits behind head on develop.

Files with missing lines	Patch %	Lines
vortex-array/src/arrays/listview/rebuild.rs	78.26%	10 Missing ⚠️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

In vortex-data#4946, we forced rebuilding ListView arrays that were being built. A user report came in with a 10x performance regression with lots of time being spent inside of ZSTD decompression. On further investigation, the flame graph clarified that inside of a file scan, we were filtering a ListView array, where the elements were ZSTD compressed. Calling `naive_rebuild` goes through an awful `append_scalar` pathway, and `scalar_at` for ZSTD...decompresses a whole frame. To avoid this, we fully canonicalize the elements, offsets, and sizes in bulk, then stitch a new ListView from the components. This is 10-20x faster than the previous codepath, per the added benchmark. Before: ``` Timer precision: 41 ns listview_rebuild fastest │ slowest │ median │ mean │ samples │ iters ╰─ rebuild_naive 1.821 ms │ 2.535 ms │ 2.019 ms │ 2.024 ms │ 100 │ 100 ``` After: ``` Timer precision: 41 ns listview_rebuild fastest │ slowest │ median │ mean │ samples │ iters ╰─ rebuild_naive 109.5 µs │ 192.2 µs │ 121.1 µs │ 122 µs │ 100 │ 100 ``` --------- Signed-off-by: Andrew Duffy <[email protected]>

a10y added 3 commits November 18, 2025 16:36

fix: performance regression for filtering ListView

3594454

Signed-off-by: Andrew Duffy <[email protected]>

implement benchmark

8b8cb1c

Timer precision: 41 ns listview_rebuild fastest │ slowest │ median │ mean │ samples │ iters ╰─ rebuild_naive 109.5 µs │ 192.2 µs │ 121.1 µs │ 122 µs │ 100 │ 100 Signed-off-by: Andrew Duffy <[email protected]>

remove unnecessary assert in ListViewBuilder

5fc6745

Signed-off-by: Andrew Duffy <[email protected]>

a10y added the changelog/performance A performance improvement label Nov 18, 2025

a10y requested review from connortsui20 and gatesn November 18, 2025 21:46

robert3005 approved these changes Nov 18, 2025

View reviewed changes

a10y added 2 commits November 18, 2025 17:07

fix

90d8b4a

Signed-off-by: Andrew Duffy <[email protected]>

fix harder

c247bc9

Signed-off-by: Andrew Duffy <[email protected]>

a10y enabled auto-merge (squash) November 18, 2025 22:15

a10y merged commit e269163 into develop Nov 18, 2025
38 checks passed

a10y deleted the aduffy/listview-perf branch November 18, 2025 22:25

asubiotto mentioned this pull request Nov 20, 2025

panic: failed to create ListView views to the end of the elements array must have size 0 #5412

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: performance regression for filtering ListView#5390

fix: performance regression for filtering ListView#5390
a10y merged 5 commits intodevelopfrom
aduffy/listview-perf

a10y commented Nov 18, 2025

Uh oh!

robert3005 left a comment

Uh oh!

codspeed-hq bot commented Nov 18, 2025

Uh oh!

Uh oh!

codecov bot commented Nov 18, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

a10y commented Nov 18, 2025

Uh oh!

robert3005 left a comment

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Nov 18, 2025

CodSpeed Performance Report

Merging #5390 will improve performances by 17.59%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

Uh oh!

codecov bot commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Nov 18, 2025 •

edited

Loading