Skip to content

Conversation

@etseidl
Copy link
Contributor

@etseidl etseidl commented Aug 20, 2025

Which issue does this PR close?

Note: this targets a feature branch, not main

We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax.

Rationale for this change

Add a custom parser for PageLocation as the decoding of this struct is one of several hot spots.

What changes are included in this PR?

This adds a faster means of obtaining the struct field ids to ThriftCompactInputProtocol. For a small struct (3 fields) with all of them required, we can save a good bit of time bypassing ThriftCompactInputProtocol::read_field_begin which is very general and can handle out-of-order fields, among other things. By adding a new function read_field_header, we can avoid the costly branching that occurs when calculating the new field id (as well as special handling needed for boolean fields). Field validation is then handled on the consuming side while decoding the PageLocation struct.

Note that to obtain the speed up seen, we need to assume the fields will always be in order, and the field ids will all be encoded as field deltas. This is probably a fairly safe assumption, but there does exist the possibility of custom thrift writers that use absolute field ids. If we encounter such a writer in the wild, this change will need to be reverted.

Are these changes tested?

These changes should be covered by existing changes.

Are there any user-facing changes?

None beyond the changes in this branch.

@github-actions github-actions bot added the parquet Changes to the parquet crate label Aug 20, 2025
@etseidl
Copy link
Contributor Author

etseidl commented Aug 20, 2025

From #8160 benchmark results (old is the original thrift, new is the reworked decoder)

old
open(page index)        time:   [1.7446 ms 1.7524 ms 1.7639 ms]

new
open(page index)        time:   [698.00 µs 699.75 µs 701.66 µs]

With these changes the reading of the page index is now:

open(page index)        time:   [613.47 µs 614.98 µs 616.68 µs]

Note that the decoding of the column index is ~2/3 of the decoding time, so the speed up in the offset index decoding is considerable.

Is this worth the added complexity is the question. No changes to the PageLocation struct are on the horizon, so maybe.

@etseidl etseidl changed the title [thrift-remodel] Add custom PageLocation decoder for to speed up decoding of page indexes [thrift-remodel] Add custom PageLocation decoder to speed up decoding of page indexes Aug 23, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @etseidl -- I think this type of code is acceptable assuming:

  1. Bad input won't cause undefined behavior (I think it would result in an error)

I also think this kind of special case might be more palatable if we also included the more general decoder as a fallback, something like

let original_input = prot.clone(); // save original location in input -- is this possible?
match try_parse_fast(prot)
  Ok(page_location) => Ok(page_location), // successful parse, yay!
  Err(_) => { 
    // ignore and try to parse with slow path
    try_parse_slow_but_general(original_input)
   }
}

pub first_row_index: i64,
}

// Note: this will fail if the fields are either out of order, or if a suboptimal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative might be to have a fallback, so that if we encountered an out of order field, it could discard the work so far, and fall back to the more general, but slower implementation 🤔

let first_row_index = prot.read_i64()?;

// This loop slows things down a bit, but it's an acceptible price to allow
// forwards compatibility. We could instead assert the next field is Stop.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise, if we are really focused on speed, we could do a fast path without a loop, and then add a fallback that handled it correctly

It would of course add complexity

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr definitely needs more benchmarking to justify it. On a xeon the speed up isn't as dramatic, so i might shelve this one for now. I do like the idea of keeping both paths, especially with the push decoder on the horizon.

@etseidl
Copy link
Contributor Author

etseidl commented Aug 24, 2025

I still need to test this more, but I followed your suggestion to fall back on error, and along the way did a custom implementation of all of OffsetIndexMetaData...now the benchmark is down to 584us.

@alamb
Copy link
Contributor

alamb commented Aug 26, 2025

.now the benchmark is down to 584us.

Given the original description, does that mean that the orignal time was 1.75 ms (thus this PR is like a 3X improvement)?

open(page index) time: [1.7446 ms 1.7524 ms 1.7639 ms]

@etseidl
Copy link
Contributor Author

etseidl commented Aug 26, 2025

.now the benchmark is down to 584us.

Given the original description, does that mean that the orignal time was 1.75 ms (thus this PR is like a 3X improvement)?

open(page index) time: [1.7446 ms 1.7524 ms 1.7639 ms]

The 1.75ms is the time for the decoder currently in main. The first phase of the remodel got that down to 700us, and this PR gets the time down to 585us (so a further 16% improvement). When combined with the column index changes in #8191 this benchmark is down to under 290us.

@etseidl
Copy link
Contributor Author

etseidl commented Aug 27, 2025

Going to commit this and move on to writes, etc. We can always revert this later if we don't like it.

@etseidl etseidl merged commit db16cb4 into apache:gh5854_thrift_remodel Aug 27, 2025
16 checks passed
@alamb
Copy link
Contributor

alamb commented Sep 5, 2025

When combined with the column index changes in #8191 this benchmark is down to under 290us.

bird-parrot

@etseidl etseidl deleted the offset_idx_speedup branch October 10, 2025 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parquet Changes to the parquet crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants