Skip to content

Conversation

@l0rinc
Copy link
Contributor

@l0rinc l0rinc commented Oct 31, 2025

Summary

Many callsites only need to determine the script type and don't use the parsed solutions.
Previously, these callers still incurred the cost of allocating and populating a std::vector<std::vector<unsigned char>>.

Fix

Changing Solver signature from reference to nullable pointer parameter with nullptr default allows calling the method for its result only. Note that we can't call std::optional here because that would copy the values but we need the result to be provided from the outside to avoid the copies.
Inside the Solver we guard all accesses to the solutions parameter.
Callsites were changed to pass &solutions only when they need the data and removed when we don't.

Origin & Alternative

This was originally discovered in https://github.com/bitcoin/bitcoin/pull/32279/files#diff-060e8fd790fc1c3e18c64327a7395bb5b2d6d57db9792cc666bd8d7354a40c0bR1154-R1159 where we needed to test the allocation characteristics of different templates.
This supersedes the vector reuse optimization approach in #33645 (comment) by addressing the problem more fundamentally - avoiding unnecessary work entirely rather than optimizing it. The author of the different change was added as a coauthor.

Make the `vSolutionsRet` parameter of `Solver()` nullable pointer allowing callers that only need the return value to avoid allocating and populating the solutions vector.

Benchmark results show a 25.0% performance improvement (4,240,283 to
5,300,923 operations per second) for callers that don't need solutions.

This approach addresses the performance concern raised in bitcoin#33645 more
fundamentally than the vector reuse optimization, while simplifying the call sites at the same time.

Co-authored-by: Raimo33 <[email protected]>
@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 31, 2025

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33757.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Raimo33

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #33645 (refactor: optimize: avoid allocations in script & policy verification by Raimo33)
  • #32729 (test,refactor: extract script template helpers & widen sigop count coverage by l0rinc)
  • #29060 (Policy: Report debug message why inputs are non standard by ismaelsadeeq)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@Raimo33
Copy link
Contributor

Raimo33 commented Oct 31, 2025

Code Review ACK 2726abd

I proposed the same in this private commit: 3ea121d
would like a bit more clarity on the benchmarks referenced. are we talking about CCoinsCaching?

@l0rinc
Copy link
Contributor Author

l0rinc commented Oct 31, 2025

would like a bit more clarity on the benchmarks referenced

Was waiting for corecheck to finish, but it shows only 2.63% speedup for CCoinsCaching.
I'll measure it properly, but in the meantime I'll leave this as a refactoring only since the resulting code is arguably cleaner.

@purpleKarrot
Copy link
Contributor

The proposed approach increases complexity both at the callsites and in the implementation of the function itself.

All the added & at the callsites would be unnecessary, if the PR simply added a single argument overload to the Solver function instead of a default argument.

Further, vSolutionsRet actually seems to be an output argument, not an inout argument. Passing that as the return type improves local reasoning:

auto SolveType(CScript const& scriptPubKey) -> TxoutType;
auto Solve(CScript const& scriptPubKey) -> std::tuple<TxoutType, std::vector<std::vector<unsigned char>>>; 

@Raimo33
Copy link
Contributor

Raimo33 commented Nov 2, 2025

auto SolveType(CScript const& scriptPubKey) -> TxoutType;

I agree on having a separate method for this. It was my initial approach. It involves some code duplication and adding an extra test but might be worth it to avoid all the if (vSolutionsRet) checks.

auto Solve(CScript const& scriptPubKey) -> std::tuple<TxoutType, std::vector<std::vector<unsigned char>>>; 

Wouldn't this increase allocs/copies? I'd say it's perfectly fine to leave it as is with the vector reference as parameter. I'd lean towards returning it if it didn't involve std::tuple and had guaranteed RVO or copy elision.

@sipa
Copy link
Member

sipa commented Nov 2, 2025

If we're going to touch this code, my preference would be to go with @purpleKarrot's approach of two separate functions, moving the vSolutions field into a return pair element.

If the allocation overhead of that is a concern, I think the proper solution is a follow-up to get rid of the vSolutions approach of encoding things, and instead introduce a proper type that encodes it more usefully (possibly an std::variant, like CTxDestination, but with more possibilities). For multisig-like results it wouldn't be entirely allocation-free, but it'd certainly get rid of most allocations in practice.

@l0rinc
Copy link
Contributor Author

l0rinc commented Nov 2, 2025

Thank you for the comments, appreciate the high-level review!

@purpleKarrot, I though of both solutions you have suggested (I also passionately hate output parameters) but since I meant this as a call-site-simplification while eliminating useless work (though the cases where the vSolutions was still needed was indeed slightly more awkward), populating the vector just to discard it immediately didn't seem like it would solve anything.

and instead introduce a proper type that encodes it more usefully

That could work, split the different results by type instead of an enum (Solve would basically be proper parser doing some of the work that's currently done on call-sites) so that they just lazily store minimal data and reconstruct whatever the particular type needs.
But this seems like a big refactor of an area of the code that I'm not actively working on so I'll close the PR - and removed my nack from #33645 (comment) since I can't provide a meaningful alternative anymore.

Thanks!

@l0rinc l0rinc closed this Nov 2, 2025
@l0rinc l0rinc deleted the l0rinc/solutions-vector-optional branch November 2, 2025 16:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants