Safegcd-based modular inverses in MuHash3072 #21590

sipa · 2021-04-04T04:26:05Z

This implements a safegcd-based modular inverse for MuHash3072. It is a fairly straightforward translation of the libsecp256k1 implementation, with the following changes:

Generic for 32-bit and 64-bit
Specialized for the specific MuHash3072 modulus (2^3072 - 1103717).
A bit more C++ish
Far fewer sanity checks

A benchmark is also included for MuHash3072::Finalize. The new implementation is around 100x faster on x86_64 for me (from 5.8 ms to 57 μs); for 32-bit code the factor is likely even larger.

For more information:

Original paper by Daniel J. Bernstein and Bo-Yin Yang
Implementation for libsecp256k1 by Peter Dettman; and the final version
Explanation of the algorithm using Python snippets
Analysis of the maximum number of iterations the algorithm needs
Formal proof in Coq by Russell O'Connor (for the 256-bit version of the algorithm; here we use a 3072-bit one).

sipa · 2021-04-06T23:11:27Z

Using libgmp for inverses is 1.5x-2x faster still, which is somewhat expected - there are several optimizations to safegcd that become more relevant for larger input sizes but aren't useful in the 256-bit code which this is adapted from as well.

I think it's fine to leave those for future improvements, as this already gets hash finalization down to ~1 signature check worth, which is probably far below what we care about.

laanwj · 2021-05-19T14:01:47Z

Concept and high-level review ACK. Did not check the algorithm in detail.

theStack · 2021-06-22T18:56:40Z

Concept ACK

DrahtBot · 2021-10-15T09:07:10Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/21590.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	TheCharlatan, dergoegge, achow101
Concept ACK	laanwj
Stale ACK	theStack

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#31507 ([POC] build: Use clang-cl to build on Windows natively by hebasto)
#31308 (ci, iwyu: Treat warnings as errors for specific targets by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

sipa · 2021-11-15T16:22:45Z

Rebased.

configure.ac

sipa · 2021-12-01T21:17:11Z

I added additional fuzz tests, which seem to pass.

However, test/functional/feature_coinstatsindex.py now fails, and I haven't figured out why. @fjahr ?

src/test/fuzz/muhash.cpp

2c35a93 Generalize/simplify VectorReader into SpanReader (Pieter Wuille) Pull request description: Originally written for #21590 (safegcd-based MuHash inverses), but then found a better way that removed the need for it, so I'm submitting it independently. ACKs for top commit: MarcoFalke: re-ACK 2c35a93 🖨 shaavan: ACK 2c35a93 Tree-SHA512: 959e3251e0cfe20e13a50639b617c9dc2a561d613a0884d983c93d15dacb6d2305d760aa933d18ba055cef8a1651a344bcb6b3f93051ecf26d3f2efc5779efa4

2c35a93 Generalize/simplify VectorReader into SpanReader (Pieter Wuille) Pull request description: Originally written for bitcoin#21590 (safegcd-based MuHash inverses), but then found a better way that removed the need for it, so I'm submitting it independently. ACKs for top commit: MarcoFalke: re-ACK 2c35a93 🖨 shaavan: ACK 2c35a93 Tree-SHA512: 959e3251e0cfe20e13a50639b617c9dc2a561d613a0884d983c93d15dacb6d2305d760aa933d18ba055cef8a1651a344bcb6b3f93051ecf26d3f2efc5779efa4

fjahr · 2021-12-05T22:05:08Z

I added additional fuzz tests, which seem to pass.

However, test/functional/feature_coinstatsindex.py now fails, and I haven't figured out why. @fjahr ?

This isn't an issue with the implementation here but with me the test just being stupid. More info here #23681.

I will take a closer look at the code here soon!

c055f6b test: Remove false coinstatsindex test (Fabian Jahr) Pull request description: This test never actually tested the behavior that it describes in the comments. This was discovered in #21590 which seems to speed up muhash which lead to the test failing. I can vaguely remember that the described behavior was desired by some reviewers of `coinstatsindex`: That `coinstatsindex` should be aware of stale blocks and able to return statistics on them as well. The index actually does this for blocks that it sees while the index is active, i.e. while running `coinstatsindex` all blocks will be indexed and even when they become stale the index (via `gettxoutsetinfo`) will still return a result for them when given the right hash. But this currently does not work for blocks that the node saw and that became stale _before_ the node activated `coinstatsindex`. While the index syncs initially everything but the active chain is ignored and I don't see any indication that this ever worked differently in the past. Introducing this behavior seems non-trivial at first glance so, while I will give this a shot, I think the test should be removed so it does not confuse users and does not block #21590. Top commit has no ACKs. Tree-SHA512: b05f5dfeea3e453c8bb7c761501d0d896d4412a3f0c08037955951fae9fe388c63402da401792591e18da8fb67734f47f1a297d573cdb66e0ced451698718067

c055f6b test: Remove false coinstatsindex test (Fabian Jahr) Pull request description: This test never actually tested the behavior that it describes in the comments. This was discovered in bitcoin#21590 which seems to speed up muhash which lead to the test failing. I can vaguely remember that the described behavior was desired by some reviewers of `coinstatsindex`: That `coinstatsindex` should be aware of stale blocks and able to return statistics on them as well. The index actually does this for blocks that it sees while the index is active, i.e. while running `coinstatsindex` all blocks will be indexed and even when they become stale the index (via `gettxoutsetinfo`) will still return a result for them when given the right hash. But this currently does not work for blocks that the node saw and that became stale _before_ the node activated `coinstatsindex`. While the index syncs initially everything but the active chain is ignored and I don't see any indication that this ever worked differently in the past. Introducing this behavior seems non-trivial at first glance so, while I will give this a shot, I think the test should be removed so it does not confuse users and does not block bitcoin#21590. Top commit has no ACKs. Tree-SHA512: b05f5dfeea3e453c8bb7c761501d0d896d4412a3f0c08037955951fae9fe388c63402da401792591e18da8fb67734f47f1a297d573cdb66e0ced451698718067

PastaPastaPasta

please change all the c-style casts to functional casts, I gave specific comments on a few instances, but stopped so it's not as spammy

src/crypto/muhash.cpp

sedited

ACK ad67fd2

Just nits, and all of them can be ignored. This was fun to review, the explanations in the safegcd_implementation.md are excellent. I also profited from the two review clubs and their notes that were done on the original MuHash introduction.

src/crypto/muhash.cpp

sedited · 2024-11-07T09:41:07Z

src/crypto/muhash.cpp

+    ce -= (signed_double_limb_t)1103717 * me;
+    /* Verify that the low SIGNED_LIMB_SIZE bits of the computation are indeed zero, and then throw them away. */
+    assert((cd & MAX_SIGNED_LIMB) == 0);
+    assert((ce & MAX_SIGNED_LIMB) == 0);


Just a question: Compared to the code in modinv32_impl.h, there are fewer bound checks done here. Is this on purpose?

Which ones in particular are missing?

I meant these checks:
https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/modinv32_impl.h#L414-L419
https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/modinv32_impl.h#L456-L459

But I guess they are not really that important and it is not worth adding the required utilities?

sedited · 2024-11-07T12:12:36Z

src/crypto/muhash.cpp

+        // Add modulus if this was negative. This brings the range of *this to 1-2^3072..2^3072-1.
+        signed_limb_t cond_add = limbs[SIGNED_LIMBS-1] >> (LIMB_SIZE-1); // -1 if this is negative; 0 otherwise
+        limbs[0] += signed_limb_t(-MAX_PRIME_DIFF) & cond_add;
+        limbs[3072 / SIGNED_LIMB_SIZE] += (signed_limb_t(1) << (3072 % SIGNED_LIMB_SIZE)) & cond_add;


Just a question: Compared to modinv32_normalize, this step seems to skip the inner limbs. Is there a reason why there is a difference between the implementations here? IIUC this works out in the end because of the carry step.

Yes and no. They aren't skipped, the carry step is the equivalent of processing the inner limbs. They're just easier, because the modulus here can be represented as [1 << FINAL_LIMB_MODULUS_BITS, 0, 0, 0, ..., 0, 0, -MAX_PRIME_DIFF]. In the modinv32_impl.h code, the modulus is treated as generic, where any limb can be nonzero.

src/crypto/muhash.cpp

sedited · 2024-11-07T20:19:29Z

src/crypto/muhash.cpp

+    me -= (limb_t(0x70a1421da087d93) * limb_t(ce) + me) & MAX_SIGNED_LIMB;
+    /* Update the beginning of computation for t*[d,e]+modulus*[md,me] now md,me are known. */
+    cd -= (signed_double_limb_t)1103717 * md;
+    ce -= (signed_double_limb_t)1103717 * me;


This tripped me up at first, because something is added instead of subtracted compared to the implementation in modinv32_impl.h. But I think this is correct, because it subtracts the distance (I don't know what the correct terminology is for this) to the modulus instead of adding the modulus, which should work out to the same. Similarly, the operation in the for loop can also be moved to the final step, which I'm guessing is a further nice optimization.

Same comment as above. The modulus here is 2^3072 - MAX_PRIME_DIFF, which is represented in signed-limb representation as [1 << FINAL_LIMB_MODULUS_BITS, 0, 0, 0, ..., 0, -MAX_PRIME_DIFF], so we only need to do something for the bottom limb (where our modulus is negative) and the top limb.

theStack

Fuzz-tested ACK ad67fd2

With the friendly help of @dergoegge I managed to get differential fuzzing running last week and let that ran for the last ~77 hours. Here are the rough instructions for those who also want to give it a try:

Instructions for differential fuzzing using semsan

created branches on top of master and the PR each that add a characterization to the MuHash fuzz test, writing to a shared memory for comparison (see https://github.com/theStack/bitcoin/tree/muhash_characterization_master and https://github.com/theStack/bitcoin/tree/muhash_characterization_pr21590, cherry-picking the commit originally from dergoegge@d327378; note that the environment variable value had to be adapted to SEMSAN_CHARACTERIZATION_SHMEM_ID)
built the afl-clang-... binaries for clang 18:

$ git clone https://github.com/AFLplusplus/AFLplusplus
$ cd AFLplusplus
$ LLVM_CONFIG=llvm-config-18 make

built both branches mentioned in step 1 above using afl-clang-lto/afl-clang-lto++ (built in step 2):

$ cmake -B build_fuzz -DCMAKE_C_COMPILER="/path/to/AFLplusplus/afl-clang-lto" -DCMAKE_CXX_COMPILER="/path/to/AFLplusplus/afl-clang-lto++" -DBUILD_FOR_FUZZING=ON
...
$ cmake --build build_fuzz/
...

(Note that this can take quite a while. Unfortunately, using the -fast binaries didn't work for me and resulted in a linker error.)

cloned the qa-assets repo for the fuzzing seeds

$ git clone --depth=1 https://github.com/bitcoin-core/qa-assets

built the dergoegge's semsan tool and run it with each of the built fuzzing binaries above and the fuzzing seed:

$ https://github.com/dergoegge/semsan
$ cd semsan
$ cargo build --release
$ AFL_DEBUG=1 FUZZ=muhash ./target/release/semsan --debug-children /path/to/master_characterization_branch/build_fuzz/src/test/fuzz/fuzz /path/to/pr21590_characterization_branch/build_fuzz/src/test/fuzz/fuzz fuzz --seeds ~/qa-assets/fuzz_corpora/muhash/ --solutions ./solutions

wait and enjoy 🍻 🥃 🥩 🍨

The latest output looked like this on my machine:

[Client Heartbeat #0] run time: 77h-20m-55s, clients: 1, corpus: 22, objectives: 0, executions: 31816828, exec/sec: 114.3, combined-coverage: 262/563840 (0%), stability: 262/262 (100%)

DrahtBot added Build system Utils/log/libs labels Apr 4, 2021

DrahtBot mentioned this pull request Oct 15, 2021

refactor: Extract RipeMd160 #15294

Merged

DrahtBot mentioned this pull request Nov 10, 2021

build: boring autotools cleanup #23473

Merged

DrahtBot added the Needs rebase label Nov 13, 2021

sipa force-pushed the 202101_muhash_safegcd branch from 498d1df to f1f4eae Compare November 15, 2021 16:09

DrahtBot removed the Needs rebase label Nov 15, 2021

fanquake reviewed Nov 15, 2021

View reviewed changes

configure.ac Outdated Show resolved Hide resolved

sipa force-pushed the 202101_muhash_safegcd branch 2 times, most recently from 0bf165c to 58b108e Compare December 1, 2021 21:13

DrahtBot mentioned this pull request Dec 2, 2021

refactor: Use spans of std::byte in serialize #23438

Merged

maflcko reviewed Dec 2, 2021

View reviewed changes

src/test/fuzz/muhash.cpp Outdated Show resolved Hide resolved

src/test/fuzz/muhash.cpp Outdated Show resolved Hide resolved

sipa force-pushed the 202101_muhash_safegcd branch 2 times, most recently from 3821b14 to 9cb1bc3 Compare December 2, 2021 16:48

sipa mentioned this pull request Dec 2, 2021

Generalize/simplify VectorReader into SpanReader #23653

Merged

fjahr mentioned this pull request Dec 5, 2021

test: Remove false coinstatsindex test #23681

Merged

This was referenced Dec 18, 2021

docs: avoid C-style casts; use modern C++ casts #23810

Merged

Add ASM optimizations for MuHash3072 #19181

Closed

PastaPastaPasta suggested changes Feb 1, 2022

View reviewed changes

src/crypto/muhash.cpp Outdated Show resolved Hide resolved

src/crypto/muhash.cpp Outdated Show resolved Hide resolved

src/crypto/muhash.cpp Outdated Show resolved Hide resolved

src/crypto/muhash.cpp Outdated Show resolved Hide resolved

sipa force-pushed the 202101_muhash_safegcd branch from 02c333c to ad67fd2 Compare October 16, 2024 07:58

DrahtBot removed the CI failed label Oct 16, 2024

sedited approved these changes Nov 7, 2024

View reviewed changes

DrahtBot requested a review from dergoegge November 7, 2024 20:26

theStack approved these changes Nov 18, 2024

View reviewed changes

Safegcd-based modular inverses in MuHash3072 #21590

Safegcd-based modular inverses in MuHash3072 #21590

Uh oh!

Conversation

sipa commented Apr 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sipa commented Apr 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

laanwj commented May 19, 2021

Uh oh!

theStack commented Jun 22, 2021

Uh oh!

DrahtBot commented Oct 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Coverage & Benchmarks

Reviews

Conflicts

Uh oh!

sipa commented Nov 15, 2021

Uh oh!

Uh oh!

sipa commented Dec 1, 2021

Uh oh!

Uh oh!

Uh oh!

fjahr commented Dec 5, 2021

Uh oh!

PastaPastaPasta left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sedited left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sedited Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

sipa Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

sedited Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

sedited Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

sipa Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sedited Nov 7, 2024

Choose a reason for hiding this comment

Uh oh!

sipa Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

theStack left a comment

Choose a reason for hiding this comment

Uh oh!

sipa commented Apr 4, 2021 •

edited

Loading

sipa commented Apr 6, 2021 •

edited

Loading

DrahtBot commented Oct 15, 2021 •

edited

Loading