-
Notifications
You must be signed in to change notification settings - Fork 38.6k
Use SipHash-2-4 for various non-cryptographic hashes #8020
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Concept ACK. |
|
Concept ACK |
src/main.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if pnode is 32-bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch. I've replaced it by using pnode->id.
|
Concept ACK |
|
Here is a comment about using SipHash-1-3 instead of SipHash-2-4: rust-lang/rust#29754 (comment) It would be approximately twice as fast for hashing txids, and the distinguisher mentioned in that comment is not relevant for data that is a multiple of 8 bytes (which is the case for us), as 8 bytes of padding are added in that case. Opinions? |
|
Siphash 2-4 seems to be the defacto 'safe' choice. Siphash 1-3 happens to be secure for our use case right now, but that sounds a tad brittle: as if a small change in our use case, or a small advancement in cryptoanalysis of the function, could mean that the known weakness does suddenly affect us? Not sure how much hashing the txids is a bottleneck in the whole scheme of things around CCoinsCache, it's only 32 bytes after all. Conservatively I'd say only consider 'downgrading' if performance is seriously affected by going from 29 cycles to 49 cycles. On the other hand it's pretty nice if using a hash with better security properties could be faster than the Lookup3/XOR scheme, so I see the attraction in that. |
|
Did some better benchmarks:
|
|
Thanks @sipa 👍 |
src/hash.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be nice to not break aliasing rules here? Shouldnt the compiler optimize out a manual-shift-or?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
Going to stick to SipHash-2-4 for now. We can always switch to something else later. Ready for merge, I think. |
|
utACK 658c6481616f12a3466d1ddeb23eb91788ab2e66 |
|
ACK |
src/uint256.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make this more readable?
Something like
(uint64_t(ptr[0]) << 0) |
(uint64_t(ptr[1]) << 8) |
(uint64_t(ptr[2]) << 16) |
(uint64_t(ptr[3]) << 24) |
(uint64_t(ptr[4]) << 32) |
(uint64_t(ptr[5]) << 40) |
(uint64_t(ptr[6]) << 48) |
(uint64_t(ptr[7]) << 56))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This is ~1.7x slower than the Lookup3-of-Xor-with-salt construct we were using before, but it is a primitive designed for exactly this.
|
ACK a68ec21 |
|
I'm trying to work out what problem these siphashes fix. What is the risk that this solves please? |
|
It's fast and cryptographically solid - which reduces DoS risk by attackers remotely predicting the behavior of the hash function. As I quoted above already, pretty much same rationale as for Python: https://www.python.org/dev/peps/pep-0456/ |
For future reference. Command: Before: After: Spec: i7-8700 CPU @ 3.20GHz, 32GB RAM, NVMe SSD. |
cbdd0e1 Use SipHash-2-4 for address relay selection (Pieter Wuille) cf27341 Switch CTxMempool::mapTx to use a hash index for txids (Pieter Wuille) 7e636b2 Use SipHash-2-4 for CCoinsCache index (random-zebra) 15ee581 [Core] implement GetUint64 on uint256 (random-zebra) b2e5c4a Add SipHash-2-4 primitives to hash (Pieter Wuille) Pull request description: Backports bitcoin#8020 > Use SipHash-2-4 for: > > - CCoinsViewCache hashmap (instead of a custom Lookup3/XOR scheme) > - CTxMempool::mapTx txid index (converting from an ordered map) > - Address relay peer selection (instead of SHA256) > > Computing a hash for a txid using this takes around 52 CPU cycles in benchmarks (the Lookup3/XOR based mechanism used for CCoinsViewCache took 31 cycles). I think that's negligible still, while using a much more standard construct designed for such purposes. ACKs for top commit: Fuzzbawls: ACK cbdd0e1 furszy: ACK cbdd0e1 Tree-SHA512: 223f2f119c94991f1e3594afbacc2dd2476a689b9127860c7e8e244052e5e2cddc486ee44f5345370185f1cd337dfd67c2ec9a44e9dcc8f6cc391a011364ce8f
Backport of bitcoin/bitcoin#8020 Cherry-picked from bitcoin/bitcoin#8020 - cherry-pick bitcoin/bitcoin@a68ec21f7e - conflict with bitcoin/bitcoin@957e5d2 - cherry-pick bitcoin/bitcoin@8cc9cfe160 - cherry-pick bitcoin/bitcoin@382c871d28 - replace CCoinsKeyHasher with SaltedTxidHasher - cherry-pick bitcoin/bitcoin@0b1295b066 - self-conflict with previously-applied bitcoin/bitcoin@b8a657936 (#3180)
Use SipHash-2-4 for:
Computing a hash for a txid using this takes around 52 CPU cycles in benchmarks (the Lookup3/XOR based mechanism used for CCoinsViewCache took 31 cycles). I think that's negligible still, while using a much more standard construct designed for such purposes.