-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Follow-ups to Bech32 error detection #23577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@ryanofsky and @sipa, this addresses most of your review comments in #16807 so you may like to take a look, thanks! |
Wow, this makes the compiler generate the table? Agree this is better for understandability, just wonder, does this have any noticeable impact on compilation time? |
It shouldn't do, each element of the table only takes a few bitwise operations on 16 bit integers and the table is only 1024 elements. EDIT: a quick test compiling that file using clang takes 0.24 on master and 0.28 with this PR so it is pretty negligible overall. |
|
I checked the assembly output from clang 13.0.0 and gcc 9.3.0 that the tables are indeed a) generated constants in Full values for both tables can be found here: https://0bin.net/paste/Lp6hKmVD#EwTuc1KFIDzy57g0g3EvtQiLIj3HWD4e+OGPWUJTwQ+ (I did not have anything to check them against). |
Looks like the |
|
I will review in detail shortly, but I suspect that @meshcollider perhaps included the table changes corresponding to sipa/bech32#64 here? There is Sage code there that generates the tables too, if someone wants something to compare against. |
Both tables are. From the commit:
I began this test with the intent to compare if they were the same, then noticed they were not. But it's expected (they use a different encoding). Edit: I'll check against @sipa's sage output. |
|
I'm happy to split that commit into two, the first deriving the existing table and then the second modifying to the new table, if that would help review? The length difference (1024 vs 1023) is just because the sage code also includes (a)^1023 = 1 = (a)^0. |
|
ACK 98124a63946d6f7d665864adbef35a863423ca47, with or without nits. |
|
Updated to address all of @sipa's nits, and split the GF1024 constexpr commit into two:
|
src/bech32.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After the recent push I cannot find GF1024_EXP and GF1024_LOG as symbols anymore in the compiled output. I think that makes sense as they're references to tables now, which is fine, just need to adapt my comparison script.
|
Thanks, re-checked:
Code review ACK 1e24b1d385d64892389985afc058c83410c44311 |
|
ACK 1e24b1d |
|
Verified the generated tables by adding the assertions generated by this Sage code: # Binary field
B.<b> = GF(2)
# Polynomials over the binary field
BP.<bp> = B[]
# Encoder field GF(32)
F.<f> = GF(32, modulus=bp^5 + bp^3 + 1, repr='int')
# Polynomials over the encoder field
FP.<fp> = F[]
# Decoder field GF(1024)
E.<e> = F.extension(modulus=fp^2 + F.fetch_int(9)*fp + F.fetch_int(23))
# Convert an E field element to an integer 0..1023.
def gf_to_int(v):
return v[0].integer_representation() + 32*v[1].integer_representation()
# Convert an integer 0..1023 to an E field element.
def int_to_gf(v):
return F.fetch_int(v & 31) + e*F.fetch_int(v >> 5)
GF1024_EXP = [gf_to_int(e**i) for i in range(1023)]
GF1024_LOG = [-1] + [discrete_log(int_to_gf(i), e, ord=1023, operation="*") for i in range(1, 1024)]
for i in range(1023):
print("static_assert(GF1024_EXP[%i] == %i);" % (i, GF1024_EXP[i]))
for i in range(1024):
print("static_assert(GF1024_LOG[%i] == %i);" % (i, GF1024_LOG[i]))
SYNDROME_CONSTS = [None for _ in range(25)]
a1, a2, a3 = e**997, e**998, e**999
for k in range(1, 6):
for b in range(5):
c = (gf_to_int(a1**k * int_to_gf(2)**b) |
gf_to_int(a2**k * int_to_gf(2)**b) << 10 |
gf_to_int(a3**k * int_to_gf(2)**b) << 20)
SYNDROME_CONSTS[(k-1)*5 + b] = c
for i in range(25):
print("static_assert(SYNDROME_CONSTS[%i] == %i);" % (i, SYNDROME_CONSTS[i])) |
This follows PR 64 of the sipa/bech32 repo.
|
Only change: - constexpr const std::array<uint32_t, 25>& SYNDROME_CONSTS = GenerateSyndromeConstants();
+ constexpr std::array<uint32_t, 25> SYNDROME_CONSTS = GenerateSyndromeConstants();as per @sipa's comment. |
|
ACK 2fa4fd1, verified the tables by inserting assertions generated by a Sage script. |
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 2fa4fd1 with caveat that I didn't try to understand or verify the math here, I just checked that all comments describing the math matched up with the code, and that all the changes seemed like improvements.
|
ACK 353b463 |
|
Re-ACK a4fe701 |
maflcko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a nit
|
|
||
| /** Find index of an incorrect character in a Bech32 string. */ | ||
| std::string LocateErrors(const std::string& str, std::vector<int>& error_locations) { | ||
| std::pair<std::string, std::vector<int>> LocateErrors(const std::string& str) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docstring reads odd in doxygen:
"Find index of an incorrect character in a Bech32 string. Return the positions of errors in a Bech32 string. "
Also, std::make_pair is not needed with C++11/17
diff --git a/src/bech32.cpp b/src/bech32.cpp
index 3cda1dfff5..abc617467d 100644
--- a/src/bech32.cpp
+++ b/src/bech32.cpp
@@ -111,7 +111,7 @@ constexpr std::pair<std::array<int16_t, 1023>, std::array<int16_t, 1024>> Genera
GF1024_LOG[v] = i;
}
- return std::make_pair(GF1024_EXP, GF1024_LOG);
+ return {GF1024_EXP, GF1024_LOG};
}
constexpr auto tables = GenerateGFTables();
@@ -395,27 +395,27 @@ DecodeResult Decode(const std::string& str) {
return {result, std::move(hrp), data(values.begin(), values.end() - 6)};
}
-/** Find index of an incorrect character in a Bech32 string. */
-std::pair<std::string, std::vector<int>> LocateErrors(const std::string& str) {
+std::pair<std::string, std::vector<int>> LocateErrors(const std::string& str)
+{
std::vector<int> error_locations{};
if (str.size() > 90) {
error_locations.resize(str.size() - 90);
std::iota(error_locations.begin(), error_locations.end(), 90);
- return std::make_pair("Bech32 string too long", std::move(error_locations));
+ return {"Bech32 string too long", std::move(error_locations)};
}
if (!CheckCharacters(str, error_locations)){
- return std::make_pair("Invalid character or mixed case", std::move(error_locations));
+ return {"Invalid character or mixed case", std::move(error_locations)};
}
size_t pos = str.rfind('1');
if (pos == str.npos) {
- return std::make_pair("Missing separator", std::vector<int>{});
+ return {"Missing separator", std::vector<int>{}};
}
if (pos == 0 || pos + 7 > str.size()) {
error_locations.push_back(pos);
- return std::make_pair("Invalid separator position", std::move(error_locations));
+ return {"Invalid separator position", std::move(error_locations)};
}
std::string hrp;
@@ -430,7 +430,7 @@ std::pair<std::string, std::vector<int>> LocateErrors(const std::string& str) {
int8_t rev = CHARSET_REV[c];
if (rev == -1) {
error_locations.push_back(i);
- return std::make_pair("Invalid Base 32 character", std::move(error_locations));
+ return {"Invalid Base 32 character", std::move(error_locations)};
}
values[i - pos - 1] = rev;
}
@@ -550,7 +550,7 @@ std::pair<std::string, std::vector<int>> LocateErrors(const std::string& str) {
}
} else {
// No errors
- return std::make_pair("", std::vector<int>{});
+ return {"", std::vector<int>{}};
}
if (error_locations.empty() || (!possible_errors.empty() && possible_errors.size() < error_locations.size())) {
@@ -562,7 +562,7 @@ std::pair<std::string, std::vector<int>> LocateErrors(const std::string& str) {
: error_encoding == Encoding::BECH32 ? "Invalid Bech32 checksum"
: "Invalid checksum";
- return std::make_pair(error_message, std::move(error_locations));
+ return {error_message, std::move(error_locations)};
}
} // namespace bech32
diff --git a/src/bech32.h b/src/bech32.h
index 5e89e6efda..707299f19e 100644
--- a/src/bech32.h
+++ b/src/bech32.h
@@ -45,7 +45,7 @@ struct DecodeResult
/** Decode a Bech32 or Bech32m string. */
DecodeResult Decode(const std::string& str);
-/** Return the positions of errors in a Bech32 string. */
+/** Return the positions of errors in a Bech32 string. (Empty string if no errors can be found). */
std::pair<std::string, std::vector<int>> LocateErrors(const std::string& str);
} // namespace bech32|
More nits:
|
a4fe701 Make Bech32 LocateErrors return error list rather than using out-arg (Samuel Dobson) 2fa4fd1 Use std::iota instead of manually pushing range (Samuel Dobson) 405c96f Use bounds-checked array lookups in Bech32 error detection code (Samuel Dobson) 28d9c28 Simplify encoding of e in GF(1024) tables to (1,0) (Samuel Dobson) 14358a0 Replace GF1024 tables and syndrome constants with compile-time generated constexprs. (Samuel Dobson) 63f7b69 Update release note for bech32 error detection (Samuel Dobson) c8b9a22 Report encoding type in bech32 error message (Samuel Dobson) 92f0caf Improve Bech32 boost tests (Samuel Dobson) bb4d3e9 Address review comments for Bech32 error validation (Samuel Dobson) Pull request description: A number of follow-ups and improvements to the bech32 error location code, introduced in bitcoin#16807. Notably, this removes the hardcoded GF1024 tables in favour of constexpr table generation. ACKs for top commit: laanwj: Re-ACK a4fe701 Tree-SHA512: 6312373c20ebd6636f5797304876fa0d70fa777de2f6c507245f51a652b3d1224ebc55b236c9e11e6956c1e88e65faadab51d53587078efccb451455aa2e2276
f5e22de Make Bech32 LocateErrors return error list rather than using out-arg (Samuel Dobson) a94d838 Use std::iota instead of manually pushing range (Samuel Dobson) e32e83d Use bounds-checked array lookups in Bech32 error detection code (Samuel Dobson) af1c914 Simplify encoding of e in GF(1024) tables to (1,0) (Samuel Dobson) a85037e Replace GF1024 tables and syndrome constants with compile-time generated constexprs. (Samuel Dobson) 9ac54d3 Update release note for bech32 error detection (Samuel Dobson) 5fb7801 Report encoding type in bech32 error message (Samuel Dobson) 1fc2945 Improve Bech32 boost tests (Samuel Dobson) e242dd3 Address review comments for Bech32 error validation (Samuel Dobson) Pull request description: A number of follow-ups and improvements to the bech32 error location code, introduced in #16807. Notably, this removes the hardcoded GF1024 tables in favour of constexpr table generation. ACKs for top commit: laanwj: Re-ACK f5e22de Tree-SHA512: 6312373c20ebd6636f5797304876fa0d70fa777de2f6c507245f51a652b3d1224ebc55b236c9e11e6956c1e88e65faadab51d53587078efccb451455aa2e2276
A number of follow-ups and improvements to the bech32 error location code, introduced in #16807.
Notably, this removes the hardcoded GF1024 tables in favour of constexpr table generation.