Skip to content

Conversation

@maflcko
Copy link
Member

@maflcko maflcko commented May 27, 2022

Seems a bit confusing to happily accept random bytes and pretend they are hex encoded strings.

@maflcko maflcko force-pushed the 2205-reject-non-hex- branch from fa83a82 to 6d5fba8 Compare May 27, 2022 14:11
@maflcko
Copy link
Member Author

maflcko commented May 27, 2022

Looks like this may be used by salvage wallet, but I am not sure if it is worth it to keep for that?

@sipa
Copy link
Member

sipa commented May 27, 2022

I don't think the distinction matters for salvage; it looks like it's bdb producing the hex dump output lines at runtime, so reading anything but hex data from it would imply buggy/incorrect use of bdb in the first place.

More generally... what about instead moving to a model where this hex parsing returns an std::optional<std::vector<std::byte>>, so callers are free to distinguish invalid input from empty input?

@laanwj
Copy link
Member

laanwj commented May 27, 2022

Concept ACK.

But if we're going to change this I'd prefer to return an optional<> so that the empty vector can be distinguished from the error condition.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 28, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK stickies-v
Concept ACK laanwj, vincenzopalazzo
Stale ACK pinheadmz

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

Copy link

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, and I also vote for the optional parameter, that is already used in the code base and we can try to unify the paradigm used

@maflcko maflcko changed the title Return empty vector on invalid hex encoding Return nullopt on invalid hex encoding May 30, 2022
@maflcko maflcko changed the title Return nullopt on invalid hex encoding Handle invalid hex encoding in ParseHex May 30, 2022
@maflcko maflcko force-pushed the 2205-reject-non-hex- branch from 6d5fba8 to 0aff758 Compare May 30, 2022 13:24
@maflcko maflcko marked this pull request as draft May 30, 2022 14:27
@maflcko maflcko force-pushed the 2205-reject-non-hex- branch from 0aff758 to fad748a Compare December 13, 2022 15:49
@maflcko
Copy link
Member Author

maflcko commented Dec 13, 2022

Ok, rebased and added a new function for optional. The old function remains an alias, with the fallback to an empty vector. This avoid having to change all existing code and making it needlessly verbose, because it is already properly handling empty vectors.

Unrelated: It is possible to convert hex at compile time
#include <array>
#include <cstddef>
#include <string_view>

#define FromHex(chars_lit)                                 \
    ([&]() {                                               \
        using namespace std::literals;                     \
        constexpr std::string_view hex_str{chars_lit##sv}; \
        static_assert(IsHex(hex_str));                     \
        std::array<std::byte, hex_str.size() / 2> b{};     \
        auto it = hex_str.begin();                         \
        for (auto& i : b) {                                \
            auto c1 = HexDigit(*(it++));                   \
            auto c2 = HexDigit(*(it++));                   \
            i = std::byte(c1 << 4) | std::byte(c2);        \
        }                                                  \
                                                           \
        return b;                                          \
    }())

Use:

constexpr auto a{FromHex("ffaa")}; // OK
constexpr auto a{FromHex("xxYx")}; // compile failure

@maflcko maflcko marked this pull request as ready for review December 13, 2022 18:45
@maflcko maflcko force-pushed the 2205-reject-non-hex- branch 2 times, most recently from fa68745 to fa6ec61 Compare January 3, 2023 11:34
Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@fanquake fanquake requested a review from pinheadmz February 23, 2023 16:06
Copy link
Member

@pinheadmz pinheadmz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fad0c892c34c30cf8f50e832425210e24d45837e

Built and ran all tests, reviewed code. I also inserted a few other test cases but nothing broke so, take em or leave em:

Ff aA (mixed case / multiple spaces, succeeds)
F F (spaces between nibbles, fails as expected)

Show Signature
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

ACK fad0c892c34c30cf8f50e832425210e24d45837e
-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEE5hdzzW4BBA4vG9eM5+KYS2KJyToFAmP5A/QACgkQ5+KYS2KJ
yToaJw/+OHwTtj30d86GvVcGf4v7Hz0nVl+qVI0/BNE+xEcVEOkfK47yYoIIuJ40
uWiW7ts9s8ERQ6o60nVRBm2V8yEjjbD1kJf4+qJiWIEVvIkgzXrshziqy8RGDeaj
yMpdSEwRmysnWQ4/hk6vVJ3+bn3dQfKCv7QA/CEEwcWQh148jZxPzSZwAuaU7dWH
uzmA+P3AYFYJ+jfOVlahWnNVlmFDedJixXV6qwEyBZbpC9oUwIgb2pMcbalojktS
G1IafwHMl0dvkhxVRlwKQK1I1tjoqaAT7Dfv/Y3wJaW4rjewVVE/Vs5iAheUmpKI
bSsGGCwmiv+leyBpWltwyYDklzOt2ITysVcVjpNVE4TmfcFK3z3b30rdQGvGYc2Q
UPG6GjOOZ1Kvb07oy0DcICbNkLTg5r4ZEqCRZFn9+o/Eu90EjHBD9PzvNAzk4yw0
y5GinmmwmBHXNCHBYbU+ZrjrYJSmAjOORoOrYin7Gt6BQ8nIlA8f8LurTmj9/RJp
Fj6vrfk8EjOmyPFoxzVOC7woirK8PMF8EU9yJIebaIzX2PzjSEXiTOhOV/BJs1bL
5WhjCHuqYdHWKS609twwvpEdQXFquThcLlmUATI+0jNt5gsegxdkF39BYQpr95B2
hMS/yJb18ywl4HOa/Bk4HyRFekI+EUbkeFxwlruerwxzzKtKUQE=
=qi0Q
-----END PGP SIGNATURE-----

pinheadmz's public key is on keybase

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK fad0c892c34c30cf8f50e832425210e24d45837e

An API that fails on invalid input instead of accepting the partial valid part is more robust imo. I don't see any places where this behaviour change will cause issues, but there are a lot of callsites.


Looking at the callsites of ParseHex(), it looks like there's a significant number of places where we first check IsHex() and then ParseHex(), iterating over the string twice when I think this can just be replaced with a single TryParseHex(). Besides being (I'd assume) more efficient, the main behaviour difference I can see is that IsHex() also returns false on an empty string, but in the proposed diff I don't think that makes a difference.

Can just as well be done in a follow-up, though, it's a much bigger diff than the current PR. (Also: didn't yet super thoroughly check all the template specifiers but happy to revise if you think this approach makes sense).

git diff
diff --git a/src/bitcoin-tx.cpp b/src/bitcoin-tx.cpp
index dc3038316..7b57b0637 100644
--- a/src/bitcoin-tx.cpp
+++ b/src/bitcoin-tx.cpp
@@ -442,12 +442,10 @@ static void MutateTxAddOutData(CMutableTransaction& tx, const std::string& strIn
     // extract and validate DATA
     const std::string strData{strInput.substr(pos, std::string::npos)};
 
-    if (!IsHex(strData))
-        throw std::runtime_error("invalid TX output data");
+    auto data{TryParseHex<unsigned char>(strData)};
+    if (!data.has_value()) throw std::runtime_error("invalid TX output data");
 
-    std::vector<unsigned char> data = ParseHex(strData);
-
-    CTxOut txout(value, CScript() << OP_RETURN << data);
+    CTxOut txout(value, CScript() << OP_RETURN << data.value());
     tx.vout.push_back(txout);
 }
 
diff --git a/src/core_read.cpp b/src/core_read.cpp
index 7bab171c8..b580e7867 100644
--- a/src/core_read.cpp
+++ b/src/core_read.cpp
@@ -194,20 +194,16 @@ static bool DecodeTx(CMutableTransaction& tx, const std::vector<unsigned char>&
 
 bool DecodeHexTx(CMutableTransaction& tx, const std::string& hex_tx, bool try_no_witness, bool try_witness)
 {
-    if (!IsHex(hex_tx)) {
-        return false;
-    }
-
-    std::vector<unsigned char> txData(ParseHex(hex_tx));
-    return DecodeTx(tx, txData, try_no_witness, try_witness);
+    auto tx_data{TryParseHex<unsigned char>(hex_tx)};
+    if (!tx_data) return false;
+    return DecodeTx(tx, tx_data.value(), try_no_witness, try_witness);
 }
 
 bool DecodeHexBlockHeader(CBlockHeader& header, const std::string& hex_header)
 {
-    if (!IsHex(hex_header)) return false;
-
-    const std::vector<unsigned char> header_data{ParseHex(hex_header)};
-    CDataStream ser_header(header_data, SER_NETWORK, PROTOCOL_VERSION);
+    auto header_data{TryParseHex<unsigned char>(hex_header)};
+    if (!header_data) return false;
+    CDataStream ser_header(header_data.value(), SER_NETWORK, PROTOCOL_VERSION);
     try {
         ser_header >> header;
     } catch (const std::exception&) {
@@ -218,11 +214,9 @@ bool DecodeHexBlockHeader(CBlockHeader& header, const std::string& hex_header)
 
 bool DecodeHexBlk(CBlock& block, const std::string& strHexBlk)
 {
-    if (!IsHex(strHexBlk))
-        return false;
-
-    std::vector<unsigned char> blockData(ParseHex(strHexBlk));
-    CDataStream ssBlock(blockData, SER_NETWORK, PROTOCOL_VERSION);
+    auto block_data{TryParseHex<unsigned char>(strHexBlk)};
+    if (!block_data) return false;
+    CDataStream ssBlock(block_data.value(), SER_NETWORK, PROTOCOL_VERSION);
     try {
         ssBlock >> block;
     }
@@ -247,9 +241,9 @@ std::vector<unsigned char> ParseHexUV(const UniValue& v, const std::string& strN
     std::string strHex;
     if (v.isStr())
         strHex = v.getValStr();
-    if (!IsHex(strHex))
-        throw std::runtime_error(strName + " must be hexadecimal string (not '" + strHex + "')");
-    return ParseHex(strHex);
+    auto hex{TryParseHex<unsigned char>(strHex)};
+    if (!hex) throw std::runtime_error(strName + " must be hexadecimal string (not '" + strHex + "')");
+    return hex.value();
 }
 
 int ParseSighashString(const UniValue& sighash)
diff --git a/src/rpc/util.cpp b/src/rpc/util.cpp
index 85158c99c..3e68182d5 100644
--- a/src/rpc/util.cpp
+++ b/src/rpc/util.cpp
@@ -112,9 +112,10 @@ std::vector<unsigned char> ParseHexV(const UniValue& v, std::string strName)
     std::string strHex;
     if (v.isStr())
         strHex = v.get_str();
-    if (!IsHex(strHex))
-        throw JSONRPCError(RPC_INVALID_PARAMETER, strName+" must be hexadecimal string (not '"+strHex+"')");
-    return ParseHex(strHex);
+    auto hex{TryParseHex<unsigned char>(strHex)};
+    if (!hex) throw JSONRPCError(RPC_INVALID_PARAMETER, strName+" must be hexadecimal string (not '"+strHex+"')");
+
+    return hex.value();
 }
 std::vector<unsigned char> ParseHexO(const UniValue& o, std::string strKey)
 {
@@ -198,10 +199,9 @@ std::string HelpExampleRpcNamed(const std::string& methodname, const RPCArgList&
 // Converts a hex string to a public key if possible
 CPubKey HexToPubKey(const std::string& hex_in)
 {
-    if (!IsHex(hex_in)) {
-        throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Invalid public key: " + hex_in);
-    }
-    CPubKey vchPubKey(ParseHex(hex_in));
+    auto hex{TryParseHex<uint8_t>(hex_in)};
+    if (!hex) throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Invalid public key: " + hex_in);
+    CPubKey vchPubKey(hex.value());
     if (!vchPubKey.IsFullyValid()) {
         throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Invalid public key: " + hex_in);
     }
diff --git a/src/script/descriptor.cpp b/src/script/descriptor.cpp
index 864eb8864..83d465039 100644
--- a/src/script/descriptor.cpp
+++ b/src/script/descriptor.cpp
@@ -1082,9 +1082,9 @@ std::unique_ptr<PubkeyProvider> ParsePubkeyInner(uint32_t key_exp_index, const S
         return nullptr;
     }
     if (split.size() == 1) {
-        if (IsHex(str)) {
-            std::vector<unsigned char> data = ParseHex(str);
-            CPubKey pubkey(data);
+        auto data{TryParseHex<unsigned char>(str)};
+        if (data) {
+            CPubKey pubkey(data.value());
             if (pubkey.IsFullyValid()) {
                 if (permit_uncompressed || pubkey.IsCompressed()) {
                     return std::make_unique<ConstPubkeyProvider>(key_exp_index, pubkey, false);
@@ -1092,9 +1092,9 @@ std::unique_ptr<PubkeyProvider> ParsePubkeyInner(uint32_t key_exp_index, const S
                     error = "Uncompressed keys are not allowed";
                     return nullptr;
                 }
-            } else if (data.size() == 32 && ctx == ParseScriptContext::P2TR) {
+            } else if (data.value().size() == 32 && ctx == ParseScriptContext::P2TR) {
                 unsigned char fullkey[33] = {0x02};
-                std::copy(data.begin(), data.end(), fullkey + 1);
+                std::copy(data.value().begin(), data.value().end(), fullkey + 1);
                 pubkey.Set(std::begin(fullkey), std::end(fullkey));
                 if (pubkey.IsFullyValid()) {
                     return std::make_unique<ConstPubkeyProvider>(key_exp_index, pubkey, true);
@@ -1160,15 +1160,15 @@ std::unique_ptr<PubkeyProvider> ParsePubkey(uint32_t key_exp_index, const Span<c
         return nullptr;
     }
     std::string fpr_hex = std::string(slash_split[0].begin(), slash_split[0].end());
-    if (!IsHex(fpr_hex)) {
+    auto fpr_bytes{TryParseHex<unsigned char>(fpr_hex)};
+    if (!fpr_bytes) {
         error = strprintf("Fingerprint '%s' is not hex", fpr_hex);
         return nullptr;
     }
-    auto fpr_bytes = ParseHex(fpr_hex);
     KeyOriginInfo info;
     static_assert(sizeof(info.fingerprint) == 4, "Fingerprint must be 4 bytes");
-    assert(fpr_bytes.size() == 4);
-    std::copy(fpr_bytes.begin(), fpr_bytes.end(), info.fingerprint);
+    assert(fpr_bytes.value().size() == 4);
+    std::copy(fpr_bytes.value().begin(), fpr_bytes.value().end(), info.fingerprint);
     if (!ParseKeyPath(slash_split, info.path, error)) return nullptr;
     auto provider = ParsePubkeyInner(key_exp_index, origin_split[1], ctx, out, error);
     if (!provider) return nullptr;
@@ -1488,12 +1488,12 @@ std::unique_ptr<DescriptorImpl> ParseScript(uint32_t& key_exp_index, Span<const
     }
     if (ctx == ParseScriptContext::TOP && Func("raw", expr)) {
         std::string str(expr.begin(), expr.end());
-        if (!IsHex(str)) {
+        auto bytes{TryParseHex<unsigned char>(str)};
+        if (!bytes) {
             error = "Raw script is not hex";
             return nullptr;
         }
-        auto bytes = ParseHex(str);
-        return std::make_unique<RawDescriptor>(CScript(bytes.begin(), bytes.end()));
+        return std::make_unique<RawDescriptor>(CScript(bytes.value().begin(), bytes.value().end()));
     } else if (Func("raw", expr)) {
         error = "Can only have raw() at top level";
         return nullptr;
diff --git a/src/script/miniscript.h b/src/script/miniscript.h
index fa3b0350e..f400aac44 100644
--- a/src/script/miniscript.h
+++ b/src/script/miniscript.h
@@ -1001,10 +1001,10 @@ std::optional<std::pair<std::vector<unsigned char>, int>> ParseHexStrEnd(Span<co
     int hash_size = FindNextChar(in, ')');
     if (hash_size < 1) return {};
     std::string val = std::string(in.begin(), in.begin() + hash_size);
-    if (!IsHex(val)) return {};
-    auto hash = ParseHex(val);
-    if (hash.size() != expected_size) return {};
-    return {{std::move(hash), hash_size}};
+    auto hash{TryParseHex<unsigned char>(val)};
+    if (!hash) return {};
+    if (hash.value().size() != expected_size) return {};
+    return {{std::move(hash.value()), hash_size}};
 }
 
 /** BuildBack pops the last two elements off `constructed` and wraps them in the specified Fragment */
diff --git a/src/wallet/dump.cpp b/src/wallet/dump.cpp
index efa548ad9..55fa278e0 100644
--- a/src/wallet/dump.cpp
+++ b/src/wallet/dump.cpp
@@ -240,22 +240,21 @@ bool CreateFromDump(const ArgsManager& args, const std::string& name, const fs::
                 continue;
             }
 
-            if (!IsHex(key)) {
+            auto k{TryParseHex<unsigned char>(key)};
+            if (!k) {
                 error = strprintf(_("Error: Got key that was not hex: %s"), key);
                 ret = false;
                 break;
             }
-            if (!IsHex(value)) {
+            auto v{TryParseHex<unsigned char>(value)};
+            if (!v) {
                 error = strprintf(_("Error: Got value that was not hex: %s"), value);
                 ret = false;
                 break;
             }
 
-            std::vector<unsigned char> k = ParseHex(key);
-            std::vector<unsigned char> v = ParseHex(value);
-
-            CDataStream ss_key(k, SER_DISK, CLIENT_VERSION);
-            CDataStream ss_value(v, SER_DISK, CLIENT_VERSION);
+            CDataStream ss_key(k.value(), SER_DISK, CLIENT_VERSION);
+            CDataStream ss_value(v.value(), SER_DISK, CLIENT_VERSION);
 
             if (!batch->Write(ss_key, ss_value)) {
                 error = strprintf(_("Error: Unable to write record to new wallet"));
diff --git a/src/wallet/rpc/backup.cpp b/src/wallet/rpc/backup.cpp
index 95117b6c1..382b731d4 100644
--- a/src/wallet/rpc/backup.cpp
+++ b/src/wallet/rpc/backup.cpp
@@ -290,9 +290,8 @@ RPCHelpMan importaddress()
             pwallet->MarkDirty();
 
             pwallet->ImportScriptPubKeys(strLabel, {GetScriptForDestination(dest)}, /*have_solving_data=*/false, /*apply_label=*/true, /*timestamp=*/1);
-        } else if (IsHex(request.params[0].get_str())) {
-            std::vector<unsigned char> data(ParseHex(request.params[0].get_str()));
-            CScript redeem_script(data.begin(), data.end());
+        } else if (auto data{TryParseHex<unsigned char>(request.params[0].get_str())}) {
+            CScript redeem_script(data.value().begin(), data.value().end());
 
             std::set<CScript> scripts = {redeem_script};
             pwallet->ImportScripts(scripts, /*timestamp=*/0);
@@ -463,10 +462,10 @@ RPCHelpMan importpubkey()
         throw JSONRPCError(RPC_WALLET_ERROR, "Wallet is currently rescanning. Abort existing rescan or wait.");
     }
 
-    if (!IsHex(request.params[0].get_str()))
-        throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Pubkey must be a hex string");
-    std::vector<unsigned char> data(ParseHex(request.params[0].get_str()));
-    CPubKey pubKey(data);
+    auto data{TryParseHex<unsigned char>(request.params[0].get_str())};
+    if (!data) throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Pubkey must be a hex string");
+
+    CPubKey pubKey(data.value());
     if (!pubKey.IsFullyValid())
         throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Pubkey is not a valid public key");
 
@@ -576,9 +575,8 @@ RPCHelpMan importwallet()
                 }
                 nTimeBegin = std::min(nTimeBegin, nTime);
                 keys.push_back(std::make_tuple(key, nTime, fLabel, strLabel));
-            } else if(IsHex(vstr[0])) {
-                std::vector<unsigned char> vData(ParseHex(vstr[0]));
-                CScript script = CScript(vData.begin(), vData.end());
+            } else if(auto data{TryParseHex<unsigned char>(vstr[0])}) {
+                CScript script = CScript(data.value().begin(), data.value().end());
                 int64_t birth_time = ParseISO8601DateTime(vstr[1]);
                 if (birth_time > 0) nTimeBegin = std::min(nTimeBegin, birth_time);
                 scripts.push_back(std::pair<CScript, int64_t>(script, birth_time));
@@ -959,11 +957,10 @@ static UniValue ProcessImportLegacy(ImportData& import_data, std::map<CKeyID, CP
         }
         script = GetScriptForDestination(dest);
     } else {
-        if (!IsHex(output)) {
-            throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Invalid scriptPubKey \"" + output + "\"");
-        }
-        std::vector<unsigned char> vData(ParseHex(output));
-        script = CScript(vData.begin(), vData.end());
+        auto data{TryParseHex<unsigned char>(output)};
+        if (!IsHex(output)) throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Invalid scriptPubKey \"" + output + "\"");
+
+        script = CScript(data.value().begin(), data.value().end());
         CTxDestination dest;
         if (!ExtractDestination(script, dest) && !internal) {
             throw JSONRPCError(RPC_INVALID_PARAMETER, "Internal must be set to true for nonstandard scriptPubKey imports.");
@@ -973,26 +970,24 @@ static UniValue ProcessImportLegacy(ImportData& import_data, std::map<CKeyID, CP
 
     // Parse all arguments
     if (strRedeemScript.size()) {
-        if (!IsHex(strRedeemScript)) {
+        auto parsed_redeemscript{TryParseHex<uint8_t>(strRedeemScript)};
+        if (!parsed_redeemscript) {
             throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Invalid redeem script \"" + strRedeemScript + "\": must be hex string");
         }
-        auto parsed_redeemscript = ParseHex(strRedeemScript);
-        import_data.redeemscript = std::make_unique<CScript>(parsed_redeemscript.begin(), parsed_redeemscript.end());
+        import_data.redeemscript = std::make_unique<CScript>(parsed_redeemscript.value().begin(), parsed_redeemscript.value().end());
     }
     if (witness_script_hex.size()) {
-        if (!IsHex(witness_script_hex)) {
+        auto parsed_witnessscript{TryParseHex<uint8_t>(witness_script_hex)};
+        if (!parsed_witnessscript) {
             throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Invalid witness script \"" + witness_script_hex + "\": must be hex string");
         }
-        auto parsed_witnessscript = ParseHex(witness_script_hex);
-        import_data.witnessscript = std::make_unique<CScript>(parsed_witnessscript.begin(), parsed_witnessscript.end());
+        import_data.witnessscript = std::make_unique<CScript>(parsed_witnessscript.value().begin(), parsed_witnessscript.value().end());
     }
     for (size_t i = 0; i < pubKeys.size(); ++i) {
         const auto& str = pubKeys[i].get_str();
-        if (!IsHex(str)) {
-            throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Pubkey \"" + str + "\" must be a hex string");
-        }
-        auto parsed_pubkey = ParseHex(str);
-        CPubKey pubkey(parsed_pubkey);
+        auto parsed_pubkey{TryParseHex<uint8_t>(str)};
+        if (!parsed_pubkey) throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Pubkey \"" + str + "\" must be a hex string");
+        CPubKey pubkey(parsed_pubkey.value());
         if (!pubkey.IsFullyValid()) {
             throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Pubkey \"" + str + "\" is not a valid public key");
         }
diff --git a/src/wallet/rpc/spend.cpp b/src/wallet/rpc/spend.cpp
index 5f31c1d72..631fa7c1e 100644
--- a/src/wallet/rpc/spend.cpp
+++ b/src/wallet/rpc/spend.cpp
@@ -605,11 +605,9 @@ void FundTransaction(CWallet& wallet, CMutableTransaction& tx, CAmount& fee_out,
         if (solving_data.exists("pubkeys")) {
             for (const UniValue& pk_univ : solving_data["pubkeys"].get_array().getValues()) {
                 const std::string& pk_str = pk_univ.get_str();
-                if (!IsHex(pk_str)) {
-                    throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, strprintf("'%s' is not hex", pk_str));
-                }
-                const std::vector<unsigned char> data(ParseHex(pk_str));
-                const CPubKey pubkey(data.begin(), data.end());
+                auto data(TryParseHex<unsigned char>(pk_str));
+                if (!data) throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, strprintf("'%s' is not hex", pk_str));
+                const CPubKey pubkey(data.value().begin(), data.value().end());
                 if (!pubkey.IsFullyValid()) {
                     throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, strprintf("'%s' is not a valid public key", pk_str));
                 }
@@ -623,11 +621,9 @@ void FundTransaction(CWallet& wallet, CMutableTransaction& tx, CAmount& fee_out,
         if (solving_data.exists("scripts")) {
             for (const UniValue& script_univ : solving_data["scripts"].get_array().getValues()) {
                 const std::string& script_str = script_univ.get_str();
-                if (!IsHex(script_str)) {
-                    throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, strprintf("'%s' is not hex", script_str));
-                }
-                std::vector<unsigned char> script_data(ParseHex(script_str));
-                const CScript script(script_data.begin(), script_data.end());
+                auto script_data{TryParseHex<unsigned char>(script_str)};
+                if (!script_data) throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, strprintf("'%s' is not hex", script_str));
+                const CScript script(script_data.value().begin(), script_data.value().end());
                 coinControl.m_external_provider.scripts.emplace(CScriptID(script), script);
             }
         }
</details>

@maflcko maflcko force-pushed the 2205-reject-non-hex- branch from fad0c89 to faab273 Compare February 27, 2023 12:40
@maflcko
Copy link
Member Author

maflcko commented Feb 27, 2023

force pushed to change tests. can be re-reviewed with range-diff

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK faab273

@fanquake fanquake merged commit a2877f7 into bitcoin:master Feb 27, 2023
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Feb 27, 2023
faab273 util: Return empty vector on invalid hex encoding (MarcoFalke)
fa3549a test: Add hex parse unit tests (MarcoFalke)

Pull request description:

  Seems a bit confusing to happily accept random bytes and pretend they are hex encoded strings.

ACKs for top commit:
  stickies-v:
    re-ACK faab273

Tree-SHA512: a808135f744f50aece03d4bf5a71481c7bdca1fcdd0d5b113abdb0c8b382bf81cafee6d17c239041fb49b59f4e19970f24a475378e7f711c3a47d6438de2bdab
@maflcko maflcko deleted the 2205-reject-non-hex-🌲 branch February 27, 2023 15:44
@hebasto
Copy link
Member

hebasto commented Mar 6, 2023

This PR causes cross compiling errors for some hosts using default toolchains on Ubuntu 22.04.

For example, for riscv64-linux-gnu

$ make -C src bitcoind
...
  CXXLD    bitcoind
/usr/lib/gcc-cross/riscv64-linux-gnu/11/../../../../riscv64-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-net_processing.o): in function `.L11743':
net_processing.cpp:(.text+0x1cc82): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
/usr/lib/gcc-cross/riscv64-linux-gnu/11/../../../../riscv64-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-rest.o): in function `.L2528':
rest.cpp:(.text+0x4fe4): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
/usr/lib/gcc-cross/riscv64-linux-gnu/11/../../../../riscv64-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-torcontrol.o): in function `.L0 ':
torcontrol.cpp:(.text._Z8ParseHexIhESt6vectorIT_SaIS1_EESt17basic_string_viewIcSt11char_traitsIcEE[_Z8ParseHexIhESt6vectorIT_SaIS1_EESt17basic_string_viewIcSt11char_traitsIcEE]+0x1a): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
/usr/lib/gcc-cross/riscv64-linux-gnu/11/../../../../riscv64-linux-gnu/bin/ld: libbitcoin_common.a(libbitcoin_common_a-external_signer.o): in function `.L3933':
external_signer.cpp:(.text+0x5378): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
collect2: error: ld returned 1 exit status
make: *** [Makefile:7178: bitcoind] Error 1
make: Leaving directory '/home/hebasto/git/bitcoin/src'

OTOH, Guix build for riscv64-linux-gnu is successful.

A compiler bug about std::byte type implementation?

@maflcko
Copy link
Member Author

maflcko commented Mar 6, 2023

Yeah, the same happens on Lunar (gcc 12.2). Steps to reproduce on a fresh install:

export HOST=powerpc64le-linux-gnu && export MAKEJOBS="$(nproc)" && apt update && apt install git vim htop  -y && git clone https://github.com/bitcoin/bitcoin.git  --depth=1 ./bitcoin-core && cd bitcoin-core && apt install -y bzip2 make automake cmake curl g++-multilib libtool binutils bsdmainutils pkg-config python3 patch bison  && apt install -y     g++-arm-linux-gnueabihf binutils-arm-linux-gnueabihf   g++-aarch64-linux-gnu binutils-aarch64-linux-gnu    g++-powerpc64-linux-gnu binutils-powerpc64-linux-gnu g++-powerpc64le-linux-gnu binutils-powerpc64le-linux-gnu    g++-riscv64-linux-gnu binutils-riscv64-linux-gnu    g++-s390x-linux-gnu binutils-s390x-linux-gnu    && ( cd depends && make NO_QT=1 "-j${MAKEJOBS}" ) && ./autogen.sh && CONFIG_SITE=$PWD/depends/$HOST/share/config.site ./configure && make "-j${MAKEJOBS}" src/bitcoind

@maflcko
Copy link
Member Author

maflcko commented Mar 6, 2023

It works on gcc-10 (debian bullseye), same steps to reproduce as above.

@gruve-p
Copy link
Contributor

gruve-p commented Mar 6, 2023

Same issue on Ubuntu Jammy (22.04) when cross compiling with depends with gcc 11 on these hosts:
i686-pc-linux-gnu
arm-linux-gnueabihf
powerpc64-linux-gnu
powerpc64le-linux-gnu
riscv64-linux-gnu
s390x-linux-gnu

@john-moffett
Copy link
Contributor

I guess it's having issues with the template for the std::optional<std::vector<uint8_t>> instantiation, since it's not available to the linker?

Adding explicit instantiations should fix it, I think:

diff --git a/src/util/strencodings.cpp b/src/util/strencodings.cpp
index 03459dc..d8b6318 100644
--- a/src/util/strencodings.cpp
+++ b/src/util/strencodings.cpp
@@ -97,6 +97,8 @@ std::optional<std::vector<Byte>> TryParseHex(std::string_view str)
 }
 template std::vector<std::byte> ParseHex(std::string_view);
 template std::vector<uint8_t> ParseHex(std::string_view);
+template std::optional<std::vector<std::byte>> TryParseHex(std::string_view);
+template std::optional<std::vector<uint8_t>> TryParseHex(std::string_view);
 
 bool SplitHostPort(std::string_view in, uint16_t& portOut, std::string& hostOut)
 {

Alternatively, I suppose you could define the template in the .h file:

diff --git a/src/util/strencodings.cpp b/src/util/strencodings.cpp
index 03459dc..4fdaafd 100644
--- a/src/util/strencodings.cpp
+++ b/src/util/strencodings.cpp
@@ -80,21 +79,0 @@ bool IsHexNumber(std::string_view str)
-template <typename Byte>
-std::optional<std::vector<Byte>> TryParseHex(std::string_view str)
-{
-    std::vector<Byte> vch;
-    auto it = str.begin();
-    while (it != str.end()) {
-        if (IsSpace(*it)) {
-            ++it;
-            continue;
-        }
-        auto c1 = HexDigit(*(it++));
-        if (it == str.end()) return std::nullopt;
-        auto c2 = HexDigit(*(it++));
-        if (c1 < 0 || c2 < 0) return std::nullopt;
-        vch.push_back(Byte(c1 << 4) | Byte(c2));
-    }
-    return vch;
-}
-template std::vector<std::byte> ParseHex(std::string_view);
-template std::vector<uint8_t> ParseHex(std::string_view);
-
diff --git a/src/util/strencodings.h b/src/util/strencodings.h
index 05e7b95..4efd25d 100644
--- a/src/util/strencodings.h
+++ b/src/util/strencodings.h
@@ -68,0 +69 @@ std::vector<Byte> ParseHex(std::string_view hex_str)
+
@@ -82,0 +84 @@ std::optional<std::vector<unsigned char>> DecodeBase32(std::string_view str);
+
@@ -171,0 +174,19 @@ constexpr inline bool IsSpace(char c) noexcept {
+template <typename Byte>
+std::optional<std::vector<Byte>> TryParseHex(std::string_view str)
+{
+    std::vector<Byte> vch;
+    auto it = str.begin();
+    while (it != str.end()) {
+        if (IsSpace(*it)) {
+            ++it;
+            continue;
+        }
+        auto c1 = HexDigit(*(it++));
+        if (it == str.end()) return std::nullopt;
+        auto c2 = HexDigit(*(it++));
+        if (c1 < 0 || c2 < 0) return std::nullopt;
+        vch.push_back(Byte(c1 << 4) | Byte(c2));
+    }
+    return vch;
+}
+

I'm not sure why it's working on most builds. Maybe the existing ParseHex explicit instantiations force explicit instantiations for TryParseHex but only on some compilers?

@maflcko
Copy link
Member Author

maflcko commented Mar 7, 2023

I guess it's having issues with the template for the std::optional<std::vector<uint8_t>> instantiation, since it's not available to the linker?

It might be a compiler bug where the compiler skips the explicit instantiation of a template that has a function body (and thus is already required to be implicitly instantiated)?

Adding explicit instantiations should fix it, I think:

Thanks. Done something like that in ##27218

Alternatively, I suppose you could define the template in the .h file:

Seems fine, but maybe the code should be modified further with compile time assertions on Byte to catch programming mistakes, because moving it to the header exposed the Byte choice to the developer? We could also think about making a full compile-time version (#25227 (comment)). Can be done in a follow-up, I guess 🤷‍♂️

fanquake added a commit to bitcoin-core/gui that referenced this pull request Mar 7, 2023
…piler bug

fa8481b util: Work around ParseHex gcc cross compiler bug (MarcoFalke)

Pull request description:

  I fail to see how an explicit `ParseHex` template instantiation fails to also instantiate `TryParseHex`.

  Nonetheless, to work around a compiler bug, change the explicit instantiation from `ParseHex` to `TryParseHex`. (`ParseHex` is inline anyway and will be instantiated by the compiler either way).

  Fixes bitcoin/bitcoin#25227 (comment) :

  ```
    CXXLD    bitcoind
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-net_processing.o): in function `(anonymous namespace)::PeerManagerImpl::ProcessMessage(CNode&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, CDataStream&, std::chrono::duration<long, std::ratio<1l, 1000000l> >, std::atomic<bool> const&)':
  net_processing.cpp:(.text+0x29660): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-rest.o): in function `rest_getutxos(std::any const&, HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)':
  rest.cpp:(.text+0x83b4): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-torcontrol.o): in function `std::vector<unsigned char, std::allocator<unsigned char> > ParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)':
  torcontrol.cpp:(.text._Z8ParseHexIhESt6vectorIT_SaIS1_EESt17basic_string_viewIcSt11char_traitsIcEE[_Z8ParseHexIhESt6vectorIT_SaIS1_EESt17basic_string_viewIcSt11char_traitsIcEE]+0x2c): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_common.a(libbitcoin_common_a-external_signer.o): in function `ExternalSigner::SignTransaction(PartiallySignedTransaction&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)':
  external_signer.cpp:(.text+0x8d84): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  collect2: error: ld returned 1 exit status

ACKs for top commit:
  gruve-p:
    ACK bitcoin/bitcoin@fa8481b
  hebasto:
    ACK fa8481b, tested on Ubuntu 22.04, gcc 11.3 for the `riscv64-linux-gnu` host.,

Tree-SHA512: 53efa424e7e18d85a2c9ac2267b9370ae3594d9be73da5135a3a79bf07ab50fcc5357cbde09dc0b2a9eb78d78ec37beae0c9f876333b568e678b9d0067bc9e4e
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Mar 7, 2023
fa8481b util: Work around ParseHex gcc cross compiler bug (MarcoFalke)

Pull request description:

  I fail to see how an explicit `ParseHex` template instantiation fails to also instantiate `TryParseHex`.

  Nonetheless, to work around a compiler bug, change the explicit instantiation from `ParseHex` to `TryParseHex`. (`ParseHex` is inline anyway and will be instantiated by the compiler either way).

  Fixes bitcoin#25227 (comment) :

  ```
    CXXLD    bitcoind
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-net_processing.o): in function `(anonymous namespace)::PeerManagerImpl::ProcessMessage(CNode&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, CDataStream&, std::chrono::duration<long, std::ratio<1l, 1000000l> >, std::atomic<bool> const&)':
  net_processing.cpp:(.text+0x29660): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-rest.o): in function `rest_getutxos(std::any const&, HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)':
  rest.cpp:(.text+0x83b4): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_node.a(libbitcoin_node_a-torcontrol.o): in function `std::vector<unsigned char, std::allocator<unsigned char> > ParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)':
  torcontrol.cpp:(.text._Z8ParseHexIhESt6vectorIT_SaIS1_EESt17basic_string_viewIcSt11char_traitsIcEE[_Z8ParseHexIhESt6vectorIT_SaIS1_EESt17basic_string_viewIcSt11char_traitsIcEE]+0x2c): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  /usr/lib/gcc-cross/powerpc64le-linux-gnu/11/../../../../powerpc64le-linux-gnu/bin/ld: libbitcoin_common.a(libbitcoin_common_a-external_signer.o): in function `ExternalSigner::SignTransaction(PartiallySignedTransaction&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)':
  external_signer.cpp:(.text+0x8d84): undefined reference to `std::optional<std::vector<unsigned char, std::allocator<unsigned char> > > TryParseHex<unsigned char>(std::basic_string_view<char, std::char_traits<char> >)'
  collect2: error: ld returned 1 exit status

ACKs for top commit:
  gruve-p:
    ACK bitcoin@fa8481b
  hebasto:
    ACK fa8481b, tested on Ubuntu 22.04, gcc 11.3 for the `riscv64-linux-gnu` host.,

Tree-SHA512: 53efa424e7e18d85a2c9ac2267b9370ae3594d9be73da5135a3a79bf07ab50fcc5357cbde09dc0b2a9eb78d78ec37beae0c9f876333b568e678b9d0067bc9e4e
stickies-v added a commit to stickies-v/bitcoin that referenced this pull request Mar 15, 2023
TryParseHex() was introduced in bitcoin#25227 and returns a std::nullopt if
the string is an invalid hex representation. This allows us to simplify
code that would first check if a string is valid hex with IsHex(),
and then iterate over the string again with ParseHex().

There is slight behaviour difference between these approaches, in that
IsHex() returns false for an empty string, whereas TryParseHex()
does not fail. For the updated instances, this does not make a difference.
@bitcoin bitcoin locked and limited conversation to collaborators Mar 6, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.