rpc: getblock: implement with block height as input parameter. #26469

portlandhodl · 2022-11-08T10:00:08Z

Affects the getblock rpc call with the goal of adding the ability to call getblock with a hash or block height whist not impacting compatibility with existing services that rely on the getblock call.

Motivation

Decrease the number of RPC calls needed to get valid data from getblock where the user has only a block height available. As well as reduce complexity, bandwidth, and overhead. Note: getblock returns the hash of the block when parsed.

Overview

It is trivial to implement a check where the input univalue string (string) to getblock is greater than or less than a length of 64. This is because all valid block header hashes are a length of 64. This means if the input is not length 64 we should attempt to interpret the string as a block height input and pass it to ParseHashOrHeight and then retreive the phashBlock (hash) value from the returned block at the input height. This hash value can be dropped into the existing blockhash variable leaving the rest of the call unmodified.

If the length of the input is 64 the univalue is treated as a blockhash and passed though ParseHashOrHeight to get the block header hash value.

Contentious Points

Ambiguity

The fact that a user can now parse a hash or a block height as the required input argument to param[0]. There may be the issue of inputs being seen as too flexible or ambiguous.

My thoughts behind this are because a valid getblock hash input will always be a univalue string with length of 64 there likely wont be a base10 block height collision with the base16 hash inputs for 1.9*10^58 years assuming a block time of ~10 mins. Well before the collision of the block height with hash values all of datatypes we for block height storage and parsing would be refactored to support values that would overflow the int32_t type. As such a block height could never reasonably be interpreted as a hash. By software or people

The next note about ambiguity of inputs is that a hash no matter the actual numerical value when read must be a string of length 64. Example a hash value of 0 would still need to be input as "000...000"

Implementation Details

Overall the implementation is pretty simple. If input string is anything other than a univalue str of length 64 attempt to parse as a block height using ParseHashOrHeight. Since the maximum number height_or_hash input parameter value is a uint64t. We use the ParseInt32 function with the input being the .get_str of param[0]. This function works incredibly well because it by default doesn't accept negative values, alpha chars. , strings that overflow a int32_t type, and symbol chars. If the conversion to a unit64t is successful it is then safe to pass into ParseHashOrHeight.

If the user places a value higher than the tip they will get an error code -8: Target block height _m_ after current tip _n_. If the user inputs an invalid string for height then they will receive and error code -5

the blockhash parameter name was left unchanged for backwards compatibility.

Backward Compatibility

Should be fully backwards compatible with all existing applications that use getblock and are supplying even remotely valid inputs. This is just a drop in check to see if getblock should make an attempt to convert an input to a block height.

Performance

Summary +9.4% performance gain when returning details of all blocks from 0 to the tip.

To test the performance impact of reducing the number of RPC to get block details from 2 calls to 1 I wrote a NodeJS script that parses all blocks through the RPC from 0 to the tip with getblock to determine the size of the block on disk and at the end returns the details of that largest block by size on the timechain.

The control test was bitcoin core .23 where the details for each block were fetched using 2 calls, first getblockhash(height) was called to get the hash. Then getblock(hash) was called to get the details. This process was applied to every block from 0 to the tip at the time of 762535. This test resulted in a runtime of 117m 11s. The result was the largest block size is 748918 with a size of 2765062 bytes.

Using this PR the same script was modified to call the getblock(height) and thus skipping the need to get the hash first. When run on the same hardware the execution time was 106m 32s. A performance uplift of 9.4%

As a note the biggest gains are inversely proportional to the ratio between the size of the results of getblock and getblock hash. The smaller the ratio the bigger the relative performance uplift.

This is seen when iterating though the blocks many of the early empty and small blocks the uplift was close to 2x, but as the size and number of details returned by getblock increased the advantage of saving around 100 bytes at the RPC became irrelevant.

aureleoules · 2022-11-08T10:40:21Z

It would make more sense to use the following code as it is uniform with the codebase but it breaks compatibility because passing a hash would require wrapping it between quotes.

#19949 fixes this and should probably be merged first?

diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
index 082e1a76a..cb171f27b 100644
--- a/src/rpc/blockchain.cpp
+++ b/src/rpc/blockchain.cpp
@@ -642,7 +642,7 @@ static RPCHelpMan getblock()
                 "If verbosity is 2, returns an Object with information about block <hash> and information about each transaction.\n"
                 "If verbosity is 3, returns an Object with information about block <hash> and information about each transaction, including prevout information for inputs (only for unpruned blocks in the current best chain).\n",
                 {
-                    {"blockhash|blockheight", RPCArg::Type::STR_HEX, RPCArg::Optional::NO, "The block hash or height", RPCArgOptions{.type_str={"", "string or numeric"}}},
+                    {"hash_or_height", RPCArg::Type::NUM, RPCArg::Optional::NO, "The block hash or height", RPCArgOptions{.type_str={"", "string or numeric"}}},
                     {"verbosity|verbose", RPCArg::Type::NUM, RPCArg::Default{1}, "0 for hex-encoded data, 1 for a JSON object, 2 for JSON object with transaction data, and 3 for JSON object with transaction data including prevout information for inputs"},
                 },
                 {
@@ -706,18 +706,6 @@ static RPCHelpMan getblock()
 {
     uint256 hash;
     ChainstateManager& chainman = EnsureAnyChainman(request.context);
-    if(request.params[0].get_str().length() == 64){
-        hash = *ParseHashOrHeight(request.params[0], chainman)->phashBlock;
-    }
-    else{
-        uint64_t blockHeight;
-        if(ParseUInt64(request.params[0].get_str(),&blockHeight)){
-            hash = *ParseHashOrHeight(blockHeight, chainman)->phashBlock;
-        }
-        else{
-            throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Block height invalid");
-        }
-    }
 
     int verbosity = 1;
     if (!request.params[1].isNull()) {
@@ -729,17 +717,11 @@ static RPCHelpMan getblock()
     }
 
     CBlock block;
-    const CBlockIndex* pblockindex;
     const CBlockIndex* tip;
+    const auto* pblockindex{ParseHashOrHeight(request.params[0], chainman)};
     {
         LOCK(cs_main);
-        pblockindex = chainman.m_blockman.LookupBlockIndex(hash);
         tip = chainman.ActiveChain().Tip();
-
-        if (!pblockindex) {
-            throw JSONRPCError(RPC_INVALID_ADDRESS_OR_KEY, "Block not found");
-        }
-
         block = GetBlockChecked(chainman.m_blockman, pblockindex);
     }
 
diff --git a/src/rpc/client.cpp b/src/rpc/client.cpp
index 8688263ef..4273063da 100644
--- a/src/rpc/client.cpp
+++ b/src/rpc/client.cpp
@@ -96,6 +96,7 @@ static const CRPCConvertParam vRPCConvertParams[] =
     { "listunspent", 2, "addresses" },
     { "listunspent", 3, "include_unsafe" },
     { "listunspent", 4, "query_options" },
+    { "getblock", 0, "hash_or_height" },
     { "getblock", 1, "verbosity" },
     { "getblock", 1, "verbose" },
     { "getblockheader", 1, "verbose" },

portlandhodl · 2022-11-08T10:56:44Z

It would make more sense to use the following code as it is uniform with the codebase but it breaks compatibility because passing a hash would require wrapping it between quotes.

The below code is more uniform agreed. The point of this was that this a fully backward compatible drop in implementation with a minimal impact on the codebase. There is an advantage to this method is that it becomes possible to throw not only on invalid hash but also independently on invalid height input.

Edit:
Commit chain fixed thanks @sipa

Applied clang formatting to @aureleoules request to make the code more uniform.

DrahtBot · 2022-11-09T16:02:18Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept NACK	luke-jr, RandyMcMillan

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#26039 (refactor: Run type check against RPCArgs (1/2) by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

maflcko · 2022-11-11T08:19:22Z

It is possible to run the test locally. This way you don't have to push and wait for CI.

fjahr · 2022-11-12T21:14:43Z

#19949 fixes this and should probably be merged first?

I agree, #19949 doesn't take getblock into account, I don't recall why, probably an oversight. This PR here could be a follow-up building on top of it implementing the needed further changes for getblock.

luke-jr · 2022-11-14T22:58:03Z

Concept NACK. See also more reasonable (yet still rejected) proposals #8457, #16439, #14858, #16345, #16317

portlandhodl · 2022-11-15T00:11:23Z

Concept NACK. See also more reasonable (yet still rejected) proposals #8457, #16439, #14858, #16345, #16317

These are my thoughts about those particular listed PRs. I would disagree that they are all 'more reasonable' but none the less here are my thoughts.

Just curious as to why you would say an implementation such as #16317 is more reasonable since it clearely violates multiple suggestions outlined in the developer notes including the use of c_str and and atoi instead of parse ParseInt32. Then you have #14858. This solution introduces a whole new include for regex expressions and again uses stoi() and defeats the purpose of the guidelines calling for a ParseInt(n)

#16345 sure could be considered more reasonable since it adds a whole new RPC call as well as #16439 since it's very different solution and removes the ambiguity of using height as an argument. Also effects multiple RPCs. Both of these PRs actually create a new way defining the argument or create a new RPCs

Lastly #8457. This solution is quite close to the one in this PR. Why it's more reasonable other than it has a throw for invalid height and height out of range.

luke-jr · 2022-11-15T01:44:31Z

More reasonable mainly in the sense of using a Number parameter rather than a String.

But regardless, the main issue is the concept.

RandyMcMillan · 2022-11-15T01:46:23Z

I like @promag's argument here --> #8457 (comment)

    > I don't like overloading the argument, making it ambiguous.

Disagree, there is a place for overloading in APIs. In this case blocks are usually referenced by height or hash.

There is also overloading in the protocol: 4 lock_time uint32_t A time (Unix epoch time) or block number. See the locktime parsing rules.

It seems messy, and we don't do that anywhere else on the API.

From help importaddress: 1. "script" (string, required) The hex-encoded script (or address) either.

E.g. a long hash with no hexadecimal characters could be interpreted as a long number.

Unlikely? Still it's simple to distinguish that block hash from a number.

Though even then I'm not convinced it's a worthwhile addition.

From the API point of view, I think it's fair to fetch a block from it's height. From a technical point of view, making 2 calls causing locks on cs_main etc on a heavy loaded daemon is resource wasting.

So, +1 for getblock hash|height (verbose).

Originally posted by @promag in #8457 (comment)

portlandhodl · 2022-11-15T02:23:31Z

Per the ambiguity argument in #8457

E.g. a long hash with no hexadecimal characters could be interpreted as a long number.

Unlikely? Still it's simple to distinguish that block hash from a number.

This is currently not even possible with the current implementation .23. This is because any attempt to input a block hash that is more or less than 64 characters will be met with an error -8, blockhash must be of length 64

Example getblock "001" doesn't attempt to get the hex padded version of 0x01 instead the user is presented with blockhash must be of length 64 (not 3, for '001')

This is because of the behavior implemented in ParseHashV.

bitcoin/src/rpc/util.cpp

Line 97 in 48174c0

uint256 ParseHashV(const UniValue& v, std::string strName)

The implementation of parsing the hash in getblock that was a part of #8457 was just a call of uint256 hash(uint256S(strHash)) as string input that was less than 64 chars would be parsed into a hash. example in #8457 if the user called getblock "001" it would parse that as `bitcoin-cli getblock "000000000000000000000000000000000000000000000000000000000000001"

Since the implementation of ParseHashV into getblock this is no longer possible because of the check for a length == 64 followed by a throw if that case is not met. As such the ambiguity arguments of a collision might need to be looked at again.

To the concern about the input for a block height not being a number.

More reasonable mainly in the sense of using a Number parameter rather than a String.

So if the block hash could be input as a HEX_STR and a height could be parsed as NUM then the idea could potentially become reasonable?

portlandhodl · 2022-11-18T07:10:14Z

More reasonable mainly in the sense of using a Number parameter rather than a String.

But regardless, the main issue is the concept.

@luke-jr a solution to the problem is just to create a switch that cases over the result of UNIVALUE::VType of request.params[0] my most recent commit supports not only string inputs of hash or height for getblock but an integer input as well. All while maintaining complete backwards compatibility.

So at minimum I hope these changes would class this solution into the same league as the 'more reasonable' ones.

Edit... taking it even further if it was desired any integer input could be parsed as a height where as any string could be processed as a hash.

added support for unitype VSTR and VNUM rpc::getblock(int/string hash or height) complete fix throw typo

DrahtBot · 2023-01-17T10:06:45Z

🐙 This pull request conflicts with the target branch and needs rebase.

RandyMcMillan · 2023-03-18T19:01:23Z

Concept NACK

After further consideration:

This change sets up the condition that getblock has to
evaluate inputs:

getblock <hash> <int verbosity>

or

getblock <int> <int verbosity>

This seems error prone to me!

Using a nested command such as:

 bitcoin-cli getblock $(bitcoin-cli getblockhash 0)
 bitcoin-cli getblock $(bitcoin-cli getblockhash 0) 0
 bitcoin-cli getblock $(bitcoin-cli getblockhash 0) 1
 bitcoin-cli getblock $(bitcoin-cli getblockhash 0) 2

forces the user to handle the data query with intent.

The command is currently simple to use!

The added convenience of over loading this command doesn't out weigh potential problems it could create. ¹ ²

Additionally:

The getblock help output is simple to understand as is.

Arguments:
1. blockhash    (string, required) The block hash
2. verbosity    (numeric, optional, default=1) 0 for hex-encoded data, 1 for a JSON object, 2 for JSON object with transaction data, and 3 for JSON object with transaction data including prevout information for inputs

Consider currently deployed tooling for jq json "<string>" vs <integer> quotation for example. ↩
I think this is a good argument on why any previous or future variation on this should never be implemented. IMO ↩

portlandhodl · 2023-03-18T19:44:07Z

This was meant to be closed some time ago.

Though I don't agree completely in the case of a hash vs height overload since height and hashes do not and can not intersect ever. With that said, there are already other rpc calls that use this same method to get the block with a parseHashOrHeight

All and all, thanks for your input.

portlandhodl force-pushed the merge_getblock_getblock_hash branch from 53c0a1c to cf784cd Compare November 8, 2022 10:14

portlandhodl force-pushed the merge_getblock_getblock_hash branch from 7301850 to f713f70 Compare November 8, 2022 11:08

DrahtBot added the RPC/REST/ZMQ label Nov 8, 2022

portlandhodl closed this Nov 8, 2022

portlandhodl force-pushed the merge_getblock_getblock_hash branch from b80e0bf to 50422b7 Compare November 8, 2022 11:33

portlandhodl deleted the merge_getblock_getblock_hash branch November 8, 2022 11:33

portlandhodl restored the merge_getblock_getblock_hash branch November 8, 2022 11:33

sipa reopened this Nov 8, 2022

portlandhodl force-pushed the merge_getblock_getblock_hash branch from b5dcb87 to abfce6f Compare November 10, 2022 02:34

DrahtBot mentioned this pull request Nov 10, 2022

refactor: Run type check against RPCArgs (1/2) #26039

Merged

portlandhodl force-pushed the merge_getblock_getblock_hash branch 2 times, most recently from 0234e00 to 121b450 Compare November 13, 2022 09:01

portlandhodl force-pushed the merge_getblock_getblock_hash branch 4 times, most recently from d303b8b to 5e8f0e9 Compare November 18, 2022 07:09

portlandhodl force-pushed the merge_getblock_getblock_hash branch from 5e8f0e9 to 83dedd7 Compare November 18, 2022 07:12

rpc::getblock<VNUM/VSTR>(hash or height) complete

a917ae2

added support for unitype VSTR and VNUM rpc::getblock(int/string hash or height) complete fix throw typo

portlandhodl force-pushed the merge_getblock_getblock_hash branch from 83dedd7 to a917ae2 Compare November 18, 2022 07:33

DrahtBot added the Needs rebase label Jan 17, 2023

portlandhodl closed this Mar 18, 2023

bitcoin locked and limited conversation to collaborators Mar 17, 2024

rpc: getblock: implement with block height as input parameter. #26469

rpc: getblock: implement with block height as input parameter. #26469

Uh oh!

Conversation

portlandhodl commented Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Overview

Contentious Points

Implementation Details

Backward Compatibility

Performance

Uh oh!

aureleoules commented Nov 8, 2022

Uh oh!

portlandhodl commented Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Nov 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Conflicts

Uh oh!

maflcko commented Nov 11, 2022

Uh oh!

fjahr commented Nov 12, 2022

Uh oh!

luke-jr commented Nov 14, 2022

Uh oh!

portlandhodl commented Nov 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luke-jr commented Nov 15, 2022

Uh oh!

RandyMcMillan commented Nov 15, 2022

Uh oh!

portlandhodl commented Nov 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

portlandhodl commented Nov 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Jan 17, 2023

Uh oh!

RandyMcMillan commented Mar 18, 2023

This seems error prone to me!

The command is currently simple to use!

Footnotes

Uh oh!

portlandhodl commented Mar 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

portlandhodl commented Nov 8, 2022 •

edited

Loading

portlandhodl commented Nov 8, 2022 •

edited

Loading

DrahtBot commented Nov 9, 2022 •

edited

Loading

portlandhodl commented Nov 15, 2022 •

edited

Loading

portlandhodl commented Nov 15, 2022 •

edited

Loading

portlandhodl commented Nov 18, 2022 •

edited

Loading

portlandhodl commented Mar 18, 2023 •

edited

Loading