Affects the getblock rpc call with the goal of adding the ability to call getblock with a hash or block height whist not impacting compatibility with existing services that rely on the getblock call.
Motivation
Decrease the number of RPC calls needed to get valid data from getblock where the user has only a block height available. As well as reduce complexity, bandwidth, and overhead. Note: getblock returns the hash of the block when parsed.
Overview
It is trivial to implement a check where the input univalue string (string) to getblock is greater than or less than a length of 64. This is because all valid block header hashes are a length of 64. This means if the input is not length 64 we should attempt to interpret the string as a block height input and pass it to ParseHashOrHeight and then retreive the phashBlock (hash) value from the returned block at the input height. This hash value can be dropped into the existing blockhash variable leaving the rest of the call unmodified.
If the length of the input is 64 the univalue is treated as a blockhash and passed though ParseHashOrHeight to get the block header hash value.
Contentious Points
Ambiguity
The fact that a user can now parse a hash or a block height as the required input argument to param[0]. There may be the issue of inputs being seen as too flexible or ambiguous.
My thoughts behind this are because a valid getblock hash input will always be a univalue string with length of 64 there likely wont be a base10 block height collision with the base16 hash inputs for 1.9*10^58 years assuming a block time of ~10 mins. Well before the collision of the block height with hash values all of datatypes we for block height storage and parsing would be refactored to support values that would overflow the int32_t type. As such a block height could never reasonably be interpreted as a hash. By software or people
The next note about ambiguity of inputs is that a hash no matter the actual numerical value when read must be a string of length 64. Example a hash value of 0 would still need to be input as "000...000"
Implementation Details
Overall the implementation is pretty simple. If input string is anything other than a univalue str of length 64 attempt to parse as a block height using ParseHashOrHeight. Since the maximum number height_or_hash input parameter value is a uint64t. We use the ParseInt32 function with the input being the .get_str of param[0]. This function works incredibly well because it by default doesn't accept negative values, alpha chars. , strings that overflow a int32_t type, and symbol chars. If the conversion to a unit64t is successful it is then safe to pass into ParseHashOrHeight.
If the user places a value higher than the tip they will get an error code -8: Target block height _m_ after current tip _n_. If the user inputs an invalid string for height then they will receive and error code -5
the blockhash parameter name was left unchanged for backwards compatibility.
Backward Compatibility
Should be fully backwards compatible with all existing applications that use getblock and are supplying even remotely valid inputs. This is just a drop in check to see if getblock should make an attempt to convert an input to a block height.
Performance
Summary +9.4% performance gain when returning details of all blocks from 0 to the tip.
To test the performance impact of reducing the number of RPC to get block details from 2 calls to 1 I wrote a NodeJS script that parses all blocks through the RPC from 0 to the tip with getblock to determine the size of the block on disk and at the end returns the details of that largest block by size on the timechain.
The control test was bitcoin core .23 where the details for each block were fetched using 2 calls, first getblockhash(height) was called to get the hash. Then getblock(hash) was called to get the details. This process was applied to every block from 0 to the tip at the time of 762535. This test resulted in a runtime of 117m 11s. The result was the largest block size is 748918 with a size of 2765062 bytes.
Using this PR the same script was modified to call the getblock(height) and thus skipping the need to get the hash first. When run on the same hardware the execution time was 106m 32s. A performance uplift of 9.4%
As a note the biggest gains are inversely proportional to the ratio between the size of the results of getblock and getblock hash. The smaller the ratio the bigger the relative performance uplift.
This is seen when iterating though the blocks many of the early empty and small blocks the uplift was close to 2x, but as the size and number of details returned by getblock increased the advantage of saving around 100 bytes at the RPC became irrelevant.