I often need to process the whole blockchain (or a large part of it) using an external script/program, for which I need blocks with prevout information included. However, the only current way to get that is getblock <hash> 3
, which includes a lot of potentially unnecessary data and is quite slow, mainly (based on my experiments) because of UniValue
overhead and descriptor inferring.
I benchmarked current master, retrieving 1000 blocks sequentially starting at block 840000, with different verbosity
parameters:
benchmark | result |
---|---|
getblock (verbosity=0) | 16.189s ± 1.165s |
getblock (verbosity=1) | 31.975s ± 1.014s |
getblock (verbosity=2) | 352.487s ± 1.636s |
getblock (verbosity=3) | 473.375s ± 2.280s |
As you can see, verbosity=3 is around 30 times slower than verbosity=0. It seems obvious that a faster way of getting blocks with prevout information is feasible.
Potential solutions that come to mind:
- Creating a new RPC call for undo data, say
getblockundo
. This would be perfect for my needs, but it would require making the undo data serialization format non-internal (not sure if this would be a problem, as IIRC it hasn’t changed in many years). - Creating a new verbosity level for
getblock
that would only provide the minimum amount of data necessary (i.e. no addresses, descriptors, ASM scripts, TXIDs/WTXIDs etc.) while still providing prevouts. This would be better than nothing but would still leave a lot of performance on the table because ofUniValue
overhead.