Faster way to get block with prevouts in JSON-RPC #30495

issue vostrnad openend this issue on July 21, 2024
  1. vostrnad commented at 10:48 pm on July 21, 2024: none

    I often need to process the whole blockchain (or a large part of it) using an external script/program, for which I need blocks with prevout information included. However, the only current way to get that is getblock <hash> 3, which includes a lot of potentially unnecessary data and is quite slow, mainly (based on my experiments) because of UniValue overhead and descriptor inferring.

    I benchmarked current master, retrieving 1000 blocks sequentially starting at block 840000, with different verbosity parameters:

    benchmark result
    getblock (verbosity=0) 16.189s ± 1.165s
    getblock (verbosity=1) 31.975s ± 1.014s
    getblock (verbosity=2) 352.487s ± 1.636s
    getblock (verbosity=3) 473.375s ± 2.280s

    As you can see, verbosity=3 is around 30 times slower than verbosity=0. It seems obvious that a faster way of getting blocks with prevout information is feasible.

    Potential solutions that come to mind:

    • Creating a new RPC call for undo data, say getblockundo. This would be perfect for my needs, but it would require making the undo data serialization format non-internal (not sure if this would be a problem, as IIRC it hasn’t changed in many years).
    • Creating a new verbosity level for getblock that would only provide the minimum amount of data necessary (i.e. no addresses, descriptors, ASM scripts, TXIDs/WTXIDs etc.) while still providing prevouts. This would be better than nothing but would still leave a lot of performance on the table because of UniValue overhead.
  2. maflcko added the label RPC/REST/ZMQ on Jul 22, 2024
  3. maflcko added the label Block storage on Jul 22, 2024
  4. maflcko added the label Feature on Jul 22, 2024
  5. andrewtoth commented at 11:16 pm on July 30, 2024: contributor

    There are a few strategies to speed this up on the client side instead:

    • Fetch blocks concurrently
    • Fetch blocks in parallel
    • Fetch blocks in batch requests
    • A combination of all of the above

    Setting rpcthreads to a higher number than the default 4 will allow you to request more concurrently or in parallel as well.

  6. maflcko commented at 11:02 am on August 8, 2024: member
    #30595 mentions “Traversing the block index as well and using block index entries for reading block and undo data.” However, it does not return JSON-RPC, but a kernel_BlockUndo*/BlockUndo, also the pull is experimental, doesn’t have versioning and has some other drawbacks. (Just mentioning it for context, because if you care about speed, this may be faster than JSON)

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-08 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me