(slow) txindex in conjunction with pruning #12651

issue jonasschnelli openend this issue on March 9, 2018
  1. jonasschnelli commented at 0:33 am on March 9, 2018: contributor

    Usecase: running a pruned peer on low resource hardware while still not want to completely loose the potential to grab any transaction by it’s txid.

    Idea: Instead of storing the CDiskBlockPos in the WriteTxIndex, it could store the blockhash (I guess its would be slightly more data and if that is a problem, maybe something with the height could work)

    In case of a pruned peer, if the block on disk is not available, it could fetch the block from other peers and locate the transaction either directly with the txid of with the CDiskTxPos::nTxOffset.

    If network speed is acceptable, a response within sections through getrawtransaction() may still be possible.

  2. jonasschnelli added the label Brainstorming on Mar 9, 2018
  3. jonasschnelli added the label UTXO Db and Indexes on Mar 9, 2018
  4. esotericnonsense commented at 2:25 pm on March 9, 2018: contributor
    Would the pruned peer immediately drop the block, or keep it around for some time - if so, how? Can we just append it to the blocks on disk and have it fall out after pruning LIFO style?
  5. eklitzke commented at 8:48 am on March 11, 2018: contributor

    Curious what the use case is for this. Makes sense, but I can’t think of why a pruned node would actually want this.

    That being said… I am uncomfortably excited by the idea of removing an index based on CDiskBlockPos. I have enough experience with logical/physical indexing schemes in my pre-Bitcoin life to feel strongly about this. Besides the pruning use case you just gave, using physical offsets now is trading a very small efficiency gain for a lot of pain in the future if we want to change the data format.

    Normally the way this is implemented in a real database is via a “secondary index”, which is not supported by LevelDB natively. If you imagine we stored blocks in MySQL or Postgres, you might have an auto-increment “block id” that maps to the full block hash, and then the txindex would map to block ids instead of the full hash. This is similar to your idea of using block height. Dealing with actual block heights sounds kind of confusing because you’d either need to deal with reorgs, or have confusing code that ignores reorgs (the node is missing data afterall) that would make things more convoluted.

    Since this is a brainstorming issue, something else that’s been on my mind is whether it’s possible to support more exotic index types (e.g. address indexes) in a sane way with the existing LevelDB code. In general, having more flexible indexing code would be a big benefit. There’s a lot of duplication in the existing index logic we have that could be cleaned up and made more efficient.

  6. jonasschnelli commented at 12:09 pm on March 11, 2018: contributor

    […] but I can’t think of why a pruned node would actually want this.

    Use case: a secure and decentralised way to access and decompose a transaction by it’s ID on low resource hardware. Usually, if you want a decentralised self-validated txindex, you need 100GB+ of free spaces.

  7. eklitzke commented at 5:54 am on March 13, 2018: contributor
    If I understand correctly, this would be very inefficient without additional extensions to the P2P protocol, since if you wanted to actually fetch the transaction data for a tx that’s not in a locally stored block you would have to download the full block. I can see how it would be useful though if you were in a kind of semi-trusted setup (e.g. querying a full node that you can access over a local network).
  8. jonasschnelli commented at 6:06 am on March 13, 2018: contributor

    If one has verified the chain on a pruned peer, fetching an already verified historical block (that has been pruned) seems secure to me. Seems ideal for merchants or other small businesses. It allows one to securely fetch a transaction by it’s ID with a overhead of ~1MB.

    Assume you have a node-in-a-box with limited disk space but you still want the ability to fetch all transaction by it’s ID instead of using a block explorer.

    Ideally fetching arbitrary transactions would be avoided with use of the wallet as relevant-transaction cache… but still: users want the ability to fetch by TX ID and it seems to be worth to allow this on pruned peers since it’s unrealistic to only have this possibility in the long run for non pruned peers.

    It may also discourage the use of centralized validation (blockexplorers), or at least it gives low resource systems an alternative option.

  9. sipa commented at 12:13 pm on April 19, 2018: member

    I think this is uninteresting.

    txindex should be a debugging feature, not something to rely on for production purposes. Having wallets that watch for things you’re interested in is a far more scalable approach than indexing.

    As for the use case of being able to quickly check if recent blocks had something relevant to you (faster than scanning through all blocks again), BIP158 is a more scalable alternative.

  10. jonasschnelli commented at 12:35 pm on April 19, 2018: contributor

    Creating a tx-index outside of Bitcoin-Core via ZMQ/RPC/REST is relatively inefficient.

    AFAIK, Bitcoin Cores txindex is widely used in production and I’m not aware of stability problems.

    I agree that selective indexing (wallets) are far more efficient and BIP158 looks very promising, though, there are reasonable use-cases for a transaction-id index.

    If we continue to support a txindex, I don’t see a reason to exclude pruned peers.

  11. sipa commented at 12:38 pm on April 19, 2018: member

    Creating a tx-index outside of Bitcoin-Core via ZMQ/RPC/REST is relatively inefficient.

    Then use P2P.

    AFAIK, Bitcoin Cores txindex is widely used in production and I’m not aware of stability problems.

    Yes, for lack of a better alternative (ease of watching in the wallet? external indexing daemon?)

  12. promag commented at 2:39 pm on April 19, 2018: member

    allow this on pruned peers since it’s unrealistic to only have this possibility in the long run for non pruned peers.

    This makes sense to me, (try) to give the same feature set to pruned peers.

  13. TheBlueMatt commented at 11:37 pm on April 27, 2018: member

    Use case: a secure and decentralised way to access and decompose a transaction by it’s ID on low resource hardware. Usually, if you want a decentralised self-validated txindex, you need 100GB+ of free spaces.

    I think this use-case is significantly more simply implemented with gettxoutproof/verifytxoutproof/decoderawtransaction. If you’ve already discovered from somewhere that you want a transaction, that somewhere can just as easily provide a txoutproof, and verifytxoutproof will already (AFAIR) tell you if the proof corresponds to a transaction on your (fully-validated) chain.

  14. Sjors commented at 1:55 pm on September 8, 2018: member

    I don’t know if that’s really more simple, but it’s certainly an option people should consider. There is additional complexity in applications that have to request, store and provide these txoutproofs. It’s a bit easier if applications relying on a pruned node only talk to applications relying on archival nodes.

    Hauling txoutproofs around, just in case you’re talking to a pruned node, does require bandwidth and storage. I don’t know if that outweighs the extra bandwidth needed for fetching an entire block when you just need one transaction.

  15. reardenlife commented at 2:39 am on September 3, 2019: none

    I don’t really understand why txindex is disabled in pruned mode.

    As for for example I just want to check if the payment was received on bunch of the addresses that I generated. So I have to parse N recent blocks for the specific addresses. I am using right now getrawtransaction, but it is extremely slow. https://codereview.stackexchange.com/questions/227202/bash-bitcoin-blockchain-explorer

    It is unclear what should I do to speed things up - to parse the raw data of the blockchain manually?

  16. NicolasDorier commented at 11:27 am on September 3, 2019: contributor
    @reardenlife what about getblock $blockid 2, this give you all transactions in one request.
  17. reardenlife commented at 1:43 am on September 4, 2019: none
    @NicolasDorier great! I was able to make bash blockchain explorer with it. https://bitcointalk.org/index.php?topic=5181207.msg52352018#msg52352018
  18. MarcoFalke commented at 12:28 pm on May 9, 2020: member

    The feature request didn’t seem to attract much attention in the past. Also, the issue seems not important enough right now to keep it sitting around idle in the list of open issues.

    Closing due to lack of interest. Pull requests with improvements are always welcome.

  19. MarcoFalke closed this on May 9, 2020

  20. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-04 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me