Use muhash for assumeUTXO snapshot #27669

issue Sjors openend this issue on May 16, 2023
  1. Sjors commented at 10:04 am on May 16, 2023: member

    Currently dumptxoutset uses hash_serialized_2 for its txoutset_hash. The proposed loadtxoutset (#27596) then checks this against the hash we hardcode in CChainParams.

    Unfortunately the only way to check this hash, if you’re not using it yourself, is to rollback the chain to the assume-utxo snapshot height, call the (slow) gettxoutsetinfo and then replay to the tip. This process is slow and potentially wreaks havoc on your lightning node if you forget to shut it down first.

    If we used the MuHash instead then any user with -coinstatsindex can verify it with a simple gettxoutsetinfo muhash HEIGHT.

    For good measure we should also modify gettxoutsetinfo so it can calculate the muhash for the current UTXO set without relying on the index. This would help users with pruned nodes (and without the index) to verify a newly proposed snapshot (if they haven’t pruned beyond it). (we already do this)

    I think this can be changed after #27596 is merged. It would just require offering two snapshots for download for a while.

  2. fanquake commented at 10:22 am on May 16, 2023: member
  3. theStack commented at 0:30 am on May 17, 2023: contributor

    Had similar thoughts that using muhash instead would be nice in order to avoid the chain rollback for AssumeUTXO hash reviewers. Not sure if this is a huge problem, but one funny side-effect of this would be that for a chainstate with n coins, there are in theory n! different serializations possible that all would pass as valid (compared to just one for hash_serialized_2), since the order of coins doesn’t matter for muhash.

    For good measure we should also modify gettxoutsetinfo so it can calculate the muhash for the current UTXO set without relying on the index. This would help users with pruned nodes (and without the index) to verify a newly proposed snapshot (if they haven’t pruned beyond it).

    I think calculating the muhash of the current chainstate without relying on coinstatsindex is already possible today by omitting the hash_or_height parameter, i.e. simply gettxoutsetinfo muhash? (If you want to try it out, I recommend passing -rpcclienttimeout=0, as the calculation takes quite long and could RPC-timeout with the default setting).

  4. Sjors commented at 8:21 am on May 17, 2023: member

    For the purpose of verifying a consensus change the ordering doesn’t matter. The end user of the snapshot would benefit from a sha256 hash in order to not waste time verifying incorrect snapshots (assuming that sha256 is quicker than muhash here).

    A file checksum doesn’t need to be in the consensus code, though if we offer an automatic download it may need to be somewhere in the code. The right file checksum may depend on the specific distribution mechanism, e.g. if you use a torrent you’d use the torrent hash, not the .dat file hash.

    Only committing to the muhash in the consensus code also leaves the door open for some fancy download scheme that takes advantage of the fact that ordering and duplicates don’t matter.

    I think calculating the muhash of the current chainstate without relying on coinstatsindex is already possible today

    You’re right, scrapping that.

  5. jamesob commented at 9:22 am on May 17, 2023: member

    See the proposal for why I didn’t use muhash to begin with: https://github.com/jamesob/assumeutxo-docs/tree/2019-04-proposal/proposal#how-will-users-and-reviewers-efficiently-verify-hashes-of-a-given-utxo-set

    That said, the original rationale (supporting snapshot chunks) doesn’t seem as relevant anymore, so this might be worth revisiting…

  6. Sjors commented at 8:07 pm on May 17, 2023: member

    However, a rolling UTXO set hash is incompatible with assumeutxo commitment schemes that involve chunking snapshots (discussed below) and so the resulting assumeutxo value might have to be a tuple consisting of (rolling_set_hash, split_snapshot_chunks_merkle_root).

    This is what I was getting at above. You just need a second hash that is order dependent, which could be the sha256 hash of the .dat file, a torrent hash or the split_snapshot_chunks_merkle_root. I don’t think that needs to live in the consensus code. It’s more of networking detail (potatos, potatos since it’s still an extra thing to set and review for each release).

  7. Sjors commented at 10:31 am on September 28, 2023: member
    Tangentially, a Merkle-Sum Tree design could be interesting, so that the total supply can be checked against the issuance schedule without having to process (or even download?) the full multi gigabyte file.
  8. fjahr commented at 7:34 pm on October 19, 2023: contributor
    @Sjors After this conversation on IRC today should we turn this into a brainstorming issue for utxo set P2P distribution for assumeutx? That seems to be very much intertwined with the hash conversation.
  9. Sjors commented at 7:46 am on October 23, 2023: member
    It makes sense to take p2p distribution into account.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me