BIP draft: UTXO set sharing #2137

pull fjahr wants to merge 1 commits into bitcoin:master from fjahr:2026_utxo_set_share changing 1 files +244 −0
  1. fjahr commented at 9:54 PM on April 10, 2026: contributor

    This BIP draft describes the sharing of a full UTXO set via the p2p network.

    Design summary:

    • Uses a new service bit to signal ability to share one or more UTXO sets
    • Introduces four new P2P messages, one round trip to get information on the available UTXO sets and one round trip for each chunk and associated meta data
    • UTXO sets are downloaded in chunks of 3.9 MiB
    • For each chunk there is a merkle proof which shows the chunk is part of the same merkle tree, this prevent potential DoS/OOM attack vectors
    • The root of the merkle tree can be known through a trusted information source (assumutxo params in Bitcoin Core) or multiple peers could be asked and the mechanism only used if there is agreement on the value, similar to compact block filters

    The one part I am not so sure about yet: This references Bitcoin Core and it’s features or RPCs in a few places now. I am aware that this is not ideal for specification that targets a wider audience but the reality is that assumeutxo seems to be only implemented in Bitcoin Core and mentioning RPCs from the workflow there seems the most clear way to describe this. But I am happy to generalize this further, I would be very happy to receive some guidance what level of referencing assumutxo is acceptable here since it is obviously the main current motivation. One concrete example: The Bitcoin Core PR that I will use as a reference implementation will rely on assumeutxo params rather than comparing multiple peer values of the merkle root. Is this discrepancy an issue?

    Mailing List post and reference implementation will follow shortly and I will add the links here asap.

  2. Add BIP UTXO set sharing 4bc39c8b17
  3. fjahr force-pushed on Apr 10, 2026
  4. in bip-XXXX.md:143 in 4bc39c8b17
     138 | +| `serialized_hash` | `uint256` | 32 | The UTXO set serialized hash. |
     139 | +| `data_length` | `uint64_t` | 8 | Total size of the serialized UTXO set in bytes (header + body). |
     140 | +| `merkle_root` | `uint256` | 32 | Root of the Merkle tree computed over chunk hashes. |
     141 | +
     142 | +A requesting node MUST ignore entries whose `serialized_hash` does not match a known
     143 | +utxo set hash for the corresponding height.
    


    ajtowns commented at 10:37 PM on April 10, 2026:

    I think it would be better for a client that supports this mechanism to hardcode the merkle root instead of the straight serialized hash, and to drop serialized_hash from this message.


    fjahr commented at 8:56 PM on April 11, 2026:

    Hm, yeah, I think that makes sense, I was struggling with finding the right path between building on top of the existing assumeutxo data we already have and extending it. I am adding the merkle root to the assumeutxo data in my PR, so checking the serialized_hash is a belts and suspenders there, so it makes sense to make this explicit here as well.


    ajtowns commented at 2:24 AM on April 12, 2026:

    If you wanted to change the contents of the utxo dump as well, it would be nice to include the header chain for the block where the snapshot was taken. Then you could do an import without needing to connect to the network at all, I think.

  5. in bip-XXXX.md:169 in 4bc39c8b17
     164 | +|-------|------|------|-------------|
     165 | +| `height` | `uint32_t` | 4 | Block height this data corresponds to. |
     166 | +| `block_hash` | `uint256` | 32 | Block hash this data corresponds to. |
     167 | +| `chunk_index` | `uint32_t` | 4 | Zero-based index of this chunk. |
     168 | +| `proof_length` | `compact_size` | 1–9 | Number of hashes in the Merkle proof. |
     169 | +| `proof_hashes` | `uint256[]` | 32 × `proof_length` | Sibling hashes from leaf to root. |
    


    ajtowns commented at 10:42 PM on April 10, 2026:

    Rather than a proof, it might be better to just request getutxoset <height> <hash> 0xFFFFFFFF once to get the full list of chunk hashes -- that should be about 74kB for a 9GB utxo set, and should stay under 4MB until the utxo set is >450GB. Then each chunk is just <height> <hash> <number> <data>.


    fjahr commented at 8:51 PM on April 11, 2026:

    Huh, interesting idea to get all the chunk hashes first, I didn't think of that. It might make sense to do this with a separate message type even instead of hacking getutxoset as you described. I will have to think about it a little more.

  6. murchandamus commented at 11:33 PM on April 10, 2026: member

    I was going to complain that I haven’t seen a discussion about this proposal on the mailing list… but you did that already for me. If you already know that you should send it to the mailing list first, I don’t know why you opened the PR first, though. :stuck_out_tongue:

  7. fjahr commented at 9:04 PM on April 11, 2026: contributor

    I don’t know why you opened the PR first

    It wasn't clear to me that a ML post was a prerequisit to open the PR. I just thought it was necessary to do this at some point before the bip could get merged/assigned a number. I think that having a place for more detail oriented commentary makes sense to have in addition to the high level discussion happening on the ML, if ML readers have such feedback but would rather use the more convenient inline commenting/suggestion features in GitHub. I was also looking for feedback on my assumeutxo related question, e.g. can I assume knowledge of a feature that is only implemented in Bitcoin Core or should I define this in the BIP. The ML doesn't seem like the right place to ask about this.

    I will close this for now and re-open when I have made the ML post and given it some reasonable time for discussion.

  8. fjahr closed this on Apr 11, 2026

  9. murchandamus commented at 5:44 PM on April 12, 2026: member

    Thanks, and sorry, I might have come off as more gruff than intended — tone is hard in written text. Obviously, you’ve been around the block and your proposal reads well-considered, but we have been getting a lot of premature submissions out of the blue here, where then BIP Editors become the first line of feedback. Personally, it’s been taking a growing part of my work hours to even just get through all submissions to the repository. So, we have become a bit more insistent on proposals actually being posted to the list first, and I’d like to avoid giving the impression that I’m playing favorites.

    A hypothetical optimal order of might be:

    1. Discuss your idea with a couple colleagues
    2. Post about the idea on the ML
    3. Compile a first draft in a PR against your personal fork of the BIPs repository
    4. Have someone give it a read
    5. Send your draft to the ML
    6. Open a PR here

    In your case it sounds like you’d be able to skip directly to 5, and we can of course reopen this PR, when there has been an ML discussion.

  10. in bip-XXXX.md:135 in 4bc39c8b17
     130 | +| `count` | `compact_size` | 1–9 | Number of available UTXO sets. |
     131 | +
     132 | +For each available UTXO set:
     133 | +
     134 | +| Field | Type | Size | Description |
     135 | +|-------|------|------|-------------|
    


    luke-jr commented at 10:07 PM on April 13, 2026:

    Since the format has a version number, it would make sense to include it here.

  11. in bip-XXXX.md:191 in 4bc39c8b17
     186 | +
     187 | +1. The requesting node identifies peers advertising `NODE_UTXO_SET`.
     188 | +2. The requesting node sends `getutxosetinfo` to one or more of these peers.
     189 | +3. Each peer responds with `utxosetinfo`. The requesting node verifies that the advertised
     190 | +   `serialized_hash` matches a known UTXO set hash, compares `merkle_root` values across peers,
     191 | +   and selects a UTXO set whose Merkle root has agreement among multiple peers.
    


    luke-jr commented at 10:09 PM on April 13, 2026:

    I think this process defeats the point of the service bit. Having "at least one" UTXO set only "works" while there are only a small number of UTXO set options to download. If it's not sufficient to just try peers until you find the UTXO set you want, then we probably need a way to advertise specific UTXO sets.

  12. in bip-XXXX.md:210 in 4bc39c8b17
     205 | +
     206 | +**Serialized hash in `utxosetinfo`:** The requesting node should have access to a known UTXO set hash
     207 | +before initiating the process. Including the serialized hash in the advertisement lets the requester
     208 | +immediately filter out peers claiming a different UTXO set state before downloading any data.
     209 | +
     210 | +**Discovery before download:** The `getutxosetinfo`/`utxosetinfo` exchange lets the requesting node
    


    luke-jr commented at 10:11 PM on April 13, 2026:

    I'm not sure discovery is a useful feature. It enables a misguided implementation to blindly trust whatever nodes provide, and harms node privacy by adding a way to fingerprint nodes.

    If you know what you'll accept in advance, you can simply request that snapshot, and the node will either provide it or (potentially) say it can't.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 15:10 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me