Unnamed repository; edit this file 'description' to name the repository.
 help / color / mirror / Atom feed
From: josie <josie@2140.dev>
To: Bitcoin Development Mailing List <bitcoindev@googlegroups.com>
Subject: [bitcoindev] Re: [BIP Draft] P2P UTXO Set Sharing
Date: Tue, 19 May 2026 02:32:33 -0700 (PDT)	[thread overview]
Message-ID: <8661e699-47d8-46e4-a561-41c5d2fdc0e6n@googlegroups.com> (raw)
In-Reply-To: <CmNC-9nAajMwaJi6dHTdlpgjzLZazPlMjGAxNT8DJXAnyw2sUKCygJJU4BLqaF8OYw3carG_pt1Rriqu66OG3wQ8u2itVlJCFo1AhI3V4es=@protonmail.com>


[-- Attachment #1.1: Type: text/plain, Size: 14970 bytes --]

I think others have articulated legitimate concerns with the AssumeUTXO 
approach better than I can, so I'll try not to repeat them. 

I'd like to instead take a step back and ask why is this being proposed in 
the first place? Even if I agreed with the AssumeUTXO approach (I do not), 
I would still push back on p2p sharing. AssumeUTXO's trust model is built 
around a hash committed to in the Bitcoin Core binary, which the snapshot 
is then verified against. So why not just distribute the snapshot from 
bitcoincore.org along with the release? As you said, this is meant to be 
opt-in so why not keep it fully opt-in by not adding code and attack 
surface to the p2p code that everyone who runs Bitcoin Core will use?

Furthermore, having the snapshot be distributed by Bitcoin Core makes the 
trust model explicitly clear: if you don't trust downloading a snapshot 
from the Bitcoin Core website then you certainly shouldn't be trusting the 
hash committed to in the Bitcoin Core binary. This also makes the feature 
available to any user who wants to use it, without requiring a certain 
number of peers on the network to also support it. I'd much rather validate 
beforehand how many people are *actually* using this feature before we 
continue writing more code to support it.

I'd also like to comment on the bandwidth arguments being used as a 
justification for AssumeUTXO. This does not make sense to me. Low bandwidth 
areas are often on metered bandwidth and AssumeUTXO increases bandwidth 
usage, not decreases. Furthermore, if low bandwidth is a concern, has 
anyone taken the time to do the math and verify that the node will catch up 
to the snapshot in a reasonable amount of time? As a reminder, the node 
also needs to keep up with new blocks and transaction relay, which would 
take away available bandwidth/CPU from validating in the background. I keep 
hearing "accepting payments" as the use case, which implies to me the node 
would care a lot more about validating recently blocks and learning about 
transactions. This implies the node may never catch up, or catch up so 
slowly that it invalidates the trust model of AssumeUTXO.

In closing, if I may be a bit (humorous) crass, this feels like putting 
lipstick on a pig by trying to dress up a trusted protocol as less trusted 
than it actually is.

Cheers,
josie
On Tuesday, May 5, 2026 at 5:39:56 PM UTC+2 Fabian wrote:

> Dear list,
>
> I am sharing a BIP draft for sharing the UTXO set over the P2P network, an 
> old idea that makes it possible to utilize AssumeUTXO without sourcing a 
> UTXO set dump from a third party source. You can find the full text below 
> or comment on the BIPs repository pull directly: 
> https://github.com/bitcoin/bips/pull/2137
>
> Fabian
>
>
> ```
>   BIP: ?
>   Layer: Peer Services
>   Title: P2P UTXO Set Sharing
>   Authors: Fabian Jahr <fj...@protonmail.com>
>   Status: Draft
>   Type: Specification
>   Assigned: ?
>   Discussion: ?
>   Version: 0.2.0
>   License: BSD-2-Clause
> ```
>
> ## Abstract
>
> This BIP defines a P2P protocol extension for sharing full UTXO sets 
> between peers. It introduces
> a new service bit `NODE_UTXO_SET`, four new P2P messages (`getutxotree`, 
> `utxotree`, `getutxoset`,
> `utxoset`), and a chunk-hash list anchored to a Merkle root known to the 
> requesting node, enabling
> per-chunk verification. This allows nodes to bootstrap from a recent 
> height by obtaining the
> required UTXO set directly from the P2P network via mechanisms such as 
> assumeutxo.
>
> ## Motivation
>
> The assumeutxo feature (implemented in Bitcoin Core) allows nodes to begin 
> operating from a serialized
> UTXO set while validating
> historical blocks in the background. However, there is currently no 
> canonical source for obtaining this
> data. Users must either generate one themselves from a fully synced node 
> (using `dumptxoutset` in 
> Bitcoin Core), or download one from a third party.
>
> By enabling UTXO set sharing over the P2P network, new nodes can obtain 
> the data directly from
> peers, removing the dependency on external infrastructure.
>
> ## Specification
>
> The key words "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" in 
> this document are to be
> interpreted as described in RFC 2119.
>
> ### Service Bit
>
> | Name | Bit | Description |
> |------|-----|-------------|
> | `NODE_UTXO_SET` | 12 (0x1000) | The node can serve complete UTXO set 
> data for at least one height. |
>
> A node MUST NOT set this bit unless it has at least one full UTXO set 
> available to serve.
> A node signaling `NODE_UTXO_SET` MUST be capable of responding to 
> `getutxotree` and `getutxoset`
> requests for every UTXO set it is willing to serve, including the full 
> chunk-hash list and every
> chunk of those sets.
>
> ### Data Structures
>
> #### Serialized UTXO Set
>
> The serialized UTXO set uses the format established by the Bitcoin Core 
> RPC `dumptxoutset` (as of Bitcoin Core v31).
>
> **Header (55 bytes):**
>
> | Field | Type | Size | Description |
> |-------|------|------|-------------|
> | `magic` | `bytes` | 5 | `0x7574786fff` (ASCII `utxo` + `0xff`). |
> | `version` | `uint16_t` | 2 | Format version. |
> | `network_magic` | `bytes` | 4 | Network message start bytes. |
> | `base_height` | `uint32_t` | 4 | Block height of the UTXO set. |
> | `base_blockhash` | `uint256` | 32 | Block hash of the UTXO set. |
> | `coins_count` | `uint64_t` | 8 | Total number of coins (UTXOs) in the 
> set. |
>
> **Body (coin data):**
>
> Coins are grouped by transaction hash. For each group:
>
> | Field | Type | Size | Description |
> |-------|------|------|-------------|
> | `txid` | `uint256` | 32 | Transaction hash. |
> | `num_coins` | `compact_size` | 1–9 | Number of outputs for this txid. |
>
> For each coin in the group:
>
> | Field | Type | Size | Description |
> |-------|------|------|-------------|
> | `vout_index` | `compact_size` | 1–9 | Output index. |
> | `coin` | `Coin` | variable | Serialized coin (varint-encoded code for 
> height/coinbase, then compressed txout). |
>
> Coins are ordered lexicographically by outpoint (txid, then vout index), 
> matching the LevelDB iteration
> order of the coins database.
>
> #### Chunk Merkle Tree
>
> The serialized UTXO set (header + body) is split into chunks of exactly 
> 3,900,000 bytes (3.9 MB). The
> last chunk contains the remaining bytes and may be smaller.
>
> The leaf hash for each chunk is `SHA256d(chunk_data)`. The tree is built 
> as a balanced binary tree. When
> the number of nodes at a level is odd, the last node is duplicated before 
> hashing the next level.
> Interior nodes are computed as `SHA256d(left_child || right_child)`.
>
> The leaves are delivered to the node in a single `utxotree` response. A 
> node that knows
> the Merkle root for a given UTXO set checks a received list of leaves by 
> recomputing the root and
> comparing. The Merkle root is the sole trust input required to verify the 
> integrity of the received UTXO set.
>
> `SHA256d` denotes double-SHA256: `SHA256d(x) = SHA256(SHA256(x))`.
>
> ### Messages
>
> #### `getutxotree`
>
> Sent to request the chunk-hash list for a specific UTXO set.
>
> | Field | Type | Size | Description |
> |-------|------|------|-------------|
> | `block_hash` | `uint256` | 32 | Block hash identifying the requested 
> UTXO set. |
>
> A node that has advertised `NODE_UTXO_SET` and can serve the requested 
> UTXO set MUST respond with
> `utxotree`. If the serving node cannot fulfill the request, it MUST NOT 
> respond. The requesting
> node SHOULD apply a reasonable timeout and try another peer.
>
> #### `utxotree`
>
> Sent in response to `getutxotree`, delivering the full chunk-hash list 
> along with per-snapshot
> metadata.
>
> | Field | Type | Size | Description |
> |-------|------|------|-------------|
> | `block_hash` | `uint256` | 32 | Block hash this data corresponds to. |
> | `version` | `uint16_t` | 2 | Format version of the serialized UTXO set. |
> | `data_length` | `uint64_t` | 8 | Total size of the serialized UTXO set 
> in bytes (header + body). |
> | `num_chunks` | `compact_size` | 1–9 | Number of chunks the serialized 
> UTXO set is split into. |
> | `chunk_hashes` | `uint256[]` | 32 × `num_chunks` | The ordered list of 
> chunk hashes. |
>
> Upon receiving a `utxotree` message, the node MUST recompute the Merkle 
> root from
> `chunk_hashes` and compare it against the Merkle root it knows for the 
> corresponding UTXO set. If
> the roots do not match, the node MUST discard the response and MUST 
> disconnect the peer.
>
> #### `getutxoset`
>
> Sent to request a single chunk of UTXO set data. The requesting node MUST 
> have received a `utxotree`
> for the corresponding UTXO set before sending this message.
>
> | Field | Type | Size | Description |
> |-------|------|------|-------------|
> | `block_hash` | `uint256` | 32 | Block hash identifying the requested 
> UTXO set. |
> | `chunk_index` | `uint32_t` | 4 | Zero-based index of the requested 
> chunk. |
>
> If the serving node cannot fulfill the request, it MUST NOT respond. The 
> requesting node SHOULD apply
> a reasonable timeout and try another peer.
>
> #### `utxoset`
>
> Sent in response to `getutxoset`, delivering one chunk.
>
> | Field | Type | Size | Description |
> |-------|------|------|-------------|
> | `block_hash` | `uint256` | 32 | Block hash this data corresponds to. |
> | `chunk_index` | `uint32_t` | 4 | Zero-based index of this chunk. |
> | `data` | `bytes` | variable | Chunk payload, exactly 3.9 MB except for 
> the last chunk. |
>
> The transfer is receiver-driven: the requesting node sends one 
> `getutxoset` per chunk. Chunks MAY be
> requested in any order and from different peers.
>
> Upon receiving a `utxoset` message, the node MUST compute `SHA256d(data)` 
> and compare it against
> `chunk_hashes[chunk_index]` from the `utxotree` it accepted for this UTXO 
> set. If the hashes do not
> match, the node MUST discard the chunk and MUST disconnect the peer. A 
> node SHOULD also disconnect
> a peer that sends a `utxoset` message with fields (`chunk_index`, 
> `block_hash`) that do not match
> the outstanding request.
>
> After all chunks have been received, the node SHOULD parse the reassembled 
> UTXO set against the
> serialized UTXO set format to confirm it is well-formed.
>
> ### Protocol Flow
>
> 1. The requesting node identifies peers advertising `NODE_UTXO_SET`.
> 2. The requesting node sends `getutxotree` for the desired block hash to 
> one or more of these peers.
> 3. Each peer responds with `utxotree`. The requesting node verifies the 
> response by recomputing
>    the Merkle root against a value it knows for the given UTXO set, either 
> from a trusted source
>    or by selecting a root with agreement among multiple peers.
> 4. The requesting node downloads chunks via `getutxoset`/`utxoset` 
> exchanges, verifying each chunk
>    against its entry in the accepted `utxotree` on receipt. On 
> verification failure the peer is
>    disconnected and download continues from another peer without losing 
> already-verified chunks.
> 5. After all chunks are received, the node parses the reassembled UTXO set 
> against the serialized
>    UTXO set format to confirm that it is well-formed.
>
> Serving nodes are free to limit the number of concurrent and repeated 
> transfers per peer at their own
> discretion to manage resource consumption.
>
> ## Rationale
>
> **Usage of service bit 12:** Service bits allow selective peer discovery 
> through
> DNS seeds and addr relay. Bit 12 is chosen as the next unassigned bit 
> after `NODE_P2P_V2` (bit 11, BIP 324).
>
> **Direct request model:** Peers signal availability of UTXO sets via the 
> `NODE_UTXO_SET`
> service bit; the requesting node identifies the desired UTXO set by block 
> hash when sending
> `getutxotree`. The serving node responds only if it can serve that 
> specific UTXO set.
>
> **Per-chunk verification:** The chunk-hash list returned in `utxotree` 
> enables each chunk to be verified
> by direct lookup against the accepted list as it arrives, allowing 
> immediate detection of corrupt data,
> peer switching without data loss, and parallel download from multiple 
> peers. The list itself is small
> (~80 KB for a ~10 GB set). The specified serialization is deterministic, 
> so all honest nodes produce
> byte-identical output, guaranteeing Merkle root agreement.
>
> **3.9 MB chunk size:** The number balances round trips (~2,560 for a ~10 
> GB set) against memory usage
> for buffering and verifying a single chunk. Smaller chunks would increase 
> protocol overhead; larger
> chunks would increase memory pressure on constrained devices commonly used 
> to run Bitcoin nodes.
> Together with the additional message overhead, the `utxoset` message 
> including the chunk data also
> sits just below the theoretical maximum block size which means any 
> implementation should be able to
> handle messages of this size.
>
> **Reusing the `dumptxoutset` format:** Avoids introducing a new 
> serialization format and ensures
> compatibility with UTXO sets already being generated and shared.
>
> **Relationship to BIP 64:** BIP 64 defined a protocol for querying 
> individual UTXOs by outpoint and is
> now closed. This BIP addresses a different use case: bulk transfer of the 
> entire UTXO set for node
> bootstrapping.
>
> ## Reference Implementation
>
> [Bitcoin Core implementation pull request](
> https://github.com/bitcoin/bitcoin/pull/35054)
>
> ## Copyright
>
> This BIP is made available under the terms of the 2-clause BSD license. See
> https://opensource.org/license/BSD-2-Clause for more information.
>
> ## Changelog
>
> * __0.2.0__ (2026-05-04):
>     * Dropped discovery before download approach, instead request the 
> chunk-hash list via `getutxotree`/`utxotree`
>     * Dropped per-chunk Merkle proofs; chunks verified directly against 
> the chunk-hash list
>     * Dropped `height` from requests (`block_hash` is the sole 
> identifier); added format `version` to `utxotree`
>     * Dropped references to the serialized hash; the Merkle root is the 
> sole integrity check
> * __0.1.0__ (2026-04-10):
>     * Initial draft
>

-- 
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/8661e699-47d8-46e4-a561-41c5d2fdc0e6n%40googlegroups.com.

[-- Attachment #1.2: Type: text/html, Size: 21196 bytes --]

      parent reply	other threads:[~2026-05-19 11:09 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-05 15:36 [bitcoindev] [BIP Draft] P2P UTXO Set Sharing 'Fabian' via Bitcoin Development Mailing List
2026-05-05 16:01 ` [bitcoindev] " Eric Voskuil
2026-05-06  1:06   ` Antoine Riard
2026-05-07 21:50     ` 'Fabian' via Bitcoin Development Mailing List
2026-05-07 21:34   ` 'Fabian' via Bitcoin Development Mailing List
2026-05-12 15:56     ` eric
2026-05-15 23:08       ` Anthony Towns
2026-05-16  0:58         ` eric
2026-05-16 17:58           ` Saint Wenhao
2026-05-16 21:48             ` 'Fabian' via Bitcoin Development Mailing List
2026-05-17  2:09               ` Eric Voskuil
2026-05-17  8:50                 ` sadiq Ismail
2026-05-17 21:29                   ` Eric Voskuil
2026-05-18  1:36                     ` Eric Voskuil
2026-05-19  8:36                       ` 'Fabian' via Bitcoin Development Mailing List
2026-05-19 23:20                         ` Eric Voskuil
2026-05-16 22:39             ` Eric Voskuil
2026-05-19  9:32 ` josie [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8661e699-47d8-46e4-a561-41c5d2fdc0e6n@googlegroups.com \
    --to=josie@2140.dev \
    --cc=bitcoindev@googlegroups.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox