p2p: UTXO set sharing #35054

fjahr commented at 9:58 PM on April 10, 2026: contributor

This implements a draft BIP for a protocol to share a UTXO set over P2P. Ideally, please review the BIP draft along side this PR if you already want to look at the code.

Motivation

The motivation is to make it possible to use the assumeutxo feature without having to acquire a snapshot from a third party.

UX

The UX for the user is very similar to how loadtxoutset works today:

The user starts their new node and has to wait until the headers have synced
The user then invokes the RPC downloadutxoset with their selected height
The utxo set download finishes in the background
Upon completion the snapshot is compared to assumeutxo params and if those match, it is activated

Design choices

Capability to serve UTXO sets is signaled with a service bit
The UTXO set is shared in chunks of 3.9 MB
A merkle root of these chunks serves as integrity check to prevent sending useless data as a trivial DoS vector
UTXO set snapshots for sharing are placed in a share folder in the datadir, they are atomically loaded on startup and then shared
A node that downloaded a snapshot will share it themselves until the snapshot has been deleted from their share folder
The merkle root is introduced to the assumeutxo data in the chainparams to ensure peers can not lie to us about it

Status

This is certainly still rough around the edges and the merkle roots in the chain params have not all been filled in yet. I am primarily looking for concept and approach feedback as well as potentially overlooked DoS vectors to start.

If you find a DoS vector on the serving side feel free to crash the live demo node below for bragging rights. Just please give me a heads up afterwards :)

Live demo

The feature can be tried out on mainnet:

Build this branch (duh)
Start with a fresh node (datadir) and -connect=178.104.141.103:8333 -debug=utxosetshare. You can also use addnode but this will make it harder to watch the logs since historical blocks are still synced at the same time. Note that since the demo node is pruned, your node will need to be restarted to sync blocks after snapshot validation if you are using -connect.
Wait until the headers have synced, then use bitcoin-cli downloadutxoset 935000
Watch the logs to see the the chunks come in and then see the completed snapshot getting activated

net: Add NODE_UTXO_SET service bit and message types

Add NODE_UTXO_SET (bit 12) service flag indicating the node can
serve a UTXO set over P2P.

Also add the P2P message types for UTXO set sharing.

350985da5e

log: Add UTXOSETSHARE category 34fd9f6c3e

consensus: Expose ComputeMerklePath and add merkle proof verification

Allows reuse of ComputeMerklePath() for UTXO set chunk Merkle proofs.

78969203f5

net: Define P2P message serialization for UTXO set sharing 9a6fd56aea

node: Implement UTXO set share provider

Scans a share directory for valid dumptxoutset snapshot files, validates
them against known assumeutxo parameters and then serves chunks with Merkle
proofs. A sidecar file is generated to cache the necessary data and
signal the file has already been verified in future restarts.

d16e35ac62

init: Scan share folder in datadir for sharable snapshots 3f8e5c0dc9

net: Handle getutxostinf and getutxoset P2P messages ef15e22d58

node: Implement UTXO set download manager

The fetched UTXO set is written to the share dir so the node can
serve the downloaded snapshot afterwards.

a00eebc6b5

rpc: Add downloadutxoset RPC 0943ed1f17

contrib: Add snapshot merkle root tool

Standalone script that computes the merkle root of a
dumptxoutset snapshot file. The output can be used to set
the chunk_merkle_root field in the assumeutxo chain params.

0cda5f60a1

net: Validate chunk Merkle root against assumeutxo chain params

Add chunk_merkle_root to AssumeutxoData so the download manager can
reject peers that advertise a different Merkle root than expected.
This prevents a peer serving a UTXO set where the merkle root does
not correspond to the UTXO set a the expected height or has the
correct serialized hash.

Check is skipped for now if no value is set for easier testing and
backwards compatibility.

b131adeb25

test: Add functional test for P2P UTXO set sharing

Makes the necessary changes to the functional test framework
like adding constants.

Also tests common failure cases:
- Disconnect on wrong block hash
- Disconnect on wrong height
- Non-serving node doesn't advertise NODE_UTXO_SET
- Non-serving node ignores getutxostinf
- RPC error on downloadutxoset with invalid height
- RPC error on duplicate downloadutxoset calls

3fffb9f085

chainparams: Add missing snapshot merkle roots cf5d5b9533

DrahtBot added the label P2P on Apr 10, 2026

DrahtBot commented at 9:59 PM on April 10, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35054.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept NACK	eynhaender, l0rinc, stickies-v, evoskuil, narula, nkaretnikov
Concept ACK	andrewtoth, svanstaa, danielabrozzoni

If your review is incorrectly listed, please copy-paste <code></code> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#35148 (refactor: Remove confusing DataStream::in_avail() alias by maflcko)
#34860 (mining: always pad scriptSig at low heights, drop include_dummy_extranonce by Sjors)
#34628 (p2p: Replace per-peer transaction rate-limiting with global rate limits by ajtowns)
#34075 (fees: Introduce Mempool Based Fee Estimation to reduce overestimation by ismaelsadeeq)
#33421 (node: add BlockTemplateCache by ismaelsadeeq)
#31974 (Drop testnet3 by Sjors)
#28690 (build: Introduce internal kernel library by sedited)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

Compute leaf hashes by from snapshot file. -> Compute leaf hashes from snapshot file. [“by from” is a grammatical error that obscures the intended meaning]
the peer is be accepted. -> the peer is accepted. [“is be” is a typo that breaks the grammar]

DrahtBot added the label CI failed on Apr 10, 2026

fanquake added this to a project on Apr 11, 2026

fanquake changed the project status on Apr 11, 2026

andrewtoth commented at 7:50 PM on April 12, 2026: contributor

Concept ACK

Why invoke an RPC, instead of a configuration option (which can be set by default at a later point)?

If ActivateSnapshot fails, download manager is forever in COMPLETE state. Not sure what we should do in that case. It would mean something is very wrong if our committed merkle root doesn't result in a correct snapshot, or there's disk error. Probably want to abort?

DoS on downloading side: Server can set utxosetinfo's data_length field to 10 PB, which causes the receiving node to allocate >60 GB immediately. Recommend committing the data_length field alongside the merkle root.

svanstaa commented at 5:30 PM on April 20, 2026: none

Concept ACK

Built the branch, set up fresh data dir, connected exclusively to the demo node (-connect=178.104.141.103:8333 -debug=utxosetshare). Waited for headers to sync, then started the snapshot download: bitcoin-cli downloadutxoset 935000.

Observations:

snapshot was activated and node came up at height 935000
restarted without -connect to reconnect to the broader network
Node synced from the snapshot to current tip normally
Background validation from genesis also started

Overall feature works as advertised.

Also tried to shoot down the test node via several methods, but it survived all attempts. Will do another post later with the details.

DrahtBot commented at 10:19 AM on April 25, 2026: contributor

🐙 This pull request conflicts with the target branch and needs rebase.

DrahtBot added the label Needs rebase on Apr 25, 2026

in src/rpc/blockchain.cpp:3462 in cf5d5b9533

3457 | +        "downloadutxoset",
3458 | +        "Download a UTXO set snapshot from peers through P2P network.\n"
3459 | +        "The download happens asynchronously in the background. Once complete, the snapshot is automatically activated. "
3460 | +        "The snapshot is saved to the share directory inside the data dir, so the node can re-serve it to other peers.",
3461 | +        {
3462 | +            {"height", RPCArg::Type::NUM, RPCArg::Optional::NO, "The block height of the UTXO set to download."},

danielabrozzoni commented at 4:21 PM on April 29, 2026:

I think this is fine for the first implementation, but do you think that in the future we could be smarter and decide the height by ourselves, without the user specifying it? Based on what we have in chainparams, or what we see in utxosetinfo

stickies-v commented at 12:52 PM on May 13, 2026:

This parameter confuses me. Don't we only allow the specific hardcoded CChainParams::m_assumeutxo_data::hash_serialized to be activated? Then why allow the user to specify block height?

in src/node/utxo_set_share.cpp:263 in cf5d5b9533

 258 | +                     peer_id, entry.merkle_root.ToString(),
 259 | +                     m_expected_chunk_merkle_root.ToString());
 260 | +            continue;
 261 | +        }
 262 | +
 263 | +        m_data_length = entry.data_length;

danielabrozzoni commented at 5:02 PM on April 29, 2026:

@andrewtoth said:

Recommend committing the data_length field alongside the merkle root.

Yes, I think right now if a peer sends us a utxosetinfo with the wrong data length, we would record it and never notice the mistake, since we are currently only processing the first utxosetinfo message that we receive that contains the merkle_root we're interested in (we switch to DOWNLOADING state in L268)

andrewtoth commented at 4:50 PM on April 30, 2026:

It's also a little worse than that currently, maybe I wasn't clear. If they specify a size of say 10 petabytes (or just uint64 max) then the receiving node will allocate 60 GB or more just to track the chunks. So immediate OOM DoS by a malicious server.

danielabrozzoni commented at 4:45 PM on April 30, 2026: member

Concept ACK!

I'm slowly going through the BIP and the code, haven't tested yet :) I left a couple of questions.

evoskuil commented at 4:44 PM on May 5, 2026: none

Concept NACK. One is free to add trust vectors to his own node implementation. Shoving that into the p2p network is not acceptable.

fjahr marked this as a draft on May 5, 2026

fjahr commented at 10:57 PM on May 5, 2026: contributor

Temporarily moving this to draft status since it's out of sync with the latest version of the BIP and I will need a few days until I can get to it catch up the code.

eynhaender commented at 2:53 PM on May 6, 2026: none

Concept NACK

This PR violates "don't trust, verify" by introducing a new protocol-level dependency on trusted, hardcoded data (Merkle roots in chainparams) for accepting a UTXO set snapshot from untrusted P2P peers.

l0rinc commented at 1:05 PM on May 13, 2026: contributor

I'm also leaning towards a concept NACK, I'd rather see a proposal to completely remove assumeUTXO from our codebase. Barely anyone uses it (I'm one of the torrent seeders), it downloads more total data than a normal IBD, and the dual chainstate complicates every code path that touches validation - reorgs, wallet rescanning, pruning, the *DANGER suffixed methods. All for a short-lived security downgrade that users may not even realize they're operating under. Adding P2P distribution on top makes this even more complex without addressing the fundamental issues.

stickies-v commented at 1:24 PM on May 13, 2026: contributor

I'm increasingly convinced that AssumeUTXO and having multiple chainstates was directionally a mistake for this project. It adds a ton of validation complexity, and the benefits to me seem limited. It also seems like more of a wallet than a node concern.

For that reason, I'm Concept NACK any changes that add more AssumeUTXO complexity.

fjahr commented at 9:52 PM on May 16, 2026: contributor

@evoskuil @eynhaender Your NACKs are noted but I think your critique is better addressed by my responses on the mailing list. I would like to keep the conversation here focussed on the code being added here and it's impact on the code base. If you disagree and have specific critique based on the code here please continue to respond here, of course.

fjahr commented at 10:08 PM on May 16, 2026: contributor

@l0rinc @stickies-v

Thanks for sharing your thoughts. You make almost identical points, so I will respond to you both together.

Barely anyone uses it (I'm one of the torrent seeders)

and the benefits to me seem limited

I don't think the torrent seeders are a good indication for usage but certainly it isn't where we hoped it would be when we added the support for it. This proposal addresses that, projects like prepackaged nodes and BTCPayServer have had AssumeUTXO on their wishlists for a long time but since implementation was much harder without this change, they have kept pushing it back.

the dual chainstate complicates every code path that touches validation

I'm increasingly convinced that AssumeUTXO and having multiple chainstates was directionally a mistake for this project.

It adds a ton of validation complexity

I would say this was probably true when the initial assumeutxo PR was merged. But since then a ton of cleanup has been done and I find the current state of the assumeutxo code quite easy to reason about. Can you point at specific projects or PRs that have been slowed down by assumeutxo being present in the project?

I'm Concept NACK any changes that add more AssumeUTXO complexity.

Adding P2P distribution on top makes this even more complex

Even if assumeutxo adds notable complexity to this project's other code paths this PR certainly does not make it worse. This is the most self-contained piece of code I have ever worked on in Bitcoin Core I think. It barely touches any other lines of code outside of what it adds.

It also seems like more of a wallet than a node concern. @stickies-v Can you explain how a wallet would implement this?

that users may not even realize they're operating under. @l0rinc Please provide evidence for this claim.

without addressing the fundamental issues @l0rinc What are "the fundamental issues"?

danielabrozzoni commented at 2:04 PM on May 18, 2026: member

Expanding a little bit on my initial concept ACK:

I think assumeutxo is a useful feature, and sharing the UTXO set over p2p makes the UX more intuitive, since users don't need to get it through external channels.

That said, I think it's more important to work on ways to speed up initial IBD, like SwiftSync. I'm also wondering how many contributors see these two projects as competing with each other: maybe we can realistically have one but not the other, because supporting both would make the validation code too complex and spread contributor time too thin.

So my concept ACK was a bit uninformed. It was more like: "I really like the idea in theory, but I don't know whether this should be the direction the project takes"

Barely anyone uses it (I'm one of the torrent seeders)

I wonder why this is the case!

Is it simply not a well-known feature among users? I honestly didn't know that assumeutxo was already merged in Bitcoin Core until a few months ago, and I found out only because I saw some assumeutxo PRs :sweat_smile:
Maybe it seems dangerous to download utxoset from torrent?
A lot of users I know run Bitcoin Core together with an Electrum server, so they can use Sparrow or Electrum as wallets. IIUC, you need to complete full IBD before the Electrum server is usable, so maybe assumeutxo just isn't useful for many of them.

stickies-v commented at 3:59 PM on May 18, 2026: contributor

Can you point at specific projects or PRs that have been slowed down by assumeutxo being present in the project?

In my experience, many PRs involving validation logic are made more complex simply by having to reason about multiple chainstates, e.g. this is the most recent example that comes to mind.

It's also generally slowing down kernel development, for example forcing us to expose the ChainstateManager in the interface until further (non-trivial) decoupling work is completed. (I recognize that the historical AssumeUTXO work probably also improved certain aspects wrt increased decoupling etc).

I recently came across this branch that removes AssumeUTXO. I didn't author it and I presume it's not heavily reviewed / may be incorrect, but glossing over it it seems directionally correct and I think it again highlights how AssumeUTXO is a significant code patch, still affecting lots of critical code. If we prioritize avoiding consensus failure, making consensus code trivial to understand by current and future contributors is an important step.

Can you explain how a wallet would implement this?

Let a new user get started quickly with increased trust assumptions, by e.g. running an SPV node or trust (a quorum of) third parties with clear disclaimers until the full node has fully validated. Obviously, I would prefer users to be able to use a full node as soon as possible, my point is that I think the current set of trade-offs for AssumeUTXO don't make sense. I'm not against the concept of AssumeUTXO, I think it's reasonable to let users choose to make certain trade-offs for e.g. faster startup times. However, I don't think it's wise to add this much code complexity to do so, especially when it can be handled by higher-level layers in ways that make sense for the use case.

josibake commented at 10:46 AM on May 19, 2026: member

I don't think the torrent seeders are a good indication for usage but certainly it isn't where we hoped it would be when we added the support for it. This proposal addresses that

Why aren't torrent seeders a good indication of usage? And the fact that usage isn't where we hoped it would be is a metric that should be seriously considered before building further on the project. "People aren't using it because its not distributed via p2p" seems like a bit of a leap to me. As I mentioned on the mailing list, why can't prepackaged nodes just ship the snapshots directly, provide them via their websites, or get them via bitcoincore.org?

fjahr commented at 11:07 AM on May 19, 2026: contributor

Why aren't torrent seeders a good indication of usage?

Because it would require projects like prepackaged nodes to add Torrenting as a dependency to their nodes, which seems like a terrible idea to me and I don't think it's something they are interested in doing.

As I mentioned on the mailing list, why can't prepackaged nodes just ship the snapshots directly, provide them via their websites, or get them via bitcoincore.org?

I haven't seen a response from you on the ML but this is what BTCPayServer has been doing with FastSync for example. But of course we would like them to not do that but rather use assumeutxo because it is safer. The projects I have spoken with say that have it high on their wishlist but they were just not able to get to it yet.

josibake commented at 11:20 AM on May 19, 2026: member

Because it would require projects like prepackaged nodes to add Torrenting as a dependency to their nodes, which seems like a terrible idea to me and I don't think it's something they are interested in doing.

This feels like a very subjective response to me. If people really needed assumeutxo, as in it was solving a real problem for them, I would expect people to use whatever means are available to them. That then creates momentum to further support the project. It seems like the opposite is happening: no one is using it, and we keep building on it trying to get people to use it.

I haven't seen a response from you on the ML but this is what BTCPayServer has been doing with FastSync for example. But of course we would like them to not do that but rather use assumeutxo because it is safer.

Likely still pending. But I don't understand how this is safer? BTCPayServer gets a snapshot, likely from a fully validating node they own? This seems safer, trust wise, than getting it from the p2p network to then be compared against a hash they did not create. But regardless, we already have a hash snapshot in the binary, why can't BTCPayServer just download the snapshot from bitcoincore.org? If it doesn't match the hash in the binary, they don't use it. This is exactly the same trust model you are proposing here, without requiring us to change our p2p code.

l0rinc commented at 1:53 PM on May 19, 2026: contributor

First, I want to separate a few things: I appreciate the work @fjahr is doing here, and I think the implementation itself looks careful and very well done. I also agree that startup cost is a real problem - it is why I have been working full time on validation optimizations for years. I also do not think assumeUTXO is useless: I use it all the time for benchmarking serialization and UTXO-set related code - even if that use case is not enough to justify the feature by itself.

My objection is that assumeUTXO makes the most critical part of the codebase harder to read and reason about, and this PR extends that model further. That is the part I do not think is justified. It looks like sunk cost reasoning: we added assumeUTXO, usage is not where we hoped, and now we keep adding the next missing piece hoping this one will finally make it broadly adopted.

Concrete examples, since you asked:

In #32975 (review), a simple assumevalid logging change became chainstate-specific. A global/static state would have produced confusing output with assumeUTXO, because one chainstate can be validating new blocks while the background chainstate is still in IBD. The end result was that, for simplicity, warnings are not generated during assumeUTXO.
#31703#pullrequestreview-2566967750 is another example. Pieter's PR was not primarily about assumeUTXO, but it exposed an assumeUTXO-specific path where the warning did not trigger the way I expected. Pieter was focused on the main change rather than the snapshot-specific path, so I ended up taking over the assumeUTXO part to make sure the warning covered that path as well. This blocked the review path for months.
#34521#pullrequestreview-3761920357 was an ordinary UB fix in LoadChainTip, but review found an extra issue because assumeUTXO can have two Chainstates, each with its own setBlockIndexCandidates. That is not theoretical complexity; it affected a concrete validation PR.
#33259 is a user-facing example. getblockchaininfo could report verificationprogress=1 and initialblockdownload=false while background validation was still running. That required adding explicit backgroundvalidation state.
#28598 and #28616 show the same issue from the wallet side. We had to be careful not to present transactions as normally confirmed while background validation was incomplete.
#32029 (comment) is another example of this being confusing even to developers. After a snapshot had supposedly been fully validated, I expected all traces of assumeUTXO seeding to be gone, but restart behavior still looked assumeUTXO-specific.
#32377 (comment) is related too. If we want the baked-in snapshot hashes to be more defensible, we could theoretically harden them by checking the committed assumeUTXO values during normal IBD/reindex/reindex-chainstate as well. That would make the commitments less one-sided. But that idea did not get much traction. So we are adding more functionality on top of these commitments before doing the obvious consistency checks during ordinary validation.

None of these are catastrophic individually. The problem is that the complexity accumulates in validation-adjacent code, where readability and reviewer confidence matter most.

On the claim that users may not realize the temporary security downgrade: the evidence is not a user study. The evidence is that we already needed extra RPC and wallet state to avoid misleading users and developers. Before background validation finishes, the node is in a confusing dual state: it is validating current transactions on top of an assumed base state, while we still cannot say that this node has independently validated that base state. That may be an acceptable tradeoff for some users, but it is not the same security model as a normally synced full node.

So when I say "fundamental issues", I mean:

The node is temporarily operating in a weird partial-validation state: it is not providing the same security as a normally synced full node until background validation completes.
The dual-chainstate model has a continuing maintenance cost in validation-adjacent code.
Demand still seems weak relative to the complexity. Torrent usage may not be a perfect metric, but it is a smell.
This does not solve the bandwidth- or CPU-constrained cases in the way it is sometimes presented. If background validation completes, the user still downloads the UTXO snapshot plus historical blocks, so total bandwidth is higher than normal IBD. For CPU-constrained users, moving validation to the background does not remove the work. It just makes the node usable earlier under an assumption. But you all know this.

If the gain is mostly that a normal node avoids waiting overnight, I do not think that justifies extending this model further into P2P and chainparams. If the user is genuinely CPU- or bandwidth-starved, then the historical work still takes a long time in the background, and the user remains in this purgatory state during that period.

For a bandwidth-constrained user, downloading no UTXO set is better than downloading a multi-GB UTXO set. For a CPU-constrained user, putting the CPU work in the background does not remove it. If the problem is really the UTXO set or the cost of keeping state, then UTXO-less or accumulator-based experiments such as Utreexo seem like a cleaner direction to explore. They have their own tradeoffs, but they attack the state problem directly instead of adding P2P transport for assumed state blobs.

My objection is to extending assumeUTXO further into P2P and chainparams because I do not see why assumeUTXO is the right solution to that problem. I would rather see it removed and provided as an external service; I do not think it is "Core".

fjahr commented at 8:05 PM on May 19, 2026: contributor

@josibake

no one is using it

That is not the case, and I don't think such exaggerations are helpful for the discourse.

Some prepackaged nodes or node setups do use it: https://github.com/Start9Labs/bitcoin-core-startos/blob/31.x/README.md https://github.com/FreeOnlineUser/bitcoin-pocket-node#-path-3-download-from-internet-assumeutxo-3-6-hours

And there is quite some educational content available as well: https://blog.lopp.net/bitcoin-node-sync-with-utxo-snapshots/ "Sync your full node in an hour" workshop @ https://btcplusplus.dev/ba24/talks https://yourdevice.ch/fullnode-schneller-syncen-mit-utxo-snapshots/

So I do think there is clear evidence that it is used, but it could be used more, and part of the barrier is the sourcing hoop that users have to jump through, which is the very thing this PR tries to fix. We don't have any telemetry on Bitcoin Core so the "nobody uses it" claim can probably be made about 90% of user-facing features that are not signaled on the network itself.

I don't think this line of arguing will lead anywhere. If I showed evidence of more usage, you would simply flip your argument that this is evidence the change isn't needed because adoption is good enough. There will be no usage statistic that will convince you that this change is warranted.

What actually matters in terms of usage is the following: Adoption could be better, and lowering the barrier for its usage will undoubtedly have a positive impact on it. Whether that justifies adding a sizable chunk of code (while very self-contained) is the question reviewers have to decide on. I am rather focused on the conditions that people try to set up nodes in and struggle with long IBD wait times and what AssumeUTXO could do for them if it were more accessible to them.

But I don't understand how this is safer? BTCPayServer gets a snapshot, likely from a fully validating node they own? This seems safer, trust wise, than getting it from the p2p network to then be compared against a hash they did not create.

You think an unverified utxo set dump from a single HTTP server, where it's unclear how safe it is, who hosts it, and who has access, is safer than this proposal? You also want their users to get the actual bitcoin core release build and verify the signatures, right? Not some custom bitcoin core build from BTCPayServer.

There are several websites that provide snapshots if you are looking for a centralized provider, by the way: https://bitcoin-snapshots.jaonoctus.dev/ https://utxo.download/

I would rather not send users to a centrally hosted server where they only know if the snapshot is valid when they have already loaded it into Bitcoin Core.

But regardless, we already have a hash snapshot in the binary, why can't BTCPayServer just download the snapshot from bitcoincore.org? This is exactly the same trust model you are proposing here, without requiring us to change our p2p code.

Because bitcoincore.org doesn't host the snapshot. And I don't think we want to do that because it's another server we have to maintain, a centralizing entity and a single source of failure. But if this is something you would like to pursue as an alternative approach to help adoption of AssumeUTXO, of course you can. To me personally, this seems like a lot more work than maintaining the code I am proposing here.

yancyribbens commented at 11:30 PM on May 19, 2026: contributor

You think an unverified utxo set dump from a single HTTP server, where it's unclear how safe it is, who hosts it, and who has access, is safer than this proposal?

Why would that be the case? Even if true, it could easily be https/ssh/ftps or other encrypted internal mechanism, possibly even distributed.

narula commented at 4:47 PM on May 20, 2026: contributor

Concept NACK, for the same reasons @l0rinc has articulated very well here: #35054 (comment)

side note: I wonder if there are concrete suggestions for how one might better collect this reasoning before doing a lot of work on a feature that many don't highly prioritize, and which has real code complexity impact.

sipa commented at 7:14 PM on May 20, 2026: member

Before venturing into a general concept ack or nack, I want to offer a perspective on a different path forward. I don't know what the right path is, but framing it this way may help us figure out where fundamental costs/benefits lie.

I want to start with two viewpoints I haven't often seen in this debate:

P2P sync is a fundamental part of assumeutxo. We cannot judge usefulness, interest, or adoption, of assumeutxo without P2P sync. assumeutxo is an optimization whose intention is lowering the bar for users to have fully validating nodes. As such, it competes through ease of use with alternatives that have worse trust assumptions, such as lightweight validation, and just downloading a precreated chainstate directory from a possibly unreliable trusted source. Without P2P sync, the user experience of assumeUTXO doesn't compare with those alternatives, and it entirely unsurprising that it is not used widely. It needs to work out of the box, or we shouldn't have it at all.
Background re-validation is an anti-feature. From the perspective of an end user, who is already trusting the software and the process that got it to them, the assumeutxo hash in the code is trusted already (because if it isn't, then that process might equally well remove critical consensus checks in validation code). They are not operating in a reduced partial level of validation before background re-validation is complete. The background re-validation adds a huge amount of bandwidth and processing, and simultaneously complicates our validation code significantly. It's there to keep the process honest, which all things equal is a good thing, but it's very bizarre to burden the users with what is effectively an extremely expensive mandatory self-check of the software they already trust.

So my question is: if we had the option of a assumeutxo with P2P sync, but without background re-validation, would that tilt the scales for commenters here, in one way or the other? The reduction in bandwidth, computation, and validation code complexity may make it more appealing. The fact that it intricately exposes how critically reliant on review the assumeutxo hash is (like all the rest of the code) may make it less appealing (and if it does to you, that's probably an argument against assumeutxo in the first place).

Note that while I believe there is no categorical difference in trust assumptions between trusting the code and trusting the assumeutxo hash, that does not make them equal in all regards, and does not mean they should be equally acceptable to everyone. For one, the cost of review of the assumeutxo hash has a very different cost profile (need to sync a node without assumeutxo). Another is that from a trust minimization perspective, a need to trust the (process around) code is inevitable, while trust in a trust anchor for assumeutxo is not. Yet another one is education, because it makes it unclear to users (and developers, see this thread and the one on the ML...) what the trust assumptions exactly are.

Regarding synchronization off bitcoincore.org instead of P2P: I really like that from an education perspective, as it aligns the data source with the trust assumptions. However, it is also completely unacceptable for other reasons, like DoS risks and logistics of operating the infrastructure.

evoskuil commented at 8:31 PM on May 20, 2026: none

assumeutxo is an optimization whose intention is lowering the bar for users to have fully validating nodes.

It may be the intention, but it is not the reality. As has been pointed out here, it increases this cost.

re-validation is an anti-feature... They are not operating in a reduced partial level of validation before background re-validation is complete.

The attempt to re-brand not validating as "re-validating" is noted.

Assume utxo operates under the exact same security model as SPV, with notable deficiencies.

startup cost is much higher (big download).
wallet complexity is pushed into nodes (e.g. dual chainstate, see thread).
inclusion proofs are not available for any tx supposedly confirmed within the assumption window.
trust must be established in the assumption in order to prevent very costly DoS.
degrades the formerly trustless p2p network.

It has been anticipated for many years that degrading performance would lead to these exact calls to abandon validation. If the argument is now that trust-me-bro is sufficient because you need to trust the software anyway, this project has failed.

stickies-v commented at 8:41 PM on May 20, 2026: contributor

P2P sync is a fundamental part of assumeutxo. ... It needs to work out of the box, or we shouldn't have it at all.

I wonder if most users who are (or would be) interested in AssumeUTXO install Bitcoin Core directly, or through node management software (Umbrel, etc) that focus on one-click install (and/or hardware to go along with it). I kinda suspect it'd be the latter, as the focus there is on gaining convenience at the cost of increased trust assumptions? If so, then I'm not sure your statement is true: node management software can easily hide the overhead of downloading the txoutset (e.g. from their website)? (Of course, they could already achieve something similar by e.g. shipping datadirs, )

So my question is: if we had the option of a assumeutxo with P2P sync, but without background sync, would that tilt the scales for commenters here, in one way or the other?

I could be open to keeping AssumeUTXO without background sync around (depending on remaining complexity, and user demand), so it'd definitely tilt the scales somewhat more positive for me. However, I think I still would not support adding p2p set sharing to Bitcoin Core. In my view, people running fully validated full nodes should remain the default, and I don't think we need to go out of our way to facilitate the alternative. However, I would not strongly oppose it if there is widespread support amongst contributors and demand from users.

sipa commented at 9:33 PM on May 20, 2026: member

It may be the intention, but it is not the reality. As has been pointed out here, it increases this cost.

I agree. However, I believe that most of that added cost is due to background validation of chain prior to the pre-assumeutxo sync point.

The attempt to re-brand not validating as "re-validating" is noted.

I refer to the current approach of background validation of the chain prior to the pre-assumeutxo sync point as re-validation, because it is validating something that is already trusted to be correct anyway. I am arguing for - at least as a thought experiment - dropping that re-validation. That indeed means not validating it.

Assume utxo operates under the exact same security model as SPV

I don't see how. SPV is trusting hashrate majority to commit to a consensus-valid chain. Assumeutxo is trusting the development, review, and distribution process of the software you are using to only commit to a utxo set matching a particular block hash. Note that I am not claiming that that makes it a better (or worse) idea, just that categorically it belongs to the class of software-review assumptions and not hashrate assumptions.

startup cost is much higher (big download).

With re-validation, yes. Without it, it's replacing the pre-assumeutxo part of the chain with just its UTXO set.

wallet complexity is pushed into nodes (e.g. dual chainstate, see thread).

The dual chainstate is only there for re-validation. I am suggesting not doing that.

inclusion proofs are not available for any tx supposedly confirmed within the assumption window.

That seems like an implementation aspect. Nothing prevents using similar techniques as SPV for transactions in the window if the user cares about them (filtering, inclusion proofs w.r.t. to the known and assumed-correct chain).

trust must be established in the assumption in order to prevent very costly DoS.

Indeed.

degrades the formerly trustless p2p network.

The network is neutral in this regard. Many features of the network can be used in ways that minimize trust or not. For example, one can download blocks from an arbitrary node without any validation whatsoever. Or one can trust the hashrate majority like SPV does. Other features, like IP address relay are fundamentally unverifiable information to begin with, because there is no better option.

Trustlessness is a property of the node software, not the network it connects to. Individual nodes participating in behavior that you find objectionable in no way degrades the usefulness of the network to you, as long as there are no compatibility issues.

Let me be clear that I value your opinion and the discussion about the assumeutxo feature here. There are many interesting points to be made about its trust assumptions, engineering tradeoffs, and overall usefulness. However, this one in particular seems like a bizarre purity argument, and I don't think it has a place in a technical debate.

It has been anticipated for many years that degrading performance would lead to these exact calls to abandon validation.

Assumeutxo is a pragmatic trade-off that exchanges validation time for a trust assumption, period. This is a trade-off that people are already making, and have been making forever, in the form of offering pre-packaged chainstates. This is what motivated assumeutxo in the first place, because having the utxo set hash reviewed by the same process you are already trusting for getting the consensus code implementation correct is better than having to additionally also trust j.random.website.com. And it's in my view - though I suspect you may disagree - also better than using a permanently lighter weight validation mode.

It's entirely fair to dislike the fact that people make such trade-offs, but they are, and I don't think dislike for reality is a reason to argue against an improvement of the status quo. There are however many other arguments against assumeutxo, like engineering costs and improvements to the state of the art in validation that don't need assumptions. I also listed others above, including the confusion about security assumptions it causes. I think getting rid of the background re-validation actually improves upon that, by ripping off the facade that makes it seem like something better than a trusted hash in the source code. I would have thought you'd appreciate that, actually.

evoskuil commented at 10:48 PM on May 20, 2026: none

It may be the intention, but it is not the reality. As has been pointed out here, it increases this cost.

I agree. However, I posit that most of that added cost is due to background validation of chain prior to the pre-assumeutxo sync point.

I don't see how that is relevant.

The attempt to re-brand not validating as "re-validating" is noted.

I refer to the current approach of background validation of the chain prior to the pre-assumeutxo sync point as re-validation, because it is validating something that is already trusted to be correct anyway. I am arguing for - at least as a thought experiment - dropping that re-validation. That indeed means not validating it.

"validating something that is already trusted to be correct anyway"

Trusted is not validated. You may be aware of the phrase, "don't trust, validate."

Assume utxo operates under the exact same security model as SPV

I don't see how. SPV is trusting hashrate majority to commit to a consensus-valid chain.

Inclusion of a tx is the only aspect of the design that is validating, exactly as with SPV. The author has made this specific argument on bitcoin-dev that this is why it's secure.

"An incoming payment can only be confirmed in a mined block on the headers-validated chain. For an attacker to trick the user into accepting a transaction that spends UTXOs which exist only in a malicious snapshot, the majority of mining hashpower would have to be running nodes that accepted and continued to run based only on the same malicious snapshot." - Fabian

Assumeutxo is trusting the development, review, and distribution process of the software you are using to only commit to a utxo set matching a particular block hash. Note that I am not claiming that that makes it a better (or worse) idea, just that categorically it belongs to the class of software-review assumptions and not hashrate assumptions.

When we refer to security we are specifically not referring to trust. The security of this feature (as Fabian describes above) is SPV. Beyond that there is trust. We can wax philosophical about the need to trust the software, but if one accepts this argument - the project has failed.

Trustlessness is a property of the node software, not the network it connects to.

Let's not be pedantic. The proposal enlists the P2P network into the distribution of multi-gigabyte blobs that cannot be validated without validating the entire chain.

Individual nodes participating in behavior that you find objectionable in no way degrades the usefulness of the network to you, as long as there are no compatibility issues... seems like a bizarre purity argument, and I don't think it has a place in a technical debate.

This statement justifies any imaginable use of the P2P network on the basis that others can ignore it. This is not a serious argument when the specific question is what should or should not be standardized network behavior.

It has been anticipated for many years that degrading performance would lead to these exact calls to abandon validation.

Assumeutxo is a pragmatic trade-off that exchanges validation time for a trust assumption, period.

Period or not, engineering tradeoffs have consequences.

because having the utxo set hash reviewed by the same process you are already trusting for getting the consensus code implementation correct is better than having to additionally also trust j.random.website.com.

This restates my summary of what you previously said. I don't see any reason to keep repeating it.

Bitcoin is based on a simple principle - validation. If you don't want to validate, and instead just trust the bros (who might as well just be j.random.website.com), you do not have to. But standardizing that as a matter of protocol is inherently broken.

also better than using a permanently lighter weight validation mode.

This is NOT a stronger "validation mode". It is the SPV validation mode. There are only two actual modes of Bitcoin validation, and it has been that way since the paper. You are arguing for the permanent use of one as if it was the other. To think we all fought against these arguments a decade ago.

Furthermore it's more costly, and requires a to-be-determined trust chain, from bitcoin-dev:

"The BIP intentionally leaves the source of the Merkle root to the implementation. The protocol's job is to enable transferring and verifying UTXO data once a root is known, not to dictate how each implementation establishes that root." - Fabian

It's entirely fair to dislike the fact that people make such trade-offs but they are, and I don't think dislike for reality...

Do not put words in my mouth. I never expressed any dislike for such tradeoffs. I rejected flawed arguments and standardizing this into the P2P protocol. You may have noticed that I pointed out that people already widely utilize SPV (which we support via a native electrum interface).

I think getting rid of the background re-validation actually improves upon that, by ripping off the facade that makes it seem like something better than a trusted hash in the source code. I would have thought you'd appreciate that, actually.

I do appreciate the fact that someone is finally being honest about the intent. Just 13 days ago on bitcoin-dev there was a rejection of the slippery slope argument, made by the author:

"I can not refute critique of something that is not part of this proposal except for pointing out that what you are insinuating is not something I am working on or plan on working on and I am not aware of anyone working on skipping IBD and I would not endorse such a proposal if it were to be published. In contrast to some hypothetical dangerous future extension of this proposal that you are warning about, I am convinced that it does have real positive impact on users today, as I pointed out above." - Fabian

Didn't even take 2 weeks.

fjahr commented at 11:34 PM on May 24, 2026: contributor

So, @evoskuil is already getting anxious on the mailing list that I haven’t posted my view here again in the context of @sipa 's suggestion, despite himself making sure that my opinion is already present here by quoting me. Last time I checked, not posting didn’t mean that one’s opinion has quietly taken a 180, but that’s where this conversation is now, I guess.

To make it clear again and prevent him from continuing to put words into my mouth, I don’t find AssumeUTXO interesting if we don’t fully validate IBD in the background. As I have said several times already, I find it an interesting feature to make it quicker and easier to get started with running a full node. This is in particular contrast to losing users to custodial solutions more and more. It takes the time it takes to run IBD out of the equation, and this has been made clear to me by many people I spoke with back when AssumeUTXO was actually implemented: that this would be useful in convincing merchants to run their own full nodes and to get people set up with pre-packaged full node projects and setup tutorials through a much nicer experience, for example. These effects are reinforced in the typical circumstances found in developing regions. If AssumeUTXO turns into a pruned-only feature without full background sync, I don’t think that’s valuable because that’s not what the users I am thinking of actually want.

nkaretnikov commented at 12:16 AM on May 25, 2026: none

Concept NACK.

First, people disagree on the adoption rate of assumeUTXO, which this proposal builds on. So the usefulness of the proposal is also in question.

It was said that users are not using torrents, which is the closest existing thing to including this into the P2P network. So this must either not be used at all or they just download data from a different source. If this is not used, then no proposal is needed. If they use a different data source, then the solution already exists. But somehow the counterargument to that is that users are using untrusted sources. However, the BIP doesn’t address the trust argument, it just changes the distribution method.

On adoption: including this into the P2P network also doesn’t automatically help with adoption because the proposal is opt-in. Someone has to volunteer to publish the data first, and why would they if there is a simpler way of solving this?

With all that uncertainty, this proposal comes with a bunch of concrete downsides: trust assumptions, implementation complexity leading to longer reviews and potential bugs, data usage.

Finally, I wish more projects followed OpenBSD’s lead and actively removed unused code. So I would encourage removing assumeUTXO support if it’s not useful, because almost always more code means more bugs.

evoskuil commented at 12:58 AM on May 25, 2026: none

Last time I checked, not posting didn’t mean that one’s opinion has quietly taken a 180, but that’s where this conversation is now, I guess.

You favorably referenced the very post that proposed exactly what you had dismissed to me as a "slippery slope" argument, without so much as mentioning that you didn't support it. That's called implicit support in any universe.

So now, despite the fact that we slipped all the way down that slope in less than two weeks, I see that above you at least "don't find that valuable". That's great to hear.

You are welcome to do it in your node, but this is not something that should be even jokingly considered for protocol standardization by the community.

josibake commented at 9:34 AM on May 28, 2026: member

I realise this PR has now drifted to a more general discussion of AssumeUTXO, but I'd prefer to respond here as this is where the relevant points I want to respond to are. I am also happy to continue this discussion in another place, if that is deemed more appropriate.

To start off, I realise I jumped in hastily before without fully letting my thoughts marinate. I had a strong reaction to the P2P proposal not because I think this change in isolation is a bad proposal, but rather because I think that AssumeUTXO is a bad proposal and adding the P2P mechanism on top of it is building on an unsound foundation. Apologies for not clearly articulating my point of view.

I do agree with @sipa 's point that P2P is necessary for AssumeUTXO to compete with "less good" alternatives. But to me, this feels like the tail wagging the dog. I do not think we should be designing features to compete with users doing a "bad thing." We should be designing features that are technically sound and ideologically aligned with the goals of this project.

The two main reasons I've seen given for AssumeUTXO (I'm not commenting on whether or not I think they are good reasons, only that they seem to be the most commonly quoted) are full IBD is slow and/or full IBD is too much data.

For slow IBD, I think we are nowhere near the limit and have plenty more work to do towards optimising this. @evoskuil 's libbitcoin-node is a testament to this. I applaud the efforts of the developers who have been doing work in this area. It is my personal view that this are of work is of utmost importance if we expect people to run their own nodes.

For bandwidth concerns, I think this is an area where more work can be done, as well. Some time ago there was a proposal for witness pruning which claimed something on the order of 40% bandwidth savings and a trust model that I consider vastly superior to AssumeUTXO. There has also been some discussion on block and transaction encodings (though I am less familiar with this area).

I have mentioned several times in the past that I think these areas need to be explored more seriously before considering things like AssumeUTXO. I also am not fully convinced having users bootstrap from an unverified hash is the same as trusting the code of the node software. It seems easier to me to have a malicious hash introduced while leaving the majority of the critical code untouched, vs modifying consensus code to produce a malicious UTXO set whilst escaping detection.

Regardless, I think my main point is there is still plenty of work on the table to make full verification optimised and viable for the most number of users.

yancyribbens commented at 9:23 PM on May 28, 2026: contributor

I also am not fully convinced having users bootstrap from an unverified hash is the same as trusting the code of the node software.

This to me is the elephant in the room. Would you feel the same way if the node software was from a different source? For example, it's now possible to run alternate node implementations using the kernel lib. And in such a case, what would your level of trust be then?

ajtowns commented at 7:20 AM on June 5, 2026: contributor

To start off, I realise I jumped in hastily before without fully letting my thoughts marinate. I had a strong reaction to the P2P proposal not because I think this change in isolation is a bad proposal, but rather because I think that AssumeUTXO is a bad proposal and adding the P2P mechanism on top of it is building on an unsound foundation. Apologies for not clearly articulating my point of view.

I think this PR is pretty clearly aimed at "make AssumeUTXO more useful", and that a conversation about "AssumeUTXO is actively harmful and shouldn't be supported" should be held elsewhere (eg on the mailing list if you want to discourage users from using it, or as a new issue if you think it should not even be a supported option).

While we have AssumeUTXO in the codebase, I think having PRs that make it more useful is a good thing.

The background re-validation adds a huge amount of bandwidth and processing, and simultaneously complicates our validation code significantly. It's there to keep the process honest, which all things equal is a good thing, but it's very bizarre to burden the users with what is effectively an extremely expensive mandatory self-check of the software they already trust.

I don't think this is actually very expensive form a user's point-of-view -- it's just a bunch of extra processing happening in the background that doesn't delay them from using their node, which is fine unless their node is very resource constrained. It is somewhat expensive from a development/maintenance POV, but if the feature's useful, that can be okay.

I think the self check is fairly valuable for keeping everyone honest. We could probably simplify the code by changing the way we do it though: eg, running the check-from-genesis as a separate process with separate p2p connections and a separate debug.log, then, if we reach the target block and end up with a different utxo set, just abort and display an error with recovery instructions, rather than doing an automatic reorg. (In that model, perhaps you could request a utxo-snapshot-set-revalidation at arbitrary times when you switch to new node software, even if you're not using assumeutxo yourself, making the "self-test" behaviour more widely available)