← index

[BIP Draft] Segregated Data: a prunable, script-isolated block region for data carriage

An archive of delvingbitcoin.org · view original topic →

MrHash · #1 ·

Dear all,

I’ve drafted companion BIPs for Segregated Data (SegData), a soft-fork block region for arbitrary data (data without an intent to transfer value). SegData entries are committed in-block through a separate Merkle root and validated at the tip by every node. Beyond the standard retention window a node may prune them individually or in bulk, so the storage cost falls on the operators who choose to keep the data rather than on everyone.

This is not an effort to promote data carriage, rather to give it a designed structural home. OP_RETURN mitigated UTXO bloat by drawing data out of fake outputs, but its bytes are still synced in full and stored permanently by every node. SegData offers to carry the same data in a prunable region at the witness discount, so a carrier has sensible reasons to migrate. The OP_RETURN precedent shows the approach works in principle. Altering the existing vectors is deliberately out of scope.

Design. It follows BIP-141 patterns. The commitment is a coinbase output over a separate Merkle root, blocks gain a base and an extended serialisation, the weight formula is extended, and reference outputs are witness v2, value-zero, unspendable, and excluded from the UTXO set. No new sighash is needed.

Two specific properties carry most of the weight:

Who migrates:

Open questions I would value feedback on:

Consensus BIP: bips/bip-segdata.md at 4eeeb0afbb9d256d264225801e635d2df1cc875f · MrHash/bips · GitHub

Peer-services BIP: bips/bip-segdata-peer-services.md at 4eeeb0afbb9d256d264225801e635d2df1cc875f · MrHash/bips · GitHub

Also announced on bitcoin-dev ML, but not passed moderation yet.

Happy to take detailed discussion here. I’m sure there will be many more questions, i hope the rationale section covers most of the obvious questions.

Hash

X: @hashamadeus

Nostr: npub1tjfwajj3cfy25ujx02c7q3e7pzc27jasxakk9v0lsrkrewahpkesee5a0v

Antoine Poinsot · #2 ·

I don’t understand the goal here. Since you give it consensus meaning, any full node would have to process the new data structure and use up resources. After processing the data, the node would not have to keep it around, but this saves storage (the least expensive resource) at the expense of other more expensive resources.

Alternatively, you could simply have the commitment but not give it consensus meaning through a soft fork. Essentially commit a Merkle root in an OP_RETURN. This is already possible, and done, today.

MrHash · #3 ·

Hi Antoine,

Addressing the point about resources, SegData is designed to move data that would otherwise be in OP_RETURN, witness stuffed, etc. With the segregation, aside from the absolutely necessary tip validation, the segdata is not required either by IBD or storage, and validation is skipped, so node resources are potentially saved in multiple dimensions.

Addressing the other point, as you say anyone can timestamp, so this doesn’t change that in principle. However we can’t force people keep their data off-chain so this covers the case where people insist on putting it on chain, which is what is happening in the wild, but instead of OP_RETURN, creating a structural semantically deliberate region with intent which supports consensual retention.

I hope that answers the initial question. Please continue to press if there’s something i’m missing.

Chris Guida · #4 ·

Why would bitcoin noderunners want to store nonmonetary data for free?

MrHash · #5 · · in reply to #4

I’m not sure I understand your question. SegData gives node operators the choice not to store data. Existing vectors offer no such choice.