Make provably unsignable standard P2PK and P2MS outpoints unspendable. #28400

pull russeree wants to merge 2 commits into bitcoin:master from russeree:23-08-16-prune-unspenable-p2ms-p2pk changing 5 files +180 −5
  1. russeree commented at 8:04 am on September 4, 2023: contributor


    This pr introduces additional conditionals into IsUnspendable() to remove provably unspendable P2PK and P2MS tx outpoints from the UTXO set. This is done by using IsFullyValid() to check that the public key(s) for the outpoints script are valid. This trims nearly 20K outpoints from the UTXO set at height 805618.

    A side effect of this PR is it removes the use case for uncompressed public keys through standard tx types to store arbitrary data in the UTXO set.


    P2PK outpoints with a single pubkey that is invalid can by flagged as unspendable becuase the public key does not exist on the SECP256K1 curve and thus no private key exists to make OP_CHECKSIG evaluate to true.

    Script must be in the format of OP_PUSHBBYTES PUBKEY OP_CHECKSIG(VERIFY)


    P2MS outpoints that do not have enough valid public keys to meet the threshold. where (n-k < m) where k is the number of invalid public keys.

    Script must be in the format of OP_(N) PUBKEY1 .... OP_(N) OP_CHECKMULTISIG(VERIFY)

  2. DrahtBot commented at 8:04 am on September 4, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.


    See the guideline for information on the review process.

    Type Reviewers
    Concept NACK petertodd
    Concept ACK RandyMcMillan, dzyphr, dexX7, NicolasDorier

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.


    Reviewers, this pull request conflicts with the following ones:

    • #28728 (wallet: [bugfix] Mark CNoDestination and PubKeyDestination constructor explicit by maflcko)
    • #28690 (build: Introduce internal kernel library by TheCharlatan)
    • #28550 (Covenant tools softfork by jamesob)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label CI failed on Sep 4, 2023
  4. theStack commented at 10:46 am on September 4, 2023: contributor

    Changing the IsUnspendable logic in general seems very brittle, as it obviously leads to nodes with diverging UTXO sets. Even if all nodes would run this patch, they’d start to run it at different block heights (if between blocks M and N an UTXO enters the set that classifies for pruning, you’d already diverge), and you’d only end up with the same fully pruned UTXO set if all of them also did a -reindex once.

    When the pruning of OP_RETURN outputs was implemented over 10 years ago (see #2791) this was seemingly less of a problem, but now that many projects depend on a normalized view of the UTXO set (muhash, assumeutxo, utreexo etc.), I think that would lead to much unwanted chaos.

    Happy to hear other inputs how this could be solved, maybe I’m missing something.

  5. russeree force-pushed on Sep 4, 2023
  6. russeree force-pushed on Sep 4, 2023
  7. russeree force-pushed on Sep 4, 2023
  8. in src/script/script.cpp:324 in 5892815ced outdated
    323@@ -258,6 +324,44 @@ bool CScript::IsPushOnly() const
    324     return this->IsPushOnly(begin());
    325 }
    328+bool CScript::IsUnspendable() const
    330+    if (size() > MAX_SCRIPT_SIZE) {

    rot13maxi commented at 11:59 am on September 4, 2023:
    did you find anything interesting when you added this check?

    russeree commented at 12:08 pm on September 4, 2023:

    No as it was previously defined here.

    The reason I placed it at the top of the if else stack was to avoid doing work on a script that already is invalid due to length. If the script is too long just mark as unspendable and avoid any extra checking

  9. ajtowns commented at 12:29 pm on September 4, 2023: contributor

    Changing the IsUnspendable logic in general seems very brittle, as it obviously leads to nodes with diverging UTXO sets.

    now that many projects depend on a normalized view of the UTXO set (muhash, assumeutxo, utreexo etc.), I think that would lead to much unwanted chaos.

    I think muhash provides a pretty straightforward way of dealing with this. Namely: rather than only maintaining a single muhash for the utxo set you have, also maintain one for the outputs you’ve pruned as unspendable (or multiple indexes if you introduce more over time). Then you might have:

    • at block height 810,000
    • the spendable utxo set has muhash X
    • the OP_RETURN utxos have muhash Y
    • the bad pubkey utxos have muhash Z

    if you compare with a node running current master, you then verify that your X+Z matches their X.

    It’s easy to initialize these values on upgrade: Z=0, then iterate through your utxo set looking for bad pubkeys, adding them to Z as you find them, and removing them from the utxo set. Downgrading breaks things though.

    I don’t think you can bootstrap utreexo from just the utxo set at an arbitrary height in the first place, so I don’t think its really affected by this. For assumeutxo, if you want to be able to distribute the utxo set snapshots over p2p, then peers that have pruned unspendable utxos won’t have them to share with peers that don’t realise they’re unspendable in the first place, but if you’re just producing signed snapshots, then you could probably just publish two versions until the old software goes out of its support window.

  10. in src/script/script.cpp:215 in 5892815ced outdated
    210+        return false;
    211+    }
    213+    // Last byte is equal to OP_CHECKSIG / OP_CHECKSIGVERIFY
    214+    const unsigned char last_byte = *(this->end() - 1);
    215+    if (!(last_byte != OP_CHECKSIG ^ last_byte != OP_CHECKSIGVERIFY)) {

    ajtowns commented at 12:37 pm on September 4, 2023:
    if (back() != OP_CHECKSIG && back() != OP_CHECKSIGVERIFY) ?
  11. in src/script/script.cpp:226 in 5892815ced outdated
    221+    if (first_byte == OP_0 || first_byte >= OP_PUSHDATA1) {
    222+        return false;
    223+    }
    225+    // The script contains 2 ops and is equal to the script length
    226+    if (first_byte + 2 == size){

    ajtowns commented at 12:41 pm on September 4, 2023:

    You’ve already checked size is between 2 and 77, so if first_byte + 2 == size, first_byte is between 0 and 75, so the check for OP_PUSHDATA1 is redundant (and the check for OP_0 could be subsumed if you checked size < 3).

    Also, first_byte == front().

  12. in src/script/script.cpp:327 in 5892815ced outdated
    328+bool CScript::IsUnspendable() const
    330+    if (size() > MAX_SCRIPT_SIZE) {
    331+        return true;
    332+    } else if (this->IsPayToPublicKey()) {
    333+        std::vector<unsigned char> pubkey(this->begin() + 1, this->end() - 1);

    ajtowns commented at 12:46 pm on September 4, 2023:
    Should just be a span
  13. theStack commented at 3:27 pm on September 4, 2023: contributor

    @ajtowns: Neat idea with the additional MuHash! I agree this should work, but obviously at the cost of increased complexity (as there is much more to change than only the logic of a single method), and there should be a really good reason for doing it.

    It’s easy to initialize these values on upgrade: Z=0, then iterate through your utxo set looking for bad pubkeys, adding them to Z as you find them, and removing them from the utxo set. Downgrading breaks things though.

    Right, and you’d also need to somehow detect if an upgrade has already happened. In #2791 it was proposed to either scan for a particular output in the UTXO set ( or to introduce a flag in the chainstate database ( @russeree: Can you give some additional motivation? While I enjoy very much reasoning about these kind of topics, it’s still unclear to me what concrete problem this PR is trying to solve. The description claims that about 20k of outputs could be pruned. As of now (block 806205), that’s merely 0.016% of the total UTXO set size, freeing up about 1,24 MB of chainstate space (if we assume an average size of 65 bytes per UTXO). I’d argue that these numbers are way too low (even if you 10x them for the sake of projecting into the future) to justify the increased complexity in different areas that would need to be touched (i.e. differentiating between different IsUnspendable reasons and update all call-sites correctly, maintaining an extra MuHash, implementing an UTXO set upgrade mechanism, probably adapting gettxoutsetinfo RPC results to include both “new-pruned-MuHash” and “old-unpruned-MuHash”, dealing with different dumptxoutset/loadtxoutset result/behaviour on old vs new nodes for AssumeUTXO etc. etc.).

    Also, calling IsFullyValid is significantly more expensive than just doing the usual Solver matching and length checks, so one would also need to analyze if this has a noticable negative performance impact on standardness checks.

  14. in src/script/script.cpp:332 in 5892815ced outdated
    333+        std::vector<unsigned char> pubkey(this->begin() + 1, this->end() - 1);
    334+        if(!CPubKey(pubkey).IsFullyValid()){
    335+            return true;
    336+        }
    337+    } else if (this->IsPayToMultisig()) {
    338+        CScript::const_iterator pc = this->begin() + 1;

    ajtowns commented at 10:58 pm on September 4, 2023:

    Seems like it would be better to have this->IsInvalidMultisig() and move this code all into one place? Something like:

     0if (size() < 3) return false;
     1if (back() != OP_CHECKMULTISIG && back() != OP_CHECKMULTISIGVERIFY) return false;
     2int n;
     3if (!IsOpN(*(end() - 2), n ) return false;
     4int m;
     5if (!IsOpN(front(), m) return false;
     6std::vector<Span<unsigned char>> keys;
     8// parse into keys, finish checking it's well-formed
     9if (m > n) return true;
    10int good = 0;
    11int max_good = n;
    12for (k : keys) {
    13    if (CPubKey(k).IsFullyValid()) {
    14        ++good;
    15    } else {
    16        --max_good;
    17    }
    18    if (good >= m) return false;
    19    if (max_good < m) return true;
    21return false; // unreachable
  15. in src/script/script.cpp:205 in 5892815ced outdated
    201@@ -201,6 +202,71 @@ unsigned int CScript::GetSigOpCount(const CScript& scriptSig) const
    202     return subscript.GetSigOpCount(true);
    203 }
    205+bool CScript::IsPayToPublicKey() const

    Randy808 commented at 12:45 pm on September 5, 2023:
    It might be simpler to follow the style of the existing script template checks like ‘IsPayToScriptHash’. You can also use the existing constants to check against the compressed and uncompressed sizes
  16. in src/script/script.cpp:445 in 5892815ced outdated
    446@@ -343,6 +447,40 @@ bool IsOpSuccess(const opcodetype& opcode)
    447            (opcode >= 187 && opcode <= 254);
    448 }
    450+bool IsOpN(unsigned char op_code){
    451+    return OP_0 || (OP_1 <= op_code && op_code <= OP_16);

    Randy808 commented at 12:55 pm on September 5, 2023:
    I think you meant: return op_code == OP_0 || (OP_1 <= op_code && op_code <= OP_16);
  17. in src/script/script.cpp:448 in 5892815ced outdated
    446@@ -343,6 +447,40 @@ bool IsOpSuccess(const opcodetype& opcode)
    447            (opcode >= 187 && opcode <= 254);
    448 }
    450+bool IsOpN(unsigned char op_code){
    451+    return OP_0 || (OP_1 <= op_code && op_code <= OP_16);
    454+bool IsOpN(unsigned char op_code, unsigned char& value){

    Randy808 commented at 12:59 pm on September 5, 2023:

    Might be more concise if you used the above function in this one:

     0 bool IsOpN(unsigned char op_code, unsigned char &value) {
     1   if (IsOpN(op_code)) {
     2     if (op_code == OP_0) {
     3       value = 0;
     4     } else {
     5       value = op_code - 0x50;
     6     }
     7     return true;
     8   }
     9   return false;
    10 }
  18. russeree commented at 1:48 am on September 11, 2023: contributor
    Sorry for the lack of work on this PR last week, was at TABConf working on #27260 . Work has resumed and I will update this thread with changes over the next few days.
  19. russeree force-pushed on Sep 12, 2023
  20. Make unsignable standard P2PK and P2MS outpoints unspendable.
    Author:    russeree <>
    @ajtowns - Remove OP_O case from IsPayToPublicKey()
  21. russeree force-pushed on Sep 12, 2023
  22. russeree marked this as a draft on Sep 13, 2023
  23. russeree force-pushed on Sep 17, 2023
  24. russeree force-pushed on Sep 17, 2023
  25. p2pk unit tests 3733665990
  26. russeree force-pushed on Sep 26, 2023
  27. petertodd commented at 8:47 am on September 26, 2023: contributor

    This trims nearly 20K outpoints from the UTXO set at height 805618.

    Concept NACK.

    At the moment the UTXO set has 130 million entries. Reducing it by 0.015% isn’t worth the technical risk.

    By comparison, there have been 53 million OP_Return outputs. Removing those from the UTXO set was a not-so-trivial ~30% reduction.

  28. DrahtBot added the label Needs rebase on Oct 26, 2023
  29. DrahtBot commented at 4:42 pm on October 26, 2023: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  30. RandyMcMillan commented at 7:55 pm on December 14, 2023: contributor
    Concept ACK
  31. BTCMcBoatface commented at 11:32 pm on January 2, 2024: none
    I have no business adding comments here but I may humbly add that, despite this only affecting 0.016% of the UTXOs, it is a smart preemptive defense against arbitrary data on chain using a method that is far more damaging than inscriptions.
  32. DrahtBot commented at 10:42 am on January 5, 2024: contributor

    There hasn’t been much activity lately and the patch still needs rebase. What is the status here?

    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  33. owenstrevor commented at 7:12 pm on January 28, 2024: none

    So this would make all current and future STAMPS unspendable? Or just future ones? How would this affect other potential legitimate use cases?

    What is the projected growth of node requirements if this situation were to get worse and how does that track with Moore’s Law of the increase in computer performance? Or is there a better way to think about this?

    It’s worth noting, making these unspendable may not stop their proliferation. For example, solutions that allow you to trade private keys off-chain, in the event that there is market demand for STAMPS, would still incentivize their creation to the same degree even if they are unspendable.

  34. dzyphr commented at 11:12 pm on January 28, 2024: none

    Concept ACK


    So this would make all current and future STAMPS unspendable? Or just future ones? How would this affect other potential legitimate use cases?

    What is the projected growth of node requirements if this situation were to get worse and how does that track with Moore’s Law of the increase in computer performance? Or is there a better way to think about this?

    It’s worth noting, making these unspendable may not stop their proliferation. For example, solutions that allow you to trade private keys off-chain, in the event that there is market demand for STAMPS, would still incentivize their creation to the same degree even if they are unspendable. @owenstrevor it does nothing of the sort. You should read more carefully before asserting a slippery slope economic argument about STAMPS.

    to remove provably unspendable P2PK and P2MS tx outpoints from the UTXO set.

    These outputs are already unspendable mathematically, and the person spending to them agreed to that condition upon signing.

    This trims nearly 20K outpoints from the UTXO set at height 805618.

    Concept NACK.

    At the moment the UTXO set has 130 million entries. Reducing it by 0.015% isn’t worth the technical risk.

    By comparison, there have been 53 million OP_Return outputs. Removing those from the UTXO set was a not-so-trivial ~30% reduction.

    OP_RETURN outputs are not provably unspendable though, they have valid spending paths. So the comparison is irrelevant, the point is why would you incentivize the storage of something that is basically a burned output? Sure right now the storage space is negligible, however you would have to have a completely non-adversarial mindset to suggest that no one would ever exploit this further.

  35. theStack commented at 2:51 am on January 29, 2024: contributor
    Fun challenge for everyone: can you link to a specific bare multisig UTXO that has been created within the past year (let’s say, since block 769785) where the prunable detection in this PR would hit? Looking at an UTXO snapshot from now (block 827855), from all the 749496 P2MS UTXOs that have been created since the start of the year 2023, I haven’t managed to find a single one which is provably unspendable. // EDIT: nevermind, my script had a bug.
  36. dzyphr commented at 3:06 am on January 29, 2024: none

    Fun challenge for everyone: can you link to a specific bare multisig UTXO that has been created within the past year (let’s say, since block 769785) where the prunable detection in this PR would hit? Looking at an UTXO snapshot from now (block 827855), from all the 749496 P2MS UTXOs that have been created since the start of the year 2023, I haven’t managed to find a single one which is provably unspendable.

    they have provided a list:

  37. ajtowns commented at 7:22 am on January 29, 2024: contributor

    Fun challenge for everyone: can you link to a specific bare multisig UTXO that has been created within the past year (let’s say, since block 769785) where the prunable detection in this PR would hit? Looking at an UTXO snapshot from now (block 827855), from all the 749496 P2MS UTXOs that have been created since the start of the year 2023, I haven’t managed to find a single one which is provably unspendable.

    they have provided a list:

    The most recently confirmed txs in that list are from block 792783 (June 2023), namely c18fe6…, 74c96e… and a6062e…. All but ~50 of the txs are over 300k blocks ago (ie 6+ years). The ones that aren’t are:

  38. ajtowns commented at 7:29 am on January 29, 2024: contributor

    So this would make all current and future STAMPS unspendable? Or just future ones? How would this affect other potential legitimate use cases?

    Most STAMPS txs are spendable and are not affected by this PR. As I understand it, that protocol sets up a 1-of-3 multisig, where of the keys is valid (providing a spendable path) and the other two are used for data, and are thus provably unspendable about 50% of the time as the data doesn’t match a valid secp point. The only times those txs become unspendable are when the one valid key is replaced by an invalid key – eg the txs linked in my previous comment have a pubkey 030303030303030303030303030303030303030303030303030303030303030303 which is neither a valid point nor carries any data.

  39. dexX7 commented at 8:44 am on January 29, 2024: contributor

    Concept ACK.

    Note, however, that pubkeys can be modified to be valid and still be used to store data, for example by using a byte that is shuffled until the pubkey becomes valid.

  40. NicolasDorier commented at 12:13 pm on January 29, 2024: contributor

    Concept ACK.

    However, note that from the time this PR run on a node, its UTXO Set hash will start to differ from any peer that didn’t start using this PR at the same time.

    This can be solved by cleaning up the UTXO Set when the node starts. Unsure how expensive would it be.

    It would be nice to know how much space is saved running this PR from genesis.

    We are facing a problem in BTCPay: A large number of users have only 20GB of space to store the UTXO Set. Now that the UTXO Set is reaching 10GB and increasing, those users might have their node crashing sooner or later. So we are very interested into solutions to drop down the UTXO set size.

    EDIT: It turns out that our problem wouldn’t be solved see #28400 (comment)

  41. russeree commented at 12:47 pm on January 29, 2024: contributor

    and are thus provably unspendable about 50% of the time as the data doesn’t match a valid secp point.

    Sorry for not understanding this, but why 50%? I thought compressed pubkeys covered almost everything on the X axis because the y is derived? Also many of the pruned outpoints are not compressed pubkeys but instead uncompressed.

  42. theStack commented at 5:11 pm on January 29, 2024: contributor

    and are thus provably unspendable about 50% of the time as the data doesn’t match a valid secp point.

    Sorry for not understanding this, but why 50%? I thought compressed pubkeys covered almost everything on the X axis because the y is derived? Also many of the pruned outpoints are not compressed pubkeys but instead uncompressed.

    In my own non-cryptographer words: for about half of all possible field elements x, there exists no field element y such that the secp256k1 equation $y^2 = x^3 + 7$ holds (more precisely, you would get the coordinate y via $y = \sqrt{x^3 + 7}$, but for those x values, the expression $x^3 + 7$ doesn’t have a square root).

  43. ajtowns commented at 5:12 pm on January 29, 2024: contributor

    and are thus provably unspendable about 50% of the time as the data doesn’t match a valid secp point.

    Sorry for not understanding this, but why 50%? I thought compressed pubkeys covered almost everything on the X axis because the y is derived? Also many of the pruned outpoints are not compressed pubkeys but instead uncompressed.

    The formula is y^2 = x^3 + 7, so you derive x^3+7 for any x, but only half of those will be valid squares – every number 0 < i < p/2 squares to the same result as the different number p-i does, so you only use up half the possible numbers as squares; presuming x^3+7 is just randomly choosing a number between 1 and p, then half the time it won’t be a square. You get back to having ~2^256 points despite half of the ~2^256 x coords being invalid because each of the valid coords gives you two points (y and p-y).

  44. 1440000bytes commented at 6:25 pm on January 29, 2024: none

    We are facing a problem in BTCPay: A large number of users have only 20GB of space to store the UTXO Set. Now that the UTXO Set is reaching 10GB and increasing, those users might have their node crashing sooner or later. So we are very interested into solutions to drop down the UTXO set size.

    This pull request won’t make any difference. Top 5 output types in UTXO set are P2PKH, P2WPKH, P2TR, P2SH and P2WSH.

  45. NicolasDorier commented at 11:39 pm on January 29, 2024: contributor

    @1440000bytes the repartition you are showing is on outputs count, not size. Since ordinal, utxoset size grew from 5GB to 10GB. Even if they are marginal in term of output count, in term of size, it takes around 50% of the UTXO Set at the moment.

    I haven’t looked how they put JPEG and JSON files on the chain exactly, I assumed it was with invalid P2PK, but didn’t look closely into it.

  46. achow101 commented at 11:49 pm on January 29, 2024: member

    I haven’t looked how they put JPEG and JSON files on the chain exactly, I assumed it was with invalid P2PK, but didn’t look closely into it.

    Ordinal inscriptions don’t use P2PK or bare multisig, they use P2TR with tapscripts. In order for an inscription to be in the blockchain, it must create and then spend a UTXO, so inscriptions themselves have no impact on the UTXO set. Ordinals may have an impact as there are people creating small outputs for their “rare” sats, but those are all perfectly valid P2TR outputs, so nothing can be done there.

    Stamps is using the old Counterparty protocol that uses bare multisigs. These are 1-of-3 multisigs where one key is valid, and the other two are the data. These are also spendable so this PR would not remove them.

    OP_RETURN outputs are not provably unspendable though, they have valid spending paths

    The point of OP_RETURN outputs (defined as the ones following the template OP_RETURN <data>) is to have a provably unspendable output that can be, and already are, pruned from the UTXO set. Yes, technically there are scripts that can contain OP_RETURN that are valid as long as OP_RETURN is not executed, but that’s not what people generally refer to when saying “OP_RETURN output”.

  47. russeree commented at 11:20 pm on March 7, 2024: contributor
    Closing because the fragility of this PR does not justify it’s limited impact.
  48. russeree closed this on Mar 7, 2024

  49. bitcoin locked this on Mar 7, 2025


This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-03-09 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on