Cluster mempool, CPFP carveout, and V3 transaction policy #29319

issue sdaftuar openend this issue on January 25, 2024
  1. sdaftuar commented at 4:15 pm on January 25, 2024: member

    Opening an issue for high-level discussion, as the PR that implements this has gotten difficult to follow.

    Cluster mempool

    Work is underway to redesign the mempool with different topology constraints on the transaction graph than exist today. I originally described this proposal in a github issue (#27677), and have shared a draft implementation (#28676). In brief, with a new mempool design we could simultaneously: fix bugs with mempool eviction and the incentive compatibility of RBF replacements; achieve improved performance; eliminate the ancestor/descendant limits (with the introduction of a likely more relaxed “cluster” limit); and likely be in a position to better implement complex behaviors like package validation and package rbf.

    CPFP carveout is not compatible with cluster mempool

    As I explain here, I believe the CPFP carveout rule (introduced in #15681) is not compatible with the new design. Since some users may be relying on this behavior, we should come up with a workaround to avoid breaking existing applications.

    V3 transaction policy

    Described in #28948, the v3 transaction policy is a proposal to introduce topology restrictions on unconfirmed transactions with nVersion=3 (which are currently non-standard for relay, and would be made standard under the proposal). Specifically, v3 transactions would be:

    • opted-in to RBF replacement policies (whatever those may be, as they evolve in our project)
    • permitted to only be part of mempool clusters that are at most of size 2 (meaning 1 parent/1 child)
    • required to have any unconfirmed parents or children also be marked v3 (and therefore subject to these rules)
    • bounded in size to at most 1000 vbytes, if the child of an unconfirmed (and therefore v3) parent

    By offering a policy rule that can enforce a much tighter topology restriction that the current ancestor/descendant limits, we hope to achieve a few things:

    Replacing CPFP carveout (with a new sibling-replacement policy)

    We can provide a way to achieve the goals of the CPFP carveout rule even if the existing carevout rule were to be dropped. As explained in this post, the use-case contemplated by the CPFP carveout rule was one where a single transaction might have two spendable outputs, each spendable by a different party, and that either party should be able to spend their output without hitting global topology limits, provided that their spending transaction was bounded in size and had no other unconfirmed parents. If the v3 policy rule were to be applied to the LN commitment transactions described in that post, then neither of the two outputs that are spendable could be used to hit the limits that carveout is designed to bypass.

    So if we couple the v3 policy with an RBF rule that would allow one spend of a v3 transaction to replace an existing lower-feerate spend of that same parent – something we call sibling eviction and has a draft implementation in #29306 – then I believe we will have enacted a set of policies that replicate the CPFP carveout use case.

    Provide a general way for fee bumping to work better

    Even apart from the needs of layer 2 protocols, the new v3 policy serves as a general way to bypass the issue of RBF being expensive when a child transaction is created that has a large fee (and is typically large in size, allowing for it to be low feerate). This has been the source of user complaints since our RBF policies were first deployed, and by bounding the size of child transactions we effectively can limit the amount of additional fee a user would need to pay due to the presence of such children.

    Note, of course, that this property is also what makes the sibling-eviction-policy described above workable.

    Proposed roadmap

    To unblock the cluster mempool project, I think we need the following:

    1. Deployment of the V3 policy rules (current proposal is #28948)
    2. Deployment of V3 sibling eviction rbf rules (current proposal is #29306)

    Then, projects that use CPFP carveout would need to be able to adopt the new policy rules. After the last LN spec discussion, I understand that migrating their commitment transactions to use a new format (even just a change in version number) might take time to coordinate, and that in the meantime we should simply use some other unique markers to identify those commitment transactions, and implicitly imbue such transactions with the v3 policy rules. However, doing so would eliminate the ability to batch-CPFP several unconfirmed commitment transactions at once (although this isn’t reliable anyway today, since the carveout protections don’t apply to this case, but perhaps this sometimes works fine and is more efficient).

    It’s not clear to me if that final step – of imbuing transactions with v3 policy, rather than requiring explicit opt-in – is needed before we move further forward with cluster mempool, or even if doing so would be acceptable to the broader community, but I think we could entertain that idea as a way to assist the LN project with migration and to decouple progress in this project from upgrade timelines in that one.

    Feedback

    While we’re still working out implementation details in the v3 and v3-sibling-eviction PRs, it would be great to get concept ACKs on this roadmap from any users of the current CPFP carveout policy. In particular, please provide feedback on:

    1. whether there are use cases of CPFP carveout that are not covered by the V3 proposal as I described above, and
    2. whether the “imbued v3” behavior described above should be treated as a blocker for CPFP carveout removal as well, or if it’s a bad idea due to (eg) breaking the ability to batch-CPFP (or any other reason)
  2. sdaftuar commented at 4:19 pm on January 25, 2024: member
  3. glozow added the label Feature on Jan 25, 2024
  4. glozow added the label TX fees and policy on Jan 25, 2024
  5. glozow added the label Mempool on Jan 25, 2024
  6. ariard commented at 10:13 pm on January 25, 2024: member
    i think there is @petertodd ’s https://petertodd.org/2024/one-shot-replace-by-fee-rate to weigh as a pinning solution. sounds to me slightly more robust than v3 policy as no malleability in the fee-bumping mechanism. however the dynamic N replace-by-feerate window might be a mess for miners mempools. whatever the solution (v3 or replace-by-feerate), i believe you will still have exploitable asymmetries for L2s.
  7. petertodd commented at 0:11 am on January 26, 2024: contributor

    @sdaftuar

    bounded in size to at most 1000 vbytes, if the child of an unconfirmed (and therefore v3) parent

    This is insufficient to fix pinning in comparison to existing solutions: https://petertodd.org/2023/v3-txs-pinning-vulnerability

    For example, at the moment the transaction fee required to get into the next block is about 23sat/vB, while the minimum relay fee of a typical mempool is 20sat/vB. So an attacker who simply did a straightforward pinning attack on an ephemeral anchor spend could force the defender to spend an additional:

    20sat/vB * 1000vB = 20,000sat = $8 USD
    

    just to get their transaction mined, at almost no cost to themselves. We need a better solution than that. One obvious one would be to make the 1000vB limit much smaller, eg the same size as a 2-in, 1 out, anchor spend. A Replace-by-Fee-Rate carveout for this specific case is another potential solution. (and as @ariard mentions, replace-by-fee-rate in general is a solution; IIUC one-shot replace-by-fee-rate will be quite a bit easier to implement with the cluster mempool work)

  8. petertodd commented at 0:14 am on January 26, 2024: contributor

    @ariard

    however the dynamic N replace-by-feerate window might be a mess for miners mempools.

    Can you give a bit more detail on what challenges you think that’ll pose?

  9. sdaftuar commented at 0:15 am on January 26, 2024: member
    @ariard @petertodd The discussion you are having is not related to this topic, can you please take it to another thread?
  10. petertodd commented at 0:25 am on January 26, 2024: contributor

    @sdaftuar As you said, “Opening an issue for high-level discussion”.

    Whether or not V3 achieves its goals is definitely a high level discussion that needs to be resolved here. I showed that V3 does not, as attackers can still cause defenders to pay significant amounts of money in response to pinning attacks, and that the way we intend to use V3 with ephemeral anchors is quite possibly worse than the status quo as anyone can be the attacker. I also provided a simple fix, and some less simple fixes. @ariard has also been discussing similar high-level design considerations separately.

    These are all items appropriate for high-level discussion to make sure the design of cluster mempool/V3 is correct and worth implementing (and in this case, I gave an argument that cluster mempool may be worthwhile without V3).

  11. sdaftuar commented at 0:41 am on January 26, 2024: member

    This topic is about the CPFP carveout rule, which cannot be supported in the cluster mempool design. My goal is to establish whether a particular set of policy rules would be a suitable replacement for carveout. You and ariard are discussing pinning more generally, and I infer from your prior comments that you think CPFP is a mistaken idea to begin with. So I assume that means you don’t believe that dropping the CPFP carveout rule should be a big deal either, is that a fair statement?

    That is fine with me, and I’ll take it as a concept ACK from you that you don’t think the cluster mempool efforts need to be gated on worrying about the effects of CPFP carveout. However, I believe that there are users of the CPFP carveout policy that the software supports today, and I’d like to hear from them about this topic before proposing that we drop support.

    Take detailed discussions of pinning vectors and potential solutions elsewhere.

  12. TheBlueMatt commented at 2:30 am on January 26, 2024: contributor

    However, doing so would eliminate the ability to batch-CPFP several unconfirmed commitment transactions at once (although this isn’t reliable anyway today, since the carveout protections don’t apply to this case, but perhaps this sometimes works fine and is more efficient).

    I believe this is something that needs to be addressed at some point in v3 transaction, but because, as you note, it is not reliable today, waiting to address it until later send acceptable. The only requirement here, then, would be that any v3-auto-tx-type-match logic allow standard rules for transactions which otherwise match but which have in-mempool ancestors.

    It’s not clear to me if that final step – of imbuing transactions with v3 policy, rather than requiring explicit opt-in – is needed before we move further forward with cluster mempool

    If the outcome is to break existing software’s use of the carve-out, then I think it’s prudent to wait until that software has updated and channels have had a chance to migrate before moving forward. There’s plenty of problems with lightning, but making anchor channel confirmation even less reliable (even in the honest case) seems like a bad outcome. Admittedly these cases are likely very rare (it requires one peer broadcasting then going offline while the other peer tries to get things confirmed), but certainly not impossible.

    or even if doing so would be acceptable to the broader community, but I think we could entertain that idea as a way to assist the LN project with migration and to decouple progress in this project from upgrade timelines in that one.

    It certainly seems like the simplest way forward, especially if we have to figure out some 0conf solution for v3 transactions, which could further delay any lightning switchover.

  13. glozow commented at 9:02 am on January 26, 2024: member

    However, doing so would eliminate the ability to batch-CPFP several unconfirmed commitment transactions at once (although this isn’t reliable anyway today, since the carveout protections don’t apply to this case, but perhaps this sometimes works fine and is more efficient).

    I believe this is something that needs to be addressed at some point in v3 transaction, but because, as you note, it is not reliable today, waiting to address it until later send acceptable. The only requirement here, then, would be that any v3-auto-tx-type-match logic allow standard rules for transactions which otherwise match but which have in-mempool ancestors.

    I think that pattern-matching / imbuing v3 is doable (that sounds like a concept ack for v3?). Assuming sibling eviction in v3, so far, it seems like it applies nicely to existing commitment transactions without any modifications. On the LN side, the requirement is no other unconfirmed inputs in the child, implying no batching.

    IIRC during the meeting, people said that LND (cc @roasbeef) are the only ones who do batching?

    Supporting batched bumping is definitely a goal, and was originally part of the v3 design until we realized it created a pinning problem and couldn’t support it.

    Since we’re on the topic of not breaking the usage of past features, though, batched CPFP and CPFP Carve Out have always been mutually exclusive. The extra descendant is given a ancestor limit of 2, which means it cannot be CPFPing more than one parent (see code and proposal). That suggests that there can’t be any transactions that are relying on both features simultaneously, so supporting such transactions shouldn’t be a blocker for cluster mempool. (^I know I don’t have to explain this to you but wanted to write this explanation somewhere).

  14. TheBlueMatt commented at 8:12 pm on January 26, 2024: contributor

    I think that pattern-matching / imbuing v3 is doable (that sounds like a concept ack for v3?). Assuming sibling eviction in v3, so far, it seems like it applies nicely to existing commitment transactions without any modifications. On the LN side, the requirement is no other unconfirmed inputs in the child, implying no batching.

    Yes, though as I mention above I do think we may need to figure out a few additional specific details around v3 (eg how do we handle 0conf ln channels? I need to make a delving post to discuss this) before its a clear answer. Of course a v3 that simply was implicitly not v3 if it has unconfirmed ancestors would also address this, as long as we can expand the scope later (and presumably lightning may not use v3 until that point).

    IIRC during the meeting, people said that LND (cc @Roasbeef) are the only ones who do batching?

    That’s my understanding. I believe our API allows it, but it’d take some work on downstream users to build it, mostly we just need to update our docs to explicitly forbid it.

  15. t-bast commented at 11:43 am on January 29, 2024: contributor

    (eg how do we handle 0conf ln channels? I need to make a delving post to discuss this)

    I’d be curious to see your delvingbitcoin post around that, as I believe this shouldn’t be too much of an issue (but let’s discuss it on a dedicated post instead of here).

    I think that pattern-matching / imbuing v3 is doable (that sounds like a concept ack for v3?). Assuming sibling eviction in v3, so far, it seems like it applies nicely to existing commitment transactions without any modifications. On the LN side, the requirement is no other unconfirmed inputs in the child, implying no batching.

    Concept ACK on our side here, we don’t have any code that would break due to the implicit enrolling of existing commitment transactions to v3 (we don’t batch anchor-spends for both security and simplicity reasons).

    Supporting batched bumping is definitely a goal, and was originally part of the v3 design until we #25038 (comment) it created a pinning problem and couldn’t support it.

    I’m wondering if that would really be hard to support? It seems to me that sibling eviction rules could efficiently handle the 1-child-multiple-parent cases, but I may be missing something. I thought that the rules could be:

    • a v3 unconfirmed transaction may have multiple unconfirmed v3 parents (but is limited by the 1000 vbytes threshold)
    • a v3 unconfirmed transaction may have at most one unconfirmed child
    • a v3 transaction that has unconfirmed parents cannot have unconfirmed children

    Those rules ensure that we have at most two generations of unconfirmed v3 txs in a package. When evaluating sibling eviction, it’s thus easy to figure out the replaced package, it only consists of the sibling with all of its unconfirmed parents. We can then apply package RBF rules (not the current RBF rules though) to efficiently replace this, and it should be easier to evaluate whether it’s more incentive compatible?

    It does require different package RBF rules than BIP 125 though:

    • we mustn’t apply BIP 125 rules 2 and 3
    • rule 4 should probably be replaced by an increase in total package feerate

    This is probably much more complex than the current sibling eviction proposal, and I may be missing important details as well, so feel free to ignore!

  16. sdaftuar commented at 7:59 pm on January 29, 2024: member

    Thanks @t-bast and @TheBlueMatt for chiming in here.

    It does require different package RBF rules than BIP 125 though:

    • we mustn’t apply BIP 125 rules 2 and 3

    Relaxing rule 3 is the main issue, in my view, as it stems from concerns around both incentive compatibility (which I explore in my cluster mempool post here) and anti-DoS protections against free relay. At any rate I’m not aware of any proposals that are fully-baked enough to point to as part of a currently foreseeable path forward (and we can take discussion of any particular proposals that people are interested in analyzing to another thread). @TheBlueMatt @t-bast Since you both seem to be on-board with the idea of imbuing commitment transactions with v3 semantics, can you advise on where you think we should start with identifying such transactions? What I have understood so far is that we want to only consider transactions with no in-mempool ancestors, which have exactly two 330-satoshi outputs (did I get that number right?). Do you have more criteria we should be looking at? It seems like a good next step would be for someone to code up something that tries to capture what you are talking about, and try to verify on historical data that it wouldn’t catch anything it shouldn’t.

  17. TheBlueMatt commented at 10:14 pm on January 29, 2024: contributor
    Looking at my code, I see version: 2, lock_time: (0x20 << 8*3) | (random_garbage & 0xffffff), one segwit non-wrapped input with sequence: (0x80 << 8*3) | (random_garbage & 0xffffff), outputs sorted first by value, then by script_pubkey. There should always be at least one (sometimes two or more) outputs of exactly 330 sats and all outputs should be P2WSH. There may be one or two things I didn’t catch, but that’s pretty specific, I imagine.
  18. t-bast commented at 6:32 am on January 30, 2024: contributor

    Another approach that may be simpler is to imbue with v3 semantics only once one of the anchor output is being spent in the mempool (because if none are spent, there’s basically nothing to do). This way we can pattern match on the anchor output script which is very specific:

    0OP_PUSHDATA(<some_public_key>) OP_CHECKSIG OP_IFDUP OP_NOTIF OP_16 OP_CHECKSEQUENCEVERIFY OP_ENDIF
    

    I think that this approach shouldn’t generate any false positive. Once such an anchor transaction is identified, we would imbue its parent from the input that matches that script with v3 semantics.

  19. petertodd commented at 9:07 am on January 30, 2024: contributor

    @sdaftuar

    So I assume that means you don’t believe that dropping the CPFP carveout rule should be a big deal either, is that a fair statement?

    No, that’s not a fair statement. While I have my reservations about CPFP, the fact is we’ve shipped a lot of Lightining implementations that rely on the CPFP carveout for safety and we have to continue to provide that functionality for the forseeable future. Indeed, this complexity is one reason why I’m dubious about rushing into the half-baked V3 proposal, as the complexity of it will simply have to live on in mempool policy for a long time to come if we adopt it.

    See: https://petertodd.org/2023/v3-transactions-review#recommendations

    I did not suggest that we get rid of CPFP in my V3 transactions review. Even though I recommended against shipping V3, I even said that we should try to ship package relay to make existing anchor/CPFP-using Lightning implementations less broken.

  20. instagibbs commented at 1:18 pm on January 30, 2024: member

    I think that this approach shouldn’t generate any false positive. Once such an anchor transaction is identified, we would imbue its parent from the input that matches that script with v3 semantics.

    Reminder we probably have to be aware of simple taproot channels, or we should just move fast enough to get CPFP carveout replaced via v3+sibling eviction, then only focus on imbuing segwitv0 channels? We’ll likely need @Roasbeef to weigh in

  21. bitcoin deleted a comment on Jan 30, 2024
  22. ariard commented at 9:28 pm on February 1, 2024: member

    @petertodd

    Can you give a bit more detail on what challenges you think that’ll pose?

    from my memory: “How this new replacement rule would behave if you have a parent in the “replace-by-feerate” half but the child is in the “replace-by-fee” one ?” see the conversation here: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019839.html in past conversations we have assumed a “static” N blocks worth of mempool, unclear to me with your proposal if dynamic. i wonder about threshold effect and it could be exploited by pinning adversaries @sdaftuar

    “This topic is about the CPFP carveout rule, which cannot be supported in the cluster mempool design. My goal is to establish whether a particular set of policy rules would be a suitable replacement for carveout.”

    i don’t understand why removing the carve out has to be technically linked with v3 deployment. you can break carve-out today by broadcasting one of the 2 latest valid commitment tx in lightning full node mempool. and the other state on the tx-relay network. end of the game. so it can be removed in my mind without talking about v3.

    srly which LN implementation has implemented “blind” broadcast on counterparty state ? can’t remember LDK doing it.

    and that in the meantime we should simply use some other unique markers to identify those commitment transactions,

    side-note, a generic mechanism in core to retro-actively apply policy rules semantics to pre-signed states good thing would be great for emergency security deployment, where interactivity with lightning counterparty can’t be assumed. gave thoughts on it a year ago on one of the V3 PR: #25038 (comment)

  23. petertodd commented at 6:52 am on February 2, 2024: contributor

    @petertodd

    Can you give a bit more detail on what challenges you think that’ll pose?

    from my memory: “How this new replacement rule would behave if you have a parent in the “replace-by-feerate” half but the child is in the “replace-by-fee” one ?” see the conversation here: https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2022-January/019839.html in past conversations we have assumed a “static” N blocks worth of mempool, unclear to me with your proposal if dynamic. i wonder about threshold effect and it could be exploited by pinning adversaries

    My proposal is dynamic, as I mention in the Pure Replace-By-Fee-Rate section: “In a rising fee-rate environment, the one-shot policy may degrade to pure replace-by-fee-rate.”

    Not sure what you mean by “threshold effect”. So long as the N block threshold isn’t too large, transactions that don’t quite reach the threshold have a decent chance of getting mined eventually. Similarly, since the threshold is dynamic, it’s likely that the threshold will reduce at some point, allowing a fee-bump will put it over the threshold.

    I’ve already implemented one-shot replace-by-fee-rate in a sense with the pure-replace-by-fee-rate implementation in Libre Relay: it has N=0, so replacement is always possible. That might not always be strictly incentive compatible for miners. But it’s not that far off. I’d estimate roughly 30 nodes are running it at the moment; hopefully I’ll convince a miner to run it too once more people get confident that it doesn’t introduce new, effective, DoS attacks.

    “This topic is about the CPFP carveout rule, which cannot be supported in the cluster mempool design. My goal is to establish whether a particular set of policy rules would be a suitable replacement for carveout.”

    i don’t understand why removing the carve out has to be technically linked with v3 deployment. you can break carve-out today by broadcasting one of the 2 latest valid commitment tx in lightning full node mempool. and the other state on the tx-relay network. end of the game. so it can be removed in my mind without talking about v3.

    To be clear, you’re assuming the attacker is connected to the LN node’s mempool directly, correct?

    That’s an obvious security risk. I know for a fact that many large LN nodes do not make their Bitcoin nodes publicly accessible to avoid attacks like this (of course, many don’t even make their LN nodes publicly accessible).

  24. glozow commented at 10:32 am on February 2, 2024: member

    i don’t understand why removing the carve out has to be technically linked with v3 deployment. you can break carve-out today by broadcasting one of the 2 latest valid commitment tx in lightning full node mempool. and the other state on the tx-relay network. end of the game. so it can be removed in my mind without talking about v3.

    It looks like removal of CPFP carve out is not a concern for you; you just don’t think v3 is a good idea. If that’s the case, there is no need to post general criticisms of v3 here. See #28948 description for a list of threads dedicated to discussing v3.

    My proposal is dynamic, as I mention in the Pure Replace-By-Fee-Rate section: “In a rising fee-rate environment, the one-shot policy may degrade to pure replace-by-fee-rate.”

    Unless your alternative proposal is relevant to CPFP carve out removal and/or cluster mempool, it is off topic here. Again, there are separate threads that are more appropriate.

  25. ariard commented at 2:52 am on February 5, 2024: member

    Not sure what you mean by “threshold effect”. So long as the N block threshold isn’t too large, transactions that don’t quite reach the threshold have a decent chance of getting mined eventually. Similarly, since the threshold is dynamic, it’s likely that the threshold will reduce at some point, allowing a fee-bump will put it over the threshold.

    My understanding of the proposal is the following. We introduce a dynamic N block threshold, where everything inside the threshold is replace-by-feerate, everything under is replace-by-fee as of today. The “dynamism" trigger is scheduled on the mempool congestion rate (?) yet to be defined.

    I think you might have threshold effect where the parent commitment tx is in the replace-by-fee mempool group segment and a high-fee CPFP arrives qualifying the whole package to be reconsidered in the replace-by-feerate mempool group segment. Reconsideration might be computationally expensive. Interested to have a look on the implementation in Libre Relay if already available.

    To be clear, you’re assuming the attacker is connected to the LN node’s mempool directly, correct?

    You can trigger a off-chain force-close and have the honest node auto-partitioning itself. So not necessarily.

    That’s an obvious security risk. I know for a fact that many large LN nodes do not make their Bitcoin nodes publicly accessible to avoid attacks like this (of course, many don’t even make their LN nodes publicly accessible).

    If you’re running a malicious reachable public node and non-listening Bitcoin nodes connect to you for tx-relay easy to cross-layer map (cf. time-dilation attack paper for more on cross-layer mapping).

    Here a test illustrating a network-topology-aware (NTA) pinning https://github.com/ariard/bitcoin/commit/84e12b87b29637a26e8a9bb4c494ffc7206d3777

    Those types of advanced pinning are low-key known among some devs since https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2020-June/018011.html

    edited: to answer PT on the negative on the lack of necessary tx-relay connection for NTA pinning

  26. ariard commented at 3:11 am on February 5, 2024: member

    It looks like removal of CPFP carve out is not a concern for you; you just don’t think v3 is a good idea. If that’s the case, there is no need to post general criticisms of v3 here. See #28948 description for a list of threads dedicated to discussing v3.

    I would appreciate a more gently online communication tone from someone I spent time sharing a lot of information (cf. my old gist “Mitigating Tx-Relay Jamming for Time-Sensitive Contract Protocols”) on all those issues circa 2020 / 2021. Thanks.

    Unless your alternative proposal is relevant to CPFP carve out removal and/or cluster mempool, it is off topic here. Again, there are separate threads that are more appropriate.

    Yes there are zero-to-minimal security benefits provided by the CPFP carve out as LN implementations as an adversary can nullify the carve-out with NTA pinnings - today. And I’ll bet that half of the LN implementations have not implemented broadcast-CPFP-on-remote counterparty commitment transaction.

    In my reasonable opinion, the CPFP carve-out can be deprecated now without necessity of introducing V3 transaction policy as a preliminary step. I think all non-lightning use-cases can upgrades their own protocol transaction format in the same way than lightning to introduce a single CPFP with two script path branches.

  27. sdaftuar commented at 4:48 pm on February 7, 2024: member
    Using a datalogging and replaying system that I have, I did some analysis of what the effect the idea of imbuing LN commitment transaction spends with v3 semantics would have had in 2023. My writeup of that is here: https://delvingbitcoin.org/t/analysis-of-attempting-to-imbue-ln-commitment-transaction-spends-with-v3-semantics/527
  28. ariard commented at 10:54 pm on February 8, 2024: member

    So carve-out was introduced back by #15681 with motivation to allow one extra single ancestor tx. After testing https://github.com/bitcoin/bitcoin/commit/6337b978e77a4e1f93bb009958db7c9a619323df you can always replace a long-chain of transactions with a higher-fee candidate. Carve-out is already broken today due to NTA pinning. LN devs can upgrade to have a single output on each commitment with one branch for each counterparty. At least this should conserve the same level of robustness for non-adversarial env (synchronous broadcast of commitments)

    Using a datalogging and replaying system that I have, I did some analysis of what the effect the idea of imbuing LN commitment transaction spends with v3 semantics would have had in 2023.

    Unclear to me what is the hypothesis this analysis aims to corroborate or invalidate. I’m understanding as “assume all v3 semantics are opt-in by all LN commitment transaction owners and see success/failure”. For high number of batch CPFP spending LN commitment transactions, it’s already unsafe. (depends `max_accepted_htlcs) So it can be okay to be more restrictive actually in the deployment of any imbuance mechanism for LN transactions. (not endorsing v3 - just pointing out that some analyzed LN traffic sounds intrinsically unsafe at first sight).

  29. sdaftuar commented at 11:35 pm on February 8, 2024: member

    @ariard I agree that this would be simpler if there is just 1 anchor output (see also @instagibbs’ proposal for ephemeral anchors, fwiw). But (a) that is not up to this project to decide; (b) you still need some kind of pinning solution if a too-large spend is created (which v3 provides – while no other workable proposals have been put forward); and (c) even then it sounds like it would take LN some time to coordinate an upgrade to a 1-anchor system (whether EA or 2 branches of a single script, or whatever else could be on the table).

    As a developer on this project, I think we should generally strive to not break downstream users’ applications unnecessarily. I also don’t want this project to be blocked waiting on users to upgrade to a better way of doing things; the sense I got from the LN spec developers is that coordinated upgrades take time, which seems reasonable to me on its face.

    I believe that we have work in progress to mitigate many (most?) of the pinning scenarios that have been discussed in recent years. However it is a significant amount of work, and each piece requires careful review – so it is a challenge to present a complete solution all at once (though @glozow and @instagibbs have both done great work to try to demonstrate large pieces of these ideas at various times). Pinning is also not the only thing we are trying to solve – in my view it’s far from the most important problem we face – so some pieces of the pinning solution should rightfully be blocked on us solving bigger problems, such as making RBF incentive compatible, and making mining and eviction use the same transaction score metric (so that we don’t evict things we might want to mine).

    Overall, I think at some point we need to pick a path forward and get it done. The steps I’ve described above (v3, v3-sibling-eviction, and imbuing LN commitment transaction spends with v3 semantics) are close to ready at this point and would immediately address the pinning problem that carveout currently solves, without having to wait for LN developers to upgrade their systems. If we don’t go down that path, then we’re either breaking users or blocked until they upgrade their spec – both bad options in my view.

    If you have alternate work to share to address the variety of problems that have been discussed, please feel free to open PRs and try to get review for an alternative roadmap.

  30. ariard commented at 6:31 pm on February 22, 2024: member

    @sdaftuar While I’m sharing your opinion on the lack of necessity to not break downstream users’s applications unnecessarily, I think you’re missing my present observation on the lack of current benefit of the 2 anchor outputs on LN commitment transactions. As my test demonstrate just above, in the occurence of an adversarial scenario, the abibility to CPFPcon a counterparty commitment transaction can be neutralized by broadcasting one of the 2 commitment states in the target mempool. In the occurence of hazardous scenario (e.g concurrent unilateral force-closure), network mempools might be partitioned with propagation of the asymmetric states, and the fee-bumping CPFP can’t propagate as expected. As said above, the carve-out mechanism is broken since its initial implementation in 2019, as its security benefits have not been analyzed correctly at that time — to be fair we’re far more knowledgeable on this area now that we used to be.

    Beyond, I’m very doubtful broadcasting a CPFP on a counterparty commitment transaction is currently implemented by all LN implementations, at the very least I can’t remember it’s done by LDK or CLN. It might be implemented by Eclair or LND, yet I would very doubtul again they broadcast a CPFP on the latest 2 valid counterparty commitment transaction (allowed during the CS - RAA phase). So I believe that current carve-out mechanism can be deprecated today as it’s useless without package relay, and it’s not even implemented and deployed correctly by LN implementations. They can upragade directly to one of the alternative (e.g 2 branches of a single script).

    On the pinning mitigations, it turns out v3 is very far to address all the pinning scenarios known since years (e.g NTA pinning originally described in 2020) and it still offers significant exploitation areas with currently selected parameters and child coverage only approach. This is very uncertain that actually constraining the child only solve long-term pinning, and that an approach like replace-by-feerate isn’t more accurate to solve NTA-like class of pinning as it’s removing package malleability. On the EA idea itself, it turns out it might alter badly miners incentives as its own author has recognized himself. We’re doing safety-first engineering and going for a short-sighted solution only makes the problem worst on the long-term as we’re increasing re-deployment cost of any newer and more efficient pinning mitigation.

    I do share the sentiment that we shouldn’t be rightfully blocked on coming to consensus on a pinning solution and be rightfully blocked on solving bigger problems, and yes I recognize that RBF incentive compatibility is a serious issue, as we all figure out during the initial deployment of mempoolfullrbf. However I’m yet to understand why solving pinning and deploying cluster have to be linked together ? In my mind they’re orthogonal and there is no need to commit and “package” them on a single roadmap. If there is any substantial technical issue why they should be linked together, thanks to clarify.

    Calling to “ship-forward” and get stuff it done, on its own justification intrinsically, appears in my eyes as sign of “process fatigue”. In the current case of such phenomena, we shall take time to build a stronger technical consensus, an engineering mindset you have yourself advocated in the past (cf. Taproot activation).

    The path forward in my opinion is still have to LN developers uprading their systems in a non-coordinated fashion on the question of carve-out output, while we can still make progress in parallel on solving RBF incentive compatibility. In the present state of deployed codebases, there is no need to “point-lock” the issues.

  31. Roasbeef commented at 8:42 pm on February 27, 2024: none

    Beyond, I’m very doubtful broadcasting a CPFP on a counterparty commitment transaction is currently implemented by all LN implementations, at the very least I can’t remember it’s done by LDK or CLN.

    lnd will attempt to sweep all 3 possible anchors (our local, remote commitment, remote pending commitment) here: https://github.com/lightningnetwork/lnd/blob/48a3d5633560d0f38c69e69b3a4bf520545c8590/contractcourt/channel_arbitrator.go#L1383-L1409. We’ll continue to offer those sweeps into the central input aggregation sub-system (the Sweeper) until one of them is spent or the commitment transaction confirms by itself. Each anchor is offered with a discint exclusive group so they won’t be batched in the same transaction (would ofc be invalid).

    Today configure a deadline based on the soonest expiring outgoing HTLC, and then will retarget based on only that deadline (so continually request a 15 block conf or w/e). In the next release, we’re planning on revamping this to be more directly aware of elapsed time, then using a fee bumping function to progressively increase the fee we offer until ultimate confirmation.

    IIRC during the meeting, people said that LND (cc @Roasbeef) are the only ones who do batching?

    Yes today we’ll batch all inputs that need to be swept into a single mega transaction on a best effort basis. As a prep for some of these upcoming changes, we’ve started to abstract out the concept of input aggregation so we’re able to implement more flexible policies to allow the anchors to be swept by themselves.

    With our next release, we also start to use testmempoolaccept before publishing to attempt to cut down on wasted publish attempts to due violating the RBF rules or otherwise.

  32. sdaftuar commented at 10:51 pm on February 27, 2024: member

    @roasbeef Thanks for commenting here. It sounds like in the not-too-distant future, the LND project would be ok with the v3 transaction policy + sibling eviction plan that I’ve laid out above, as a replacement for CPFP carveout?

    Just to update this thread – #28948 (with the v3 policy rules) was merged, but without v3 transactions being made standard yet (as we wanted to leave room for further feedback on whether the 1000 vbyte size limit on children is a reasonable number to pick). At this point, in the absence of any such additional feedback, I think we should just leave it as-is and move on.

    So as far as timeline/roadmap goes, I think if all goes well, then conceivably in the 28.0 release (~7 months from now) we will likely see in Bitcoin Core:

    • v3 sibling eviction (#29306, this is almost ready for merge)
    • v3 transactions made standard (#29496)

    And my hope is that we may also have cluster mempool in that release (draft is in #28676 for concept purposes only), but that is a big project so whether that makes it in to 28.0 or some later release is unknowable right now.

    Whenever we get to the point that we are ready to merge cluster mempool, the question will be whether the LN folks want the “imbued v3” behavior merged as well, so that LN commitment transactions pick up the v3 semantics without explicit changes to the transactions. I think the advantage of doing so would be that rollout of cluster mempool could happen without first having to deploy v3 as standard for a release or two (to let the network upgrade so that v3 transactions will propagate), because old software would relay nVersion=2 LN commitment transactions and apply the existing carveout rule, while new software would give the v3 semantics (limited child size/topology, plus sibling eviction) to ensure that fee bumping CPFP’s of commitment transactions will be relayed, with limited pinning potential by an adversarial counterparty.

    However, as the imbued v3 patch is of course a giant hack and has downstream impact on LN wallets, the LN maintainers whose applications would be affected should explicitly ACK that idea before we consider merging it. A draft PR that shows what the code could look like is in #29427.

    Of course there’s also a very good chance that getting cluster mempool finished and reviewed in time for 28.0 is too ambitious, in which case we could always revisit the timeline and decide whether imbued v3 would be needed for a smooth rollout, or if LN will opt-in to the v3 rules explicitly before cluster mempool would be deployed.

  33. t-bast commented at 10:44 am on February 28, 2024: contributor

    However, as the imbued v3 patch is of course a giant hack and has downstream impact on LN wallets, the LN maintainers whose applications would be affected should explicitly ACK that idea before we consider merging it. A draft PR that shows what the code could look like is in #29427.

    I’m not a fan of merging hacky, temporary code into bitcoin core to be honest. Since LN implementations are not fully leveraging the carve-out rule anyway (because we can end up in a state where our mempool contains our local commitment transaction, but our peers have the remote commitment transaction with a chain of descendants to pin it, in which case we have no way of broadcasting our spend of the anchor output on the remote transaction), we’re already exposed to that risk today. We can’t easily fix that without package relay anyway, so we can just accept the fact that this stays somewhat broken until we start using package relay? And thus we don’t really need the hacky v3 imbuing logic?

  34. sdaftuar commented at 2:36 pm on February 28, 2024: member

    We can’t easily fix that without package relay anyway, so we can just accept the fact that this stays somewhat broken until we start using package relay?

    I would also prefer to not merge the imbued v3 patch.

    In parallel with the PRs I’ve mentioned above, there are PRs open to implement 1-parent-1-child relay (#28970), as well as cluster-size-2-package-rbf (#29242 + #28984), which I think is most of what you’re referring to. However, note these limitations:

    • Until cluster mempool is in, the limitation on package RBF to cluster-size-2 conflicts will mean that in adversarial conditions, an attacker using non-v3 commitment transactions will be able to trivially prevent replacement of their commitment tx by just creating a topology that isn’t eligible for replacement under #28984.

    • After cluster mempool is in, I expect we’ll have more general package RBF, where a parent+child could try to evict a more complex topology of conflicts, but without v3 policy being applied to the conflict, pinning will likely be an issue (though at least you’ll have a chance to get a package replacement to take place).

  35. t-bast commented at 3:52 pm on February 28, 2024: contributor
    That sounds good enough to me, that’s what I had in mind. I expect to benefit from improved security only once we move to v3 commitment txs. We will aggressively try to migrate channels to use v3 commitment transactions once we’ve confirmed that a large enough part of the bitcoin relay nodes support them.
  36. ariard commented at 1:53 am on March 4, 2024: member

    @Roasbeef

    lnd will attempt to sweep all 3 possible anchors (our local, remote commitment, remote pending commitment) here: https://github.com/lightningnetwork/lnd/blob/48a3d5633560d0f38c69e69b3a4bf520545c8590/contractcourt/channel_arbitrator.go#L1383-L1409. We’ll continue to offer those sweeps into the central input aggregation sub-system (the Sweeper) until one of them is spent or the commitment transaction confirms by itself. Each anchor is offered with a discint exclusive group so they won’t be batched in the same transaction (would ofc be invalid).

    See my note in #29306

    “LND head commit 935e550: mempool watch interface (SubscribeMempoolSpent() in chainnntfs/bitcoindnotify/bitcoind.go) does not subscribe to spend of the funding outpoint in all the contractcourt/ logic, only for HTLC timeout resolution L952 in htlc_timeout_resolver.go.”

    I don’t see how your channel arbitrator logic is monitoring your local mempool, so if your anchors.Remote is replaced by anchors.RemotePending, I’m very unsure you’ll rebroadcast in consequence the correct CPFP transaction spending an anchor. Do you have test coverage for this case ?

    With our next release, we also start to use testmempoolaccept before publishing to attempt to cut down on wasted publish attempts to due violating the RBF rules or otherwise.

    Actually testmempoolaccept should be run when you accept counterparty signature in commitment_signed, at time of publication, if it’s counter-signed transaction like the commitment transaction it’s already too late. We don’t dual-signed second-stage HTLC transactions, though in a post-Taproot future you might have to.

    Just to update this thread – #28948 (with the v3 policy rules) was merged, but without v3 transactions being made standard yet (as we wanted to leave room for further feedback on whether the 1000 vbyte size limit on children is a reasonable number to pick). At this point, in the absence of any such additional feedback, I think we should just leave it as-is and move on.

    Just for everyone awareness v3 transactions was merged. Yet it lets 2 exploitable pinnings strategy:

    • “loophole” pinning documented here
    • NTA pinning tested here

    From my understanding, the reviewers of the v3 patchset that got merged don’t understand pinning so I don’t know if they’re disagreeing on the holes or they are only motivated by merging anti-pinning “security theater” which let exploitable vectors exposing the LN ecosystem. Beyond, this is only increasing the deployment cost of any future real pinning mitigations, as “unsafe” off-chain LN states might be have to be pruned out by an on-chain write (i.e one cooperative closing transaction spending funding outpoints).

    My recommendation would be to consider transaction-issuer selected parameters to reduce the area of pinning exposure (cf. #29454), directly embedded in the current v3 mechanism. We would save the introduction of a new opt-in policy mechanism and any bargaining around the 1000 vb limit.

    v3 sibling eviction (https://github.com/bitcoin/bitcoin/pull/29306, this is almost ready for merge)

    v3 sibling is unsafe w.r.t to replacement cycling attacks as raised here: https://github.com/bitcoin/bitcoin/commit/04fdc0a77f70a998a433a3839c807422bc2e3bfa.

    However, as the imbued v3 patch is of course a giant hack and has downstream impact on LN wallets, the LN maintainers whose applications would be affected should explicitly ACK that idea before we consider merging it. A draft PR that shows what the code could look like is in #29427.

    See my comments on the PR already, due to the inherent malleability of LN commitment transactions (i.e as allowed by channel policy parameters), sounds you can always screw up the template. I would rather recommend to propose a more generic use-case side opt-in imbuance mechanism.

    My recommendations:

    • abandons v3 sibling evictions (it introduces new security exposure)
    • deploy #27463 as it allows to remove CPFP carve-out thanks to package-RBF
    • cluster-mempool can pursue its own review process - we might be stuck to check DoS risks simulations for a while
  37. ariard commented at 2:54 am on March 4, 2024: member

    @sdaftuar I’ll appreciate an answer on my remark on the duality of development standards between what we’re doing on consensus and policy changes. As pointed above, you did stand in the past for being conservative and patient in the context of Taproot deployment in the lack of consensus among bip 8 / bip 9.

    I don’t understand that you’re advocating different standard of development behavior today. While I certainly understand an enhancement of mempool processing logic is welcome, and that we shouldn’t delay that much its deployment, I’m perplexed with your $YOLO behavior in matters of reviewing anti-pinning mitigations.

    If one does the quick math, we have ~4,400 BTC exposed in public Lightning channels that could be exploited under timelines of 144 blocks (the most conservative cltv_expiry_delta default value among LN implementations iirc) due to not-optimal pinning mitigations. This is far more funds at stake that 144 blocks * 6.250 = 900.000 that we have at risk in case of few hours of chain partitions due to a consensus upgrades.

  38. petertodd commented at 3:09 am on March 13, 2024: contributor

    If one does the quick math, we have ~4,400 BTC exposed in public Lightning channels that could be exploited under timelines of 144 blocks (the most conservative cltv_expiry_delta default value among LN implementations iirc) due to not-optimal pinning mitigations. This is far more funds at stake that 144 blocks * 6.250 = 900.000 that we have at risk in case of few hours of chain partitions due to a consensus upgrades.

    There’s much more at stake than just the block reward. Both intentional and unintentional double spends will inevitably happen if we ever get a chain partition, and the total value of funds moved per block is much higher than the block reward. Even just cleaning up the mess of RBF’s getting mined differently on both sides will not be easy.

    Meanwhile re: Lightning, funds in a channel are not attackable by uninvolved third parties. If Kraken and Bitfinex have a channel with each other, there’s nothing I can do beyond a remote LND exploit to directly steal the funds in that channel. So your 4400BTC estimate is definitely an overestimate.

    I agree that V3 is being rushed and poses obvious risks. But the above wasn’t a solid argument.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-14 07:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me