Package Mempool Submission with Package Fee-Bumping

glozow commented at 3:49 pm on June 20, 2021: member

This implements mempool validation logic for fee-bumping transactions using CPFP and RBFing mempool transactions within a package, and the combination of both (i.e. a child in a package can pay to RBF its parents’ conflicts). This does not implement package relay; there are no P2P changes.

For more info, see full proposal gist.

Package Mempool Accept Progress:

Note that, until package relay exists, fee-bumped package transactions that are otherwise too-low-fee won’t go any further than the user’s mempool. This PR doesn’t expose the package validation codepath to users because it would be misleading.

DrahtBot added the label Mempool on Jun 20, 2021

DrahtBot added the label RPC/REST/ZMQ on Jun 20, 2021

DrahtBot added the label Validation on Jun 20, 2021

DrahtBot commented at 5:19 pm on June 20, 2021: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#22981 (doc: Fix incorrect C++ named args by MarcoFalke)
#22976 (scripted-diff: Rename overloaded int GetArg to GetIntArg by ryanofsky)
#22901 (Improve mempool_package_limits.py by naiza2000)
#22674 (validation: mempool validation and submission for packages of 1 child + parents by glozow)
#22539 (Re-include RBF replacement txs in fee estimation by darosior)
#22097 (validation: Move package acceptance size limit from KvB to WU by ariard)
#21515 (Erlay: bandwidth-efficient transaction relay protocol by naumenkogs)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

glozow force-pushed on Jun 21, 2021

sdaftuar commented at 5:17 pm on June 21, 2021: member

I have some concerns around what semantics are desired for bypassing the fee rate checks for a single transaction and using a notion of package fee rate instead.

I think the logic here — of using the descendant fee rate as an alternate to the transaction’s fee rate — is insufficient for preventing free relay. Consider a 3 transaction package where one child transaction C has two parents, A and B, all of equal size. Suppose A and B are zero-fee transactions and C has a fee rate of 2. Then each of A and B would evaluate to having a fee rate of 1 (with C), but as a package the fee rate would be just 2/3. If the mempool min fee or min relay fee is 1, then this package would make it in despite being below the required fee rate.

I think this type of issue may be somewhat difficult to avoid if we don’t tailor our semantics to the use case(s) we are trying to support. Right now, if I understand correctly, we don’t enforce any particular topology on the packages we accept — in fact I think the package acceptance logic would even accept unrelated transactions as well? One idea I had was to require the whole package’s fee rate to be above the min relay and mempool min fee as well, but that doesn’t work very well if we allow someone to bundle in an unrelated high fee transaction to “pay” for some low fee rate package.

We could check that a package is connected (from a graph theory perspective) as a condition for acceptance, but that is also not quite sufficient for achieving the semantics that I think we want. For instance, if we are processing some package that has a sub graph of transactions which would not make it in on its own, we probably wouldn’t want to admit that whole graph? I’m not quite sure. It seems like if there is a detachable sub graph that would get evicted shortly after acceptance because it’s below the mempool min fee, that might still admit some kind of free relay problem, similar to the issue with unrelated transactions.

My previous approach to the package relay problem was to define packages in a future p2p protocol extension as being the set of unconfirmed ancestors of a single target transaction. If that is sufficient for the use cases we are currently trying to support, then I think that simplifies the concerns a great deal — in this simple case I believe we could just look at two things: (a) check the target transaction’s own fee rate is sufficient to get in, and (b) check that the entire ancestor package for that target transaction also has a total fee rate sufficient to get in. (Of course we’d have to add a check that validates a package only contains ancestor transactions of the target, too.)

The other benefit of using a target transaction’s ancestors as how we define a package is that it lines up better with how the mining algorithm currently works.

If multiple children paying for multiple parents is some desired use case, I’m not sure the mempool and mining code are set up well enough to support that, so it would be helpful to analyze those use cases better to make sure our implementation will work okay.

Rspigler commented at 9:37 pm on June 21, 2021: contributor

in this simple case I believe we could just look at two things: (a) check the target transaction’s own fee rate is sufficient to get in, and (b) check that the entire ancestor package for that target transaction also has a total fee rate sufficient to get in.

This seems reasonable.

glozow commented at 11:05 am on June 22, 2021: member

Thank you for the thoughtful review @sdaftuar!

I think the logic here — of using the descendant fee rate as an alternate to the transaction’s fee rate — is insufficient for preventing free relay. Consider a 3 transaction package where one child transaction C has two parents, A and B, all of equal size. Suppose A and B are zero-fee transactions and C has a fee rate of 2. Then each of A and B would evaluate to having a fee rate of 1 (with C), but as a package the fee rate would be just 2/3. If the mempool min fee or min relay fee is 1, then this package would make it in despite being below the required fee rate.

Great point. I had been thinking of descendant feerate as a good marker since that’s how we evict from mempool, but it is imperfect: I think we already have the case where a transaction’s ancestor score is too low to be mined, but descendant score too high to be evicted. And it’s additionally problematic with package relay.

A proposal: if the mempool is intended to store the best candidates for mining, then we should evict in the opposite order we include in blocks, which is ancestor score. So a transaction’s minerscore = max([ancestorfeerate(tx) for tx in {itself, all its descendants}]). (This is with the current mining code - I suppose the definition of minerscore would be updated if/when block template creation changes).

If we replaced descendant feerate with minerscore, would that solve this problem? We go through the package, calculate everyone’s ancestor feerate (including in-package and in-mempool ancestors), then we calculate everyone’s minerscore based on that? Everyone’s minerscore must surpass the min mempool/relay feerate. I think, then, it might not be necessary to specify/figure out which transactions are sponsees and which ones are sponsors?

(Very far down the line, but just throwing a thought out there: considering feefilters with package relay, I think we would also want to use [unmodified] minerscore for feefiltering as well).

sdaftuar commented at 9:29 pm on June 22, 2021: member

A proposal: if the mempool is intended to store the best candidates for mining, then we should evict in the opposite order we include in blocks, which is ancestor score. So a transaction’s minerscore = max([ancestorfeerate(tx) for tx in {itself, all its descendants}]). (This is with the current mining code - I suppose the definition of minerscore would be updated if/when block template creation changes).

If we replaced descendant feerate with minerscore, would that solve this problem?

I think we should consider these two things separately: (1) whether to change the eviction algorithm used by the mempool, and (2) whether to change the fee rate heuristic used to evaluate transaction / package acceptance.

Regarding the use of the miner score based on maximum-ancestor-feerate-of-descendants, I think that heuristic doesn’t work very well for eviction. Imagine this scenario: the mempool has a very large, very low fee rate transaction A, with children B and C. C also is a child of another low feerate parent D.

It is possible then that B has such a high feerate that A and B would be selected for the next block, and then that C and D would be selected as well (once A is paid for by B, C’s ancestor feerate score would go up). However, if A is in fact very large, then C’s own ancestor fee rate could be quite low, so that D’s minerscore could be very small. This might mean that D and C could be evicted if the mempool were full, even if they would be selected for the next block!

(As an aside I think the worst-case computation required to maintain this score would be worse than the status quo, too – going from O(n) to O(n^2) to update statistics in the mempool when transactions are added/removed, where n = ancestor/descendant count.)

However regarding the heuristic we use for admitting a package to the mempool, I think using this max-ancestor-feerate-of-descendants as an additional check that we compare to the min-relay-fee and mempool-min-fee probably does work. It might mean that packages which include a transaction that would be relying on some in-mempool-sibling to pay for a low fee parent might not make it in, but if that’s not a use case we’re worried about then probably this is fine (if conservative)?

That seems to be a generalization of what I had proposed; I had suggested requiring a single ancestor-package that passes the fee rate check in total, while you’re saying we can just require that every transaction in the package be part of some ancestor package that would pass the fee rate check. I’ll give that more thought but it seems like a plausible solution.

EDIT: One additional thought, assuming this idea works at all I think it ought to be sufficient to restrict the calculation of the score to the package transactions alone. That is, for each transaction, calculate its fee rate for mempool acceptance as max([in-package-ancestor-fee-rate(tx) for tx in {itself, descendants}], where the in-package-ancestor-fee-rate(tx) is defined to be the min(tx fee rate, tx's fee rate including in-package ancestors). And then for each transaction in the package, evaluate that fee rate against the mempool min fee and the min relay fee. The idea is that we only need to ensure that we’re paying enough to justify relay of the transactions – when the network gets busier, the mempool min fee goes up so as long as the new transaction packages being relayed are continuing to have their fee rates go up too, that should be good enough to prevent free relay. There shouldn’t be any need to look at the fee rates of the transactions that are already in the mempool.

glozow commented at 3:58 pm on June 23, 2021: member

Right, I agree with the separation between mempool eviction policy and evaluation of package transactions. Won’t pursue the former much further here…

My plan of attack after the IRC discussion last night is going to be 1 parent + 1 child packages (perhaps these can also be thought of as 1 sponsee + 1 sponsor) and I think checking everyone’s max-ancestor-feerate-of-descendants or descendant feerate of sponsee & ancestor feerate of sponsor would all work.

Edit: I’m going for 1 child + multiple parents instead, so package feerate is what I’ll use.

One additional thought, assuming this idea works at all I think it ought to be sufficient to restrict the calculation of the score to the package transactions alone. That is, for each transaction, calculate its fee rate for mempool acceptance as max([in-package-ancestor-fee-rate(tx) for tx in {itself, descendants}], where the in-package-ancestor-fee-rate(tx) is defined to be the min(tx fee rate, tx’s fee rate including in-package ancestors). And then for each transaction in the package, evaluate that fee rate against the mempool min fee and the min relay fee. The idea is that we only need to ensure that we’re paying enough to justify relay of the transactions – when the network gets busier, the mempool min fee goes up so as long as the new transaction packages being relayed are continuing to have their fee rates go up too, that should be good enough to prevent free relay. There shouldn’t be any need to look at the fee rates of the transactions that are already in the mempool.

Good point!!!

glozow force-pushed on Aug 10, 2021

glozow force-pushed on Aug 13, 2021

glozow force-pushed on Sep 2, 2021

glozow force-pushed on Sep 8, 2021

fanquake referenced this in commit b8336b22d3 on Sep 10, 2021

glozow force-pushed on Sep 10, 2021

glozow force-pushed on Sep 22, 2021

[packages] add sanity checks for package vs mempool limits f1b86db4e5

[packages] distinguish severity of package errors

No change in behavior.
This will help us distinguish between punishable protocol-violation
erros and mempool policy-related rejections later. Starting the
distinction here because we will be enforcing IsChildWithParents on
mempool submissions.

d9f3230248

glozow force-pushed on Sep 23, 2021

[validation] case-based constructors for ATMPArgs

No change in behavior.
ATMPArgs can continue to have granular rules like switching BIP125
on/off while we create an interface for the different sets of rules for
single transactions vs multiple-testmempoolaccept vs package validation.
This is a cleaner interface than manually constructing the args, which
makes it easy to mix up ordering, use the wrong default, etc. It also
means we don't need to edit ATMP/single transaction validation code
every time we update ATMPArgs for package validation.

014f2af89a

[validation] cache virtual size in Workspace struct

This is not only cleaner but also helps make sure we are always using
the virtual size measure that includes sigops.
Not in the scripted-diff because there are many other instances of nSize
within validation.cpp and it's simpler to keep in a separate commit.

c075453c69

[validation/rpc] use MempoolAcceptResult vsize for fee check and results

This uses the vsize that includes sigops weight. It also prevents a bug
where, we swap a package transaction for a same-txid-different-wtxid
mempool transaction and put the wrong virtual size in the RPC result.

27179e111a

[validation] make descendant limits per-transaction

Note to reviewers: this commit should not change any current behavior
because package conflicts are disallowed.

Descendant limits are still mempool-wide. When a transaction has
conflicts in the mempool, descendant limits may be increased (in
PreChecks) to account for replaced transactions. Not doing this now
would cause a bug where every package transaction could increase the
descendant limit despite having unrelated conflicts, resulting in an
inflated descendant limit.

3c6be6e383

scripted-diff: clean up MemPoolAccept aliases

The aliases are leftover from a previous MOVEONLY refactor - they are
unnecessary and removing them reduces the diff for splitting out mempool
Checks from PreChecks, making RBF variables MemPoolAccept-wide, etc.

-BEGIN VERIFY SCRIPT-

unalias() { sed -i "s:\<$1\>:$2:g" src/validation.cpp; sed -i "/$2 = $2/d" src/validation.cpp; }

unalias nModifiedFees ws.m_modified_fees
unalias nConflictingFees ws.m_conflicting_fees
unalias nConflictingSize ws.m_conflicting_size
unalias fReplacementTransaction ws.m_replacement_transaction
unalias setConflicts ws.m_conflicts
unalias allConflicting ws.m_all_conflicting
unalias setAncestors ws.m_ancestors

-END VERIFY SCRIPT-

ea42573603

MOVEONLY: mempool checks to their own functions

No change in behavior, because package transactions would not be going
through the rbf logic in PreChecks anyway (BIP125 is currently disabled
for package acceptance, see ATMPArgs).

We draw the line here because each individual transaction in package
validation still goes through all PreChecks. For example, checking that
one's own conflicts and dependencies are disjoint (a consensus check)
and individual transaction mempool ancestor/descendant limits.

94f395a033

[validation/refactor] store precomputed txdata in workspace

We want to be able to re-use the precomputed transaction data between
PolicyScriptChecks and ConsensusScriptChecks in
AcceptMultipleTransactions.

5c9d89efc1

[packages] add static IsChildWithParents function cc7af5c1b2

[unit test] static package checks 2be0e3d89a

[validation] full package accept + mempool submission 87b4c7b967

[policy] require packages to be child + all unconfirmed parents 52cbdb3af0

[rpc] add new submitrawpackage RPC

This is meant to be similar to sendrawtransaction, so it throws a
JSONRPCError when something goes wrong, and calls BroadcastTransaction()
when transactions succeed.

f2d8a872e9

[functional test] submitrawpackage RPC ffbfb716d2

[packages/policy] use package feerate in package validation

This allows CPFP within a package prior to submission to mempool.

edeb0749ab

[test] package submission with CPFP'd zero-fee parent

Update the existing unit test to have 0 fees in parent, 1BTC fees in
child. This makes the parent too-low-fee by itself, but enough with its
child.

f37e126bc8

[validation] make most RBF members MemPoolAccept-wide

No change in behavior.

For single transaction acceptance, this is a simple refactor:
Workspace::m_all_conflicting -> MemPoolAccept::m_all_conflicts
Workspace::m_replacement_transaction -> MemPoolAccept::m_rbf
Workspace::m_conflicting_fees -> MemPoolAccept::m_conflicting_fees
Workspace::m_conflicting_size -> MemPoolAccept::m_conflicting_size
Workspace::m_replaced_transactions -> MemPoolAccept::m_replaced_transactions

For package acceptance, we don't enable RBF yet, but we want these to be
package-wide variables because:
- Transactions will never have the same conflict by prevout in the
package, but they could have the same conflict by tx, and their
conflicts could share descendants.
- We want to compare conflicts with the package fee rather than
individual transaction fee.

40c368a236

[packages/policy] implement package RBF 1a0b766ab1

[functional test] package RBF

Created a new file because rpc_packages.py is for testing that the RPCs
are returning the results we want. feature_package_relay is for testing
the special policies and behaviors like CPFP and RBF.

98238e7325

[functional test] cpfp and rbf together 4768198eec

[validation] implement ancestor comparison for RBF

Gated on an ATMPARgs value until later, so there are no behavior
changes in this commit.

634f950f25

[policy] allow new unconfirmed if higher ancestor score in package RBF 042790acde

[functional test] package RBF with ancestor scores 10f20eaeb6

[validation/packages] handle package transactions already in mempool

As node operators are free to set their mempool policies however they
please, it's possible for package transaction(s) to already be in the
mempool. We definitely don't want to reject the entire package in that
case (as that could be a censorship vector).

We still need to return the successful result to the caller, so add
another result type to MempoolAcceptResult.

cda84edb09

[functional test] partially submitted packages 535f311ed6

glozow force-pushed on Oct 20, 2021

DrahtBot commented at 11:18 pm on December 13, 2021: contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

DrahtBot added the label Needs rebase on Dec 13, 2021

laanwj referenced this in commit 216f4ca9e7 on Dec 15, 2021

glozow commented at 2:38 pm on January 25, 2022: member

Next PR ready for review is #24152.

LarryRuane commented at 5:38 pm on March 29, 2022: contributor

Hi @sdaftuar, regarding your comment:

Consider a 3 transaction package where one child transaction C has two parents, A and B, all of equal size. Suppose A and B are zero-fee transactions and C has a fee rate of 2. Then each of A and B would evaluate to having a fee rate of 1 (with C), but as a package the fee rate would be just 2/3. If the mempool min fee or min relay fee is 1, then this package would make it in despite being below the required fee rate.

This is an interesting problem, after some thought I wrote a little demo to better understand these concepts, and I think it addresses some of these questions. I thought you and @glozow might find this helpful:

https://github.com/LarryRuane/bitcoin_package_accept_demo

fanquake referenced this in commit d844b5e799 on Apr 7, 2022

DrahtBot commented at 0:44 am on December 22, 2022: contributor

There hasn’t been much activity lately and the patch still needs rebase. What is the status here?

Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
Is it no longer relevant? ➡️ Please close.
Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.

glozow commented at 12:14 pm on December 22, 2022: member

Actually that’s a good point @DrahtBot - this PR was made to track progress on package relay / package mempool stuff but it’s not great at it since there are now multiple branches, v3 is separated, and there are things in multiple repos. Possibly more helpful, here’s a board to track everything package relay-related: https://github.com/orgs/bitcoin/projects/4/views/1?layout=board

glozow closed this on Dec 22, 2022

flack commented at 8:01 pm on December 22, 2022: contributor

@glozow that link gives a 404 for me. Also the board you mention isn’t listed in https://github.com/orgs/bitcoin/projects?query=is%3Aopen. Or is it only accessible to members?

glozow commented at 11:08 am on January 4, 2023: member

@flack apologies I didn’t realize! It should be public now.

bitcoin locked this on Jan 4, 2024

Package Mempool Submission with Package Fee-Bumping #22290

Conflicts