cluster mempool: Implement changeset interface for mempool

sdaftuar commented at 5:46 pm on October 20, 2024: member

part of cluster mempool: #30289

It became clear while working on cluster mempool that it would be helpful for transaction validation if we could consider a full set of proposed changes to the mempool – consisting of a set of transactions to add, and a set of transactions (ie conflicts) to simultaneously remove – and perform calculations on what the mempool would look like if the proposed changes were to be applied. Two specific examples of where we’d like to do this:

Determining if ancestor/descendant/TRUC limits would be violated (in the future, cluster limits) if either a single transaction or a package of transactions were to be accepted
Determining if an RBF would make the mempool “better”, however that idea is defined, both in the single transaction and package of transaction cases

In preparation for cluster mempool, I have pulled this reworking of the mempool interface out of #28676 so it can be reviewed on its own. I have not re-implemented ancestor/descendant limits to be run through the changeset, since with cluster mempool those limits will be going away, so this seems like wasted effort. However, I have rebased #28676 on top of this branch so reviewers can see what the new mempool interface could look like in the cluster mempool setting.

There are some minor behavior changes here, which I believe are inconsequential:

In the package validation setting, transactions would be added to the mempool before the ConsensusScriptChecks() are run. In theory, ConsensusScriptChecks() should always pass if the PolicyScriptChecks() have passed and it’s just a belt-and-suspenders for us, but if somehow they were to diverge then there could be some small behavior change from adding transactions and then removing them, versus never adding them at all.
The error reporting on CheckConflictTopology() has slightly changed due to no longer distinguishing between direct conflicts and indirect conflicts. I believe this should be entirely inconsequential because there shouldn’t be a logical difference between those two ideas from the perspective of this function, but I did have to update some error strings in some tests.
Because, in a package setting, RBFs now happen as part of the entire package being accepted, the logging has changed slightly because we do not know which transaction specifically evicted a given removed transaction.
- Specifically, the “package hash” is now used to reference the set of transactions that are being accepted, rather than any single txid. The log message relating to package RBF that happen in the TXPACKAGES category has been updated as well to include the package hash, so that it’s possible to see which specific set of transactions are being referenced by that package hash.
- Relatedly, the tracepoint logging in the package rbf case has been updated as well to reference the package hash, rather than a transaction hash.

DrahtBot commented at 5:46 pm on October 20, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31122.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	instagibbs, ismaelsadeeq, naumenkogs, glozow

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#31318 (Drop script_pub_key arg from createNewBlock by Sjors)
#31283 (Add waitNext() to BlockTemplate interface by Sjors)
#30964 (Disable RBF Rule 2 by arik-so)
#30391 (BlockAssembler: return selected packages virtual size and fee by ismaelsadeeq)
#30157 (Fee Estimation via Fee rate Forecasters by ismaelsadeeq)
#29680 (wallet: fix unrelated parent conflict doesn’t cause child tx to be marked as conflict by Eunovo)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

laanwj added the label Mempool on Oct 20, 2024

DrahtBot added the label CI failed on Oct 20, 2024

DrahtBot commented at 7:19 pm on October 20, 2024: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/31793556861

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

sdaftuar force-pushed on Oct 20, 2024

DrahtBot removed the label CI failed on Oct 21, 2024

sdaftuar force-pushed on Oct 21, 2024

in src/txmempool.h:831 in fc03c77f8b outdated

826+        CTxMemPoolChangeSet& operator=(const CTxMemPoolChangeSet&) = delete;
827+
828+        using TxHandle = CTxMemPool::txiter;
829+
830+        TxHandle AddTx(const CTransactionRef& tx, const CAmount fee, int64_t time, unsigned int entry_height, uint64_t entry_sequence, bool spends_coinbase, int64_t sigops_cost, LockPoints lp);
831+        void RemoveTx(CTxMemPool::txiter it) { m_all_conflicts.insert(it); }

glozow commented at 0:57 am on October 22, 2024:

StageAddition and StageRemoval seem a bit more intuitive, but it’s a nit

sdaftuar commented at 4:55 pm on October 22, 2024:

Agreed, done.

in src/txmempool.h:849 in fc03c77f8b outdated

844+        CTxMemPool* m_pool;
845+        CTxMemPool::indexed_transaction_set m_stage_tx;
846+        std::vector<CTxMemPool::txiter> m_entry_vec; // track the added transactions' insertion order
847+        // map from the stage_tx index to the ancestors for the transaction
848+        std::map<CTxMemPool::txiter, CTxMemPool::setEntries, CompareIteratorByHash> m_ancestors;
849+        CTxMemPool::setEntries m_all_conflicts;

glozow commented at 1:50 am on October 22, 2024:

Similarly, m_stage_tx and m_all_conflicts could be m_to_add and m_to_remove?

sdaftuar commented at 4:56 pm on October 22, 2024:

Agreed, done.

in src/txmempool.h:883 in 1282bc0ac2 outdated

853+        auto ret = std::make_unique<CTxMemPoolChangeSet>(this);
854+        m_changeset = ret.get();
855+        return ret;
856+    }
857+
858+    CTxMemPoolChangeSet* m_changeset GUARDED_BY(cs){nullptr};

glozow commented at 1:53 am on October 22, 2024:

Question: why isn’t this a shared_ptr / what’s the purpose of the unique_ptr? I get that there should only ever be one changeset, but I’m confused by the fact that this carries a copy of the underlying raw pointer, as the unique_ptr then doesn’t seem very unique.

sdaftuar commented at 4:39 pm on October 22, 2024:

This only exists for sanity checking that we don’t have more than one changeset at a time. We could just as easily replace this with a bool; I made it a pointer mostly for debugging purposes, if the Assume() were ever hit.

The idea behind the unique_ptr is that I wanted all the state in the ChangeSet to magically go away whenever the object goes out of scope, so that I didn’t have to explicitly clean up state everywhere that a transaction might fail validation (which seems like a lot of places). In this PR, there’s not a ton of state, but I expect that when we implement cluster mempool that we’ll be creating similar staging data structures within the TxGraph layer that will exist one level below the mempool (for computing clusters and feerate diagrams for candidate transactions), and I thought this mechanism of ending a staging when the changeset goes out of scope would be an easy way to manage things at that layer as well. So in the future you could imagine that in the CTxMemPoolChangeSet constructor, we might initiate a staging in the underlying TxGraph; in the mempool Apply() function we might also apply the staging to the TxGraph; and in the CTxMemPoolChangeSet destructor we might abort a staging in the underlying TxGraph.

The paradigm I’m looking for is that you can have at most one changeset at a time, and when there’s a changeset outstanding that serves as a lock on the mempool so that you can’t otherwise add/remove things (except via the changeset) until the changes are either applied or aborted, and the changeset is the destroyed. However if there’s a better way to implement that than what I tried here, I’m certainly open to suggestions.

sipa commented at 12:50 pm on October 27, 2024:

I think it suffices to have a bool m_have_changeset GUARDED_BY(cs){false};, as it’s only used for tracking and enforcing the “no two changesets” rule?

sdaftuar commented at 8:07 pm on October 27, 2024:

Done.

in src/validation.cpp:1312 in a8d3118d0c outdated

1318-                hash.ToString(),
1319-                tx.GetWitnessHash().ToString(),
1320-                workspaces[0].m_tx_handle->GetFee(),
1321-                workspaces[0].m_tx_handle->GetTxSize());
1322+                it->GetTxSize());
1323+        FeeFrac feerate = m_subpackage.m_changeset->GetAggregateFeeRate();

glozow commented at 2:00 am on October 22, 2024:

You could use m_subpackage.m_total_modified_fees, m_subpackage.m_total_vsize here? Unless you feel that CTxMemPoolChangeSet::GetAggregateFeeRate will be useful in the future.

sdaftuar commented at 4:56 pm on October 22, 2024:

Thanks that is much better; the only reason I had introduced GetAggregateFeeRate() was for access in this one place, so now that is gone.

glozow commented at 2:04 am on October 22, 2024: member

Concept ACK, did a first pass

in src/validation.cpp:1319 in a8d3118d0c outdated

1330+        } else {
1331+            LogDebug(BCLog::MEMPOOL, "New package with %lu txs, fees=%s, vsize=%s\n",
1332+                    m_subpackage.m_changeset->GetTxCount(),
1333+                    feerate.fee,
1334+                    feerate.size);
1335+        }

ismaelsadeeq commented at 1:02 pm on October 22, 2024:

Should this be just logged once; i.e after replacing the transactions.

sdaftuar commented at 4:48 pm on October 22, 2024:

I tried to make this code analogous to the old code, which repeated information for each transaction. I’m open to changing the logging but I am always a little afraid of breaking user’s scripts or habits if people are used to grepping for a transaction ID and seeing more information? (Though I guess this PR already is breaking the logging for package RBF… so maybe not a big concern.)

ismaelsadeeq commented at 12:14 pm on October 23, 2024:

Yeah will break existing grepping scripts, but I think we will need to do this anyway to improve the logging for package replacements.

I think we should log the new package or new transaction replacement only once after logging all the replaced transactions:

First, log all replaced transactions with their ID, witness, fee, and size.
Then:
- For a single transaction replacement, log its ID, witness, fee, and size.
- For a package replacement, log that a new package has been received, along with the package’s number of transactions, fees, and vsize. Then, log each transaction ID and witness in the package.

This approach preserves the previous logging behavior for individual transactions. Currently, in this PR, for a single transaction, the logs show the following:

02024-10-23T10:26:03.802679Z [httpworker.3] [src/validation.cpp:1310] [FinalizeSubpackage] [mempool] replacing mempool tx 6ae77ff52efae84e0cca125e4b5acfe2f3415298f4f98e850b40f89a2aa5990f (wtxid=57f1edd04c15f6532cdddf4a7c8393f5c5e5584f00003855df8c490053027ac8, fees=10000000, vsize=104).
12024-10-23T10:26:03.802686Z [httpworker.3] [src/validation.cpp:1317] [FinalizeSubpackage] [mempool] New tx e089a6391ae23d41baeed46fb1d8ac04cd0db73e82b4b330275f7d7cafcda33c (wtxid=db89e399619446928fb74a41a5207164beb0fd99972ba147b806a6a24f140e13, fees=490000000, vsize=104)

For a package, we see duplicate removal logs, which makes it seem like multiple packages are being added for each replacement:

02024-10-23T10:30:28.361089Z [httpworker.3] [src/validation.cpp:1310] [FinalizeSubpackage] [mempool] replacing mempool tx 05ad4c0db1a2fe169acd15bf5d8d1f4fa0ff834b5e8259baab4219f3d20b8991 (wtxid=39676091a39b5f77ee0865c2c52ac4bb8378ff7283fc1668738155ec247c3868, fees=10000, vsize=104).
12024-10-23T10:30:28.361110Z [httpworker.3] [src/validation.cpp:1322] [FinalizeSubpackage] [mempool] New package with 2 txs, fees=50000, vsize=208
22024-10-23T10:30:28.361127Z [httpworker.3] [src/validation.cpp:1310] [FinalizeSubpackage] [mempool] replacing mempool tx 992204e731c78b7afc6e4b17e14adf0f6b93d472fddc4ba184dd86e1899a83d1 (wtxid=05d9f03194089441530d7503a12b7f36086a2c41deac0960e238e426ed59e5f4, fees=10000, vsize=104).
32024-10-23T10:30:28.361139Z [httpworker.3] [src/validation.cpp:1322] [FinalizeSubpackage] [mempool] New package with 2 txs, fees=50000, vsize=208

With the idea above, the logs for individual transactions will remain the same, but for packages, the logs will look like this:

02024-10-23T11:46:49.459665Z [httpworker.2] [src/validation.cpp:1312] [operator()] [mempool] Replacing mempool tx 05ad4c0db1a2fe169acd15bf5d8d1f4fa0ff834b5e8259baab4219f3d20b8991 (wtxid=39676091a39b5f77ee0865c2c52ac4bb8378ff7283fc1668738155ec247c3868, fees=10000, vsize=104).
12024-10-23T11:46:49.459683Z [httpworker.2] [src/validation.cpp:1312] [operator()] [mempool] Replacing mempool tx 992204e731c78b7afc6e4b17e14adf0f6b93d472fddc4ba184dd86e1899a83d1 (wtxid=05d9f03194089441530d7503a12b7f36086a2c41deac0960e238e426ed59e5f4, fees=10000, vsize=104).
22024-10-23T11:46:49.459696Z [httpworker.2] [src/validation.cpp:1342] [FinalizeSubpackage] [mempool] New package (txs=2, fees=50000, vsize=208)
32024-10-23T11:46:49.459709Z [httpworker.2] [src/validation.cpp:1330] [operator()] [mempool] tx acd7ea806cd261b6c8cdb6d15210dad53daabbe68d5cb99f7df31646a769b69d (wtxid=8a867c2e5aea0d89831d3fe6831535e8cf8ce662ba36b6331fa072babc06f08c)
42024-10-23T11:46:49.459721Z [httpworker.2] [src/validation.cpp:1330] [operator()] [mempool] tx 28deb9d38a4490c3f542f0fd173805eb6c2c8309aa12d5e1669fc71a453d23fe (wtxid=8f8a74f38ed9deada97ef37bb5138b46d16f448da7585e33fc861b4142f764d5)

This will make the logs much easier to parse with a script. See naive diff implementing this change suggestion https://github.com/ismaelsadeeq/bitcoin/commit/555b05680d8e782fbe0d7b4807d2fce23f512b44

sdaftuar commented at 4:06 pm on October 23, 2024:

Thanks for pointing out how the logging looks in this PR – my intent was to preserve the existing behavior in the single transaction replacement case, and not introduce a new line break. (I think the behavior of LogDebug() changed recently in #30929, which made what I wrote here no longer do what I expected.)

So we’re all on the same page, here’s what the logging looks like today (I grabbed this output from the functional tests feature_rbf.py and mempool_package_rbf.py), in the single transaction and package replacement cases:

single transaction:

02024-10-23T14:08:01.250431Z [httpworker.3] [../../src/validation.cpp:1302] [Finalize] [mempool] replacing mempool tx 38335600f2465c0f8bb2b86d5830a34851d86fa879800c0e1434ddfc78c42898 (wtxid=c033cd3efd301c369d66cf759769159609471bd4f9efb3ee30e7209e57b74778, fees=31200, vsize=104). New tx b4001854dc577e21fde7acb4016a9179c3dd5a5af076295d8b0b7c47c35e2032 (wtxid=1f0a0b164999ac0843a8865df347e4fb63191ea3eec539d685b94dcdbeb39fae, fees=10031200, vsize=104)

package:

02024-10-23T14:10:19.726907Z [httpworker.2] [../../src/validation.cpp:1219] [PackageMempoolChecks] [txpackages] package RBF checks passed: parent acd7ea806cd261b6c8cdb6d15210dad53daabbe68d5cb99f7df31646a769b69d (wtxid=8a867c2e5aea0d89831d3fe68315
135e8cf8ce662ba36b6331fa072babc06f08c), child 28deb9d38a4490c3f542f0fd173805eb6c2c8309aa12d5e1669fc71a453d23fe (wtxid=8f8a74f38ed9deada97ef37bb5138b46d16f448da7585e33fc861b4142f764d5)
22024-10-23T14:10:19.727195Z [httpworker.2] [../../src/validation.cpp:1302] [Finalize] [mempool] replacing mempool tx 05ad4c0db1a2fe169acd15bf5d8d1f4fa0ff834b5e8259baab4219f3d20b8991 (wtxid=39676091a39b5f77ee0865c2c52ac4bb8378ff7283fc1668738155ec247c3868, fees=10000, vsize=104). New tx acd7ea806cd261b6c8cdb6d15210dad53daabbe68d5cb99f7df31646a769b69d (wtxid=8a867c2e5aea0d89831d3fe6831535e8cf8ce662ba36b6331fa072babc06f08c, fees=10000, vsize=104)
32024-10-23T14:10:19.727212Z [httpworker.2] [../../src/validation.cpp:1302] [Finalize] [mempool] replacing mempool tx 992204e731c78b7afc6e4b17e14adf0f6b93d472fddc4ba184dd86e1899a83d1 (wtxid=05d9f03194089441530d7503a12b7f36086a2c41deac0960e238e426ed59e5f4, fees=10000, vsize=104). New tx acd7ea806cd261b6c8cdb6d15210dad53daabbe68d5cb99f7df31646a769b69d (wtxid=8a867c2e5aea0d89831d3fe6831535e8cf8ce662ba36b6331fa072babc06f08c, fees=10000, vsize=104)

Here’s what the logging would look like in the commit you shared:

single transaction:

02024-10-23T13:51:10.728066Z [httpworker.3] [../../src/validation.cpp:1312] [operator()] [mempool] Replacing mempool tx 38335600f2465c0f8bb2b86d5830a34851d86fa879800c0e1434ddfc78c42898 (wtxid=c033cd3efd301c369d66cf759769159609471bd4f9efb3ee30e7209e57b74778, fees=31200, vsize=104).
12024-10-23T13:51:10.728089Z [httpworker.3] [../../src/validation.cpp:1333] [operator()] [mempool] New tx b4001854dc577e21fde7acb4016a9179c3dd5a5af076295d8b0b7c47c35e2032 (wtxid=1f0a0b164999ac0843a8865df347e4fb63191ea3eec539d685b94dcdbeb39fae, fees=10031200, vsize=104)

package:

02024-10-23T13:56:13.459972Z [httpworker.1] [../../src/validation.cpp:1230] [PackageMempoolChecks] [txpackages] package RBF checks passed: parent acd7ea806cd261b6c8cdb6d15210dad53daabbe68d5cb99f7df31646a769b69d (wtxid=8a867c2e5aea0d89831d3fe68315
135e8cf8ce662ba36b6331fa072babc06f08c), child 28deb9d38a4490c3f542f0fd173805eb6c2c8309aa12d5e1669fc71a453d23fe (wtxid=8f8a74f38ed9deada97ef37bb5138b46d16f448da7585e33fc861b4142f764d5)
22024-10-23T13:56:13.460177Z [httpworker.1] [../../src/validation.cpp:1312] [operator()] [mempool] Replacing mempool tx 05ad4c0db1a2fe169acd15bf5d8d1f4fa0ff834b5e8259baab4219f3d20b8991 (wtxid=39676091a39b5f77ee0865c2c52ac4bb8378ff7283fc1668738155ec247c3868, fees=10000, vsize=104).
32024-10-23T13:56:13.460227Z [httpworker.1] [../../src/validation.cpp:1312] [operator()] [mempool] Replacing mempool tx 992204e731c78b7afc6e4b17e14adf0f6b93d472fddc4ba184dd86e1899a83d1 (wtxid=05d9f03194089441530d7503a12b7f36086a2c41deac0960e238e426ed59e5f4, fees=10000, vsize=104).
42024-10-23T13:56:13.460243Z [httpworker.1] [../../src/validation.cpp:1342] [FinalizeSubpackage] [mempool] New package (txs=2, fees=50000, vsize=208)
52024-10-23T13:56:13.460260Z [httpworker.1] [../../src/validation.cpp:1330] [operator()] [mempool] tx acd7ea806cd261b6c8cdb6d15210dad53daabbe68d5cb99f7df31646a769b69d (wtxid=8a867c2e5aea0d89831d3fe6831535e8cf8ce662ba36b6331fa072babc06f08c)
62024-10-23T13:56:13.460274Z [httpworker.1] [../../src/validation.cpp:1330] [operator()] [mempool] tx 28deb9d38a4490c3f542f0fd173805eb6c2c8309aa12d5e1669fc71a453d23fe (wtxid=8f8a74f38ed9deada97ef37bb5138b46d16f448da7585e33fc861b4142f764d5)

My proposal would be to leave the single-transaction case exactly as-is, so I can just fix what I tried to before to avoid the extra line break. For the package case, given that the txpackages log level is already reporting what transactions are in the package, does it make sense to duplicate that information in the mempool log category as well?

An issue that occurred to me with splitting up log messages for what’s in a package across multiple lines is that other threads may be logging simultaneously and cause messages to be interleaved, making the resulting log difficult to parse.

Another option would be to simply augment the existing txpackages log message to include the aggregate fee/size of the package, and reference the cumulative fee/size of the package that evicted a transaction. Or we could be even more precise and use some kind of identifier for the package (perhaps BIP 331’s “Combined Hash”?) in our logging. Now that I think about it, this may make the most sense: if we logged the combined hash in the txpackages log message, and referenced the combined hash in the replacement message, I think that would be pretty good. And we could use the same combined in the tracepoints as well… Thoughts?

ismaelsadeeq commented at 4:35 pm on October 23, 2024:

For the package case, given that the txpackages log level is already reporting what transactions are in the package, does it make sense to duplicate that information in the mempool log category as well

No it does not

An issue that occurred to me with splitting up log messages for what’s in a package across multiple lines is that other threads may be logging simultaneously and cause messages to be interleaved, making the resulting log difficult to parse.

Good point

if we logged the combined hash in the txpackages log message, and referenced the combined hash in the replacement message, I think that would be pretty good. And we could use the same combined in the tracepoints as well

This makes sense +1.

At some point I was thinking whether we should add the package sponsor tx id in the tracing API, but I think the combinedHash is better.

sdaftuar commented at 10:10 am on October 24, 2024:

@0xB10C Any thoughts on changing the way replacement logging works via tracepoints, in the package RBF case? One issue I can think of with what I’m proposing here is that the “package hash” we’d be sending is not one that would come through any other tracing messages – would that be a problem?

If that is a problem, would it make sense to add a tracepoint that lines up with the existing txpackages debug log message, which (as of this PR) would now report the package hash along with the transaction ids that are in it?

0xB10C commented at 2:06 pm on October 29, 2024:

Thanks for the ping, sorry for the late response. I agree that a “package hash” on it’s own might not be useful for someone interested in “which transactions were replaced by which other transactions”. On the other hand, the tracepoints are on purpose only semi-stable as they expose internals. If the internals change, the tracepoints need to change too.

which (as of this PR) would now report the package hash along with the transaction ids that are in it?

That could be useful, yes. However, constructing and passing arrays/lists to tracepoints is probably not something we want / can do (at least not without #26593).

One solution could be to update the current tracepoint to a TRACE8 and have a bool argument indicating if tx_or_package_hash is a tx or package hash with m_subpackage.m_changeset->GetTxCount() == 1. This wouldn’t allow to get data on “which transactions were replaced by which other transactions”, but consumers would know that the hash is either referring to a tx or package. While I’m currently using the tracepoint to monitor count, size and fees in replacements, I don’t use the txids at the moment. I used them a while ago for fullrbf.mempool.observer.

also, CC @virtu who initially worked on these tracepoints. Maybe you have another idea.

0xB10C commented at 2:44 pm on November 7, 2024:

One solution could be to update the current tracepoint to a TRACE8 and have a bool argument indicating if tx_or_package_hash is a tx or package hash with m_subpackage.m_changeset->GetTxCount() == 1.

This could look like e.g. https://github.com/0xB10C/bitcoin/commit/2eb1410e774636040d19d3baf5e22b2e4a8fbfd2

sdaftuar commented at 5:38 pm on November 7, 2024:

Thanks! I’ve included your commit (and will mark this as resolved).

in src/validation.cpp:1329 in a8d3118d0c outdated

1339                 it->GetFee(),
1340                 std::chrono::duration_cast<std::chrono::duration<std::uint64_t>>(it->GetTime()).count(),
1341-                hash.data(),
1342-                workspaces[0].m_tx_handle->GetTxSize(),
1343-                workspaces[0].m_tx_handle->GetFee()
1344+                m_subpackage.m_changeset->GetTx(0).GetHash().data(),

ismaelsadeeq commented at 1:07 pm on October 22, 2024:

Why are we only tracing the first tx in the package

sdaftuar commented at 4:49 pm on October 22, 2024:

Not sure if the tracing API is well-specified for package based replacement. I’m open to doing something different than what I propose here though, suggestions?

ismaelsadeeq commented at 12:21 pm on October 23, 2024:

Its not specified, I think we have to specify tracing api for packages;

Should be something like

Tracepoint `mempool:replaced_by_package`

Is called when a transaction in the node’s mempool is getting replaced by a package. Passes information about the replaced transaction and the replacement package.

Arguments passed:

Replaced transaction ID (hash) as pointer to unsigned chars (i.e. 32 bytes in little-endian)
Replaced transaction virtual size as int32
Replaced transaction fee as int64
Replaced transaction mempool entry time (epoch) as uint64
Replacement package virtual size as int32
Replacement package fee as int64

sdaftuar commented at 10:11 am on October 24, 2024:

Tracking this issue above (https://github.com/bitcoin/bitcoin/pull/31122#discussion_r1814692827), so marking this conversation as resolved.

ismaelsadeeq commented at 1:10 pm on October 22, 2024: member

Concept ACK

The changes are straightforward to understand, and the large diff is primarily due to changes in the error message and the removal of addUnchecked.

sdaftuar force-pushed on Oct 22, 2024

sdaftuar commented at 5:01 pm on October 22, 2024: member

Thanks @glozow and @ismaelsadeeq for the review; I’ve incorporated some of the comments. If you have suggestions for how the logging can be improved in the package RBF case (and how the USDT tracing messages can be improved, similarly) please let me know your thoughts.

Also, I wanted to flag that the locking is a bit of a mess in this branch. Initially, I tried to add lock annotations to the CTxMemPoolChangeSet functions to guarantee that the mempool itself was locked whenever any of those functions were invoked. I had mixed results with the compiler understanding those annotations, and ran into a roadblock at some point with a unit test which was taking the mempool lock and should have satisfied the annotations, but the compiler wasn’t recognizing the lock as being the same one that needed to be held. I assume the complication is mostly that the changeset is a different object than the mempool, so one thing I could try is moving all the changeset methods to instead be methods of the mempool itself (which take a changeset as an argument, potentially) – but that also seems like a little bit of a mess to me, so wasn’t sure if that was worth doing.

sdaftuar force-pushed on Oct 22, 2024

in src/validation.cpp:1346 in 88e109f109 outdated

1330         m_subpackage.m_replaced_transactions.push_back(it->GetSharedTx());
1331     }
1332-    m_pool.RemoveStaged(m_subpackage.m_all_conflicts, false, MemPoolRemovalReason::REPLACED);
1333+    ws.m_changeset->Apply();
1334     // Don't attempt to process the same conflicts repeatedly during subpackage evaluation:
1335     // they no longer exist on subsequent calls to Finalize() post-RemoveStaged

instagibbs commented at 3:13 pm on October 23, 2024:

s/RemoveStaged/Apply/

sdaftuar commented at 10:07 am on October 24, 2024:

Fixed.

in src/txmempool.h:878 in 88e109f109 outdated

843+    private:
844+        CTxMemPool* m_pool;
845+        CTxMemPool::indexed_transaction_set m_to_add;
846+        std::vector<CTxMemPool::txiter> m_entry_vec; // track the added transactions' insertion order
847+        // map from the m_to_add index to the ancestors for the transaction
848+        std::map<CTxMemPool::txiter, CTxMemPool::setEntries, CompareIteratorByHash> m_ancestors;

instagibbs commented at 3:41 pm on October 23, 2024:

note: circa 88e109f1091561639aca323cb455180317a12b32 this field is set but never read from making the CalculateMemPoolAncestors wrapper a bit mysterious. Could a motivating comment be given?

sdaftuar commented at 8:42 pm on October 23, 2024:

My thought process here was to mimic the old behavior of having the in-mempool ancestors calculated once and then cached for any further needs, such as adding the transaction to the mempool (in the single transaction case).

In the package setting, we calculate the in-mempool ancestors of each transaction in a package, and I believe we use that information for some policy checks, but we throw it away as transactions get added.

This all goes away with cluster mempool, so I didn’t spend a lot of time thinking about this. Possibly it would be clearer to just not bother with the caching at all, but then we lose the optimization… Thoughts?

instagibbs commented at 6:09 pm on October 25, 2024:

logically seems to work, probably not worth nitting

sdaftuar commented at 6:04 pm on October 31, 2024:

I think this is resolved now with the rewrite of ChangeSet::CalculateMemPoolAncestors() – let me know if you agree.

instagibbs commented at 6:08 pm on October 31, 2024:

this is fine, feel free to resolve it

instagibbs commented at 7:20 pm on October 31, 2024:

ok so I think I was a bit wrong, these values were being used and would “blind” us potentially from catching the “BUG! Mempool ancestors or descendants were underestimated” error.

in src/txmempool.h:876 in 88e109f109 outdated

841+        void Apply() EXCLUSIVE_LOCKS_REQUIRED(cs_main);
842+
843+    private:
844+        CTxMemPool* m_pool;
845+        CTxMemPool::indexed_transaction_set m_to_add;
846+        std::vector<CTxMemPool::txiter> m_entry_vec; // track the added transactions' insertion order

instagibbs commented at 3:43 pm on October 23, 2024:

what order should caller call things, e.g. StageAddition? I think some notes for this class would help with assumptions for caller.

sdaftuar commented at 8:42 pm on October 23, 2024:

My intention was to allow calling StageAddition and StageRemoval in any order. I can add comments to that effect.

sdaftuar commented at 0:43 am on October 25, 2024:

Added docs in 7bc0f1f48c079b019f1d7a3fef00ae711f25d624

sdaftuar commented at 8:55 am on October 28, 2024:

Update: I realized that at least for now, there is a requirement that any added transactions are added in a topological order, with parents coming before children – this is so that we can add things to the mempool’s multi_index at Apply() time without creating any missing dependencies. But removals can happen in any order and can be interleaved with the additions.

in src/validation.cpp:1334 in 32101330ec outdated

1332         );
1333         m_subpackage.m_replaced_transactions.push_back(it->GetSharedTx());
1334     }
1335-    ws.m_changeset->Apply();
1336+    m_subpackage.m_changeset->Apply();
1337+    m_subpackage.m_changeset.reset();

instagibbs commented at 4:02 pm on October 23, 2024:

I think “cleanup” of m_subpackage can now all be managed inside ClearSubPackageState, including the m_all_conflicts clear below.

sdaftuar commented at 0:36 am on October 25, 2024:

After looking at this a little more, I don’t think we need m_all_conflicts to live in SubPackageState anymore – FinalizeSubpackage can grab this information from the changeset and then the other two places are just local uses within a function.

Pushed a commit to get rid of this extra state.

instagibbs commented at 5:56 pm on October 28, 2024:

think this is right, you can resolve this

in src/txmempool.cpp:1384 in 32101330ec outdated

1380@@ -1381,7 +1381,13 @@ void CTxMemPool::CTxMemPoolChangeSet::Apply()
1381     m_pool->RemoveStaged(m_to_remove, false, MemPoolRemovalReason::REPLACED);
1382     for (size_t i=0; i<m_entry_vec.size(); ++i) {
1383         auto tx_entry = m_entry_vec[i];
1384-        m_pool->addUnchecked(*tx_entry);
1385+        if (i == 0 && m_ancestors.count(tx_entry)) {

instagibbs commented at 4:05 pm on October 23, 2024:

as of 32101330ec6f6e9c4d467dec370c848cf4d14f23 , now the first entry of m_entry_vec now uses m_ancestors but no others.

Am I understanding the extend of its use?

instagibbs commented at 4:10 pm on October 23, 2024:

0        if (i == 0 && m_ancestors.contains(tx_entry)) {

sdaftuar commented at 8:43 pm on October 23, 2024:

Yes this matches the old behavior, I believe.

sdaftuar commented at 10:14 am on October 24, 2024:

I think this is done?

sdaftuar force-pushed on Oct 23, 2024

DrahtBot added the label CI failed on Oct 23, 2024

DrahtBot commented at 5:12 pm on October 23, 2024: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/31962210262

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

in src/validation.cpp:1477 in 1360fd36d1 outdated

1477-        Assume(ws.m_state.GetResult() == TxValidationResult::TX_RECONSIDERABLE);
1478-        return MempoolAcceptResult::FeeFailure(ws.m_state, CFeeRate(ws.m_modified_fees, ws.m_vsize), {ws.m_ptx->GetWitnessHash()});
1479+    FinalizeSubpackage(args);
1480+
1481+    // Limit the mempool, if appropriate.
1482+    if (!args.m_bypass_limits) {

instagibbs commented at 5:49 pm on October 23, 2024:

The dropping of m_package_submission seems problematic, since that’s the original use-case for the argument: #28251

In this PR it’s essentially no longer used.

it has a regression test that isn’t failing in this commit/PR, so either it’s resolved a different way or the test fell over or I don’t understand what’s going on: https://github.com/bitcoin/bitcoin/pull/28251/commits/32c1dd1ad65af0ad4d36a56d2ca32a8481237e68#diff-63dc84ee23b871d1f4931c4f261b3c6b815bd8d5a65098a61bce8ea9b6b81965R81

sdaftuar commented at 10:19 am on October 24, 2024:

Agreed, I have restored this argument where LimitMempoolSize() is invoked in AcceptSingleTransaction(), and I pulled this change out into its own commit so that it’s easier to review and less tangled with the other changes.

More thoughts on this here. It would be nice to have a test that packages which involve a parent transaction that would be just at the bottom of the mempool without their children are able to make it in with a child, if the resulting package would score higher in the mempool than other transactions.

instagibbs commented at 4:49 pm on October 24, 2024:

I’m working on a test case for master for that case

sdaftuar commented at 0:44 am on October 25, 2024:

I think this is all good now (your test in #31152 passes for me locally), so marking this as resolved.

glozow commented at 1:39 am on October 25, 2024:

nice catch

in src/test/fuzz/rbf.cpp:113 in 1c33e133db outdated

103@@ -104,10 +104,18 @@ FUZZ_TARGET(package_rbf, .init = initialize_package_rbf)
104     std::vector<CTransaction> mempool_txs;
105     size_t iter{0};
106 
107-    int32_t replacement_vsize = fuzzed_data_provider.ConsumeIntegralInRange<int32_t>(1, 1000000);
108-
109     // Keep track of the total vsize of CTxMemPoolEntry's being added to the mempool to avoid overflow
110     // Add replacement_vsize since this is added to new diagram during RBF check
111+    std::optional<CMutableTransaction> replacement_tx = ConsumeDeserializable<CMutableTransaction>(fuzzed_data_provider, TX_WITH_WITNESS);

instagibbs commented at 7:02 pm on October 23, 2024:

noting that this isn’t actually required to get variable vsize transactions since ConsumeTxMemPoolEntry injects a fuzz-selected sigops value. Simpler if you just build a minimal tx then allow ConsumeTxMemPoolEntry to choose how big it is in vbytes?

sdaftuar commented at 6:08 pm on October 31, 2024:

I think I prefer letting the fuzzer pick it, so that I don’t have to worry about whether there are other aspects of the transaction I’m overfitting (and thus getting worse coverage than if I let the fuzzer do all the work).

On the other hand, if there are certain transaction templates that we want to be sure are covered, then perhaps I can flip a bool and determine whether to use a pre-constructed transaction versus use one from the fuzzer.

I can’t think of what that might be though, so I’m inclined to leave this as-is unless you feel differently.

EDIT: sorry I just realized that the transaction is actually carefully structured down below (at least the inputs), so yeah I can see why this feels a bit silly.

instagibbs commented at 6:11 pm on October 31, 2024:

I can’t imagine it would greatly effect performance of the fuzz target as it’s a single transaction, but it might matter more if more are constructed like in the rest of the harness.

feel free to resolve

in src/policy/rbf.cpp:187 in 1c33e133db outdated

183@@ -184,14 +184,10 @@ std::optional<std::string> PaysForRBF(CAmount original_fees,
184     return std::nullopt;
185 }
186 
187-std::optional<std::pair<DiagramCheckError, std::string>> ImprovesFeerateDiagram(CTxMemPool& pool,
188-                                                const CTxMemPool::setEntries& direct_conflicts,
189-                                                const CTxMemPool::setEntries& all_conflicts,
190-                                                CAmount replacement_fees,
191-                                                int64_t replacement_vsize)
192+std::optional<std::pair<DiagramCheckError, std::string>> ImprovesFeerateDiagram(CTxMemPool::CTxMemPoolChangeSet& changeset)

instagibbs commented at 7:14 pm on October 23, 2024:

at commit 1c33e133db5f876f2ba6eeb28877a5dcf3f6ac04 are we not applying priority values when computing RBFs, since they’re only being applied during Apply() stage?

subsequent commit 2b5268ebac8daee0521ea82c2688e5e4fac885bc pulls it into changeset, meaning it can act on it?

Do we have coverage of package rbf with priority?

sdaftuar commented at 12:36 pm on October 24, 2024:

Yep you’re right. I swapped the order of those two commits to avoid this problem, and I also added a test for package rbf with priority (which fails on https://github.com/bitcoin/bitcoin/commit/1c33e133db5f876f2ba6eeb28877a5dcf3f6ac04, but passes by the end of this branch).

in src/validation.cpp:1825 in b431ab44b0 outdated

1809@@ -1810,6 +1810,8 @@ PackageMempoolAcceptResult MemPoolAccept::AcceptPackage(const Package& package,
1810         AcceptSubPackage(txns_package_eval, args);
1811     PackageValidationState& package_state_final = multi_submission_result.m_state;
1812 
1813+    ClearSubPackageState();

instagibbs commented at 7:17 pm on October 23, 2024:

This shouldn’t be necessary, but shouldn’t hurt either. AcceptSubPackage calls it already for each subpackage.

sdaftuar commented at 8:53 am on October 28, 2024:

I think the issue is if we quit_early; I recall the fuzzer hitting the new Assume() calls via LimitMempoolSize() at the line just below, before I added this.

instagibbs commented at 5:55 pm on October 28, 2024:

Was the assert failed because of the changeset state, or something that could be hit in master?

AcceptSubPackage is called for each submission attempt of any subpackage, so I’m not sure why quit_early would be problematic.

instagibbs commented at 7:42 pm on October 29, 2024:

still curious about this if you figure out how it was hit

sdaftuar commented at 6:26 pm on October 31, 2024:

Now I think I may have been mistaken; I’ve commented out this line and started fuzzing again, and I cannot reproduce the crash. Also, I agree with your assessment after re-reading the code some more.

However, it would be nice if the SubPackageState were clearly destroyed when it goes out of scope from logic that uses it… Even though it’s seemingly not necessary to call cleanup here, it makes me a little uncomfortable to leave the code in a place where it takes some investigation to figure out that the changeset will be gone by the time LimitMempoolSize() is called.

I might just add a comment mentioning that this should have gotten cleaned up by AcceptSubPackage() but that I’m leaving the cleanup call for readability, given that the call to LimitMempoolSize requires that the changeset be gone. Does that seem reasonable?

glozow commented at 1:59 pm on November 18, 2024:

Comment seems clear to me.

in src/test/fuzz/rbf.cpp:157 in a981f07f03 outdated

153@@ -149,7 +154,9 @@ FUZZ_TARGET(package_rbf, .init = initialize_package_rbf)
154             mempool_txs.pop_back();
155             break;
156         }
157-        AddToMempool(pool, child_entry);
158+        if (!pool.GetIter(child_entry.GetTx().GetHash())) {

instagibbs commented at 7:19 pm on October 23, 2024:

belt-and-suspenders are fine, but it “shouldn’t happen” since each tx chain is sourced from a new g_outpoints?

sdaftuar commented at 12:31 pm on October 24, 2024:

No I think this one is just “belt” – if you look at lines 146-148 above, we’re not always modifying the “child” transaction here, so sometimes it’ll be a duplicate of something we already have.

instagibbs commented at 6:24 pm on October 25, 2024:

ah right

instagibbs commented at 7:23 pm on October 23, 2024: member

first pass through, concept ACK

DrahtBot removed the label CI failed on Oct 23, 2024

sdaftuar force-pushed on Oct 24, 2024

sdaftuar force-pushed on Oct 25, 2024

in src/txmempool.h:801 in b1f32cd943 outdated

796+     *
797+     * This class is used for all mempool additions and associated removals (eg due to rbf).
798+     * Callers can invoke StageAddition()/StageRemoval() of transactions in any order.
799+     *
800+     * CalculateChunksForRBF() can be used to calculate the feerate diagram of the proposed
801+     * set of new transactions and comare with the existing mempool.

instagibbs commented at 6:06 pm on October 25, 2024:

0     * set of new transactions and compare with the existing mempool.

sdaftuar commented at 9:12 pm on October 26, 2024:

Fixed, thanks.

in src/txmempool.h:797 in b1f32cd943 outdated

789@@ -816,6 +790,102 @@ class CTxMemPool
790         assert(m_epoch.guarded()); // verify guard even when it==nullopt
791         return !it || visited(*it);
792     }
793+
794+    /*
795+     * CTxMemPoolChangeSet:
796+     *
797+     * This class is used for all mempool additions and associated removals (eg due to rbf).

instagibbs commented at 6:06 pm on October 25, 2024:

aside from reorgs?

sdaftuar commented at 8:16 pm on October 26, 2024:

Reorgs are split: additions happen via the changesets (since we just use ATMP), while removals happen in their own special funky way.

I can update the comment.

sdaftuar commented at 2:41 pm on October 28, 2024:

This is updated – does this seem sufficient now, or should I add more documentation about how reorgs work?

instagibbs commented at 7:42 pm on October 29, 2024:

text is better thanks

in src/txmempool.h:808 in b1f32cd943 outdated

803+     * CalculateMemPoolAncestors() calculates the in-mempool (not including
804+     * what is in the change set itself) ancestors of a given transacion.
805+     *
806+     * Apply() will apply the removals and additions that are staged into the* mempool.
807+     *
808+     * Only one changeset may exist at a time. While a changeset is outstanding, no removals

instagibbs commented at 6:07 pm on October 25, 2024:

when would removals or additions happen outside of a changeset?

sdaftuar commented at 8:20 pm on October 26, 2024:

Additions never happen outside a changeset. RBF-related removals always go through a changeset as well.

Removals that happen because a block is found, or due to a reorg, or due to mempool expiry or mempool limiting do not go through the changeset.

in src/txmempool.h:806 in b1f32cd943 outdated

801+     * set of new transactions and comare with the existing mempool.
802+     *
803+     * CalculateMemPoolAncestors() calculates the in-mempool (not including
804+     * what is in the change set itself) ancestors of a given transacion.
805+     *
806+     * Apply() will apply the removals and additions that are staged into the* mempool.

instagibbs commented at 6:07 pm on October 25, 2024:

?

0     * Apply() will apply the removals and additions that are staged into the mempool.

sdaftuar commented at 9:12 pm on October 26, 2024:

Fixed, thanks.

in src/validation.cpp:1371 in b1f32cd943 outdated

1407-                                  strprintf("BUG! Adding to mempool failed: %s", ws.m_ptx->GetHash().ToString()));
1408-        }
1409+    }
1410+    if (!all_submitted) {
1411+        // This code should be unreachable; it's here as belt-and-suspenders
1412+        // to try to ensure we have no consensus-invalid transactions in the

instagibbs commented at 6:17 pm on October 25, 2024:

we’re still applying on failure, so it’s not really a belt and suspenders for that anymore is it?

sdaftuar commented at 8:28 pm on October 26, 2024:

Not sure I follow – if somehow the PolicyScriptChecks() passed for something that failed ConsensusChecks(), then this logic would trigger and we would then remove the transactions from the mempool. It’s slightly different than the old behavior of never allowing it to enter in the first place, but given that we’re holding the mempool locks the whole time it is not really directly observable?

instagibbs commented at 1:49 pm on October 28, 2024:

Misunderstood the code at time of writing this comment, mark as resolved

in src/validation.cpp:1103 in b1f32cd943 outdated

1099@@ -1100,13 +1100,15 @@ bool MemPoolAccept::ReplacementChecks(Workspace& ws)
1100                              strprintf("insufficient fee%s", ws.m_sibling_eviction ? " (including sibling eviction)" : ""), *err_string);
1101     }
1102 
1103+    CTxMemPool::setEntries m_all_conflicts;

instagibbs commented at 6:18 pm on October 25, 2024:

0    CTxMemPool::setEntries all_conflicts;

sdaftuar commented at 9:12 pm on October 26, 2024:

Fixed, thanks.

in src/validation.cpp:1188 in b1f32cd943 outdated

1184@@ -1178,13 +1185,16 @@ bool MemPoolAccept::PackageMempoolChecks(const std::vector<CTransactionRef>& txn
1185 
1186     // Don't consider replacements that would cause us to remove a large number of mempool entries.
1187     // This limit is not increased in a package RBF. Use the aggregate number of transactions.
1188+    CTxMemPool::setEntries m_all_conflicts;

instagibbs commented at 6:19 pm on October 25, 2024:

0    CTxMemPool::setEntries all_conflicts;

sdaftuar commented at 9:14 pm on October 26, 2024:

Fixed, thanks.

glozow referenced this in commit 2a52718d73 on Oct 26, 2024

sdaftuar force-pushed on Oct 26, 2024

sdaftuar commented at 9:21 pm on October 26, 2024: member

Rebased (to pick up the test introduced in #31152) and addressed latest feedback.

in src/txmempool.h:820 in 0df81a420e outdated

815+     *
816+     * Only one changeset may exist at a time. While a changeset is
817+     * outstanding, no removals or additions may be made directly to the
818+     * mempool.
819+     */
820+    class CTxMemPoolChangeSet {

sipa commented at 12:53 pm on October 27, 2024:

Nit, if it’s a member class of CTxMempool anyway, maybe a shorter name like ChangeSet suffices?

sdaftuar commented at 8:07 pm on October 27, 2024:

Done.

in src/txmempool.h:838 in 0df81a420e outdated

833+        const CTxMemPool::setEntries& GetRemovals() const { return m_to_remove; }
834+
835+        util::Result<CTxMemPool::setEntries> CalculateMemPoolAncestors(TxHandle tx, const Limits& limits) {
836+            LOCK(m_pool->cs);
837+            auto ret = m_pool->CalculateMemPoolAncestors(*tx, limits);
838+            if (ret) m_ancestors.insert({tx, *ret});

sipa commented at 1:08 pm on October 27, 2024:

Style: use braces/newlines/indentation when the entire if statement doesn’t fit on a single line.

sdaftuar commented at 8:07 pm on October 27, 2024:

No longer relevant after rewriting this function.

in src/txmempool.h:835 in 0df81a420e outdated

830+        TxHandle StageAddition(const CTransactionRef& tx, const CAmount fee, int64_t time, unsigned int entry_height, uint64_t entry_sequence, bool spends_coinbase, int64_t sigops_cost, LockPoints lp);
831+        void StageRemoval(CTxMemPool::txiter it) { m_to_remove.insert(it); }
832+
833+        const CTxMemPool::setEntries& GetRemovals() const { return m_to_remove; }
834+
835+        util::Result<CTxMemPool::setEntries> CalculateMemPoolAncestors(TxHandle tx, const Limits& limits) {

sipa commented at 1:36 pm on October 27, 2024:

I do find this function a bit confusing; it seems like m_ancestors is effectively a cache for the result of m_pool->CalculateMempoolAncestors(), but then I’d expect this function to also act like a cache?

How about:

 0util::Result<CTxMemPool::setEntries> CalculateMemPoolAncestors(TxHandle tx, const Limits& limits)
 1{
 2    LOCK(m_pool->cs);
 3    auto it = m_ancestors.find(tx);
 4    util::Result<CTxMemPool::setEntries> ret;
 5    if (it == m_ancestors.end()) {
 6        ret = m_pool->CalculateMemPoolAncestors(*tx, limits);
 7        if (ret) {
 8            m_ancestors.try_emplace(tx, *ret);
 9        } else {
10            // Is erasing ever needed? Maybe when the cache gets invalidated?
11        }
12    }
13    return ret;
14}

and then always use this function, rather than directly accessing m_ancestors in CTxMempool::Apply.

sdaftuar commented at 8:15 pm on October 27, 2024:

I reworked your suggestion to capture the caching idea, and now invoke this function in Apply. However, there’s still an ugly special case in Apply() in that we can only used a cached ancestor set calculation for the first transaction in a package (because the calculation is no longer correct once transactions start getting added to the mempool).

Note that in the cluster mempool implementation, this ugliness in Apply() goes away, because we will no longer need to have ancestor sets calculated in order to add a transaction to the mempool.

Furthermore, I believe that we may be able to completely eliminate the need for this function altogether, post-cluster-mempool. I should go back and try to rework #28676 to demonstrate that this is the case, but my recollection is that the only reason we will need to calculate ancestors sets in the future is for an RBF sanity check (namely that we’re not replacing an ancestor of any to-be-added transaction) which we can reimplement without these ancestor calculations.

sdaftuar commented at 4:28 pm on November 1, 2024:

Looking into whether I can avoid exposing this function even before cluster mempool lands; will consider it for a followup PR.

sdaftuar force-pushed on Oct 27, 2024

in src/validation.cpp:1373 in 8e2aac1e87 outdated

1369@@ -1366,7 +1370,7 @@ bool MemPoolAccept::SubmitPackage(const ATMPArgs& args, std::vector<Workspace>&
1370         // Re-calculate mempool ancestors to call addUnchecked(). They may have changed since the
1371         // last calculation done in PreChecks, since package ancestors have already been submitted.
1372         {
1373-            auto ancestors{m_pool.CalculateMemPoolAncestors(*ws.m_entry, m_pool.m_opts.limits)};
1374+            auto ancestors{ws.m_changeset->CalculateMemPoolAncestors(ws.m_tx_handle, m_pool.m_opts.limits)};

instagibbs commented at 5:03 pm on October 29, 2024:

8e2aac1e87b67a5d28cdc507df9bca85614cc97d it’s using the same cached value, so how can it return something new?

this disappears next commit of course…

sdaftuar commented at 6:40 pm on October 31, 2024:

I’m very puzzled why none of the tests appear to fail on this commit.

sdaftuar commented at 6:56 pm on October 31, 2024:

Ah, it’s because at this point, the Apply() function doesn’t use the ancestors anyway. Agreed that this is pretty ugly – I’ll try to fix.

sdaftuar commented at 2:13 pm on November 1, 2024:

Should be better now, I think – now I’m introducing the ancestor caching inside of CalculateMemPoolAncestors into the same commit that introduces using the cached value in Apply().

instagibbs commented at 2:45 pm on November 1, 2024:

history is cleaner, thank you

in src/validation.cpp:1368 in 27c2abffbd outdated

1402-            all_submitted = false;
1403-            package_state.Invalid(PackageValidationResult::PCKG_MEMPOOL_ERROR,
1404-                                  strprintf("BUG! Adding to mempool failed: %s", ws.m_ptx->GetHash().ToString()));
1405-        }
1406+    }
1407+    if (!all_submitted) {

instagibbs commented at 5:29 pm on October 29, 2024:

nit: would be nice to check the changeset exists before applying it

sdaftuar commented at 2:14 pm on November 1, 2024:

Added an Assume() for it.

in src/txmempool.h:857 in a8d0976de7 outdated

852+            }
853+            return ret;
854+        }
855+
856+        size_t GetTxCount() const { return m_entry_vec.size(); }
857+        const CTransaction& GetTx(size_t index) const { return m_entry_vec.at(index)->GetTx(); }

instagibbs commented at 6:21 pm on October 29, 2024:

GetAddedTxn ?

sdaftuar commented at 3:01 pm on November 1, 2024:

Done.

in src/validation.cpp:685 in a8d0976de7 outdated

681@@ -682,7 +682,9 @@ class MemPoolAccept
682     bool ConsensusScriptChecks(const ATMPArgs& args, Workspace& ws) EXCLUSIVE_LOCKS_REQUIRED(cs_main, m_pool.cs);
683 
684     // Try to add the transaction to the mempool, removing any conflicts first.
685-    void FinalizeSubpackage(const ATMPArgs& args, std::vector<Workspace>& ws) EXCLUSIVE_LOCKS_REQUIRED(cs_main, m_pool.cs);
686+    // Returns true if the transaction is in the mempool after any size

instagibbs commented at 6:23 pm on October 29, 2024:

copy/past error or something?

naumenkogs commented at 12:40 pm on October 30, 2024:

in a8d0976de7edf43d3cfea8dff3afc923580d8f84

sdaftuar commented at 2:14 pm on November 1, 2024:

No idea! Gone now.

in src/txmempool.h:868 in ca235933e0 outdated

863     std::unique_ptr<ChangeSet> GetChangeSet() EXCLUSIVE_LOCKS_REQUIRED(cs) { return std::make_unique<ChangeSet>(this); }
864+
865+    friend class CTxMemPool::ChangeSet;
866+
867+private:
868+    // addNewTransaction must updated state for all ancestors of a given transaction,

instagibbs commented at 7:00 pm on October 29, 2024:

while we’re moving the comment, can we fix the grammar: “must updated”?

sdaftuar commented at 2:14 pm on November 1, 2024:

Fixed.

in src/txmempool.h:875 in ca235933e0 outdated

870+    // addNewTransaction can be used to have it call CalculateMemPoolAncestors(), and
871+    // then invoke the second version.
872+    // Note that addNewTransaction is ONLY called from ATMP outside of tests
873+    // and any other callers may break wallet's in-mempool tracking (due to
874+    // lack of CValidationInterface::TransactionAddedToMempool callbacks).
875+    void Apply(CTxMemPool::ChangeSet* changeset) EXCLUSIVE_LOCKS_REQUIRED(cs);

instagibbs commented at 7:01 pm on October 29, 2024:

comment above this isn’t for Apply.

sdaftuar commented at 2:14 pm on November 1, 2024:

Moved.

in src/txmempool.h:872 in ca235933e0 outdated

867+private:
868+    // addNewTransaction must updated state for all ancestors of a given transaction,
869+    // to track size/count of descendant transactions.  First version of
870+    // addNewTransaction can be used to have it call CalculateMemPoolAncestors(), and
871+    // then invoke the second version.
872+    // Note that addNewTransaction is ONLY called from ATMP outside of tests

instagibbs commented at 7:02 pm on October 29, 2024:

it’s not called from ATMP or tests anymore, thankfully it’s private now

sdaftuar commented at 2:15 pm on November 1, 2024:

Reworded.

in src/txmempool.h:264 in ca235933e0 outdated

259@@ -260,7 +260,7 @@ struct TxMempoolInfo
260  *
261  * Usually when a new transaction is added to the mempool, it has no in-mempool
262  * children (because any such children would be an orphan).  So in
263- * addUnchecked(), we:
264+ * addNewTransaction(), we:
265  * - update a new entry's setMemPoolParents to include all in-mempool parents

instagibbs commented at 7:04 pm on October 29, 2024:

not your fault but: I don’t think setMemPoolParents exists in master even

I feel like these paragraphs are a minefield of old information

sdaftuar commented at 2:15 pm on November 1, 2024:

Tried to remove all instances of setMemPoolParents and setMemPoolChildren from code comments.

in src/txmempool.cpp:596 in c8b896067d outdated

602@@ -603,6 +603,7 @@ void CTxMemPool::removeRecursive(const CTransaction &origTx, MemPoolRemovalReaso
603 {

instagibbs commented at 7:17 pm on October 29, 2024:

c8b896067d42001f2a47cd837c0aed07bb39f329 commit message could probably use a little massaging since removals can happen in a changeset? I’m assuming these are all block/trimming related

sdaftuar commented at 2:16 pm on November 1, 2024:

This was a good suggestion, thanks – with the introduction of changesets, we can make RemoveStaged() a private method now, and so I updated the commit message to explain that the public removal methods may not be invoked while a changeset is outstanding.

in src/test/rbf_tests.cpp:379 in 5e447a3059 outdated

373@@ -374,6 +374,10 @@ BOOST_FIXTURE_TEST_CASE(improves_feerate, TestChain100Setup)
374     const auto tx2_fee = entry2->GetModifiedFee();
375     const auto tx2_size = entry2->GetTxSize();
376 
377+    // conflicting transactions
378+    const auto tx1_conflict = make_tx(/*inputs=*/ {m_coinbase_txns[0], m_coinbase_txns[2]}, /*output_values=*/ {10 * COIN});
379+    auto entry1_conflict = entry.FromTx(tx1_conflict);

instagibbs commented at 7:22 pm on October 29, 2024:

5e447a3059224ac97ec7cf5681dc3ca492770968 unused?

sdaftuar commented at 2:16 pm on November 1, 2024:

Gone thanks

in src/validation.cpp:742 in 27c2abffbd outdated

738@@ -740,6 +739,8 @@ class MemPoolAccept
739         CTxMemPool::setEntries m_all_conflicts;
740         /** Mempool transactions that were replaced. */
741         std::list<CTransactionRef> m_replaced_transactions;
742+        /* Changeset representing adding a transaction and removing its conflicts. */

instagibbs commented at 7:31 pm on October 29, 2024:

circa 27c2abffbd9c5a83fd2becf6d7b70c7b30031f76 it’s applying multiple transactions

0        /* Changeset representing adding transactions and removing their conflicts. */

sdaftuar commented at 2:16 pm on November 1, 2024:

Done

in src/txmempool.cpp:1321 in e60239454c outdated

1315-    Assume(replacement_vsize > 0);
1316+    LOCK(m_pool->cs);
1317+    FeeFrac replacement_feerate{0, 0};
1318+    for (auto it : m_entry_vec) {
1319+        replacement_feerate += {it->GetModifiedFee(), it->GetTxSize()};
1320+    }

instagibbs commented at 7:34 pm on October 29, 2024:

after this loop, sanity check replacement_feerate isn’t size 0?

sdaftuar commented at 2:10 pm on November 1, 2024:

I think if you created a changeset and immediately invoked CalculateChunksForRBF(), then this would be size 0 right? Does anything here break if it is size 0? (It looks ok to me, not sure if I’m missing something.)

instagibbs commented at 2:41 pm on November 1, 2024:

Ok, given new interface seems impossible to hit and seems well-defined. Added another test case to ComapreChunks to just make sure it was well-defined:

 0diff --git a/src/test/rbf_tests.cpp b/src/test/rbf_tests.cpp
 1index c6d06c937c..d610661878 100644
 2--- a/src/test/rbf_tests.cpp
 3+++ b/src/test/rbf_tests.cpp
 4@@ -631,20 +631,32 @@ BOOST_AUTO_TEST_CASE(feerate_chunks_utilities)
 5 {
 6     // Sanity check the correctness of the feerate chunks comparison.
 7 
 8     // A strictly better case.
 9     std::vector<FeeFrac> old_chunks{{{950, 300}, {100, 100}}};
10     std::vector<FeeFrac> new_chunks{{{1000, 300}, {50, 100}}};
11 
12     BOOST_CHECK(std::is_lt(CompareChunks(old_chunks, new_chunks)));
13     BOOST_CHECK(std::is_gt(CompareChunks(new_chunks, old_chunks)));
14 
15+    // Empty comparisons
16+    old_chunks = {{0, 0}};
17+    new_chunks = {{1000, 300}, {0, 100}};
18+
19+    BOOST_CHECK(std::is_lt(CompareChunks(old_chunks, new_chunks)));
20+    BOOST_CHECK(std::is_gt(CompareChunks(new_chunks, old_chunks)));
21+
22+    old_chunks = {{0, 0}};
23+    new_chunks = {{0, 0}};
24+
25+    BOOST_CHECK(std::is_eq(CompareChunks(old_chunks, new_chunks)));
26+
27     // Incomparable diagrams
28     old_chunks = {{950, 300}, {100, 100}};
29     new_chunks = {{1000, 300}, {0, 100}};
30 
31     BOOST_CHECK(CompareChunks(old_chunks, new_chunks) == std::partial_ordering::unordered);
32     BOOST_CHECK(CompareChunks(new_chunks, old_chunks) == std::partial_ordering::unordered);
33 
34     // Strictly better but smaller size.
35     old_chunks = {{950, 300}, {100, 100}};
36     new_chunks = {{1100, 300}};

sdaftuar commented at 3:02 pm on November 1, 2024:

Maybe worth a separate PR to ensure the test coverage; will mark this as resolved.

instagibbs commented at 7:41 pm on October 29, 2024: member

LGTM e2324779781b695024d3b17340001fa2da5d3c7f

with non-blocking nits

will do some testing

sdaftuar force-pushed on Nov 1, 2024

instagibbs commented at 2:54 pm on November 1, 2024: member

ACK 9512bd34a3d74d15c3ce1086a80a1edfa5ac3acf

via git range-diff master e2324779781b695024d3b17340001fa2da5d3c7f 9512bd34a3d74d15c3ce1086a80a1edfa5ac3acf

DrahtBot requested review from glozow on Nov 1, 2024

DrahtBot requested review from ismaelsadeeq on Nov 1, 2024

in src/test/txpackage_tests.cpp:1082 in 2ee7fd9708 outdated

1077@@ -1078,7 +1078,25 @@ BOOST_AUTO_TEST_CASE(package_rbf_tests)
1078         BOOST_CHECK_EQUAL(it_child_3->second.m_effective_feerate.value().GetFee(package3_total_vsize), 199 + 1300);
1079 
1080         BOOST_CHECK_EQUAL(m_node.mempool->size(), expected_pool_size);
1081-    }
1082 
1083+        // Finally, check that we can prioritise package1 and get it back into the mempool.

ismaelsadeeq commented at 4:11 pm on November 4, 2024:

For clarity:

0        // Finally, check that we can prioritize tx_child_1 to get package1 into the mempool.

We can also prioritize tx_parent_1 to get package1 in, but the nFeeDelta must be high enough to evict package3.

sdaftuar commented at 4:56 pm on November 7, 2024:

Done

in src/txmempool.h:876 in 810a794946 outdated

840+        void Apply() EXCLUSIVE_LOCKS_REQUIRED(cs_main);
841+
842+    private:
843+        CTxMemPool* m_pool;
844+        CTxMemPool::indexed_transaction_set m_to_add;
845+        std::vector<CTxMemPool::txiter> m_entry_vec; // track the added transactions' insertion order

ismaelsadeeq commented at 4:29 pm on November 4, 2024:

Can just be vector of TxHandle here.

sdaftuar commented at 4:58 pm on November 7, 2024:

My thought was that we’re storing CTxMemPool::txiter objects internally, but the TxHandle object could be replaced with some other wrapper if we wanted to hide the internals of the mempool further. So I’m leaving this as-is for now.

ismaelsadeeq commented at 9:22 pm on November 12, 2024:

Ahh Makes sense

in src/validation.cpp:1362 in 7807ae8265 outdated

1374-
1375-        // Re-calculate mempool ancestors to call addUnchecked(). They may have changed since the
1376-        // last calculation done in PreChecks, since package ancestors have already been submitted.
1377-        {
1378-            auto ancestors{ws.m_changeset->CalculateMemPoolAncestors(ws.m_tx_handle, m_pool.m_opts.limits)};
1379-            if(!ancestors) {

ismaelsadeeq commented at 3:00 pm on November 6, 2024:

This seems fine to remove because it’s redundant anyways. But worth noting in the commit message.

sdaftuar commented at 5:00 pm on November 7, 2024:

Updated the commit message.

in src/validation.cpp:1330 in 9ef19d4ff4 outdated

1336-        if (!m_pool.exists(GenTxid::Txid(hash)))
1337-            // The tx no longer meets our (new) mempool minimum feerate but could be reconsidered in a package.
1338-            return state.Invalid(TxValidationResult::TX_RECONSIDERABLE, "mempool full");
1339-    }
1340-    return true;
1341+    return;

ismaelsadeeq commented at 3:13 pm on November 6, 2024:

The return here is redundant.

sdaftuar commented at 5:01 pm on November 7, 2024:

Gone.

in src/validation.cpp:1326 in cc666ec52a outdated

1342                 it->GetFee(),
1343                 std::chrono::duration_cast<std::chrono::duration<std::uint64_t>>(it->GetTime()).count(),
1344-                hash.data(),
1345-                workspaces[0].m_tx_handle->GetTxSize(),
1346-                workspaces[0].m_tx_handle->GetFee()
1347+                tx_or_package_hash.data(),

ismaelsadeeq commented at 3:32 pm on November 6, 2024:

tracing.md doc is still saying

Replacement transaction ID (hash) as pointer to unsigned chars (i.e. 32 bytes in little-endian)

sdaftuar commented at 5:02 pm on November 7, 2024:

I cherry-picked https://github.com/0xB10C/bitcoin/commit/2eb1410e774636040d19d3baf5e22b2e4a8fbfd2, which updates the documentation and resolves the issue discussed above: #31122 (review)

ismaelsadeeq commented at 4:57 pm on November 6, 2024: member

ACK 9512bd34a3d74d15c3ce1086a80a1edfa5ac3acf With some comments.

sdaftuar force-pushed on Nov 7, 2024

DrahtBot added the label Needs rebase on Nov 11, 2024

0xB10C commented at 12:29 pm on November 11, 2024: contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Note that in #26593 the tracepoint macros (e.g. TRACE7, TRACE8, ..) were renamed to just TRACEPOINT (without a number). When rebasing, there will be a few places where these will need to be renamed as you touched code close to it. Similarly, in the last commit (“tracing: pass if replaced by tx/pkg to tracepoint”, currently 0ca1e05d8bcbc42892156c15f128a5f829e9e48e) the TRACE8 won’t work and will need to be just TRACEPOINT.

naumenkogs commented at 10:41 am on November 12, 2024: member

ACK 0ca1e05d8bcbc42892156c15f128a5f829e9e48e

Spent a fair amount of time staring at ConsensusScriptChecks behavior change mentioned in the original post. It seems alright. Handling various limits is probably the most complex part of this code change.

Otherwise the code seems clear and nothing is broken. Still hard for me to reason about high-level design decisions on cluster mempool here, but I learned a lot while reviewing this pr at least.

DrahtBot requested review from ismaelsadeeq on Nov 12, 2024

DrahtBot requested review from instagibbs on Nov 12, 2024

instagibbs approved

instagibbs commented at 2:31 pm on November 13, 2024: member

ACK 0ca1e05d8bcbc42892156c15f128a5f829e9e48e

primary change is the tracing commit, which seems to make sense though I am not an expert

via git range-diff master 9512bd34a3d74d15c3ce1086a80a1edfa5ac3acf 0ca1e05d8bcbc42892156c15f128a5f829e9e48e

Add package hash to package-rbf log message 15d982f91e

test: Add unit test coverage of package rbf + prioritisetransaction 87d92fa340

Introduce mempool changesets

Introduce the CTxMemPool::ChangeSet, a wrapper for creating (potential) new
mempool entries and removing conflicts.

802214c083

Move changeset from workspace to subpackage

Removes a redundant check that mempool limits will not be violated during
package acceptance.

01e145b975

Move LimitMempoolSize to take place outside FinalizeSubpackage 57983b8add

Clean up FinalizeSubpackage to avoid workspace-specific information

Also, use the "package hash" for logging replacements in the package rbf
setting.

34b6c5833d

Apply mempool changeset transactions directly into the mempool

Rather than individually calling addUnchecked for each transaction added in a
changeset (after removing all the to-be-removed transactions), instead we can
take advantage of boost::multi_index's splicing features to extract and insert
entries directly from the staging multi_index into mapTx.

This has the immediate advantage of saving allocation overhead for mempool
entries which have already been allocated once. This also means that the memory
locations of mempool entries will not change when transactions go from staging
to the main mempool.

Additionally, eliminate addUnchecked and require all new transactions to enter
the mempool via a CTxMemPoolChangeSet.

7fb62f7db6

Enforce that there is only one changeset at a time 18829194ca

Make RemoveStaged() private 2b30f4d36c

Public mempool removal methods Assume() no changeset is outstanding

While a changeset is outstanding, removing transaction directly from the
mempool will invalidate the changeset state, so this is not permitted.

b447416fdd

Duplicate transactions are not permitted within a changeset b53041021a

Don't distinguish between direct conflicts and all conflicts when doing cluster-size-2-rbf checks 446b08b599

Move prioritisation into changeset 284a1d33f1

Move CalculateChunksForRBF() to the mempool changeset d7dc9fd2f7

Ensure that we don't add duplicate transactions in rbf fuzz tests d3c8e7dfb6

Remove m_all_conflicts from SubPackageState 83f814b1d1

doc: add comments for CTxMemPool::ChangeSet a4ec07f194

tracing: pass if replaced by tx/pkg to tracepoint

The mempool:replaced tracepoint now reports either a txid or a
package hash (previously it always was a txid). To let users know
if a txid or package hash is passed, a boolean argument is added
the the tracepoint.

In the functional test, a ctypes.Structure class for MempoolReplaced
is introduced as Python warns the following when not explcitly
casting it to a ctype:

  Type: 'bool' not recognized. Please define the data with ctypes manually.

5736d1ddac

sdaftuar force-pushed on Nov 13, 2024

sdaftuar commented at 6:33 pm on November 13, 2024: member

Just rebased this on master – @0xB10C thanks for the note on the tracepoint change. @instagibbs I think I could rewrite the CheckEphemeralSpends() code to take advantage of the changesets for package validation (since we will now be maintaining a map of package transactions that we can query), but maybe that can be done in a separate PR? Just left that code as-is for now. @naumenkogs Thanks for looking at the validation change relating to consensus-invalid packages. Someone should probably test that logic by changing the policy checks to NOT be a strict superset of the consensus checks and verify that the logic I’m introducing here actually does what I think it does. I intend to do that myself but haven’t gotten to it yet.

EDIT: I hacked up the validation logic to trigger this currently unreachable block of code – ie to make the ConsensusScriptChecks() fail in a package validation context – and verified that logic to remove the newly added transactions seems to work. Would be great for others to verify this as well!

DrahtBot removed the label Needs rebase on Nov 13, 2024

instagibbs approved

instagibbs commented at 7:28 pm on November 13, 2024: member

ACK 5736d1ddacc4019101e7a5170dd25efbc63b622a

Picks up TRACEPOINT and ephemeral dust stuff.

via git range-diff master 0ca1e05d8bcbc42892156c15f128a5f829e9e48e 5736d1ddacc4019101e7a5170dd25efbc63b622a

DrahtBot requested review from naumenkogs on Nov 13, 2024

naumenkogs commented at 7:48 am on November 14, 2024: member

@sdaftuar

EDIT: I hacked up the validation logic to trigger this currently unreachable block of code – ie to make the ConsensusScriptChecks() fail in a package validation context – and verified that logic to remove the newly added transactions seems to work. Would be great for others to verify this as well!

I had a little doubt that it will remove the transactions… Instead I was worried about evictions (i can’t think of other side effects) of add/remove invalid stuff. Doesn’t this allow to clear the victims mempool for free?

sdaftuar commented at 11:18 am on November 14, 2024: member

Instead I was worried about evictions (i can’t think of other side effects) of add/remove invalid stuff. Doesn’t this allow to clear the victims mempool for free?

Just to be clear, in case anyone has lost track: we’re specifically talking about a codepath we believe is not possible to trigger, namely one where the PolicyScriptChecks pass yet the ConsensusScriptChecks fail for some transaction, and someone is able to submit such a transaction as part of a package to our node.

In this event, I actually don’t think the mempool could be directly drained, because we don’t call LimitMempoolSize() until after this condition is detected and the new transactions are removed. However, I did observe that any RBF’s that take place would not be fully reverted (ie the replaced transactions aren’t being added back).

If it were simple to just re-add the just-removed transactions, I’d try to include that here as well, but it would take some additional work to topologically sort the removed transactions so they can be re-added. Given that this logic is unreachable and already difficult-to-test as a result, I don’t think it makes sense to add any significant complexity, at least for now – this may be simpler to think about post-cluster-mempool.

ismaelsadeeq commented at 4:35 pm on November 15, 2024: member

reACK 5736d1ddacc4019101e7a5170dd25efbc63b622a

naumenkogs commented at 8:32 am on November 18, 2024: member

ACK https://github.com/bitcoin/bitcoin/commit/5736d1ddacc4019101e7a5170dd25efbc63b622a

My concern above was related to LimitMempoolSize(), but indeed it’s easy to see it doesn’t happen between the two m_changeset->Apply() calls.

For the RBF transactions i agree it’s probably not worth complicating the code.

in src/txmempool.h:848 in 5736d1ddac

843+            // If not found, try to have the mempool calculate it, and cache
844+            // for later.
845+            LOCK(m_pool->cs);
846+            auto ret{m_pool->CalculateMemPoolAncestors(*tx, limits)};
847+            if (ret) m_ancestors.try_emplace(tx, *ret);
848+            return ret;

glozow commented at 1:40 pm on November 18, 2024:

This is maybe a TxGraph question, but will we get to the point where CMPA and CheckPackageLimits can do the calculations using all staged changes? For example, if it looks like I would bust cluster limits using current mempool, but my parent’s RBF is breaking up the cluster / removing things from it.

I’m wondering if we could easily align our multi-testmempoolaccept interface with the ProcessNewPackage interface to support package CPFP and RBF by pointing testmempoolaccept down the same codepath, with some if (test_accept) conditions to skip submission and retain changeset + coinsview state.

sipa commented at 2:23 pm on November 18, 2024:

TxGraph will be able reason about cluster count limits using arbitrary staged changes (so any combination of transaction deletions, transaction additions, and added dependencies).

These operations are always ordered as (1) tx additions (2) tx deletions (3) dep additions, even if they are invoked in a different order. So if you first add dependencies which - if applied - would violate the limit, and then delete some transactions in the cluster such that the result breaks up again and stays below the limit, that is fine.

sdaftuar commented at 1:46 pm on November 19, 2024:

I also have a goal of eliminating the need to invoke CMPA on a changeset anyway. In #28676, I believe the only place that CMPA is used in a changeset is for determining whether an RBF might be replacing something it depends on. I think it should be straightforward to have the changeset logic directly test for that condition and return failure if a dependency is being added to something that is slated for removal, which would then allow us to remove the method altogether.

in src/txmempool.h:797 in 5736d1ddac

789@@ -816,6 +790,122 @@ class CTxMemPool
790         assert(m_epoch.guarded()); // verify guard even when it==nullopt
791         return !it || visited(*it);
792     }
793+
794+    /*
795+     * CTxMemPool::ChangeSet:
796+     *
797+     * This class is used for all mempool additions and associated removals (eg

glozow commented at 1:53 pm on November 18, 2024:

Re: “all mempool additions and removals”

Is the plan to use changesets for all other mempool changes as well, such as removeForBlock and removeForReorg?

sdaftuar commented at 5:19 pm on November 18, 2024:

I’ve discussed this question with @sipa a bit; our thought was that changesets are for evaluating changes that we might need to back out and not apply; removals due to a block being found or to make the mempool consistent after a reorg have to happen, so we don’t bother with a changeset and just apply them directly.

If you have other thoughts on this let me know– happy to reconsider if there’s a good reason to structure removals differently.

glozow commented at 2:02 am on November 19, 2024:

That makes sense, thanks. I wanted to clarify what was meant by “all” additions and removals.

glozow commented at 12:36 pm on November 20, 2024:

Sorry about this comment - just reread what I said and it makes no sense.

glozow commented at 2:08 pm on November 18, 2024: member

Still working on review. I’ve skimmed through the commits and reviewed the end state of MemPoolAccept so far.

DrahtBot requested review from glozow on Nov 18, 2024

in src/txmempool.h:857 in 34b6c5833d outdated

852+            }
853+            return ret;
854+        }
855+
856+        size_t GetTxCount() const { return m_entry_vec.size(); }
857+        const CTransaction& GetAddedTxn(size_t index) const { return m_entry_vec.at(index)->GetTx(); }

glozow commented at 11:27 am on November 20, 2024:

nitty, but I wondered why not CTransactionRef here?

0        CTransactionRef GetAddedTxn(size_t index) const { return m_entry_vec.at(index)->GetSharedTx(); }

in src/txmempool.h:875 in 7fb62f7db6 outdated

870+    void Apply(CTxMemPool::ChangeSet* changeset) EXCLUSIVE_LOCKS_REQUIRED(cs);
871+
872+    // addNewTransaction must update state for all ancestors of a given transaction,
873+    // to track size/count of descendant transactions.  First version of
874+    // addNewTransaction can be used to have it call CalculateMemPoolAncestors(), and
875+    // then invoke the second version.

glozow commented at 11:43 am on November 20, 2024:

I misunderstood this comment initially, thinking it meant that you would call both addNewTransactions, but it actually means that one is a wrapper for the other, with a prepended CMPA. Callers should use one or the other.

in src/txmempool.cpp:1170 in b447416fdd outdated

1166@@ -1163,6 +1167,7 @@ void CTxMemPool::trackPackageRemoved(const CFeeRate& rate) {
1167 
1168 void CTxMemPool::TrimToSize(size_t sizelimit, std::vector<COutPoint>* pvNoSpendsRemaining) {
1169     AssertLockHeld(cs);
1170+    Assume(!m_have_changeset);

glozow commented at 12:06 pm on November 20, 2024:

b447416fddcb8c8647391502cca3dbfd1552e02e is nice!

in src/test/rbf_tests.cpp:378 in 284a1d33f1 outdated

373@@ -374,6 +374,9 @@ BOOST_FIXTURE_TEST_CASE(improves_feerate, TestChain100Setup)
374     const auto tx2_fee = entry2->GetModifiedFee();
375     const auto tx2_size = entry2->GetTxSize();
376 
377+    // conflicting transactions
378+    const auto tx1_conflict = make_tx(/*inputs=*/ {m_coinbase_txns[0], m_coinbase_txns[2]}, /*output_values=*/ {10 * COIN});

glozow commented at 12:28 pm on November 20, 2024:

nit: these changes were perhaps meant for d7dc9fd2f7bc675256687b9c55fdbec9cc8ac781 instead of 284a1d33f1dcbc3b3404ea40a948ff6600239613?

glozow commented at 12:39 pm on November 20, 2024: member

ACK 5736d1ddacc

I really like how ChangeSet codifies the concept of an atomic changeset of additions and removals, and how CalculateChunksForRBF and ImprovesFeerateDiagram operate on changesets. It’s like the mempool interface is moving towards it being more responsible for itself.

glozow merged this on Nov 20, 2024

glozow closed this on Nov 20, 2024

cluster mempool: Implement changeset interface for mempool #31122

Code Coverage & Benchmarks

Reviews

Conflicts

Tracepoint mempool:replaced_by_package

Tracepoint `mempool:replaced_by_package`