Tidy up ProcessOrphanTx #19498
pull jnewbery wants to merge 6 commits into bitcoin:master from jnewbery:2020-07-orphan-tidy changing 2 files +55 −43-
jnewbery commented at 10:30 am on July 12, 2020: memberOriginally a follow-up to #19364, this simplifies the logic in ProcessOrphanTx() and removes unused variables.
-
jnewbery commented at 10:32 am on July 12, 2020: member
-
fanquake added the label P2P on Jul 12, 2020
-
in src/net_processing.cpp:1967 in 9c954f4a86 outdated
1940+/** 1941+ * Reconsider orphan transactions after a parent has been accepted to the mempool. 1942+ * 1943+ * @param[in] connman Reference to the connection manager 1944+ * @param[in] mempool Reference to mempool 1945+ * @param[in/out] orphan_work_set The set of orphan transactions to reconsider. Only one
jonatack commented at 2:53 pm on July 12, 2020:2ec4eeadb nit, can mention “Reference to” here as well
jnewbery commented at 5:16 pm on July 12, 2020:Meh. I think I’d prefer to remove “Reference to” from the others than add this.in src/net_processing.cpp:1965 in 9c954f4a86 outdated
1935@@ -1925,64 +1936,67 @@ static void ProcessHeadersMessage(CNode& pfrom, CConnman* connman, ChainstateMan 1936 return; 1937 } 1938 1939-void static ProcessOrphanTx(CConnman* connman, CTxMemPool& mempool, std::set<uint256>& orphan_work_set, std::list<CTransactionRef>& removed_txn) EXCLUSIVE_LOCKS_REQUIRED(cs_main, g_cs_orphans) 1940+/** 1941+ * Reconsider orphan transactions after a parent has been accepted to the mempool. 1942+ * 1943+ * @param[in] connman Reference to the connection manager
jonatack commented at 3:16 pm on July 12, 2020:9c954f4 it looks likeconnman
can be reference to const?
jnewbery commented at 5:16 pm on July 12, 2020:sure. Donein src/net_processing.cpp:1966 in 9c954f4a86 outdated
1935@@ -1925,64 +1936,67 @@ static void ProcessHeadersMessage(CNode& pfrom, CConnman* connman, ChainstateMan 1936 return; 1937 } 1938 1939-void static ProcessOrphanTx(CConnman* connman, CTxMemPool& mempool, std::set<uint256>& orphan_work_set, std::list<CTransactionRef>& removed_txn) EXCLUSIVE_LOCKS_REQUIRED(cs_main, g_cs_orphans) 1940+/** 1941+ * Reconsider orphan transactions after a parent has been accepted to the mempool. 1942+ * 1943+ * @param[in] connman Reference to the connection manager 1944+ * @param[in] mempool Reference to mempool
jonatack commented at 3:18 pm on July 12, 2020:2ec4eead [in/out]? (mempool
needs to be mutable)
jnewbery commented at 5:16 pm on July 12, 2020:hmmm I’m not sure. mempool isn’t used by the caller after the function returns, so I’m not sure I’d call it an ‘out param’. Just because an argument is non-const, I don’t think that automatically makes it an out-param.jonatack commented at 3:38 pm on July 12, 2020: memberACK on the doxygen comments. The pure refactoring changes seem ok. I hesitate with the commits where refactoring is mixed with behavior changes: are the behavior changes desirable in themselves or ignorable side effects; are the benefits worth the review and risk.DrahtBot commented at 5:14 pm on July 12, 2020: memberThe following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #20025 (validation/util: add GetTransactionFee by gzhao408)
- #19911 (net: guard vRecvGetData with cs_vRecv and orphan_work_set with g_cs_orphans by narula)
- #19339 (validation: re-delegate absurd fee checking from mempool to clients by gzhao408)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
jnewbery commented at 5:15 pm on July 12, 2020: memberare the behavior changes desirable in themselves or ignorable side effects
There are no observable functional changes. The behavior changes are simply in the timing of how items are popped off the orphan_work_set list. They make the control flow of the function much easier to follow. After these changes either:
- orphan_work_set is empty and the function does nothing; or
- the function pops one item off orphan_work_set and processes it.
DrahtBot added the label Needs rebase on Jul 14, 2020jnewbery force-pushed on Jul 15, 2020jnewbery commented at 9:48 am on July 15, 2020: memberrebasedDrahtBot removed the label Needs rebase on Jul 15, 2020DrahtBot added the label Needs rebase on Jul 16, 2020jnewbery commented at 9:36 am on July 16, 2020: memberrebasedjnewbery force-pushed on Jul 16, 2020jnewbery removed the label Needs rebase on Jul 16, 2020DrahtBot added the label Needs rebase on Jul 22, 2020jnewbery commented at 10:07 am on July 23, 2020: memberrebasedjnewbery force-pushed on Jul 23, 2020DrahtBot removed the label Needs rebase on Jul 23, 2020in src/net_processing.cpp:3778 in eb30f87e09 outdated
3776@@ -3763,10 +3777,7 @@ bool PeerLogicValidation::ProcessMessages(CNode* pfrom, std::atomic<bool>& inter 3777 if (!pfrom->orphan_work_set.empty()) { 3778 std::list<CTransactionRef> removed_txn;
troygiorshev commented at 8:44 pm on July 23, 2020:This is now unused and can be removed.
This is the only thing blocking my ACK.
+1 to the removal of an out param! 👏
troygiorshev commented at 8:59 pm on July 23, 2020:
jnewbery commented at 7:51 am on July 24, 2020:Thanks. Removed
removed_txn
Those out params are still needed in AcceptToMemoryPool. They’re used to add the removed transactions to the compact block extra transactions here: https://github.com/jnewbery/bitcoin/blob/66ae250195ab7c5c94f72fa9cf222992e1ac46c1/src/net_processing.cpp#L2946-L2947
troygiorshev commented at 11:23 am on July 24, 2020:I’m suggesting that calls to AddtoCompactExtraTransactions be moved into AcceptToMemoryPoolWithTime, in a similar manner to how you moved it into ProcessOrphanTx. I’m pretty sure this is possible, whether it’s a good idea or not is something I’m less sure about.
jnewbery commented at 12:10 pm on July 24, 2020:That would be a layer violation.AddToCompactExtraTransactions()
is a net_processing function, adding data to a net_processing data structure.AcceptToMemoryPool()
and the functions it calls are validation functions and shouldn’t call into net_processing.in src/net_processing.cpp:2044 in eb30f87e09 outdated
2011+ * @param[in] mempool Mempool 2012+ * @param[in/out] orphan_work_set The set of orphan transactions to reconsider. Generally only one 2013+ * orphan will be reconsidered on each call of this function. This set 2014+ * may be added to if accepting an orphan causes its children to be 2015+ * reconsidered. 2016+ */
troygiorshev commented at 8:47 pm on July 23, 2020:non-blocking personal-opinion nit: I would prefer the docs to be wrapped to 80 💻
jnewbery commented at 7:39 am on July 24, 2020:We don’t have a project-wide style for this, although I try to keep my lines under 120 chars. 80 feels is a bit stingy to me.troygiorshev commented at 9:15 pm on July 23, 2020: contributoralmost-ACK
+💯 to removing old outdated comments! 👏👏
Travis failure is unrelated
jnewbery force-pushed on Jul 24, 2020jnewbery force-pushed on Jul 24, 2020jnewbery force-pushed on Jul 24, 2020jnewbery commented at 8:12 am on July 24, 2020: memberThanks for the review @troygiorshev . I’ve addressed your comment in the lastest force-push: https://github.com/bitcoin/bitcoin/compare/eb30f87e0972faae78df7c6116bc18f6fad7b353..833bccc522703f87d0d46b190a9a29bcb633c615troygiorshev commented at 11:26 am on July 24, 2020: contributortACK 833bccc viagit range-diff master eb30f87 833bccc
in src/net_processing.cpp:261 in 833bccc522 outdated
256@@ -253,12 +257,19 @@ namespace { 257 return &(*a) < &(*b); 258 } 259 }; 260- std::map<COutPoint, std::set<std::map<uint256, COrphanTx>::iterator, IteratorComparator>> mapOrphanTransactionsByPrev GUARDED_BY(g_cs_orphans); 261 262- std::vector<std::map<uint256, COrphanTx>::iterator> g_orphan_list GUARDED_BY(g_cs_orphans); //! For random eviction 263+ /** Index from COutPoint into the mapOrphanTransactions. Used to remove
amitiuttarwar commented at 2:50 am on July 25, 2020:would be nice if this mentioned that it’s the parents COutPoint
jnewbery commented at 8:05 am on July 27, 2020:donein src/net_processing.cpp:264 in 833bccc522 outdated
261 262- std::vector<std::map<uint256, COrphanTx>::iterator> g_orphan_list GUARDED_BY(g_cs_orphans); //! For random eviction 263+ /** Index from COutPoint into the mapOrphanTransactions. Used to remove 264+ * orphan transactions from the mapOrphanTransactions */ 265+ std::map<COutPoint, std::set<std::map<uint256, COrphanTx>::iterator, IteratorComparator>> mapOrphanTransactionsByPrev GUARDED_BY(g_cs_orphans); 266+ /** Orphan transactions listed in random order for random eviction */
amitiuttarwar commented at 3:02 am on July 25, 2020:1/3 nit on this comment, 2/3 for my understanding- these orphans aren’t listed exactly random right?
AddOrphanTx
adds the entries to in the order they are seen, butEraseOrphanTx
introduces some randomness bc it deletes an entry by swapping out the last element (which I’m still trying to understand the point of).seems the point of this data structure is to enable
LimitOrphanTxSize
to easily choose a random orphan to erase by providing a vector interface?if my understanding is correct, I think it’d be clearer to say “List of orphan transactions to enable random eviction”
jnewbery commented at 8:16 am on July 27, 2020:these orphans aren’t listed exactly random right?
if my understanding is correct, I think it’d be clearer to say “List of orphan transactions to enable random eviction”
You’re correct. I’ve now fixed the comment. Thank you!
it deletes an entry by swapping out the last element (which I’m still trying to understand the point of)
This is because
g_orphan_list
is a std::vector, which is a contiguous-memory container. It can’t have gaps in the underlying array. The normal way to erase an element from the middle of the vector is to delete it and then shift all the following elements forward by one position. That’s expensive, and would invalidate all of thelist_pos
values that point back to a position in the vector. It’s more efficient to copy the last element to the position of the element you want to delete, fix up thelist_pos
of just thatCOrphanTx
, and then pop of the (now duplicate) last element.
amitiuttarwar commented at 7:05 pm on July 27, 2020:does this sound right:
it seems to me that
list_pos
is tracked for this reason. so the alternative would be if we removedlist_pos
& used in the built in vector erase function. the tradeoff would be less efficiency.the current design choice is to store this additional field on every
COrphanTx
to reduce an O(n) operation on theg_orphan_list
vector, where n <= 100 |-maxorphantx
. but I don’t have much idea of how often we erase orphans in practice. so small increase in memory for small decrease in processing time?
jnewbery commented at 7:03 am on July 28, 2020:Yes, this is the only reason we storelist_pos
. Without it, we’d have to do a (relatively) expensive erase from the middle of the vector.in src/net_processing.cpp:164 in 833bccc522 outdated
155+/** Guards orphan transactions and extra txs for compact blocks */ 156 RecursiveMutex g_cs_orphans; 157+/** Map from txid to orphan transaction record. Limited by 158+ * -maxorphantx/DEFAULT_MAX_ORPHAN_TRANSACTIONS */ 159 std::map<uint256, COrphanTx> mapOrphanTransactions GUARDED_BY(g_cs_orphans); 160 std::map<uint256, std::map<uint256, COrphanTx>::iterator> g_orphans_by_wtxid GUARDED_BY(g_cs_orphans);
amitiuttarwar commented at 3:07 am on July 25, 2020:while adding comments, consider one here too? meybe: Index from wtxid into the mapOrphanTransactions to lookup orphan transactions using their witness ids.
jnewbery commented at 8:20 am on July 27, 2020:Thanks. I missed this in the rebase. Added.in src/net_processing.cpp:2040 in 833bccc522 outdated
2007+/** 2008+ * Reconsider orphan transactions after a parent has been accepted to the mempool. 2009+ * 2010+ * @param[in] connman Connection manager 2011+ * @param[in] mempool Mempool 2012+ * @param[in/out] orphan_work_set The set of orphan transactions to reconsider. Generally only one
amitiuttarwar commented at 3:13 am on July 25, 2020:after this PR, isn’t it a max of one orphan that will be reconsidered on each run of the function?
jnewbery commented at 8:25 am on July 27, 2020:Yes. Fixed!amitiuttarwar commented at 3:50 am on July 25, 2020: contributoryay tidy! thanks for the cleanup and docs, the code makes more sense.
is my understanding correct that there is a v. small change in behavior in
ProcessOrphanTx
- previously, if an orphan taken off theorphan_work_set
was not accepted to mempool bc of missing additional parents & there are more orphans in the work set, the while loop would continue & process the next one. with these changes it will consider max of one orphan per run. seems like an edge case & acceptable shift, but would like to confirm my understanding.jnewbery force-pushed on Jul 27, 2020jnewbery commented at 8:55 am on July 27, 2020: memberThanks for the review @amitiuttarwar . I’ve addressed your review comments.
is my understanding correct that there is a v. small change in behavior in ProcessOrphanTx- previously, if an orphan taken off the orphan_work_set was not accepted to mempool bc of missing additional parents & there are more orphans in the work set, the while loop would continue & process the next one. with these changes it will consider max of one orphan per run. seems like an edge case & acceptable shift, but would like to confirm my understanding.
Yes, that’s exactly right. See the commit log from ProcessOrphanTx: Only ever reprocess one orphan tx:
0 Minor changes in behavior: 1 2 - exit immediately if the orphan_work_set entry is not found in 3 mapOrphanTransactions. This can happen if the orphan is erased 4 from mapOrpahnTransactions due to action by another peer. 5 - exit immediately if the orphan is still an orphan (do not 6 continue and process another).
To expand a bit: this logic was extracted into its own function in #15644 to ensure that we only do one expensive transaction validation per
ProcessMessages()
/SendMessages()
call. There were two conditions where we could process more than one item from theorphan_work_set
: the transaction is no longer in the orphan pool, or the transaction still doesn’t have all of its parents (in which caseAcceptToMemoryPool()
exits withTX_MISSING_INPUTS
before doing any expensive checks).I think it’s a much simpler mental model to say that we’ll process exactly one transaction from the set on every call to
ProcessOrphanTx()
(and it makes the functionality match the function name!) I’ve seen multiple bugs in these auxiliary work queues where we’ve either failed to make forward progress or created a livelock with network peers. Keeping the mental model of these work queues as simple as possible definitely seems worth the minimal sacrifice of having to call this function more times.jnewbery force-pushed on Jul 27, 2020troygiorshev commented at 12:30 pm on July 27, 2020: contributorreACK c5b4c02ee2e423dbcccc985b4f573f8ed9aafa0a viagit range-diff master 833bccc c5b4c02
, only comment changesamitiuttarwar commented at 7:18 pm on July 27, 2020: contributorI’ve seen multiple bugs in these auxiliary work queues where … created a livelock with network peers
woah interesting, any you could point me towards?
See the commit log from ProcessOrphanTx: Only ever reprocess one orphan tx
oops sorry, I don’t know how I missed that. thanks for the explanation.
alright, I’ve been considering possible ways this difference could be exploited and I can’t think of anything substantial. I’m comfortable with these changes.
code review ACK c5b4c02ee2
sipa commented at 7:27 pm on July 27, 2020: memberI’m not worried about the only-reprocess-one-at-a-time as an attack vector - at worst, a peer could make his own transactions be processed slower. As long as there is no impact on what one peer can do w.r.t. processing of things arriving from other peers, there is no such concern.
However, I’m slightly concerned that there could be innocent transaction patterns that result in orphan transactions being processed significantly slower. As every incoming transaction may “wake up” a number of unprocessed orphans for reprocessing, and each of these again waking up earlier ones again, it sounds like the worst case may be pretty bad.
jnewbery commented at 7:21 am on July 28, 2020: memberI’m not worried about the only-reprocess-one-at-a-time as an attack vector - at worst, a peer could make his own transactions be processed slower. As long as there is no impact on what one peer can do w.r.t. processing of things arriving from other peers, there is no such concern.
If this is the case, then should we process the entire
orphan_work_set
in one pass? I’m happy to remove the ProcessOrphanTx: Only ever reprocess one orphan tx commit, but I want to understand (and document!) what the expected behaviour is. It’s currently not made obvious to the reader which circumstances will cause us to reconsider multiple transactions from the orphan set in one pass (answer: if the txid is no longer in the orphan set, we’llcontinue
, and if the result fromAcceptToMemoryPool
isTX_MISSING_INPUTS
we’ll iterate again without setting thedone
variable).sipa commented at 6:14 pm on July 28, 2020: member@jnewbery It currently iterates until either one orphan is actually accepted or actually rejected, or the work set is empty.
So it continues until it does some “real work”, as anything else (reconsidering things that still have missing inputs) is presumably fast.
jnewbery commented at 10:45 am on July 29, 2020: memberI don’t see how this could cause orphans to be processed significantly slower. We always make progress through the peer’s orphan_work_set on each ProcessMessages call, and we tell the message handler thread that there’s more work to do if the work set isn’t empty, so we’ll never sleep between calls.
In the absolute pessimal case, all orphans start in orphan_work_set, the first 99 are still orphans after reconsideration and the 100th is also a parent to the first 99. We then reconsider the first 99 again and the 99th is a parent to the first 98, and so on. To clear this, we’d need to call
ProcessOrphanTx()
100^2 / 2 = 50,000 times instead of 100 times currently. I don’t think that’s a problem since:- this can’t happen through innocent transaction patterns. The peer would need to construct a very particular transaction tree (including grinding to get the txids in order)
- this doesn’t make us do any additional work. We have to call AcceptToMemoryPool() 50,000 times in each case. The only difference is that with this branch we spread those calls out over 50,000 calls to ProcessOrphanTx() instead of 100.
- the peer that fed us the 100 transactions is therefore only hurting itself. Processing for other peers is unaffected.
Are you thinking of a different scenario?
ajtowns commented at 1:09 am on July 31, 2020: memberWow, big approach NACK on “tidy up” PRs that do behaviour changes.
I think a better mental model for this sort of work queue is “always make real progress every time you look at the queue”, and the main benefit of the tidy up here is moving the “make real progress” out of the while loop, since we’re not doing many of them we’re only doing one, so having it in the loop is misleading. But we can do both: just make the trivial progress first in a loop to find a real orphan tx to attempt, then do that outside of the loop. Something like:
0--- a/src/net_processing.cpp 1+++ b/src/net_processing.cpp 2@@ -2010,23 +2010,26 @@ static void ProcessHeadersMessage(CNode& pfrom, CConnman& connman, ChainstateMan 3 * 4 * [@param](/bitcoin-bitcoin/contributor/param/)[in] connman Connection manager 5 * [@param](/bitcoin-bitcoin/contributor/param/)[in] mempool Mempool 6- * [@param](/bitcoin-bitcoin/contributor/param/)[in/out] orphan_work_set The set of orphan transactions to reconsider. Only one orphan 7- * will be reconsidered on each call of this function. This set 8- * may be added to if accepting an orphan causes its children to 9- * be reconsidered. 10+ * [@param](/bitcoin-bitcoin/contributor/param/)[in/out] orphan_work_set The set of orphan transactions to reconsider. Exactly one orphan 11+ * will be re-validated on each call of this function if there is an 12+ * orphan to process, however if any of the items in orphan_work_set 13+ * have already been removed from mapOrphanTransactions they may 14+ * also be erased. This set may be added to if accepting an orphan 15+ * causes its children to be reconsidered. 16 */ 17 void static ProcessOrphanTx(CConnman& connman, CTxMemPool& mempool, std::set<uint256>& orphan_work_set) EXCLUSIVE_LOCKS_REQUIRED(cs_main, g_cs_orphans) 18 { 19 AssertLockHeld(cs_main); 20 AssertLockHeld(g_cs_orphans); 21 22- if (orphan_work_set.empty()) return; 23- 24- const uint256 orphanHash = *orphan_work_set.begin(); 25- orphan_work_set.erase(orphan_work_set.begin()); 26- 27- auto orphan_it = mapOrphanTransactions.find(orphanHash); 28+ std::map<uint256, COrphanTx>::iterator orphan_it = mapOrphanTransactions.end(); 29+ while (!orphan_work_set.empty() && orphan_it == mapOrphanTransactions.end()) { 30+ const uint256 orphanHash = *orphan_work_set.begin(); 31+ orphan_work_set.erase(orphan_work_set.begin()); 32+ orphan_it = mapOrphanTransactions.find(orphanHash); 33+ } 34 if (orphan_it == mapOrphanTransactions.end()) return; 35+ const uint256 orphanHash = orphan_it->first; 36 37 const CTransactionRef porphanTx = orphan_it->second.tx; 38 TxValidationState state;
EDIT: (this is still a behaviour change in that it stops work after finding a tx that still has missing inputs. I’m fine with counting that much checking as “real work”, just don’t see doing a failing map lookup as enough progress)
jnewbery commented at 9:31 am on July 31, 2020: memberWow, big approach NACK on “tidy up” PRs that do behaviour changes.
I don’t understand. You’ve NACKed a behaviour change and then suggested a different behaviour change. What’s the difference?
I agree that if a PR claimed to not change behaviour but did unintentionally, that’d be very bad, but all the behaviour changes in this PR have been fully documented and justified in the commit logs since the PR was opened. They’re observationally equivalent to the peers, and no peer can influence how the node processes other peers.
DrahtBot added the label Needs rebase on Aug 6, 2020jnewbery commented at 10:43 am on August 7, 2020: member@sipa @ajtowns - you’ve both expressed reservations about this PR. Before I go ahead and rebase, can you answer the questions above: @sipa
I’m slightly concerned that there could be innocent transaction patterns that result in orphan transactions being processed significantly slower.
I don’t see how this could cause orphans to be processed significantly slower. […] Are you thinking of a different scenario? @ajtowns
Wow, big approach NACK on “tidy up” PRs that do behaviour changes.
all the behaviour changes in this PR have been fully documented and justified in the commit logs since the PR was opened. They’re observationally equivalent to the peers, and no peer can influence how the node processes other peers.
ajtowns commented at 2:56 pm on August 11, 2020: member@sipa @ajtowns - you’ve both expressed reservations about this PR. Before I go ahead and rebase, can you answer the questions above:
Wow, big approach NACK on “tidy up” PRs that do behaviour changes. all the behaviour changes in this PR have been fully documented and justified in the commit logs since the PR was opened. They’re observationally equivalent to the peers, and no peer can influence how the node processes other peers.
The NACK is just on describing something as a “tidy up” when it’s a behaviour change; fixable either by describing the behaviour change, or moving it to a separate PR (and describing it there). “Oh, I tidied up” should mean “I moved the stuff over there” not “this code didn’t bring me joy, so I Marie Kondo’d it”
Sorry for the delay, I wanted to check that the behaviour was actually (potentially) changing – the difference between doing the next step in
ProcessOrphanTx
and henceProcessMessages
directly, rather than leaving it for the next iteration inThreadMessageHandler
(which is reduced to 0.1sec away due tofMoreWork
being set) is that this gives time for other peers to do things as well.I think with suitably well connected peers you could arrange for a set of transactions A1..A100 with A as a parent, B1..B100 each of which have each of A1..A100 as a parent, C1..C100 each of which have each of B1..B100 as a parent, D1..D100 etc. Node X provides A1..A100, and then B provides tx A. At this point A1..A100 are in its orphan_work_set, and it tries A1 then lets other peers have a go. Node Y comes along and provides B1..B100, and some random subset of A1..A100,B1..B100 get evicted from the orphanage. X then processes A2, either doing nothing, or adding B1..B100 to its work set. Then Y provides C1..C100. X processes A3, etc. Not sure if there’s a way to tweak that so the work set grows spectacularly unboundedly, but it definitely seems like a behaviour change to me, and it definitely doesn’t match my expectation of “making forward progress when working on a queue”.
jnewbery commented at 8:42 pm on August 11, 2020: memberThanks @ajtowns . A few questions/observations about your scenario:
- “which is reduced to 0.1sec away due to fMoreWork being set”. I think it’s the other way round. We wait for up to 0.1s if there’s no more work to do. If there is work to do, then we immediately reiterate through vNodes again.
- “Node X provides A1..A100, and then B provides tx A”. I think the B is a typo here and Node X provides A1..A100 and also provides tx A? (I don’t see node B mentioned anywhere else)
- "…it tries A1 then lets other peers have a go. Node Y comes along and provides B1..B100". When node X “lets other peers have a go” it returns from
ProcessMessages()
and we continue through the message handler loop. We can only process one message from each peer on each iteration of that loop, so node Y can’t roll mapOrphanTransactions in one iteration as you suggest here. Theoretically nodes Y1 through Y100 could do so, but their messages would need to be exactly synchronized with the message handler loop, which seems unlikely. - “Not sure if there’s a way to tweak that so the work set grows spectacularly unboundedly”. I don’t think so. Every iteration through the the message handler loop, each peer can theoretically add one tx to the orphan map, and node X will remove one tx from its orphan_work_set. Processing that tx from the orphan_work_set may result in more txs being added to the set.
- The scenario you describe where other peers construct transactions that node X adds to its orphan_work_set is possible today. The slower removal of evicted orphans from the orphan_work_set doesn’t change this.
- aside: this is a good argument to not process orphan transactions in the context of the node that provided the parent/ancestor, which I wanted to do in #19364.
- finally, to what end is node Y carrying out this “attack” on node X. Yes, it can prevent the victim from processing messages from node X, but only as long as it can continue send transactions that get accepted to the mempool at the victim’s transaction processing capacity. I’d expect that to be many transactions per second. This doesn’t seem like a particularly useful attack, but if there are more dangerous variations, I still think the correct fix is to not process orphans in the context of the peer that provided the parent/ancestor.
amitiuttarwar commented at 9:53 pm on August 11, 2020: contributorsince its the only point of contention, maybe the ProcessOrphanTx: Only ever reprocess one orphan tx commit can be broken out into a separate PR to separate review from the rest of the changes?
in regards to this change, I’m not sure I currently have much insight to offer on this nuanced discussion, I’m mostly trying to learn and make sure I understand the concerns/tradeoffs. this comment is mostly a biased summary, so sorry if it doesn’t feel useful, but I’m hoping it might be helpful for moving the conversation forward, maybe for other reviewers, and for my own understanding. so, I’m interested in any feedback on this take.
an attempt to sum up some of the principles we are discussing: (A) each attempt to process an item should make forward progress in the work queue (B) an adversarial peer shouldn’t be able to significantly impact the node eg. prevent it from processing messages from other peers (C) we should simplify the mental overhead required to reason about these work queues, because their interactions are inherently complex & error-prone (D) we don’t want to negatively impact processing in normal cases
on current master- to achieve (B) we cap at 1 expensive validation per call. and while “expensive” is always going to be a judgement call, I don’t think the existing logic has clear boundaries. if we invoke ATMP but the error is
TX_MISSING_INPUTS
, we attempt to process the next orphan. this helps (A), but in my opinion detracts from (C) because there are other failures that exit the loop that do even less work (like if the txn fails pre-checks). so essentially, instead of saying “exit if we do expensive work”, we are saying “exit if we do the right kind of work”. which @sipa defines here #19498 (comment) as “real work” -> orphan is actually accepted, actually rejected, or the work set is empty. which can be argued achieves (C).in the current proposed changes- (correct me if I’m wrong, but based on my impression..) @jnewbery is striving to improve (C) by loosening the definition of “forward progress” in (A), and makes a case that this doesn’t detract from (B) or (D). the changes define “forward progress” to be always remove exactly one orphan from the work queue (even if lookup fails or inputs still missing)
in @ajtowns suggestion here #19498 (comment), (again.. please correct me if I’m wrong) he agrees that it would be beneficial to improve (C) but doesn’t think (A) is sufficiently addressed in the patch and proposes a definition of “forward progress” between current master & current PR- essentially, ATMP an orphan.
👆🏽 seems like a reasonable compromise to me in improving (C) and a reasonable definition for (A). do others agree? or are there outstanding concerns with robustness / @jnewbery do you not think it achieves (C)?
also, I haven’t yet dug into fully understanding the attack being discussed, but @ajtowns is the essential concern around a way a sophisticated attacker can grow the queue faster than it would be processed in the current proposed patch?
jnewbery force-pushed on Aug 12, 2020jnewbery commented at 8:47 am on August 12, 2020: memberDrahtBot removed the label Needs rebase on Aug 12, 2020jnewbery force-pushed on Aug 12, 2020jnewbery commented at 9:04 am on August 12, 2020: memberThanks @amitiuttarwar . Great summary.
I’ve done as you’ve suggested and removed the ProcessOrphanTx: Only ever reprocess one orphan tx commit. I think this PR by itself is a marginal improvement, but not including #19364 and the ProcessOrphanTx: Only ever reprocess one orphan tx commit limits the benefits. The fact that @ajtowns and I can’t agree on whether node Y can interfere with node X’s processing by sending many transactions does confirm my belief that orphan handling is (unnecessarily?) complex and a simpler mental model would be beneficial here.
If people think this PR is still worthwhile without that commit, lets go ahead and try to get it merged, and perhaps revisit orphan processing at some point in the future.
jnewbery commented at 2:12 pm on August 24, 2020: memberrebasedjnewbery force-pushed on Aug 24, 2020amitiuttarwar commented at 9:53 pm on August 24, 2020: contributorutACK 82396bb897
orphan processing is complex & I appreciate the efforts to simplify & document. thank you!
as before, I still think simplifying the while loop to have a clearer definition of “forward progress” would be nice, but given the review strugs, I understand dropping from this PR.
in src/net_processing.cpp:2058 in f98ffeb432 outdated
2039@@ -2040,12 +2040,9 @@ void static ProcessOrphanTx(CConnman& connman, CTxMemPool& mempool, std::set<uin 2040 const CTransactionRef porphanTx = orphan_it->second.tx; 2041 const CTransaction& orphanTx = *porphanTx; 2042 NodeId fromPeer = orphan_it->second.fromPeer; 2043- // Use a new TxValidationState because orphans come from different peers (and we call 2044- // MaybePunishNodeForTx based on the source peer from the orphan map, not based on the peer 2045- // that relayed the previous transaction). 2046- TxValidationState orphan_state; 2047+ TxValidationState state;
MarcoFalke commented at 7:37 am on August 25, 2020:in commit f98ffeb4323111f214234ef33:
I’d be surprised if this commit compiles. Especially L2094
jnewbery commented at 9:02 am on August 25, 2020:Good catch Marco. This was a bad rebase. I’ve fixed it and verified that all commits compile.jnewbery force-pushed on Aug 25, 2020jnewbery commented at 9:02 am on August 25, 2020: memberFixed a bad intermediate commit (no changes to final code)amitiuttarwar commented at 4:35 pm on August 25, 2020: contributorre utACK e59bbcd06f - checked that nothing changed about final state & that commits buildMarcoFalke added the label Refactoring on Sep 7, 2020MarcoFalke commented at 7:14 am on September 7, 2020: memberConcept ACKDrahtBot commented at 5:40 pm on September 7, 2020: member🐙 This pull request conflicts with the target branch and needs rebase.
Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.
DrahtBot added the label Needs rebase on Sep 7, 2020[net processing] Add doxygen comments for orphan data and function 6e8dd99ef1ProcessOrphanTx: remove useless done variable
There is a keyword that allows us to break out of loops. Use it. There's a small change in behaviour here: if we process multiple orphans that are still orphans, then we'll only call mempool.check() once at the end, instead of after processing each tx.
ProcessOrphanTx: remove useless setMisbehaving set
This starts empty, and is only added to if we're about to exit the function (so we never read from it).
ProcessOrphanTx: Remove outdated commented
Also rename orphan_state to state. Both the comment and the variable name are leftover from when this logic was part of ProcessMessage().
ProcessOrphanTx: Remove aliases 4fce726bd1ProcessOrphanTx: Move AddToCompactExtraTransactions call into ProcessOrphanTx 001343f4bcjnewbery force-pushed on Sep 7, 2020jnewbery commented at 7:13 pm on September 7, 2020: memberRebasedjnewbery removed the label Needs rebase on Sep 7, 2020troygiorshev commented at 2:50 am on September 23, 2020: contributorACK 001343f4bc8b22fa9e313bd2867756eb9d614fa3
This is a rather straightforward PR now.
sipa commented at 1:08 am on September 30, 2020: memberutACK 001343f4bc8b22fa9e313bd2867756eb9d614fa3fanquake requested review from amitiuttarwar on Sep 30, 2020MarcoFalke commented at 1:53 pm on September 30, 2020: memberACK 001343f4bc8b22fa9e313bd2867756eb9d614fa3 🌮
Signature:
0-----BEGIN PGP SIGNED MESSAGE----- 1Hash: SHA512 2 3ACK 001343f4bc8b22fa9e313bd2867756eb9d614fa3 🌮 4-----BEGIN PGP SIGNATURE----- 5 6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p 7pUhi8gv+MYMSOeIoSOoTXDVzdEvq7uZMWyFQwE9JWb+4Gf7ArKX7dLYmeI7/Qx7b 8hk1wi8p7ilKfr6/9qmVrlEbautGDbPXvGQ9QfUhOLl7O73UKigynh+49Sq6VfH20 9Et7J3EcyPncqRvU+sQihvOJuXVfsd/K1LXFrxn1Yby8c++pLCmlGwzzlIexWrO+T 10eV19tSFabJaux0U5ft4l/ZAJ+DncYjM7IQ16W/PkfE3Z/dyfrsr92ZmfZKfHdPd/ 11b8kywFsZ5vZOQAX+4rcAvyHIZyOTZaRTRiKAOTeZ251zn8VjGlhYsbPmA1mwvtZn 12q9l5C4aTEvDFhqzlSf2kvZ5PR121SGezO5Eq4DEPyRHoK0oyybHv/D6doNOO99Da 13hVDhdsvCSVXrMbhOSCwboufxbEbYJ5yoLYNZzV8z2WF8Rq2TmdY/l2m+fww8L+SD 14vatKkGZQATYSd/kMzxBzH0JWLAmz0wEuixWJfaTavv/nO/PsDRDByy0/JNUHUEw2 15UiwmxSO1 16=BIT5 17-----END PGP SIGNATURE-----
Timestamp of file with hash
bf74cbfe89c193a0e98cecd45019f0465bbd807c268f14767ecff37e07f76fe5 -
MarcoFalke merged this on Sep 30, 2020MarcoFalke closed this on Sep 30, 2020
jnewbery deleted the branch on Sep 30, 2020sidhujag referenced this in commit 1c9fa80e11 on Sep 30, 2020Fabcien referenced this in commit ac7a1c6996 on Nov 3, 2021Fabcien referenced this in commit 4410964ccc on Nov 3, 2021Fabcien referenced this in commit e2e197f5ef on Nov 3, 2021Fabcien referenced this in commit 062d88d5fe on Nov 3, 2021Fabcien referenced this in commit 9b3b55dccc on Nov 3, 2021DrahtBot locked this on Feb 15, 2022
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 15:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me