p2p: track and use all potential peers for orphan resolution #31397

pull glozow wants to merge 12 commits into bitcoin:master from glozow:2024-11-multi-orphan changing 20 files +556 −183
  1. glozow commented at 9:06 pm on November 30, 2024: member

    Part of #27463.

    (Transaction) orphan resolution is a process that kicks off when we are missing UTXOs to validate an unconfirmed transaction. We currently request missing parents by txid; BIP 331 also defines a way to explicitly request ancestors.

    Currently, when we find that a transaction is an orphan, we only try to resolve it with the peer who provided the tx. If this doesn’t work out (e.g. they send a notfound or don’t respond), we do not try again. We actually can’t, because we’ve already forgotten who else could resolve this orphan (i.e. all the other peers who announced the transaction).

    What is wrong with this? It makes transaction download less reliable, particularly for 1p1c packages which must go through orphan resolution in order to be downloaded.

    Can we fix this with BIP 331 / is this “duct tape” before the real solution? BIP 331 (receiver-initiated ancestor package relay) is also based on the idea that there is an orphan that needs resolution, but it’s just a new way of communicating information. It’s not inherently more honest; you can request ancestor package information and get a notfound. So ancestor package relay still requires some kind of procedure for retrying when an orphan resolution attempt fails. See the #27742 implementation which builds on this orphan resolution tracker to keep track of what packages to download (it just isn’t rebased on this exact branch). The difference when using BIP 331 is that we request ancpkginfo and then pkgtxns instead of the parent txids.

    Approach This PR introduces m_orphan_resolution_tracker which is responsible for remembering which peers are candidates for orphan resolution and scheduling when to make orphan resolution requests (i.e. getdata for txid of missing parent(s)). The scheduling process should be:

    • Bandwidth-efficient: don’t have too many requests out at once. As already implemented today, transaction requests for orphan parents and regular download both go through the TxRequestTracker so that we don’t have duplicate requests out.
    • Not vulnerable to censorship: don’t give up too easily, use all candidate peers. See e.g. https://bitcoincore.org/en/2024/07/03/disclose_already_asked_for/
    • Load-balance between peers: don’t overload peers; use all peers available. This is also useful for when we introduce per-peer orphan protection, since each peer will have limited slots.

    Notice that the functionality and requirements are extremely similar to that of transaction download. Naturally, TxRequestTracker is the data structure used for m_orphan_resolution_tracker. Instead of tracking hashes of transactions we need to download, it tracks hashes of orphans we need to resolve.

  2. glozow added the label P2P on Nov 30, 2024
  3. DrahtBot commented at 9:06 pm on November 30, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31397.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK instagibbs
    Concept ACK dergoegge, mzumsande

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28031 (Package Relay 1/3: Introduce TxDownloadManager and improve orphan-handling by glozow)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. in src/txorphanage.cpp:25 in 3e16a36959 outdated
    22+    if (it != m_orphans.end()) {
    23+        Assume(!it->second.announcers.empty());
    24+        const auto ret = it->second.announcers.insert(peer);
    25+        if (ret.second) {
    26+            LogDebug(BCLog::TXPACKAGES, "added peer=%d as announcer of orphan tx %s\n", peer, wtxid.ToString());
    27+        }
    


    dergoegge commented at 12:04 pm on December 2, 2024:
    0    if (auto it{m_orphans.find(wtxid)}; it != m_orphans.end()) {
    1        AddAnnouncer(wtxid, peer);
    

    Duplicating the map lookup is better than duplicating the code imo.


    glozow commented at 12:01 pm on December 6, 2024:
    largely agree, done
  5. in test/functional/p2p_opportunistic_1p1c.py:251 in 3e16a36959 outdated
    249         node_mempool = node.getrawmempool()
    250         assert low_fee_parent["txid"] not in node_mempool
    251         assert tx_orphan_bad_wit.rehash() not in node_mempool
    252 
    253-        # 5. Peer that sent a consensus-invalid transaction should be disconnected.
    254-        bad_orphan_sender.wait_for_disconnect()
    


    instagibbs commented at 3:45 pm on December 2, 2024:
    we should cause the disconnect still to make sure we’re disconnecting on consensus failure

    glozow commented at 12:10 pm on December 6, 2024:
    Good point, added a line where this sender sends the parent too
  6. in test/functional/p2p_opportunistic_1p1c.py:294 in 3e16a36959 outdated
    289-        # 5. Peer sent a consensus-invalid transaction.
    290-        fake_parent_sender.wait_for_disconnect()
    291-
    292         self.log.info("Check that fake parent does not cause orphan to be deleted and real package can still be submitted")
    293-        # 6. Child-sending should not have been punished and the orphan should remain in orphanage.
    294+        # 5. Child-sending should not have been punished and the orphan should remain in orphanage.
    


    instagibbs commented at 3:58 pm on December 2, 2024:
    while we’re touching this, we can directly assert via getorphantxs now

    glozow commented at 12:33 pm on December 11, 2024:
    Done, and added a multi-announcer test to rpc_orphans.py
  7. in src/node/txdownloadman_impl.cpp:262 in 3e16a36959 outdated
    357@@ -356,34 +358,6 @@ std::optional<PackageToValidate> TxDownloadManagerImpl::Find1P1CPackage(const CT
    358             return PackageToValidate{ptx, child, nodeid, nodeid};
    359         }
    360     }
    361-
    362-    // If no suitable candidate from the same peer is found, also try children that were provided by
    


    instagibbs commented at 3:59 pm on December 2, 2024:
    3e16a36959f70da59846621f099c1f1df4a210ed love this cleanup of logic :)

    glozow commented at 11:55 am on December 6, 2024:
    That’s why I made them separate functions! Easy deletion.
  8. in src/node/txdownloadman_impl.cpp:336 in 938659fbd8 outdated
    307@@ -308,7 +308,7 @@ node::RejectedTxTodo TxDownloadManagerImpl::MempoolRejectedTx(const CTransaction
    308     // Whether we should call AddToCompactExtraTransactions at the end
    309     bool add_extra_compact_tx{first_time_failure};
    310     // Hashes to pass to AddKnownTx later
    311-    std::vector<uint256> unique_parents;
    312+    std::vector<Txid> unique_parents;
    


    instagibbs commented at 4:40 pm on December 3, 2024:
    nit: could also adapt the for loop over unique_parents in ProcessInvalidTx

    glozow commented at 12:26 pm on December 6, 2024:
    done
  9. in src/txorphanage.h:96 in 7badc7e004 outdated
    88@@ -84,6 +89,8 @@ class TxOrphanage {
    89 protected:
    90     struct OrphanTx : public OrphanTxBase {
    91         size_t list_pos;
    92+        /** Txids of the missing parents to request. Determined by peerman. */
    


    instagibbs commented at 5:17 pm on December 3, 2024:
    as of introduction commit 7badc7e004e288aef645404b1b09699be8b87d1e this is storing all parent txids, missing or not.

    mzumsande commented at 10:41 pm on December 10, 2024:
    now determined by txdownloadman
  10. in src/test/orphanage_tests.cpp:129 in 7badc7e004 outdated
    134@@ -135,7 +135,7 @@ BOOST_AUTO_TEST_CASE(DoS_mapOrphans)
    135         tx.vout[0].nValue = 1*CENT;
    136         tx.vout[0].scriptPubKey = GetScriptForDestination(PKHash(key.GetPubKey()));
    137 
    138-        orphanage.AddTx(MakeTransactionRef(tx), i);
    139+        orphanage.AddTx(MakeTransactionRef(tx), i, {});
    


    instagibbs commented at 5:19 pm on December 3, 2024:

    7badc7e004e288aef645404b1b09699be8b87d1e In this test and the fuzz harness, I’m not sure I’m happy with this new field sometimes being populated and sometimes not.

    Could you make a test util that automatically fills in this argument properly via just looking at the tx inputs?


    instagibbs commented at 2:58 pm on December 6, 2024:
    alternatively, just don’t set them ever unless they’re intended to be used in the test, deleting most of these(?) won’t invalidate tests
  11. in src/txorphanage.cpp:186 in acaa7c13fa outdated
    182@@ -152,6 +183,12 @@ bool TxOrphanage::HaveTx(const Wtxid& wtxid) const
    183     return m_orphans.count(wtxid);
    184 }
    185 
    186+bool TxOrphanage::HaveTxAndPeer(const Wtxid& wtxid, NodeId peer) const
    


    instagibbs commented at 5:26 pm on December 3, 2024:
    s/HaveTxAndPeer/HaveTxFromPeer/ or similar?

    glozow commented at 12:09 pm on December 6, 2024:
    changed
  12. in src/txorphanage.h:86 in acaa7c13fa outdated
    89@@ -80,7 +90,8 @@ class TxOrphanage {
    90     /** Allows providing orphan information externally */
    91     struct OrphanTxBase {
    92         CTransactionRef tx;
    93-        NodeId fromPeer;
    94+        /** Peers added with AddTx or AddAnnouncer. */
    


    instagibbs commented at 5:28 pm on December 3, 2024:

    noting as of acaa7c13fac3e35a55daf2c62c27ab85172b4964 AddAnnouncer isnt called

    3 2 cases?

    1. announced as INV (AddAnnouncer @ c04d1a876cdbcd159c53dde55d57a55675c41f38) 2. parent of another orphan given (?) because txid, not wtxid
    2. given directly (AddTx on all tracked advertisements of it too)

    glozow commented at 12:20 pm on December 6, 2024:
    Correct. I think this is not too misleading? Added “unused for now” in the commit messages.

    instagibbs commented at 2:42 pm on December 6, 2024:
    kind of just exposing my thoughts as I review!
  13. in src/txorphanage.cpp:26 in acaa7c13fa outdated
    22+        Assume(!it->second.announcers.empty());
    23+        const auto ret = it->second.announcers.insert(peer);
    24+        if (ret.second) {
    25+            LogDebug(BCLog::TXPACKAGES, "added peer=%d as announcer of orphan tx %s\n", peer, wtxid.ToString());
    26+        }
    27+        // Even if an announcer was added, no new orphan entry was created.
    


    instagibbs commented at 5:40 pm on December 3, 2024:
    0        // Since an additional announcer was added, no new orphan entry was created.
    

    glozow commented at 12:05 pm on December 6, 2024:
    We haven’t necessarily added an announcer. But I updated this comment.
  14. in src/txorphanage.cpp:140 in 83d276c7dd outdated
    138         while (iter != m_orphans.end())
    139         {
    140             std::map<Wtxid, OrphanTx>::iterator maybeErase = iter++;
    141             if (maybeErase->second.nTimeExpire <= nNow) {
    142-                nErased += EraseTx(maybeErase->second.tx->GetWitnessHash());
    143+                const auto& wtxid = maybeErase->second.tx->GetWitnessHash();
    


    instagibbs commented at 5:48 pm on December 3, 2024:
    isn’t MaybeErase.first the wtxid?

    glozow commented at 12:06 pm on December 6, 2024:
    switched
  15. in src/txorphanage.cpp:155 in 83d276c7dd outdated
    155     while (m_orphans.size() > max_orphans)
    156     {
    157         // Evict a random orphan:
    158         size_t randompos = rng.randrange(m_orphan_list.size());
    159-        EraseTx(m_orphan_list[randompos]->second.tx->GetWitnessHash());
    160+        const auto& wtxid = m_orphan_list[randompos]->second.tx->GetWitnessHash();
    


    instagibbs commented at 5:49 pm on December 3, 2024:
    not much shorter but m_orphan_list[randompos]->first is wtxid?

    glozow commented at 12:07 pm on December 6, 2024:
    done
  16. in src/node/txdownloadman_impl.cpp:375 in b64be9ed2c outdated
    347@@ -348,6 +348,12 @@ node::RejectedTxTodo TxDownloadManagerImpl::MempoolRejectedTx(const CTransaction
    348                 }
    349             }
    350             if (!fRejectedParents) {
    


    instagibbs commented at 6:14 pm on December 3, 2024:

    commit message b64be9ed2ce9b68aef178684dac3be998f18bbdd

    “announcments,”


    glozow commented at 11:57 am on December 6, 2024:
    fixed
  17. in src/node/txdownloadman_impl.cpp:181 in c04d1a876c outdated
    179@@ -174,10 +180,27 @@ void TxDownloadManagerImpl::DisconnectedPeer(NodeId nodeid)
    180 
    181 bool TxDownloadManagerImpl::AddTxAnnouncement(NodeId peer, const GenTxid& gtxid, std::chrono::microseconds now, bool p2p_inv)
    


    instagibbs commented at 6:29 pm on December 3, 2024:
    as of c04d1a876cdbcd159c53dde55d57a55675c41f38 p2p_inv=false has no usage aside from fuzz harness

    glozow commented at 12:01 pm on December 6, 2024:
    added a commit to clean this up
  18. in src/node/txdownloadman_impl.cpp:202 in c04d1a876c outdated
    197+
    198+    const bool already_had{AlreadyHaveTx(gtxid, /*include_reconsiderable=*/true)};
    199+
    200     // If this is an inv received from a peer and we already have it, we can drop it.
    201     // If this is a request for the parent of an orphan, we don't drop transactions that we already have. In particular,
    202     // we *do* want to request parents that are in m_lazy_recent_rejects_reconsiderable, since they can be CPFP'd.
    


    instagibbs commented at 6:52 pm on December 3, 2024:
    these comments seem outdated since AddTxAnnouncement is only being called for explicit INV related reasons now

    glozow commented at 12:01 pm on December 6, 2024:
    cleaned up
  19. in src/node/txdownloadman_impl.cpp:198 in c04d1a876c outdated
    193+            }
    194+        }
    195+        return true;
    196+    }
    197+
    198+    const bool already_had{AlreadyHaveTx(gtxid, /*include_reconsiderable=*/true)};
    


    instagibbs commented at 6:53 pm on December 3, 2024:
    already_had used exactly once, seems like we don’t need it

    glozow commented at 12:01 pm on December 6, 2024:
    cleaned up
  20. in src/node/txdownloadman_impl.cpp:243 in 3e16a36959 outdated
    238+    if (!info.m_relay_permissions) {
    239+        // This mirrors the delaying and dropping behavior in AddTxAnnouncement in order to preserve
    240+        // existing behavior: drop if we are tracking too many invs for this peer already. Each
    241+        // orphan resolution involves at least 1 transaction request which may or may not be
    242+        // currently tracked in m_txrequest, so we include that in the count.
    243+        if (m_txrequest.Count(nodeid) + m_orphan_resolution_tracker.Count(nodeid) >= MAX_PEER_TX_ANNOUNCEMENTS) return std::nullopt;
    


    glozow commented at 2:34 pm on December 4, 2024:
    Ah, need to add the size of unique_parents to this. Surprised fuzzer has not tripped on it…

    glozow commented at 11:57 am on December 6, 2024:
    deleted, no longer applicable
  21. in test/functional/p2p_orphan_handling.py:674 in 3e16a36959 outdated
    669+        peer_early_unresponsive = node.add_p2p_connection(PeerTxRelayer())
    670+
    671+        # Announces after tx is sent
    672+        peer_late_announcer = node.add_p2p_connection(PeerTxRelayer())
    673+
    674+        # Both peers send invs for the orphan, so the node an expect both to know its ancestors.
    


    dergoegge commented at 10:06 am on December 5, 2024:
    0        # Both peers send invs for the orphan, so the node can expect both to know its ancestors.
    

    glozow commented at 12:25 pm on December 6, 2024:
    done
  22. dergoegge commented at 11:25 am on December 5, 2024: member

    Concept ACK

    I implemented what I was talking about in the review club yesterday here: https://github.com/dergoegge/bitcoin/commits/2024-12-rslv-orphns/ (the only difference is the main commit 3928b400d805f75e0ba5b9947d5d441a828aa5be) and it looks like it passes your functional test additions! So perhaps it’s worth exploring if we actually need another tracker…

     017:34 	<dergoegge> I was wondering why we need another txrequest tracker instance for this? couldnt we just keep track of announcers in the orphanage (same as in the PR) and then add requests for the parents to the existing tracker on future announcements?
     117:35 	<glozow> dergoegge: that's an idea. How would you schedule the requests?
     217:35 	<dergoegge> ideally at the same time
     317:35 	<glozow> Is there a cost to having another txrequest tracker? It's not that different from adding another std::map
     417:35 	<glozow> No, I mean, how would you load-balance between peers, bake in preferences, etc.
     517:36 	<dergoegge> isn't that what the existing tracker does?
     617:36 	<glozow> Oh, you mean adding a new type to the tracker? So we'd have txid type, wtxid type, and orphan?
     717:37 	<glozow> also note that the parent requests are added to the txrequest tracker
     817:39 	<dergoegge> I guess I'm wondering why we need the concept of tracking the resolution by orphan id, as opposed to just putting the requests for the parents in the existing tracker
     917:39 	<glozow> we do put the requests for the parents in the existing tracker
    1017:39 	<glozow> Maybe we are crossing wires?
    1117:39 	<instagibbs> mmmm he's asking why the add to new tracker, then "immediately" add to old one, vs just add to old one, I think
    1217:40 	<dergoegge> yea
    1317:40 	<instagibbs> add to old one with "proper delays"
    1417:40 	<instagibbs> I didn't get far enough in my review to figure this out either 😅
    1517:40 	<glozow> We might have multiple candidates for orphan resolution
    1617:41 	<glozow> Oh I see what you're saying
    1717:42 	<dergoegge> "multiple candidates" as in same parents different orphan?
    1817:42 	<glozow> Perhaps that could work, where you're treating it as if all of them just announced the missing parents? I don't know how you'd add `ancpkginfo` orphan resolution easily this way though.
    1917:43 	<glozow> different peers same orphan
    2017:43 	<instagibbs> You'd also have to somehow track that you're no longer asking for any parents of an orphan in order to EraseOrphanOfPeer?
    2117:43 	<marcofleon> yeah i was thinking it made sense with GetCandidatePeers. Having another tracker to separate out the orphan reso process
    2217:46 	<glozow> will think about this idea some more
    2317:47 	<dergoegge> me too 👍
    2417:47 	<glozow> I think it's possible it works? My main questions would be (1) what is the cost of having a second tracker? Because it's the perfect data structure for this. (2) how do we make it extensible to package relay still.
    2517:48 	<instagibbs> imo the cost is mostly an additional structure lying around that needs to stay in sync with other things
    2617:48 	<dergoegge> 1) if we don't need it then it's just extra complexity 2) fair!
    2717:48 	<marcofleon> The fact that there are candidates that be added or not added to that new tracker is why it made sense to me in the first place I guess is what i was saying
    2817:48 	<marcofleon> can be*
    2917:49 	<glozow> (1) shoving it into the existing tracker but needing to have extra logic could also be complexity!
    3017:49 	<dergoegge> well in my mind it wouldn't need extra logic but I might be wrong, need to think more
    3117:49 	<instagibbs> proof of code for this I think..
    3217:49 	<glozow> but yeah, I don't like that we need to ensure m_orphanage and m_orphan_resolution_tracker are in sync. that's super fair
    3317:49 	<dergoegge> frfr
    3417:50 	<glozow> yeah let's see what the code would look like
    3517:50 	<marcofleon> fr
    3617:50 	<glozow> fr r r
    
  23. glozow commented at 1:00 pm on December 5, 2024: member
    Thanks @dergoegge, will spend some time thinking about whether there are any problems with it. Perhaps we can do this easier version now and cross the orphan reso tracker bridge when we need to build the 2-step orphan resolution protocol (info and then transactions).
  24. [txrequest] GetCandidatePeers
    Needed for a later commit adding logic to ask the TxRequestTracker for a
    list of announcers.  These announcers should know the parents of the
    transaction they announced.
    503e52a7fe
  25. [refactor] change type of unique_parents to Txid 4b93281980
  26. glozow force-pushed on Dec 6, 2024
  27. glozow commented at 12:27 pm on December 6, 2024: member
    Changed to the new approach, cleaned up a bunch of things, addressed most comments. I think I have 1 or 2 more to get to, and need to write an RPC test.
  28. glozow commented at 12:33 pm on December 6, 2024: member
    The 2 main functional differences I noticed with this approach are (1) If there are multiple missing parents, we may be using different peers to request them. (2) We don’t EraseOrphanOfPeer if orphan resolution expires. This also means we don’t delete the orphan after we’ve exhausted all potential orphan resolution candidates. This is similar to what we have today - it’ll stay for 20min before orphanage expires the entry.
  29. in src/node/txdownloadman.h:40 in 1cc9e0fda9 outdated
    35@@ -36,6 +36,8 @@ static constexpr auto NONPREF_PEER_TX_DELAY{2s};
    36 static constexpr auto OVERLOADED_PEER_TX_DELAY{2s};
    37 /** How long to wait before downloading a transaction from an additional peer */
    38 static constexpr auto GETDATA_TX_INTERVAL{60s};
    39+/** How long to wait before requesting orphan parents from an additional peer. */
    40+static constexpr auto ORPHAN_ANCESTOR_GETDATA_INTERVAL{60s};
    


    instagibbs commented at 3:36 pm on December 6, 2024:
    unused now since we’re Just using the regular txtracker flow (python tests still using is erroneously too)

    glozow commented at 11:59 am on December 11, 2024:
    Yes - deleted now
  30. in src/node/txdownloadman_impl.cpp:184 in 1cc9e0fda9 outdated
    179+    // - received as an p2p inv
    180+    // - is wtxid matching something in orphanage
    181+    // - exists in orphanage
    182+    // - peer can be an orphan resolution candidate
    183+    if (p2p_inv && gtxid.IsWtxid()) {
    184+        if (auto parent_txids{m_orphanage.GetParentTxids(Wtxid::FromUint256(gtxid.GetHash()))}) {
    


    instagibbs commented at 3:52 pm on December 6, 2024:

    1cc9e0fda9d8136bc36a247009676f07d5d84fb6

    0        const auto wtxid = Wtxid::FromUint256(gtxid.GetHash());
    1        if (auto parent_txids{m_orphanage.GetParentTxids(wtxid)) {
    

    as well as re-using next 2 lines, and the LogDebug?


    instagibbs commented at 3:54 pm on December 6, 2024:
    also, should we ever have something in our orphanage that doesn’t have a parent txid?

    glozow commented at 12:47 pm on December 10, 2024:

    also, should we ever have something in our orphanage that doesn’t have a parent txid?

    Yeah we shouldn’t ever have that situation. However I like the std:optional<X> GetX() interface convention.

  31. in test/functional/p2p_1p1c_network.py:153 in b86dcb01cf outdated
    155             submitpackage_result = self.nodes[0].submitpackage(package_hex)
    156             assert_equal(submitpackage_result["package_msg"], "success")
    157 
    158         self.log.info("Wait for mempools to sync")
    159-        self.sync_mempools(timeout=20)
    160+        self.sync_mempools(timeout=62)
    


    instagibbs commented at 6:28 pm on December 6, 2024:
    62 is very magical looking, can we do better?

    glozow commented at 12:07 pm on December 11, 2024:
    deleted now, I think it was a hack for some intermittent thing that I should have removed a while ago
  32. instagibbs commented at 6:43 pm on December 6, 2024: member
    took another pass with the major new commit f96df3d60a943b87eecd3a6915fcace0eee2838f
  33. DrahtBot added the label CI failed on Dec 9, 2024
  34. [txorphanage] store parent txids in OrphanTx
    Unused for now
    6e57412d51
  35. [txorphanage] support multiple announcers
    Add ability to add and track multiple announcers per orphan transaction,
    erasing announcers but not the entire orphan.
    
    The tx creation code in orphanage_tests needs to be updated so that each
    tx is unique, because the CountOrphans() check assumes that calling
    EraseForPeer necessarily means its orphans are deleted.
    
    Unused for now.
    9a77d84a56
  36. [unit test] multiple announcers 5e5fbed1c9
  37. [fuzz] orphanage multiple announcer functions a47452460e
  38. [txdownload] remove unique_parents that we already have
    This means we no longer return parents we already have in the
    m_unique_parents result from MempoolRejectedTx.
    
    We need to separate the loop that checks AlreadyHave parents from the
    loop that adds parents as announcements, because we may do the latter
    loop multiple times for different peers.
    1ff1a90baa
  39. [p2p] try multiple peers for orphan resolution
    Co-authored-by: dergoegge <n.goeggi@gmail.com>
    a32d396ef2
  40. [functional test] orphan handling with multiple announcers f5d869e2f2
  41. [functional test] getorphantxs reflects multiple announcers 84a3f5bb80
  42. [cleanup] remove p2p_inv from AddTxAnnouncement
    This param is no longer needed since orphan parent requests are added to
    the TxRequestTracker directly.
    5768fc2b6f
  43. [p2p] only attempt 1p1c when both txns provided by the same peer
    Now that we track all announcers of an orphan, it's not helpful to
    consider an orphan provided by a peer that didn't send us this parent.
    It can only hurt our chances of finding the right orphan when there are
    multiple candidates.
    
    Adapt the 2 tests in p2p_opportunistic_1p1c.py that looked at 1p1c
    packages from different peers. Instead of checking that the right peer
    is punished, we now check that the package is not submitted. We can't
    use the functional test to see that the package was not considered
    because the behavior is indistinguishable (except for the logs).
    40abc3f4f1
  44. in src/test/orphanage_tests.cpp:376 in a0e427b46e outdated
    389@@ -390,4 +390,177 @@ BOOST_AUTO_TEST_CASE(too_large_orphan_tx)
    390     BOOST_CHECK(orphanage.AddTx(MakeTransactionRef(tx), 0, {}));
    391 }
    392 
    393+BOOST_AUTO_TEST_CASE(process_block)
    


    mzumsande commented at 11:32 pm on December 10, 2024:
    Does process_block relate to the changes wrt multiple announcers? Could adjust the commit message to explain why it’s added.
  45. glozow force-pushed on Dec 11, 2024
  46. DrahtBot removed the label CI failed on Dec 11, 2024
  47. in src/txorphanage.cpp:109 in 9a77d84a56 outdated
    105@@ -89,9 +106,14 @@ void TxOrphanage::EraseForPeer(NodeId peer)
    106     while (iter != m_orphans.end())
    107     {
    108         // increment to avoid iterator becoming invalid after erasure
    109-        const auto& [wtxid, orphan] = *iter++;
    110-        if (orphan.fromPeer == peer) {
    111-            nErased += EraseTx(wtxid);
    112+        auto& [wtxid, orphan] = *iter++;
    


    instagibbs commented at 3:31 pm on December 11, 2024:

    nitty suggestion

     0diff --git a/src/txorphanage.cpp b/src/txorphanage.cpp
     1index 62c55c2975..cea0c05b9d 100644
     2--- a/src/txorphanage.cpp
     3+++ b/src/txorphanage.cpp
     4@@ -105,17 +105,15 @@ void TxOrphanage::EraseForPeer(NodeId peer)
     5     std::map<Wtxid, OrphanTx>::iterator iter = m_orphans.begin();
     6     while (iter != m_orphans.end())
     7     {
     8         // increment to avoid iterator becoming invalid after erasure
     9         auto& [wtxid, orphan] = *iter++;
    10-        if (orphan.announcers.contains(peer)) {
    11-            if (orphan.announcers.size() == 1) {
    12-                nErased += EraseTx(orphan.tx->GetWitnessHash());
    13-            } else {
    14-                // Don't erase this orphan. Another peer has also announced it, so it may still be useful.
    15-                orphan.announcers.erase(peer);
    16-            }
    17+        orphan.announcers.erase(peer);
    18+
    19+        // No remaining annoncers; cleanup orphanage entry
    20+        if (orphan.announcers.empty()) {
    21+            nErased += EraseTx(orphan.tx->GetWitnessHash());
    22         }
    23     }
    24     if (nErased > 0) LogDebug(BCLog::TXPACKAGES, "Erased %d orphan transaction(s) from peer=%d\n", nErased, peer);
    25 }
    
  48. in src/txorphanage.cpp:19 in 9a77d84a56 outdated
    15@@ -16,8 +16,11 @@ bool TxOrphanage::AddTx(const CTransactionRef& tx, NodeId peer, const std::vecto
    16 {
    17     const Txid& hash = tx->GetHash();
    18     const Wtxid& wtxid = tx->GetWitnessHash();
    19-    if (m_orphans.count(wtxid))
    20+    if (auto it{m_orphans.find(wtxid)}; it != m_orphans.end()) {
    


    instagibbs commented at 3:35 pm on December 11, 2024:
    Just AddAnnouncer and short-circuit if it returns true instead of doing the lookup here?
  49. in src/test/orphanage_tests.cpp:441 in 5e5fbed1c9 outdated
    436+    orphanage.EraseForBlock(block);
    437+    for (const auto& expected_removed : {bo_tx_same_txid, o_tx_same_txid_diff_witness, o_tx_conflict, o_tx_conflict_partial_2}) {
    438+        const auto& expected_removed_wtxid = expected_removed->GetWitnessHash();
    439+        BOOST_CHECK(!orphanage.HaveTx(expected_removed_wtxid));
    440+    }
    441+    BOOST_CHECK_EQUAL(orphanage.Size(), 0);
    


    instagibbs commented at 3:41 pm on December 11, 2024:
    probably should add one innocent bystander that isn’t evicted to ensure we’re not doing the stupidest thing of wiping all orphans on each block or something
  50. in src/test/orphanage_tests.cpp:474 in 5e5fbed1c9 outdated
    469+        auto ptx_mutated{MakeMutation(ptx)};
    470+        BOOST_CHECK(orphanage.AddTx(ptx_mutated, node0, {}));
    471+        BOOST_CHECK(orphanage.HaveTx(ptx_mutated->GetWitnessHash()));
    472+        expected_total_count += 1;
    473+
    474+        // It's too late to add parent_txids through AddTx.
    


    instagibbs commented at 3:45 pm on December 11, 2024:
    0        // parent_txids is ignored on subsequent AddTx for the same wtxid
    
  51. in src/test/orphanage_tests.cpp:492 in 5e5fbed1c9 outdated
    487+
    488+        // if EraseForPeer is called for an orphan with multiple announcers, the orphanage should only
    489+        // erase that peer from the announcers set.
    490+        orphanage.EraseForPeer(node0);
    491+        BOOST_CHECK(orphanage.HaveTx(ptx->GetWitnessHash()));
    492+        // node0 is the only one that announced ptx_mutated
    


    instagibbs commented at 3:50 pm on December 11, 2024:
    0        // node0 is the only one that announced ptx_mutated
    1        BOOST_CHECK(!orphanage.HaveTx(ptx_mutated->GetWitnessHash()))
    
  52. in src/test/orphanage_tests.cpp:481 in 5e5fbed1c9 outdated
    476+        // Parent txids is empty because the tx exists but no parent_txids were provided.
    477+        BOOST_CHECK(orphanage.GetParentTxids(wtxid)->empty());
    478+        BOOST_CHECK(orphanage.GetParentTxids(ptx_mutated->GetWitnessHash())->empty());
    479+
    480+        // Adding a new announcer should not change overall accounting.
    481+        orphanage.AddAnnouncer(ptx->GetWitnessHash(), node2);
    


    instagibbs commented at 3:59 pm on December 11, 2024:
    Can we check value of AddAnnouncer everywhere and include the already-HaveTxFromPeer case?
  53. in src/node/txdownloadman_impl.cpp:100 in a32d396ef2 outdated
     96@@ -97,6 +97,7 @@ void TxDownloadManagerImpl::ActiveTipChange()
     97 
     98 void TxDownloadManagerImpl::BlockConnected(const std::shared_ptr<const CBlock>& pblock)
     99 {
    100+    // Orphanage may include transactions conflicted by this block.
    


    instagibbs commented at 4:08 pm on December 11, 2024:
    micro-nit: this being added in a32d396ef27d626e0c60e7ee8a74cbe3d918acb1 is weird
  54. in src/txorphanage.cpp:294 in 6e57412d51 outdated
    286@@ -287,3 +287,10 @@ std::vector<TxOrphanage::OrphanTxBase> TxOrphanage::GetOrphanTransactions() cons
    287     }
    288     return ret;
    289 }
    290+
    291+std::optional<std::vector<Txid>> TxOrphanage::GetParentTxids(const Wtxid& wtxid)
    292+{
    293+    const auto it = m_orphans.find(wtxid);
    294+    if (it != m_orphans.end()) return it->second.parent_txids;
    


    mzumsande commented at 4:17 pm on December 11, 2024:
    Could we assume here that the returned vector isn’t empty? As far as I can see, orphans with no missing parents shouldn’t be possible. (this only seems to break some unit test that don’t set the parent)
  55. in src/txorphanage.h:71 in 6e57412d51 outdated
    67@@ -66,6 +68,9 @@ class TxOrphanage {
    68      * which peer provided each tx. */
    69     std::vector<std::pair<CTransactionRef, NodeId>> GetChildrenFromDifferentPeer(const CTransactionRef& parent, NodeId nodeid) const;
    70 
    71+    /** Get an orphan's parent_txids, or std::nullopt if the orphan is not present. */
    


    mzumsande commented at 4:23 pm on December 11, 2024:
    parent_txids could be updated to txids of missing parents with 1ff1a90baa395d231e7a3d66da74851302090b40
  56. in src/node/txdownloadman_impl.cpp:245 in a32d396ef2 outdated
    240+        // orphan resolution involves at least 1 transaction request which may or may not be
    241+        // currently tracked in m_txrequest, so we include that in the count.
    242+        if (m_txrequest.Count(nodeid) >= MAX_PEER_TX_ANNOUNCEMENTS) return std::nullopt;
    243+    }
    244+
    245+    std::chrono::seconds delay{0s};
    


    instagibbs commented at 4:43 pm on December 11, 2024:

    take or leave suggestion: instead of copying and pasting, make the delay computation re-usable?

     0diff --git a/src/node/txdownloadman_impl.cpp b/src/node/txdownloadman_impl.cpp
     1index 41d1badadf..d6e44c321b 100644
     2--- a/src/node/txdownloadman_impl.cpp
     3+++ b/src/node/txdownloadman_impl.cpp
     4@@ -171,10 +171,28 @@ void TxDownloadManagerImpl::DisconnectedPeer(NodeId nodeid)
     5         m_peer_info.erase(it);
     6     }
     7 
     8 }
     9 
    10+std::chrono::seconds TxDownloadManagerImpl::GetTxRequestDelay(NodeId peer, const TxDownloadConnectionInfo& info, const GenTxid& gtxid)
    11+{
    12+    // Decide the TxRequestTracker parameters for this announcement:
    13+    // - "preferred": if fPreferredDownload is set (= outbound, or NetPermissionFlags::NoBan permission)
    14+    // - "reqtime": current time plus delays for:
    15+    //   - NONPREF_PEER_TX_DELAY for announcements from non-preferred connections
    16+    //   - TXID_RELAY_DELAY for txid announcements while wtxid peers are available
    17+    //   - OVERLOADED_PEER_TX_DELAY for announcements from peers which have at least
    18+    //     MAX_PEER_TX_REQUEST_IN_FLIGHT requests in flight (and don't have NetPermissionFlags::Relay).
    19+    auto delay{0s};
    20+    if (!info.m_preferred) delay += NONPREF_PEER_TX_DELAY;
    21+    if (!gtxid.IsWtxid() && m_num_wtxid_peers > 0) delay += TXID_RELAY_DELAY;
    22+    const bool overloaded = !info.m_relay_permissions && m_txrequest.CountInFlight(peer) >= MAX_PEER_TX_REQUEST_IN_FLIGHT;
    23+    if (overloaded) delay += OVERLOADED_PEER_TX_DELAY;
    24+
    25+    return delay;
    26+}
    27+
    28 bool TxDownloadManagerImpl::AddTxAnnouncement(NodeId peer, const GenTxid& gtxid, std::chrono::microseconds now, bool p2p_inv)
    29 {
    30     // If this is an orphan we are trying to resolve, consider this peer as a orphan resolution candidate instead.
    31     // - received as an p2p inv
    32     // - is wtxid matching something in orphanage
    33@@ -206,24 +224,12 @@ bool TxDownloadManagerImpl::AddTxAnnouncement(NodeId peer, const GenTxid& gtxid,
    34     const auto& info = it->second.m_connection_info;
    35     if (!info.m_relay_permissions && m_txrequest.Count(peer) >= MAX_PEER_TX_ANNOUNCEMENTS) {
    36         // Too many queued announcements for this peer
    37         return false;
    38     }
    39-    // Decide the TxRequestTracker parameters for this announcement:
    40-    // - "preferred": if fPreferredDownload is set (= outbound, or NetPermissionFlags::NoBan permission)
    41-    // - "reqtime": current time plus delays for:
    42-    //   - NONPREF_PEER_TX_DELAY for announcements from non-preferred connections
    43-    //   - TXID_RELAY_DELAY for txid announcements while wtxid peers are available
    44-    //   - OVERLOADED_PEER_TX_DELAY for announcements from peers which have at least
    45-    //     MAX_PEER_TX_REQUEST_IN_FLIGHT requests in flight (and don't have NetPermissionFlags::Relay).
    46-    auto delay{0us};
    47-    if (!info.m_preferred) delay += NONPREF_PEER_TX_DELAY;
    48-    if (!gtxid.IsWtxid() && m_num_wtxid_peers > 0) delay += TXID_RELAY_DELAY;
    49-    const bool overloaded = !info.m_relay_permissions && m_txrequest.CountInFlight(peer) >= MAX_PEER_TX_REQUEST_IN_FLIGHT;
    50-    if (overloaded) delay += OVERLOADED_PEER_TX_DELAY;
    51 
    52-    m_txrequest.ReceivedInv(peer, gtxid, info.m_preferred, now + delay);
    53+    m_txrequest.ReceivedInv(peer, gtxid, info.m_preferred, now + GetTxRequestDelay(peer, info, gtxid));
    54 
    55     return false;
    56 }
    57 
    58 std::optional<std::chrono::seconds> TxDownloadManagerImpl::OrphanResolutionCandidate(NodeId nodeid, const Wtxid& orphan_wtxid)
    59@@ -240,21 +246,12 @@ std::optional<std::chrono::seconds> TxDownloadManagerImpl::OrphanResolutionCandi
    60         // orphan resolution involves at least 1 transaction request which may or may not be
    61         // currently tracked in m_txrequest, so we include that in the count.
    62         if (m_txrequest.Count(nodeid) >= MAX_PEER_TX_ANNOUNCEMENTS) return std::nullopt;
    63     }
    64 
    65-    std::chrono::seconds delay{0s};
    66-    if (!info.m_preferred) delay += NONPREF_PEER_TX_DELAY;
    67-    // The orphan wtxid is used, but resolution entails requesting the parents by txid. Sometimes
    68-    // parent and child are announced and thus requested around the same time, and we happen to
    69-    // receive child sooner. Waiting a few seconds may allow us to cancel the orphan resolution
    70-    // request if the parent arrives in that time.
    71-    if (m_num_wtxid_peers > 0) delay += TXID_RELAY_DELAY;
    72-    const bool overloaded = !info.m_relay_permissions && m_txrequest.CountInFlight(nodeid) >= MAX_PEER_TX_REQUEST_IN_FLIGHT;
    73-    if (overloaded) delay += OVERLOADED_PEER_TX_DELAY;
    74-
    75-    return delay;
    76+    // We're fetching parents via txid; so pass in as a txid
    77+    return GetTxRequestDelay(nodeid, info, GenTxid::Txid(orphan_wtxid));
    78 }
    79 
    80 std::vector<GenTxid> TxDownloadManagerImpl::GetRequestsToSend(NodeId nodeid, std::chrono::microseconds current_time)
    81 {
    82     std::vector<GenTxid> requests;
    83diff --git a/src/node/txdownloadman_impl.h b/src/node/txdownloadman_impl.h
    84index 560802ab5d..0bd673e956 100644
    85--- a/src/node/txdownloadman_impl.h
    86+++ b/src/node/txdownloadman_impl.h
    87@@ -158,10 +158,12 @@ public:
    88     bool AlreadyHaveTx(const GenTxid& gtxid, bool include_reconsiderable);
    89 
    90     void ConnectedPeer(NodeId nodeid, const TxDownloadConnectionInfo& info);
    91     void DisconnectedPeer(NodeId nodeid);
    92 
    93+    std::chrono::seconds GetTxRequestDelay(NodeId peer, const TxDownloadConnectionInfo& info, const GenTxid& gtxid);
    94+
    95     /** Consider adding this tx hash to txrequest. Should be called whenever a new inv has been received.
    96      * Also called internally when a transaction is missing parents so that we can request them.
    97      */
    98     bool AddTxAnnouncement(NodeId peer, const GenTxid& gtxid, std::chrono::microseconds now, bool p2p_inv);
    
  57. in src/node/txdownloadman_impl.cpp:433 in a32d396ef2 outdated
    438+
    439+                // If there is no candidate for orphan resolution, AddTx will not be called. This means
    440+                // that if a peer is overloading us with invs and orphans, they will eventually not be
    441+                // able to add any more transactions to the orphanage.
    442+                add_orphan_reso_candidate(ptx, unique_parents, nodeid, now);
    443+                for (const auto& candidate : m_txrequest.GetCandidatePeers(wtxid)) {
    


    instagibbs commented at 4:52 pm on December 11, 2024:
    question: since GetCandidatePeers is only used in this context, can we just have it take a transaction ref and fetch both txid and wtxid under the hood?
  58. instagibbs approved
  59. instagibbs commented at 4:58 pm on December 11, 2024: member

    ACK 40abc3f4f1a2b7c6da657f2cdffd2b61f5b1b91a

    non-blocking suggestions only

  60. DrahtBot requested review from dergoegge on Dec 11, 2024
  61. in src/node/txdownloadman_impl.cpp:179 in 5768fc2b6f outdated
    172@@ -173,14 +173,14 @@ void TxDownloadManagerImpl::DisconnectedPeer(NodeId nodeid)
    173 
    174 }
    175 
    176-bool TxDownloadManagerImpl::AddTxAnnouncement(NodeId peer, const GenTxid& gtxid, std::chrono::microseconds now, bool p2p_inv)
    177+bool TxDownloadManagerImpl::AddTxAnnouncement(NodeId peer, const GenTxid& gtxid, std::chrono::microseconds now)
    178 {
    179     // If this is an orphan we are trying to resolve, consider this peer as a orphan resolution candidate instead.
    180     // - received as an p2p inv
    


    mzumsande commented at 7:30 pm on December 11, 2024:
    this is now trivially true with 5768fc2b6f82fe43e712947e6f663ef14be23c6f, so maybe remove the line
  62. in src/node/txdownloadman_impl.cpp:242 in a32d396ef2 outdated
    237+    if (!info.m_relay_permissions) {
    238+        // This mirrors the delaying and dropping behavior in AddTxAnnouncement in order to preserve
    239+        // existing behavior: drop if we are tracking too many invs for this peer already. Each
    240+        // orphan resolution involves at least 1 transaction request which may or may not be
    241+        // currently tracked in m_txrequest, so we include that in the count.
    242+        if (m_txrequest.Count(nodeid) >= MAX_PEER_TX_ANNOUNCEMENTS) return std::nullopt;
    


    mzumsande commented at 7:35 pm on December 11, 2024:
    I think we can exceed MAX_PEER_TX_ANNOUNCEMENTS now, because after doing the check, we may add more than 1 parent tx. Is that a problem?

    instagibbs commented at 8:57 pm on December 11, 2024:
    The fuzzer should theoretically hit this if so, though the limit is probably high enough to make this hard to hit.
  63. in src/txorphanage.h:93 in 6e57412d51 outdated
    88@@ -84,6 +89,8 @@ class TxOrphanage {
    89 protected:
    90     struct OrphanTx : public OrphanTxBase {
    91         size_t list_pos;
    92+        /** Txids of the missing parents to request. Determined by peerman. */
    93+        std::vector<Txid> parent_txids;
    


    mzumsande commented at 7:41 pm on December 11, 2024:

    FWIW originally I thought I’d prefer calculating the parents when needed instead of storing them - while there might be a few more AlreadyHaveTx calls (is this function particularly expensive?), we’d be operating on more up-to-date data and wouldn’t create tx requests for parents that have already arrived and therefore will never be sent out. Plus, the code will be simpler. (I later saw that there was a short discussion in the review club about this.)

    But if we store the missing parents anyway - would it be possible in future work to actually keep them up-to-date, by removing entries from parent_txids in AddChildrenToWorkSet() / Block processing, and only attempting to re-validate the tx once we have all mssing parents / parent_txids is empty - to save us from doing unnecessary work if only one of multiple missing parents has arrived?


    glozow commented at 9:03 pm on December 11, 2024:

    only attempting to re-validate the tx once we have all mssing parents / parent_txids is empty

    That’s a good idea! Especially for 1p1cs 🤔 . We could update parent_txids at the same time as AddChildrenToWorkSet and only add them to workset when empty.


    glozow commented at 9:19 pm on December 11, 2024:

    Oh, I just had a thought that maybe it’s kind of bad that we’re locked in to the parent_txids we started with… even if we’re able to remove some, what happens if a new parent goes missing?

    1. Get tx that spends [A, B, C]. We are missing inputs A and B. Store in tx orphanage. Request A and B from peers…
    2. C gets replaced* or falls out of mempool somehow
    3. We keep only requesting A and B. We receive and accept them.
    4. AddChildrenToWorkSet… C is still an orphan. We can’t figure out how to resolve it properly though.

    So maybe we do this, but also need to update parent_txids on every MempoolRejectedTx TX_MISSING_INPUTS as well.


    instagibbs commented at 3:44 pm on December 12, 2024:

    Note that also A could immediately be replaced after fetching it, before B is received. There are probably a bunch of ways this and other ancestor fetching can fail.

    Even if we manage to recover the entire ancestry tree (like in this example ABC+tx), if C was replaced, we’d likely be relying on general package RBF to have the tx successfully submitted.

    That’s a good idea! Especially for 1p1cs 🤔 . We could update parent_txids at the same time as AddChildrenToWorkSet and only add them to workset when empty.

    This seems reasonable on its own.


    glozow commented at 8:29 pm on December 17, 2024:
    Perhaps we start with recalculating every time and consider doing this later? I agree it doesn’t seem too expensive.
  64. mzumsande commented at 8:06 pm on December 11, 2024: contributor
    Concept ACK Did a first round of review, mostly comment nits (it’s been a while since I last looked at the orphanage).

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me