p2p: Track orphans by who provided them #26551

pull ajtowns wants to merge 5 commits into bitcoin:master from ajtowns:202211-orphanguardians changing 4 files +57 −49
  1. ajtowns commented at 7:44 PM on November 21, 2022: contributor

    We currently process orphans by assigning them to the peer that provided a missing parent; instead assign them to the peer that provided the orphan in the first place. This prevents a peer from being able to marginally delay another peer's transactions and also simplifies the internal API slightly. Because we're now associating orphan processing with the peer that provided the orphan originally, we no longer process orphans immediately after receiving the parent, but defer until a future call to ProcessMessage.

    Based on #26295

  2. DrahtBot commented at 7:44 PM on November 21, 2022: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK naumenkogs, glozow, mzumsande

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #26247 (refactor: Make m_mempool optional in PeerManager by MarcoFalke)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. ajtowns force-pushed on Nov 21, 2022
  4. ajtowns force-pushed on Nov 21, 2022
  5. glozow commented at 10:50 AM on November 28, 2022: member

    Concept ACK

  6. DrahtBot added the label Needs rebase on Nov 28, 2022
  7. txorphanage: index workset by originating peer a4fe09973a
  8. ajtowns force-pushed on Nov 28, 2022
  9. DrahtBot removed the label Needs rebase on Nov 28, 2022
  10. ajtowns marked this as ready for review on Nov 29, 2022
  11. glozow added the label P2P on Nov 29, 2022
  12. maflcko renamed this:
    net_processing: Track orphans by who provided them
    p2p: Track orphans by who provided them
    on Nov 29, 2022
  13. maflcko removed the label P2P on Nov 29, 2022
  14. DrahtBot added the label P2P on Nov 29, 2022
  15. ajtowns commented at 3:25 AM on December 8, 2022: contributor

    Some previous discussion at #21527 (review)

  16. in src/net_processing.cpp:2892 in 75b6f82136 outdated
    2888 | @@ -2889,7 +2889,7 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, Peer& peer,
    2889 |  bool PeerManagerImpl::ProcessOrphanTx(Peer& peer)
    2890 |  {
    2891 |      AssertLockHeld(g_msgproc_mutex);
    2892 | -    AssertLockHeld(cs_main);
    


    naumenkogs commented at 10:12 AM on December 21, 2022:

    Should become AssertLockNotHeld? :)


    ajtowns commented at 12:43 PM on December 21, 2022:

    Seems more reasonable to continue to allow invoking ProcessOrphanTx with cs_main held, even though we could forbid it. (Before the "only process orphans before messages" commit, ProcessOrphanTx() was called from ProcessMessage() with cs_main held)

  17. in src/net_processing.cpp:588 in 0a32d4f3c4 outdated
     584 | @@ -585,10 +585,10 @@ class PeerManagerImpl final : public PeerManager
     585 |       * Reconsider orphan transactions after a parent has been accepted to the mempool.
     586 |       *
     587 |       * @peer[in]  peer     The peer whose orphan transactions we will reconsider. Generally only one
     588 | -     *                     orphan will be reconsidered on each call of this function. This set
     589 | -     *                     may be added to if accepting an orphan causes its children to be
     590 | +     *                     orphan will be reconsidered on each call of this function. May
    


    naumenkogs commented at 10:25 AM on December 21, 2022:
    1. I think it should say "Creates more work if", as it's less ambiguous (I actually spent time figuring the difference out...).
    2. So now bool fMoreNodeWork = m_msgproc->ProcessMessages(pnode, flagInterruptMsgProc); should be renamed to may_have_more_work right?

    ajtowns commented at 1:29 PM on December 22, 2022:

    I don't think this PR changes whether fMoreNodeWork should be renamed -- the bool should have the same properties both before and after this PR.

    But... I think its current behaviour is technically not quite right? If we process a TX message in ProcessMessage now, we'll set fMoreWork = true only when m_getdata_requests has entries, but I think we should also be doing it if the peer's orphanage work queue isn't empty, as ProcessMessage may have added to it. I think this is technically a bug introduced in #15644 -- there should have been an if (!pfrom->orphan_work_set.empty()) fMoreWork = true; added after the ProcessMessage() call, as well as after the // this maintains the order of responses comment.

    The only result of the bug is that when a parent comes in, processing its orphaned children will be delayed by 100ms rather than occurring essentially immediately.


    ajtowns commented at 2:36 PM on December 22, 2022:

    Maybe it would be better to write:

    while (true) {
        bool all_nodes_finished_processing = true;
        for (pnode : nodes) {
            if (m_msgproc->ProcessMessages(pnode)) all_nodes_finished_processing = false;
            m_msgproc->SendMessages(pnode);
        }
        if (all_nodes_finished_processing) wait_for(100ms);
    }
    

    The idea (as I understand it) is just that ProcessMessages() returns true if it does some substantial work and exits early, without checking everything. If that's the case for any peer, then it doesn't wait before looping through all peers again. If ProcessMessages() completed for all peers, whether it did work or not, then it is okay to sleep briefly before looping over everything again.


    naumenkogs commented at 9:30 AM on December 23, 2022:

    I think you are right. Let's try applying these changes and see whether everything makes sense once again.

    Part of my confusion is that it's unclear whether has_more_work applies to a given peer, or to all peers... That variable rename could help. (peer_has_more_work vs node_has_more_work or something.)

  18. naumenkogs commented at 10:25 AM on December 21, 2022: member

    Concept ACK

  19. ajtowns force-pushed on Dec 22, 2022
  20. in src/txorphanage.h:34 in 8f9d9ab943 outdated
      32 | @@ -33,7 +33,7 @@ class TxOrphanage {
      33 |       *  the originating peer, and whether there are more orphans for this peer
      34 |       *  to work on after this tx.
      35 |       */
      36 | -    CTransactionRef GetTxToReconsider(NodeId peer, bool& more) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
      37 | +    CTransactionRef GetTxToReconsider(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
    


    mzumsande commented at 3:45 PM on December 22, 2022:

    The doc for GetTxToReconsider needs an update after both more and originator are removed.

  21. in src/net_processing.cpp:4910 in 10de619ba2 outdated
    4906 | @@ -4907,6 +4907,7 @@ bool PeerManagerImpl::ProcessMessages(CNode* pfrom, std::atomic<bool>& interrupt
    4907 |              LOCK(peer->m_getdata_requests_mutex);
    4908 |              if (!peer->m_getdata_requests.empty()) fMoreWork = true;
    4909 |          }
    4910 | +        if (m_orphanage.HaveTxToReconsider(peer->m_id)) fMoreWork = true;
    


    glozow commented at 5:11 PM on December 22, 2022:

    nit for 10de619ba2bc48de6816b3940509576c575e9ff3 typo "indidicate" in commit message


    ajtowns commented at 12:44 AM on December 23, 2022:

    This doesn't work after the other changes in this PR -- you might be creating more work for some other peer, so should be checking if any peers have txs to reconsider, and only doing this once after looping through all the peers.


    ajtowns commented at 6:40 PM on December 23, 2022:

    So, I guess this raises the question: should we move orphan processing outside of ProcessMessages and SendMessages entirely, and just do it separately, independent of both who sent the orphan and who sent the parent?

    Unless someone has a good reason not to, I think I might draft that up as an alternative PR so the approaches can be compared more directly.


    glozow commented at 6:33 PM on December 26, 2022:

    should we move orphan processing outside of ProcessMessages and SendMessages entirely, and just do it separately, independent of both who sent the orphan and who sent the parent?

    Had a similar thought, mainly I don't see why we should try to process all orphans before moving on to ProcessMessage()... afaict we never send a message to a peer as a result of processing an orphan (other than potentially disconnecting) so maintaining the order of responses is not justification like it is for finishing ProcessGetData() queue.

    fwiw doing orphan processing independently seems reasonable to me, though I don't have a very strong grasp of all the possible implications.


    ajtowns commented at 1:27 AM on December 29, 2022:

    I thought of it more along the lines of "if a node gives us a lot of orphans to process, and we're now processing them, we shouldn't do more work for that node until all the orphan work is done; also we shouldn't delay work from other nodes while we're processing that node's orphans".


    mzumsande commented at 9:39 PM on January 4, 2023:

    So, I guess this raises the question: should we move orphan processing outside of ProcessMessages and SendMessages entirely,

    Where do you intend to move it? Into a dedicated thread for orphan processing?

    This doesn't work after the other changes in this PR -- you might be creating more work for some other peer, so should be checking if any peers have txs to reconsider, and only doing this once after looping through all the peers.

    Is this situation (that the peer who sent us an orphan is different from the peer that sent us the parent) typical? Otherwise might it be acceptable to just have a unnecessary 100ms wait if it occurs very rarely?


    ajtowns commented at 6:40 AM on January 25, 2023:

    Where do you intend to move it? Into a dedicated thread for orphan processing?

    No, I was thinking a separate ProcessOrphans method like ProcessMessages and SendMessages.

    Is this situation (that the peer who sent us an orphan is different from the peer that sent us the parent) typical? Otherwise might it be acceptable to just have a unnecessary 100ms wait if it occurs very rarely?

    On a long running node, I'm seeing maybe a few hundred orphans being accepted during startup (ie, children whose parent txs were announced while my node was offline) but then perhaps only 20-60 accepted orphans per day. I don't have much of an idea how many of those would have a different node sending the parent compared to the child, but it doesn't seem likely to be a real problem.


    jnewbery commented at 1:41 PM on April 18, 2023:

    should we move orphan processing outside of ProcessMessages and SendMessages entirely, and just do it separately, independent of both who sent the orphan and who sent the parent?

    Unless someone has a good reason not to, I think I might draft that up as an alternative PR so the approaches can be compared more directly.

    See #19364

  22. mzumsande commented at 9:40 PM on January 4, 2023: contributor

    Concept ACK

  23. ajtowns force-pushed on Jan 25, 2023
  24. ajtowns commented at 6:45 AM on January 25, 2023: contributor

    Updated. I left the orphan processing in ProcessMessage based on the following thoughts:

    • an attacker could probably create arbitrary amounts of work by sending orphan txs that then gets added to the todo list as soon as the parent becomes available. if they do this, it should only slow down communication with that peer, not any of our other peers.
    • avoiding a pretty rare 100ms delay isn't worth the hassle of extending NetEventsInterface with new virtual methods for handling orphan txs
  25. in src/txorphanage.h:32 in 048bfe1679 outdated
      28 | @@ -29,11 +29,11 @@ class TxOrphanage {
      29 |      /** Extract a transaction from a peer's work set
      30 |       *  Returns nullptr and sets more to false if there are no transactions
      31 |       *  to work on. Otherwise returns the transaction reference, removes
      32 | -     *  the transaction from the work set, and populates its arguments with
      33 | -     *  the originating peer, and whether there are more orphans for this peer
      34 | +     *  the transaction from the work set, and populates "more" with
    


    naumenkogs commented at 7:44 AM on January 25, 2023:

    048bfe16795e2da18c7bb5bf1137e4370079dc33 nit: does "with and whether" make any sense? Not for me, so might be a typo.


    ajtowns commented at 8:16 AM on January 25, 2023:

    fixed

  26. naumenkogs commented at 8:00 AM on January 25, 2023: member

    utACK 6c0a068ea3670bb7a1c955ed6ecde5d050c1a812

  27. txorphange: Drop redundant originator arg from GetTxToReconsider be2304676b
  28. net_processing: only process orphans before messages
    Previously, when we processed a new tx we would attempt to ATMP any
    orphans that considered the new tx a parent immediately, but would only
    accept at most one such tx, leaving any others to be considered on a
    future run of ProcessMessages(). With this patch, we don't attempt any
    orphan processing immediately after receiving a tx, instead deferring
    all of them until the next call to ProcessMessages().
    c583775706
  29. net_processing: Don't process tx after processing orphans
    If we made progress on orphans, consider that enough work for this peer
    for this round of ProcessMessages. This also allows cleaning up the api
    for TxOrphange:GetTxToReconsider().
    ecb0a3e425
  30. net_processing: indicate more work to do when orphans are ready to reconsider
    When PR#15644 made orphan processing interruptible, it also introduced a
    potential 100ms delay between processing of the first and second newly
    reconsiderable orphan, because it didn't check if the orphan work set
    was non-empty after invoking ProcessMessage(). This adds that check, so
    that ProcessMessages() will return true if there are orphans to process,
    usually avoiding the 100ms delay in CConnman::ThreadMessageHandler().
    c58c249a5b
  31. ajtowns force-pushed on Jan 25, 2023
  32. naumenkogs commented at 9:00 AM on January 25, 2023: member

    utACK c58c249a5b694c88122589fedbef4e2f13f08bb4

  33. in src/net_processing.cpp:4905 in c58c249a5b
    4901 | @@ -4902,6 +4902,12 @@ bool PeerManagerImpl::ProcessMessages(CNode* pfrom, std::atomic<bool>& interrupt
    4902 |              LOCK(peer->m_getdata_requests_mutex);
    4903 |              if (!peer->m_getdata_requests.empty()) fMoreWork = true;
    4904 |          }
    4905 | +        // Does this peer has an orphan ready to reconsider?
    


    glozow commented at 2:16 PM on January 25, 2023:

    c58c249a5b694c88122589fedbef4e2f13f08bb4 nit: s/has/have

  34. glozow commented at 2:49 PM on January 25, 2023: member

    ACK c58c249a5b694c88122589fedbef4e2f13f08bb4

  35. glozow requested review from mzumsande on Jan 25, 2023
  36. in src/net_processing.cpp:588 in ecb0a3e425 outdated
     588 | -     *                     orphan will be reconsidered on each call of this function. This set
     589 | -     *                     may be added to if accepting an orphan causes its children to be
     590 | -     *                     reconsidered.
     591 | -     * @return             True if there are still orphans in this peer's work set.
     592 | +     * @peer[in]  peer     The peer whose orphan transactions we will reconsider. Generally only
     593 | +     *                     one orphan will be reconsidered on each call of this function. If an
    


    mzumsande commented at 10:44 PM on January 25, 2023:

    nit: I think "Generally only one orphan will be reconsidered" is misleading. Calling ProcessTransaction for a tx once more, getting TX_MISSING_INPUTS again and therefore keeping the orphan would qualify as being "reconsidered" for me, whereas what is meant is that the tx is being removed from the orphanage. But no need to address here, since this PR didn't change this.

  37. in src/txorphanage.h:52 in c58c249a5b
      47 | @@ -48,6 +48,9 @@ class TxOrphanage {
      48 |      /** Add any orphans that list a particular tx as a parent into the from peer's work set */
      49 |      void AddChildrenToWorkSet(const CTransaction& tx) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);;
      50 |  
      51 | +    /** Does this peer have any work to do? */
      52 | +    bool HaveTxToReconsider(NodeId peer) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);;
    


    mzumsande commented at 11:02 PM on January 25, 2023:

    nit: double ;

  38. mzumsande commented at 11:07 PM on January 25, 2023: contributor

    Code Review ACK c58c249a5b694c88122589fedbef4e2f13f08bb4

    nits can be ignored.

  39. glozow merged this on Jan 26, 2023
  40. glozow closed this on Jan 26, 2023

  41. sidhujag referenced this in commit 09be5d2ccc on Jan 26, 2023
  42. Wenyu1227 commented at 9:50 AM on September 27, 2023: none

    src/net_processing.cpp

  43. Fabcien referenced this in commit 3f9234a413 on May 29, 2024
  44. bitcoin locked this on Sep 26, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-26 09:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me