p2p: make block download logic aware of limited peers threshold #28120

pull furszy wants to merge 3 commits into bitcoin:master from furszy:2023_net_limited_peers changing 2 files +96 −16
  1. furszy commented at 1:55 pm on July 21, 2023: member

    Even when the node believes it has IBD completed, need to avoid requesting historical blocks from network-limited peers. Otherwise, the limited peer will disconnect right away.

    The simplest scenario could be a node that gets synced, drops connections, and stays inactive for a while. Then, once it re-connects (IBD stays completed), the node tries to fetch all the missing blocks from any peer, getting disconnected by the limited ones.

    Note: Can verify the behavior by cherry-picking the test commit alone on master. It will fail there.

  2. DrahtBot commented at 1:55 pm on July 21, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK vasild, mzumsande, pinheadmz, achow101
    Stale ACK andrewtoth

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. DrahtBot added the label P2P on Jul 21, 2023
  4. mzumsande commented at 6:53 pm on July 21, 2023: contributor

    Maybe getting disconnected could be beneficial in this rare situation? If our tip is that much behind but we’re not in IBD, we can’t help them and they can’t help us, so our priority should be to get the missing blocks asap, and getting disconnected by useless peers seems like it’s not such a bad thing towards that goal.

    What would be bad is if all our current peers were limited, and we wouldn’t try to request the needed blocks from anyone, but also wouldn’t try to exchange them for better peers, so that we would be stuck.

  5. furszy commented at 8:54 pm on July 21, 2023: member

    Maybe getting disconnected could be beneficial in this rare situation? If our tip is that much behind but we’re not in IBD, we can’t help them and they can’t help us, so our priority should be to get the missing blocks asap, and getting disconnected by useless peers seems like it’s not such a bad thing towards that goal.

    Agree but, while the outcome might be the desired one, the approach to achieve it isn’t the best. It seems unwise to have the node sending a message through the wire, knowing it will fail, just to force the other peer to disconnect. The node should be smarter than that and not rely on external factors to take decisions.

    Moreover, in cases where the other peer chooses to ignore the message without disconnecting (maybe due to using different software than bitcoin-core), the node will end up waiting for a response that will never arrive, wasting time and resources, which could have been easily avoided.

    And also, while the node has more connection slots available, I don’t see why should disconnect a peer that is behaving properly.

    Instead, I think that we should try to keep processes separated. The eviction process should be in charge of disconnecting peers that aren’t/will not provide any meaningful data. And the block downloading logic should only be responsible of deciding which blocks request to each peer and which ones not.

  6. naumenkogs commented at 9:40 am on July 25, 2023: member

    What would be bad is if all our current peers were limited, and we wouldn’t try to request the needed blocks from anyone, but also wouldn’t try to exchange them for better peers, so that we would be stuck.

    I agree with this part. I think the solution should be expanded to give a chance in such cases, and it should be merged within the same PR.

  7. luke-jr commented at 6:33 pm on July 25, 2023: member
    Agree with @mzumsande and @naumenkogs. Maybe not the same PR if it’s substantially different, but the new eviction logic should be merged before this fix.
  8. furszy commented at 2:56 pm on August 1, 2023: member

    What would be bad is if all our current peers were limited, and we wouldn’t try to request the needed blocks from anyone, but also wouldn’t try to exchange them for better peers, so that we would be stuck.

    Right now, this will hardly happen. The reason is CheckForStaleTipAndEvictPeers. The logic is as follows:

    When the node doesn’t update the tip for a period longer than 30 minutes (TipMayBeStale()), it triggers the extra outbound connection process. Which instructs the net layer to bypass the maximum outbound connection restriction and create another full relay outbound connection to sync up the missing blocks.

    Then, considering the worst-case scenario: when the extra outbound connection is also a limited peer. In that case, 45 seconds after connecting to the extra peer (scheduler task timeout), CheckForStaleTipAndEvictPeers will call EvictExtraOutboundPeers, which will evict the peer that least recently announced a block header. Then, in case of no chain movement; the process starts again, adding an extra outbound connection, and so on, until it finds a peer that can provide the missing blocks.

    Note: While this sooner or later corrects the stalling situation, it’s not ideal to continue establishing connections to limited peers, on the extra outbound connection process, when the peer is stalled. So, to address this issue, have created #28170. Which basically disallows connections to limited peers when the node is behind enough to know that limited peers will not provide any of the required historical blocks (although they will still provide valid headers).

  9. achow101 requested review from vasild on Sep 20, 2023
  10. achow101 requested review from sipa on Sep 20, 2023
  11. DrahtBot added the label Needs rebase on Oct 2, 2023
  12. furszy force-pushed on Oct 18, 2023
  13. furszy commented at 1:35 pm on October 18, 2023: member
    rebased. Conflicts solved.
  14. DrahtBot removed the label Needs rebase on Oct 18, 2023
  15. in src/net_processing.cpp:1480 in c1612ea1f0 outdated
    1491+                }
    1492+                return;
    1493+            }
    1494+
    1495+            // Don't request blocks that go further than what limited peers can provide
    1496+            if (is_limited_peer && (state->pindexBestKnownBlock->nHeight - pindex->nHeight > static_cast<int>(NODE_NETWORK_LIMITED_MIN_BLOCKS) + 2 /* add two blocks buffer extension for possible races */)) {
    


    andrewtoth commented at 9:14 pm on October 29, 2023:
    I don’t think we want to keep the + 2 buffer here. This means we would potentially still request blocks 2 blocks after 288, so up to 290. The buffer for sending requested blocks up to 288 + 2 is there for when the sender sees new blocks but the requester has not yet seen it. In this case we are requesting so we should respect the 288 limit.

    vasild commented at 2:36 pm on November 1, 2023:

    I agree - no need for +2, or even it could be harmful and defeat the margin on the sender side - e.g. if the sender is at height 1290, this logic would allow requesting block 1000. Then if the sender progresses to height 1291 in the meantime and sees request for 1000 they will disconnect.

    Further, to make this symmetric with the sender maybe it should be -2? If we assume the sender has no margin and disconnects for anything >288 then the requester should request only up to 286 (incl)?


    furszy commented at 7:29 pm on November 2, 2023:
    Upps yeah. This was intended to be -2.. fixing it.
  16. in src/net_processing.cpp:1482 in c1612ea1f0 outdated
    1443@@ -1443,30 +1444,46 @@ void PeerManagerImpl::FindNextBlocks(std::vector<const CBlockIndex*>& vBlocks, c
    1444                 // We consider the chain that this peer is on invalid.
    1445                 return;
    1446             }
    1447+
    1448             if (!CanServeWitnesses(peer) && DeploymentActiveAt(*pindex, m_chainman, Consensus::DEPLOYMENT_SEGWIT)) {
    1449                 // We wouldn't download this block or its descendants from this peer.
    1450                 return;
    1451             }
    1452+
    


    vasild commented at 2:02 pm on November 1, 2023:

    In the commit message of f3089e57aadac462e59499fdeb8091c42f39327b refactor: Make FindNextBlocksToDownload friendlier:

    s/FindNextBlocksToDownload/PeerManagerImpl::FindNextBlocks()/


    furszy commented at 10:30 pm on November 2, 2023:
    Thx, done as suggested.
  17. in src/net_processing.cpp:1481 in c1612ea1f0 outdated
    1492+                return;
    1493+            }
    1494+
    1495+            // Don't request blocks that go further than what limited peers can provide
    1496+            if (is_limited_peer && (state->pindexBestKnownBlock->nHeight - pindex->nHeight > static_cast<int>(NODE_NETWORK_LIMITED_MIN_BLOCKS) + 2 /* add two blocks buffer extension for possible races */)) {
    1497+                return;
    


    vasild commented at 2:51 pm on November 1, 2023:
    Should this not be continue;? I find PeerManagerImpl::FindNextBlocks() hard to follow. Are the entries in vToFetch sorted in ascending or descending order?

    furszy commented at 7:54 pm on November 2, 2023:

    Should this not be continue;? I find PeerManagerImpl::FindNextBlocks() hard to follow. Are the entries in vToFetch sorted in ascending or descending order?

    vToFetch is arranged in increasing order. Needs to stop once it reaches the last block the peer can provide, without continuing further.

    FindNextBlocks begins from the last block in common with the peer. Then, during each iteration of the loop, it gets the block that is 128 blocks ahead and populates vToFetch by moving backwards: it pushes the last block of the window to the last position of the array, the second to last block to the second to last position of the array, and so on. Subsequently, the function iterates over, the now ordered, vToFetch in forward direction, adding the blocks that will be requested to vBlocks.


    vasild commented at 3:11 pm on November 10, 2023:

    Ok, lets use an example (from the newly added test):

    vToFetch = [2 ... 129] state->pindexBestKnownBlock->nHeight = 301 So the peer’s height is 301. We can assume they can provide blocks [13 ... 301]. At the first iteration of the loop pindex->nHeight is 2 (vToFetch[0]). The code ends up with return; because 301 - 2 > 286 and we never request any blocks from that peer.

    Shouldn’t we request blocks [13 ... 301] instead?


    furszy commented at 11:41 pm on November 10, 2023:
    Hmm, that will only happen when the node is close to the tip, because of the 1024 block download window. More precisely, when the node is within 1312 blocks from the tip (window + limited peers threshold). Nice observation!. Will modify it and add coverage for it.
  18. in test/functional/p2p_node_network_limited.py:77 in c1612ea1f0 outdated
    74+        self.disconnect_nodes(0, 2)
    75+        self.disconnect_nodes(1, 2)
    76+        assert_equal(len(full_node.getpeerinfo()), 0)
    77+
    78+        # Mine blocks and sync the pruned node. Surpass the NETWORK_NODE_LIMITED threshold.
    79+        self.generate(miner, MIN_BLOCKS_TO_KEEP + 12, sync_fun=self.no_op)
    


    vasild commented at 3:08 pm on November 1, 2023:
    Is the intention here to mimic NODE_NETWORK_LIMITED_MIN_BLOCKS from the C++ source code, which is 288? MIN_BLOCKS_TO_KEEP is a different constant defined in both the C++ and Python code which happens to be also 288. Should NODE_NETWORK_LIMITED_MIN_BLOCKS be defined in the Python code (equal to 288, but not necessary equal or tied to MIN_BLOCKS_TO_KEEP)?

    furszy commented at 8:12 pm on November 2, 2023:

    Is the intention here to mimic NODE_NETWORK_LIMITED_MIN_BLOCKS from the C++ source code, which is 288?

    Yeah, I used MIN_BLOCKS_TO_KEEP merely because it is already part of the test framework.

    MIN_BLOCKS_TO_KEEP is a different constant defined in both the C++ and Python code which happens to be also 288. Should NODE_NETWORK_LIMITED_MIN_BLOCKS be defined in the Python code (equal to 288, but not necessary equal or tied to MIN_BLOCKS_TO_KEEP)?

    If it helps readability, sure (I would just need to see where to place it). It is important to note that NODE_NETWORK_LIMITED_MIN_BLOCKS is dependent on MIN_BLOCKS_TO_KEEP and should always be less than or equal to it.


    furszy commented at 1:10 am on November 11, 2023:
    Added NODE_NETWORK_LIMITED_MIN_BLOCKS on the last update.
  19. in test/functional/p2p_node_network_limited.py:84 in c1612ea1f0 outdated
    75+        self.disconnect_nodes(1, 2)
    76+        assert_equal(len(full_node.getpeerinfo()), 0)
    77+
    78+        # Mine blocks and sync the pruned node. Surpass the NETWORK_NODE_LIMITED threshold.
    79+        self.generate(miner, MIN_BLOCKS_TO_KEEP + 12, sync_fun=self.no_op)
    80+        self.sync_blocks([miner, pruned_node])
    


    vasild commented at 3:11 pm on November 1, 2023:
    Here sync_blocks() is used and above self.wait_until(lambda: node.getbestblockhash() == ... is used to achieve the same purpose. For consistency maybe use only one of the methods.

    furszy commented at 10:30 pm on November 2, 2023:
    Thx, done as suggested.
  20. in test/functional/p2p_node_network_limited.py:92 in c1612ea1f0 outdated
    83+        # (they will disconnect if full_node, which is chain-wise behind, request blocks
    84+        #  older than MIN_BLOCKS_TO_KEEP)
    85+        self.connect_nodes(0, 2)
    86+        assert_equal(len(full_node.getpeerinfo()), 1)
    87+        time.sleep(3)  # Wait for a few seconds to ensure peers remain connected
    88+        assert_equal(len(full_node.getpeerinfo()), 1)
    


    vasild commented at 3:18 pm on November 1, 2023:

    This sleep is non-deterministic. The good thing is that it will not cause sporadic test failures, but still, it could occasionally report success even if the full node requests an old block and gets disconnected (false positive). Also it adds unconditional 3 seconds to the tests execution time.

    Is it possible to lookup something in the log or via RPC to cross check that the full node has made a decision not to request blocks from the pruned node?


    furszy commented at 8:28 pm on November 2, 2023:

    Is it possible to lookup something in the log or via RPC to cross check that the full node has made a decision not to request blocks from the pruned node?

    Hmm, not really. We aren’t logging any of the reasons behind a lack of a getdata request for a certain block from a certain peer (invalid tree, no segwit support, already in-flight). The approach would lead to excessive logging for no real gain.

    What we could do, maybe in a follow-up?, to eliminate the unnecessary sleep time, is migrate this functional test check into a unit test. Similar to what I’m doing on #28170 for FindNextBlocks.


    furszy commented at 1:10 am on November 11, 2023:
    Removed the unconditional wait on the last update :).
  21. vasild commented at 3:20 pm on November 1, 2023: contributor
    Approach ACK c1612ea1f0
  22. furszy force-pushed on Nov 2, 2023
  23. furszy force-pushed on Nov 2, 2023
  24. furszy commented at 10:40 pm on November 2, 2023: member

    Updated per feedback, thanks for the review!

    Fixed the two blocks buffer extension (now is reduction). Small diff.

  25. in test/functional/p2p_node_network_limited.py:61 in 534996ab53 outdated
    56+        miner = self.nodes[1]
    57+        full_node = self.nodes[2]
    58+
    59+        # Connect and generate block to ensure IBD=false
    60+        self.connect_nodes(1, 2)
    61+        self.connect_nodes(0, 2)
    


    vasild commented at 3:16 pm on November 10, 2023:

    The test would fail if run standalone without the preceding tests. The failure is on line 78: self.sync_blocks([miner, pruned_node]). That is because the test relies on the pruned node being connected to the miner which is indeed the case - some leftover from previous tests. This diff makes it succeed when run standalone:

     0--- i/test/functional/p2p_node_network_limited.py
     1+++ w/test/functional/p2p_node_network_limited.py
     2@@ -54,25 +54,24 @@ class NodeNetworkLimitedTest(BitcoinTestFramework):
     3         # The request of blocks further than the NETWORK_NODE_LIMITED threshold causes a direct disconnection.
     4         pruned_node = self.nodes[0]
     5         miner = self.nodes[1]
     6         full_node = self.nodes[2]
     7 
     8         # Connect and generate block to ensure IBD=false
     9+        self.connect_nodes(0, 1)
    10         self.connect_nodes(1, 2)
    11-        self.connect_nodes(0, 2)
    12         self.generate(miner, 1, sync_fun=self.no_op)
    13 
    14         # Verify all peers are at the same height
    15         self.sync_blocks([miner, pruned_node])
    16                                      
    17         # Verify peers are out of IBD
    18         for node in self.nodes:
    19             assert not node.getblockchaininfo()['initialblockdownload']    
    20 
    21         # Disconnect full_node from the other two peers (the node will remain out of IBD)
    22-        self.disconnect_nodes(0, 2)
    23         self.disconnect_nodes(1, 2)                                       
    24         assert_equal(len(full_node.getpeerinfo()), 0)                     
    25 
    26         # Mine blocks and sync the pruned node. Surpass the NETWORK_NODE_LIMITED threshold.
    

    It is also shorter and I think more clear.


    furszy commented at 1:09 am on November 11, 2023:
    Ended up doing it in a slightly different way. Using setnetworkactive to isolate the full node from the other two. I think that is clearer in this way.
  26. furszy force-pushed on Nov 11, 2023
  27. furszy commented at 1:06 am on November 11, 2023: member

    Updated per feedback, thanks for the review @vasild!

    The node will now request blocks within the limited peers threshold when the block download window contemplates them and not only when the previous blocks arrived or are in-flight. Added test coverage exercising this behavior and verifying that the node can get synced even when the blocks arrived out of order.

  28. DrahtBot added the label CI failed on Nov 11, 2023
  29. furszy force-pushed on Nov 11, 2023
  30. DrahtBot removed the label CI failed on Nov 11, 2023
  31. in src/net_processing.cpp:1511 in 930e531046 outdated
    1492+                return;
    1493+            }
    1494+
    1495+            // Don't request blocks that go further than what limited peers can provide
    1496+            if (is_limited_peer && (state->pindexBestKnownBlock->nHeight - pindex->nHeight > static_cast<int>(NODE_NETWORK_LIMITED_MIN_BLOCKS) - 2 /* two blocks buffer for possible races */)) {
    1497+                continue;
    


    vasild commented at 1:00 pm on November 13, 2023:
    I think there is an off-by-one error here, should this be >= instead of >? Lets say the limited peer is at height H then the intention of this is to request blocks (H-286, H] from it, a total of 286 blocks. However it would not continue for height H-286 and would request that block too.

    furszy commented at 9:30 pm on November 13, 2023:
    Sounds good. Corrected.
  32. in test/functional/p2p_node_network_limited.py:104 in 930e531046 outdated
     99+            full_node.getblock(pruned_node.getblockhash(height))  # just need to assert it does not raise an exception
    100+
    101+        # Verify all blocks that should not have been requested.
    102+        for move in range(1, limited_threshold_buffer):
    103+            height = start_height_full_node + move
    104+            assert_raises_rpc_error(-1, "Block not found on disk", full_node.getblock, pruned_node.getblockhash(height))
    


    vasild commented at 1:17 pm on November 13, 2023:

    The full node is at height 1 and the pruned node is at height 289. The first loop verifies that the full node has blocks [4, 289] (286 blocks in total), which is ok. But then it only checks that the full node does not have block 2. What about block 3? It is not checked and should not have been downloaded but it has been downloaded.

    To avoid the gap the range in the second loop should be range(1, limited_threshold_buffer + 1) or even:

    0for height in range(start_height_full_node + 1, start_height_full_node + 1 + limited_threshold_buffer):
    

    or have one loop instead of two:

    0for height in range(start_height_full_node + 1, tip_height + 1):
    1    if height <= tip_height - (NODE_NETWORK_LIMITED_MIN_BLOCKS - 2):
    2        assert_raises_rpc_error(-1, "Block not found on disk", full_node.getblock, pruned_node.getblockhash(height))
    3    else:
    4        full_node.getblock(pruned_node.getblockhash(height))
    

    furszy commented at 9:30 pm on November 13, 2023:
    Done as suggested. Thanks.
  33. furszy force-pushed on Nov 13, 2023
  34. furszy force-pushed on Nov 13, 2023
  35. DrahtBot added the label CI failed on Nov 13, 2023
  36. DrahtBot removed the label CI failed on Nov 14, 2023
  37. vasild approved
  38. vasild commented at 9:39 am on November 14, 2023: contributor

    ACK a0687ffd568b0984a8b1e51fd0061f87f25e95d7

    Thanks!

  39. in test/functional/p2p_node_network_limited.py:82 in a0687ffd56 outdated
    69+        full_node.setnetworkactive(False)
    70+        self.wait_until(lambda: len(full_node.getpeerinfo()) == 0)
    71+
    72+        # Mine blocks and sync the pruned node. Surpass the NETWORK_NODE_LIMITED threshold.
    73+        # Blocks deeper than the threshold are considered "historical blocks"
    74+        num_historial_blocks = 12
    


    andrewtoth commented at 6:20 pm on December 3, 2023:
    nit: Could add a comment on why number 12 is chosen here?

    furszy commented at 12:49 pm on December 4, 2023:
    Sure. Will do if I have to retouch. The number is to test the blocks download procedure and its moving window. For instance, if the peer is at block 100, and it mines 300 blocks (288 + 12). The test verifies the node only download the blocks that are below the threshold from the limited peer (last 288 blocks minus the buffer). Then, the test connects a full node and it checks the node download the remaining historical blocks from it.

    andrewtoth commented at 12:55 pm on December 4, 2023:
    Ah yes, it makes sense what num_historical_blocks purpose is. The nit is why the number 12 was chosen specifically. I suppose to just make it an even 300, and it can be arbitrary?

    furszy commented at 12:58 pm on December 4, 2023:

    Ah yes, it makes sense what num_historical_blocks purpose is. The nit is why the number 12 was chosen specifically. I suppose to just make it an even 300, and it can be arbitrary?

    Yep.

  40. andrewtoth approved
  41. andrewtoth commented at 6:21 pm on December 3, 2023: contributor
    ACK a0687ffd568b0984a8b1e51fd0061f87f25e95d7
  42. in src/net_processing.cpp:1510 in ea3138b7b1 outdated
    1475@@ -1475,6 +1476,11 @@ void PeerManagerImpl::FindNextBlocks(std::vector<const CBlockIndex*>& vBlocks, c
    1476                 return;
    1477             }
    1478 
    1479+            // Don't request blocks that go further than what limited peers can provide
    1480+            if (is_limited_peer && (state->pindexBestKnownBlock->nHeight - pindex->nHeight >= static_cast<int>(NODE_NETWORK_LIMITED_MIN_BLOCKS) - 2 /* two blocks buffer for possible races */)) {
    


    naumenkogs commented at 12:54 pm on December 6, 2023:

    ea3138b7b189158db80bfd1f9d9dd17e7ef52f3f

    might as well jump pindex forward here, so that the second condition is satisfied?


    furszy commented at 9:04 pm on December 29, 2023:
    I’m not sure what you are referring to. Could you rephrase it please. Want to update pindexLastCommonBlock to pindex?

    andrewtoth commented at 11:29 pm on February 21, 2024:

    I think what @naumenkogs is referring to here is that if this condition is true then pindex will hit this condition on the first pindex in the loop and every pindex up until this condition is no longer true. This can be more efficient by just bumping pindex to the first pindex that is past the network limited blocks threshold.

    So, we can either change the loop from range based to regular for-loop and advance the index, or move this check out of this loop and into the loop on line 1464 and break early if the condition is met.


    furszy commented at 1:05 pm on February 27, 2024:

    I think what @naumenkogs is referring to here is that if this condition is true then pindex will hit this condition on the first pindex in the loop and every pindex up until this condition is no longer true. This can be more efficient by just bumping pindex to the first pindex that is past the network limited blocks threshold.

    So, we can either change the loop from range based to regular for-loop and advance the index, or move this check out of this loop and into the loop on line 1464 and break early if the condition is met.

    Ok, we are on the same page now. Thanks. Let’s leave it for a follow-up. It would be nice to re-think this function and benchmark it properly. In a first glance, I think we would be optimizing the function for the less likely case and may degrade performance for the likely ones.

  43. naumenkogs commented at 12:59 pm on December 6, 2023: member
    The code looks alright, but i’d still rather have this addressed first. It looks like even #28170 doesn’t directly handle this, but rather prepare the codebase for another PR handling it?
  44. furszy commented at 4:08 pm on December 24, 2023: member

    Sorry for the delay @naumenkogs. ’ve been on #28170 before returning here. All good now, it is ready to go.

    The code looks alright, but i’d still rather have this addressed first. It looks like even #28170 doesn’t directly handle this, but rather prepare the codebase for another PR handling it?

    The concern was: All connections are from limited peers, we are lagging behind and we no longer disconnect from them due the lack of the historical block request disconnection misfeature.

    With #28170, when this happens, the node will detect that it is lagging behind and update the desirable connection service flags accordingly, ensuring that the extra outbound connection process only connects to full nodes (unlike now, where the node can still connect to another limited peer, etc). Then, once the connection is established, the node will recover from the stalling situation.

    Essentially, instead of resolving the situation by evicting a good limited peer connection (which could be self-defeating in some cases), we will solve it by establishing a new connection. With this, I’m not saying that we couldn’t actually do both (it wouldn’t be bad to reserve some outbound slots for full nodes and/or evict the least helpful limited peer when the node is stalled), but I think the concern is no longer a blocker after #28170.

  45. DrahtBot added the label CI failed on Jan 12, 2024
  46. achow101 referenced this in commit 0b768746ef on Jan 31, 2024
  47. refactor: Make FindNextBlocks friendlier
    No behavior change.
    73127722a2
  48. p2p: sync from limited peer, only request blocks below threshold
    Requesting historical blocks from network limited peers is a
    direct disconnection cause.
    The node must only request the blocks who know for sure the
    limited peer can provide.
    2f6a05512f
  49. test: avoid requesting blocks beyond limited peer threshold
    Even when the node believes it completed IBD, need to avoid
    requesting historical blocks from network-limited peers.
    Otherwise, the limited peer will disconnect right away.
    c5b5843d8f
  50. furszy force-pushed on Feb 1, 2024
  51. DrahtBot removed the label CI failed on Feb 1, 2024
  52. vasild approved
  53. vasild commented at 1:15 pm on February 4, 2024: contributor
    ACK c5b5843d8f10d96f76ee6b95f2b1b1b4ce755f75
  54. DrahtBot requested review from andrewtoth on Feb 4, 2024
  55. DrahtBot removed review request from andrewtoth on Feb 6, 2024
  56. DrahtBot requested review from andrewtoth on Feb 6, 2024
  57. bitcoin deleted a comment on Feb 6, 2024
  58. in test/functional/p2p_node_network_limited.py:70 in c5b5843d8f
    65+        full_node = self.nodes[2]
    66+
    67+        # Connect and generate block to ensure IBD=false
    68+        self.connect_nodes(1, 0)
    69+        self.connect_nodes(1, 2)
    70+        self.generate(miner, 1)
    


    mzumsande commented at 4:56 pm on February 15, 2024:
    I don’t understand this part of the test. Why do we connect nodes 1 and 2, just to disconnect again without doing anything in between? Why is the generate call necessary? For me, the test also succeeds if I delete it. But if we do have to generate a block, shouldn’t we then also at least add a sync_blocks to ensure it propagates to all nodes before disconnecting?

    furszy commented at 6:48 pm on February 15, 2024:

    I don’t understand this part of the test. Why do we connect nodes 1 and 2, just to disconnect again without doing anything in between? Why is the generate call necessary?

    To ensure all nodes are out of IBD. The idea was to isolate this test case from the previous cases executed along the file. It was done in this way to be able to run this test case at any point of the file. This was vasild request because he was running it first, commenting the rest of the cases to speed the process up.

    The connection requirement is because the test overrides setup_network, which skips the framework’s connections creation process (peers start the test with no connections).

    shouldn’t we then also at least add a sync_blocks to ensure it propagates to all nodes before disconnecting?

    The self.generate(miner, 1) function calls sync_blocks() internally.


    mzumsande commented at 9:01 pm on February 15, 2024:
    ok, makes sense!
  59. DrahtBot removed review request from andrewtoth on Feb 15, 2024
  60. DrahtBot requested review from andrewtoth on Feb 15, 2024
  61. DrahtBot removed review request from andrewtoth on Feb 15, 2024
  62. DrahtBot requested review from andrewtoth on Feb 15, 2024
  63. DrahtBot removed review request from andrewtoth on Feb 15, 2024
  64. DrahtBot requested review from andrewtoth on Feb 15, 2024
  65. DrahtBot removed review request from andrewtoth on Feb 15, 2024
  66. DrahtBot requested review from andrewtoth on Feb 15, 2024
  67. DrahtBot removed review request from andrewtoth on Feb 15, 2024
  68. DrahtBot requested review from andrewtoth on Feb 15, 2024
  69. in src/net_processing.cpp:1454 in 2f6a05512f outdated
    1450@@ -1451,6 +1451,7 @@ void PeerManagerImpl::FindNextBlocks(std::vector<const CBlockIndex*>& vBlocks, c
    1451 {
    1452     std::vector<const CBlockIndex*> vToFetch;
    1453     int nMaxHeight = std::min<int>(state->pindexBestKnownBlock->nHeight, nWindowEnd + 1);
    1454+    bool is_limited_peer = IsLimitedPeer(peer);
    


    andrewtoth commented at 10:55 pm on February 21, 2024:
    nit: const bool is_limited_peer{IsLimitedPeer};
  70. DrahtBot requested review from andrewtoth on Feb 21, 2024
  71. DrahtBot removed review request from andrewtoth on Feb 21, 2024
  72. DrahtBot requested review from andrewtoth on Feb 21, 2024
  73. mzumsande commented at 11:12 pm on February 26, 2024: contributor
    Code Review ACK c5b5843d8f10d96f76ee6b95f2b1b1b4ce755f75
  74. DrahtBot removed review request from andrewtoth on Feb 26, 2024
  75. DrahtBot requested review from andrewtoth on Feb 26, 2024
  76. DrahtBot removed review request from andrewtoth on Feb 27, 2024
  77. DrahtBot requested review from andrewtoth on Feb 27, 2024
  78. pinheadmz approved
  79. pinheadmz commented at 11:23 pm on March 9, 2024: member

    ACK c5b5843d8f10d96f76ee6b95f2b1b1b4ce755f75

    Built and ran all functional and unit tests on x86 Ubuntu. Confirmed test fails on master, passes with the patch in the branch. Reviewed code and existing discussion. Regarding the decision to keep the pruning peer connected at all, I agree with the author that the affected code is only responsible for requesting available blocks from a peer and it is reasonable to take the limited flag into account. I like the clean up in the first commit as well, it’s easier to read the loop with continue.

    For sanity I added checks in the test to confirm blocks from the pruned node were “found” by the full node but had 'confirmations': -1, which makes sense because there is a gap between the full node’s chain tip and the blocks available from the pruning peer.

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3ACK c5b5843d8f10d96f76ee6b95f2b1b1b4ce755f75
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQIzBAEBCgAdFiEE5hdzzW4BBA4vG9eM5+KYS2KJyToFAmXs7UsACgkQ5+KYS2KJ
     7yTpnUg//b9HSUh/s1VJRWSoX23pcmFJtmeA4ndd5dGM+Q2RbyBZenI4ZwVlkunxy
     8l5B+U/gpD7cL/6iZsV6Eph/ouX9Om8NKQcxMn0TXzvOBYcByCZB++r5oNwfeKohu
     9aUJWRce9MYDj4GtdbW7p5DwcsKaX6nC8oOnxfv9ivP473F7BDjnYZpX7F0nKsrx7
    10b5vKMWCY7fXxmHpe0y0FnWVAJ10hvIQFD6I0EMO1mQpq14U3NUcG+Axfm6kCrJRE
    11EbeAKtxz/tITNNiN5AZg2NdgnmOR1MaOatkOuCxWgMer+N2EOWZgeW+/42Iy0Isz
    12hOnv9BQi4MwewY/I5ruM7Yx2oghUEpIkIhBX8/JPeA9iD2pnVIcXi6Xkaawd31C1
    13Xl7OMpX1mOVtPjNPvmYxpbFjze6BhGpaa3h7UJFSBzqhFm0gpe70f+hza4HbXUQj
    14MihsnKASJPmCDLFY/kGddad/E+yJ5QNxSHhLqEugkWPGPnIa4CHyd91uU5qQwwOv
    15H1sN9hAlNYguDZs79jPY32yIXgpufQyUWQA1+emQpDQun68jHjTxTdgUFEbvlIDU
    16Nl+fxsY9Fqj4D3HpdnnwIUcX1W68itMhPPBQUDu3R1abdUHMc/ntE4gwr6BP3LcL
    17AFv+d4aflQrEG9RntgmkyUSnYw6VIj9INlQMS6ql4r70Kh/6Bhk=
    18=C8tL
    19-----END PGP SIGNATURE-----
    

    pinheadmz’s public key is on keybase

  80. DrahtBot removed review request from andrewtoth on Mar 9, 2024
  81. DrahtBot requested review from andrewtoth on Mar 9, 2024
  82. achow101 commented at 12:09 pm on March 11, 2024: member
    ACK c5b5843d8f10d96f76ee6b95f2b1b1b4ce755f75
  83. DrahtBot removed review request from andrewtoth on Mar 11, 2024
  84. DrahtBot requested review from andrewtoth on Mar 11, 2024
  85. achow101 merged this on Mar 11, 2024
  86. achow101 closed this on Mar 11, 2024

  87. furszy deleted the branch on Mar 11, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me