test: intermittent issue in p2p_1p1c_network.py #33318

issue fanquake openend this issue on September 5, 2025
  1. fanquake commented at 10:56 am on September 5, 2025: member

    Timeout failure in the TSAN job here https://github.com/bitcoin/bitcoin/actions/runs/17489459842/job/49675712585?pr=33317:

     02025-09-05T10:51:37.2200865Z  node3 2025-09-05T10:51:30.541244Z [httpworker.7] [rpc/request.cpp:243] [void JSONRPCRequest::parse(const UniValue &)] [rpc] ThreadRPCServer method=getpeerinfo user=__cookie__ 
     12025-09-05T10:51:37.2201033Z  test  2025-09-05T10:51:31.542469Z TestFramework (ERROR): Unexpected exception 
     22025-09-05T10:51:37.2201165Z                                    Traceback (most recent call last):
     32025-09-05T10:51:37.2201494Z                                      File "/home/admin/actions-runner/_work/_temp/test/functional/test_framework/test_framework.py", line 195, in main
     42025-09-05T10:51:37.2201593Z                                        self.run_test()
     52025-09-05T10:51:37.2201851Z                                      File "/home/admin/actions-runner/_work/_temp/build/test/functional/p2p_1p1c_network.py", line 155, in run_test
     62025-09-05T10:51:37.2201964Z                                        self.sync_mempools()
     72025-09-05T10:51:37.2202246Z                                      File "/home/admin/actions-runner/_work/_temp/test/functional/test_framework/test_framework.py", line 811, in sync_mempools
     82025-09-05T10:51:37.2202421Z                                        raise AssertionError("Mempool sync timed out after {}s:{}".format(
     92025-09-05T10:51:37.2202566Z                                    AssertionError: Mempool sync timed out after 2400s:
    102025-09-05T10:51:37.2216842Z                                      {'74c951d3e1bc27437394377a48d6ff7f11b49c1e0bd05e169d54c5138deea7f2', '77708ea74fbffa47e192e5be99b03a72e830f22a8b60d02e262ba8cfbc3e9b47', '233c3760167a81cd6ffccae81bcb650bdbd9c84b72cf87bf329940d2ac97b8ee', '390ec58b4c5b017f557e30ead39f9921b3fec5729b07ff89415c5570275a87dc', 'c8ca64cb3da89905b3ccfc5423d908a17273f90c4893373af9354e61f3bd651c', 
    
  2. fanquake added the label Tests on Sep 5, 2025
  3. fanquake added the label CI failed on Sep 5, 2025
  4. instagibbs commented at 5:31 pm on October 10, 2025: member

    I think I’ve nailed down what’s happening from logs:

    1. test peer sends child to node1
    2. node1 has parent resolved normally, it’s above minfee
    3. test peer is picked as unique peer to schedule resolution of child during test peer’s next “turn” in networking thread
    4. node0 advertises child by wtxid (it’s ignored currently because it’s in orphanage and has no parents, test peer is already the unique reconsideration peer so that’s fairly immaterial?)
    5. test peer disconnects before ProcessOrphanTx is called (and we never reconsider the orphan again because all parents are known!)
    6. now any peer can advertise the child again but in this test only node0 can, and it already has, so the transaction never makes it

    The test does the disconnect earlier but from socket closing to clearing of peer state, this race can occur.

    due to #31829 node1’s orphanage will actually empty out on disconnect, so it’s not actually testing much anymore, so that can be cleaned up and I expect the test to work better (especially if you assert the orphanage of each node is in the expected state)

    Overall, I think this is a vote in favor of forcing ProcessOrphanTx on disconnect by peer, or having the reconsiderable slot reassigned to a new peer to disallow them from playing similar games. Seems like it only effects “catch up” orphanage usage, not 1p1c, and relies on some incredible timing to disrupt it successfully

    Some discussion of this DoS strategy here: #31829 (comment)

  5. glozow commented at 7:52 pm on October 10, 2025: member

    The test does the disconnect earlier but from socket closing to clearing of peer state, this race can occur.

    If it’s a race, then this should go away if we make node3 the one that pre-receives orphans?

     0
     1diff --git a/test/functional/p2p_1p1c_network.py b/test/functional/p2p_1p1c_network.py
     2index e4d3b738c19..5567b4e7002 100755
     3--- a/test/functional/p2p_1p1c_network.py
     4+++ b/test/functional/p2p_1p1c_network.py
     5@@ -109,14 +109,14 @@ class PackageRelayTest(BitcoinTestFramework):
     6         # Assemble return results
     7         packages_to_submit = [package_hex_1, package_hex_2, package_hex_3, package_hex_4]
     8         # node0: sender
     9-        # node1: pre-received the children (orphan)
    10-        # node3: pre-received the parents (too low fee)
    11+        # node2: pre-received the parents (too low fee)
    12+        # node3: pre-received the children (orphan)
    13         # All nodes receive parent_31 ahead of time.
    14         txns_to_send = [
    15             [],
    16-            [child_1, child_2, parent_31, child_3, child_4],
    17             [parent_31],
    18-            [parent_1, parent_2, parent_31, parent_4]
    19+            [parent_1, parent_2, parent_31, parent_4],
    20+            [child_1, child_2, parent_31, child_3, child_4]
    21         ]
    22 
    23         return packages_to_submit, txns_to_send
    
  6. instagibbs commented at 12:37 pm on October 13, 2025: member

    I don’t think the pre-received child peer(s) make sense anymore, as we intentionally zero these out of the orphanage when all the announcing peers disconnect.

    If we want to keep this part just to exercise that state space, we could make sure that we wait until all the node’s orphan slots are empty via rpc before continuing with the ultimate submitpackage?


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-10-30 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me