test: avoid interface_ipc.py race and null pointer dereference #34715

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/ipc-noctx changing 1 files +3 −1
  1. ryanofsky commented at 7:26 pm on March 2, 2026: contributor
    Avoid race condition in run_deprecated_mining_test caused by creating and immediately destroying an unused worker thread. This is leading to test failures reported by maflcko in #34711
  2. test: avoid interface_ipc.py race and null pointer dereference
    Avoid race condition in run_deprecated_mining_test caused by creating and
    immediately destroying an unused worker thread. This leads to test failures
    reported by maflcko in #34711
    1c1de334e9
  3. DrahtBot added the label Tests on Mar 2, 2026
  4. DrahtBot commented at 7:26 pm on March 2, 2026: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK w0xlt, optout21, l0rinc, enirox001

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

  5. achow101 added this to the milestone 31.0 on Mar 2, 2026
  6. fanquake commented at 9:42 am on March 3, 2026: member
    cc @Sjors
  7. w0xlt commented at 6:49 pm on March 3, 2026: contributor
    ACK 1c1de334e9c0e3b4c72c3f3fe45b1f4df5ce4502 In run_deprecated_mining_test, ctx was created but never passed to any call, causing the race condition. The only call is init.makeMiningOld2() which takes no arguments.
  8. in test/functional/interface_ipc.py:76 in 1c1de334e9
    70@@ -71,7 +71,9 @@ async def async_routine():
    71     def run_deprecated_mining_test(self):
    72         self.log.info("Running deprecated mining interface test")
    73         async def async_routine():
    74-            ctx, init = await make_capnp_init_ctx(self)
    75+            node = self.nodes[0]
    76+            connection = await capnp.AsyncIoStream.create_unix_connection(node.ipc_socket_path)
    77+            init = capnp.TwoPartyClient(connection).bootstrap().cast_as(self.capnp_modules['init'].Init)
    


    optout21 commented at 10:57 am on March 4, 2026:

    To reduce code duplication, this could be solved by breaking up make_capnp_init_ctx into two: one to create init only, and the other also the ctx. Counter arguments:

    • this is a quick test fix
    • this is the single place where ctx is not needed.

    Feel free to resolve this comment if discarded.

  9. optout21 commented at 11:14 am on March 4, 2026: contributor

    utACK 1c1de334e9c0e3b4c72c3f3fe45b1f4df5ce4502

    This is a fix for an identified sporadic CI failure, due to a race condition, due to a quick create-and-destroy of a capnp context. The change is test-only and limited. Instead of calling into the make_capnp_init_ctx util, only its needed part is inlined. Verified by code review and unit tests.

  10. l0rinc commented at 12:46 pm on March 4, 2026: contributor

    Tested ACK 1c1de334e9c0e3b4c72c3f3fe45b1f4df5ce4502

    Added a temporary delay to ProxyServer::makeThread to make the race explicit. It passes after the fix and fails before it.

     0diff --git a/src/ipc/libmultiprocess/src/mp/proxy.cpp b/src/ipc/libmultiprocess/src/mp/proxy.cpp
     1--- a/src/ipc/libmultiprocess/src/mp/proxy.cpp	(revision 9cad97f6cdf1bd660732cd10e844a6a7e0771ea0)
     2+++ b/src/ipc/libmultiprocess/src/mp/proxy.cpp	(revision 79e9c9cc5be4932352116e6db0ea84f5f769929c)
     3@@ -413,6 +413,7 @@
     4         g_thread_context.thread_name = ThreadName(m_connection.m_loop->m_exe_name) + " (from " + from + ")";
     5         g_thread_context.waiter = std::make_unique<Waiter>();
     6         thread_context.set_value(&g_thread_context);
     7+        usleep(250000); // TODO Reproducer-only delay to widen the race between thread startup and shutdown.
     8         Lock lock(g_thread_context.waiter->m_mutex);
     9         // Wait for shutdown signal from ProxyServer<Thread> destructor (signal
    10         // is just waiter getting set to null.)
    
     0cmake -B build -DCMAKE_BUILD_TYPE=Debug && cmake --build build -j && build/test/functional/test_runner.py interface_ipc.py
     1...
     21/1 - interface_ipc.py failed, Duration: 4 s
     3
     4stdout:
     52026-03-04T12:41:29.808409Z TestFramework (INFO): PRNG seed is: 9184316120798979695
     62026-03-04T12:41:29.808817Z TestFramework (INFO): Initializing test directory /var/folders/5t/04gq0pqj5yv4t8cxw51q0s2m0000gn/T/test_runner_₿_🏃_20260304_124129/interface_ipc_0
     72026-03-04T12:41:33.104595Z TestFramework (INFO): Running echo test
     82026-03-04T12:41:33.361561Z TestFramework (INFO): Running mining test
     92026-03-04T12:41:33.619016Z TestFramework (INFO): Running deprecated mining interface test
    102026-03-04T12:41:33.675022Z TestFramework (INFO): Stopping nodes
    11[node 0] Cleaning up ipc directory '/var/folders/5t/04gq0pqj5yv4t8cxw51q0s2m0000gn/T/test-ipc-0e6nefup'
    12
    13
    14stderr:
    15Traceback (most recent call last):
    16  File "/Users/lorinc/IdeaProjects/bitcoin/build/test/functional/interface_ipc.py", line 90, in <module>
    17    IPCInterfaceTest(__file__).main()
    18    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
    19  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_framework.py", line 156, in main
    20    exit_code = self.shutdown()
    21  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_framework.py", line 289, in shutdown
    22    self.stop_nodes()
    23    ~~~~~~~~~~~~~~~^^
    24  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_framework.py", line 535, in stop_nodes
    25    node.wait_until_stopped()
    26    ~~~~~~~~~~~~~~~~~~~~~~~^^
    27  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_node.py", line 524, in wait_until_stopped
    28    self.wait_until(lambda: self.is_node_stopped(**kwargs), timeout=timeout)
    29    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    30  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_node.py", line 931, in wait_until
    31    return wait_until_helper_internal(test_function, timeout=timeout, timeout_factor=self.timeout_factor, check_interval=check_interval)
    32  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/util.py", line 429, in wait_until_helper_internal
    33    if predicate():
    34       ~~~~~~~~~^^
    35  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_node.py", line 524, in <lambda>
    36    self.wait_until(lambda: self.is_node_stopped(**kwargs), timeout=timeout)
    37                            ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
    38  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_node.py", line 500, in is_node_stopped
    39    assert return_code in expected_ret_code, self._node_msg(
    40           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    41AssertionError: [node 0] Node returned unexpected exit code (-11) vs ((0,)) when stopping
    42[node 0] Cleaning up leftover process
    43
    44
    45
    46TEST             | STATUS    | DURATION
    47
    48interface_ipc.py | ✖ Failed  | 4 s
    49                                                                                                                                                                                                 
    50ALL              | ✖ Failed  | 4 s (accumulated)                                                                                                                                                 
    51Runtime: 4 s                                                                                                                                                                                     
    
  11. enirox001 commented at 1:10 pm on March 4, 2026: contributor

    ACK 1c1de33

    Tested locally on Fedora (-DENABLE_IPC=ON).

    To verify the race condition, I artificially widened the race window on master by injecting a 100ms std::this_thread::sleep_for inside ProxyServer::makeThread just before the waiter mutex lock. This successfully forced the fatal crash during the test teardown sequence.

    Compiling this PR branch with the exact same sabotaged C++ code results in the test passing. Because the test now bypasses make_capnp_init_ctx and connects directly without requesting a remote worker thread, makeThread is never invoked, completely eliminating the premature teardown race condition.

  12. fanquake merged this on Mar 4, 2026
  13. fanquake closed this on Mar 4, 2026


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-03-23 12:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me