ipc, test: Add tests for unclean disconnect and thread busy behavior #34284
pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/ipc-testasync changing 2 files +118 −2-
ryanofsky commented at 1:46 pm on January 14, 2026: contributorUpcoming libmultiprocess changes are expected to alter this behavior (https://github.com/bitcoin/bitcoin/issues/34250#issuecomment-3749243782), making test coverage useful for documenting current behavior and validating the intended changes.
-
DrahtBot commented at 1:46 pm on January 14, 2026: contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Code Coverage & Benchmarks
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34284.
Reviews
See the guideline for information on the review process.
Type Reviewers ACK Sjors Stale ACK ismaelsadeeq If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #34568 (mining: Break compatibility with existing IPC mining clients by ryanofsky)
- #34422 (Update libmultiprocess subtree to be more stable with rust IPC client by ryanofsky)
- #34184 (mining: add cooldown to createNewBlock() immediately after IBD by Sjors)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
LLM Linter (✨ experimental)
Possible typos and grammar issues:
- “If the process has exited, asserts that the exit code matches” -> “If the process has exited, the method asserts that the exit code matches” [Adds missing subject to avoid a dangling clause and clarify what performs the assertion]
2026-02-12 01:12:11
-
DrahtBot added the label CI failed on Jan 14, 2026
-
DrahtBot commented at 3:23 pm on January 14, 2026: contributor
🚧 At least one of the CI tasks failed. Task
lint: https://github.com/bitcoin/bitcoin/actions/runs/20996362200/job/60354097964 LLM reason (✨ experimental): Lint check failed (ruff) due to unused variable errors in Python code, causing the CI to fail.Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:
-
Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
-
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
-
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
-
-
ismaelsadeeq commented at 4:00 pm on January 14, 2026: memberConcept ACK
-
ryanofsky force-pushed on Jan 14, 2026
-
ryanofsky commented at 11:54 pm on January 14, 2026: contributorUpdated 7666f1cbafdff3fc8511ea540d9d7c7421902142 -> 1835ee877af30800c361e2075baae44030dc33de (
pr/ipc-testasync.1->pr/ipc-testasync.2, compare) to fix various ci errors -
ryanofsky force-pushed on Jan 15, 2026
-
ryanofsky commented at 1:39 pm on January 15, 2026: contributor
Updated 1835ee877af30800c361e2075baae44030dc33de -> 568a0d8311a73094e6510699cd0d65ef6c4a252d (
pr/ipc-testasync.2->pr/ipc-testasync.3, compare) to fix CI errors https://github.com/bitcoin/bitcoin/actions/runs/21014168771Updated 568a0d8311a73094e6510699cd0d65ef6c4a252d -> 96fd77bbcc42f1ac76be025e006bb59866bb7cc3 (
pr/ipc-testasync.3->pr/ipc-testasync.4, compare) to fix CI errors https://github.com/bitcoin/bitcoin/actions/runs/21033187005Updated 96fd77bbcc42f1ac76be025e006bb59866bb7cc3 -> f532e9b2ff1a8b6359a79f238426a912abbf7428 (
pr/ipc-testasync.4->pr/ipc-testasync.5, compare) making LLM suggested typo fixes and trying to bypass apparently stuck and incomplete CI job https://github.com/bitcoin/bitcoin/actions/runs/21036706108 -
ryanofsky force-pushed on Jan 15, 2026
-
ryanofsky force-pushed on Jan 15, 2026
-
DrahtBot removed the label CI failed on Jan 15, 2026
-
in test/functional/interface_ipc.py:455 in f532e9b2ff outdated
450+ https://github.com/bitcoin/bitcoin/issues/33923 and currently causes a 451+ thread busy error. A future change will make this just queue the calls 452+ for execution and not trigger any error""" 453+ node = self.nodes[0] 454+ self.log.info("Running thread busy test") 455+ timeout = self.rpc_timeout * 1000.0
ismaelsadeeq commented at 12:47 pm on January 19, 2026:In “ipc, test: Add tests for unclean disconnect and thread busy behavior” f532e9b2ff1a8b6359a79f238426a912abbf7428
nitty-nit: 60s is fine, but
rpc_timeoutseems unrelated to block wait timeout. I’d rather we define a timeout relative to the code in scope that would make the template wait till the required precondition is met; in case where changing the value of timeout should not necessitate editing this code (I assume that unlikely, so this suggestion is just a cosmetic).
ryanofsky commented at 12:52 pm on January 20, 2026:re: #34284 (review)
I think the idea behind timeout variable is to choose a cutoff time, where if the node does not respond to the request within that time, the request is considered hung and the test should fail.
The timeout should not be so short that the test fails spuriously. And the timeout should not be so long that hangs take a very long time to detect. Since test environments vary a lot, the timeout should also be controllable by the
--timeout-factorcommand line option.Passing
rpc_timeoutas thewaitNexttimeout accomplishes these things, which I think are what we want, but we could potentially document therpc_timeoutvariable to describe it better, or rename it to something more general likerequest_timeout. The current name also seems ok since IPC is a type of RPC. I do think we want to avoid having multiple timeout variables that are controlled separately if we can avoid that.ismaelsadeeq approvedismaelsadeeq commented at 12:57 pm on January 19, 2026: memberACK f532e9b2ff1a8b6359a79f238426a912abbf7428
I have tested locally and verified that the issue reported exists on master.
I also cherry-picked https://github.com/bitcoin/bitcoin/compare/master...ryanofsky:bitcoin:pr/ipc-cancel compile and ran the test
run_unclean_disconnect_test. It seems to fix the issue, and the subtest is not passing after that patch.Post that commit, it will be easy to update this test to reflect the new behaviour; we just have to assert the opposite of what’s assumed here, which is that an unclean disconnect does not trigger a crash and that more concurrency is allowed.
in test/functional/interface_ipc.py:492 in f532e9b2ff
487+ assert_equal(e.type, "FAILED") 488+ else: 489+ raise AssertionError("Expected thread busy exception") 490+ 491+ # Generate a new block to make the active waitNext calls return, then clean up. 492+ self.generate(node, 1, sync_fun=lambda: None)
maflcko commented at 3:10 pm on January 19, 2026:nit: I think you can useself.no_opinstead of the lambda
ryanofsky commented at 8:26 pm on February 11, 2026:re: #34284 (review)
nit: I think you can use
self.no_opinstead of the lambdaThanks, switched to
self.no_opryanofsky commented at 4:15 am on January 21, 2026: contributorI have a PR https://github.com/bitcoin-core/libmultiprocess/pull/240 which addresses both of the of the issues tested here (#33923 and #34250). The PR seems to work well but needs more testing and cleanup. I still think it would be good to merge this PR first to get more test coverage in place. If the other PR goes ahead the tests here can be updated as follows:
0--- a/test/functional/interface_ipc.py 1+++ b/test/functional/interface_ipc.py 2@@ -401,9 +401,9 @@ class IPCInterfaceTest(BitcoinTestFramework): 3 4 def run_unclean_disconnect_test(self): 5 """Test behavior when disconnecting during an IPC call that later 6- returns a non-null interface pointer. Currently this behavior causes a 7- crash as reported [#34250](/bitcoin-bitcoin/34250/), but a 8- followup will change this behavior.""" 9+ returns a non-null interface pointer. This used to cause a crash as 10+ reported [#34250](/bitcoin-bitcoin/34250/), but now just 11+ results in a cancellation log message""" 12 node = self.nodes[0] 13 self.log.info("Running disconnect during BlockTemplate.waitNext") 14 timeout = self.rpc_timeout * 1000.0 15@@ -428,28 +428,22 @@ class IPCInterfaceTest(BitcoinTestFramework): 16 with node.assert_debug_log(expected_msgs=["BlockTemplate.waitNext", "IPC server post request"]): 17 promise = template.waitNext(ctx, waitoptions) 18 await asyncio.sleep(0.1) 19- disconnected_log_check.enter_context(node.assert_debug_log(expected_msgs=["IPC server: socket disconnected"])) 20+ disconnected_log_check.enter_context(node.assert_debug_log(expected_msgs=["IPC server: socket disconnected", "cancelled while executing"])) 21 del promise 22 23 asyncio.run(capnp.run(async_routine())) 24 25 # Wait for socket disconnected log message, then generate a block to 26- # cause the waitNext() call to return a new template. This will cause a 27- # crash and disconnect with error output. 28- disconnected_log_check.close() 29- try: 30+ # cause the waitNext() call to return a new template. Look for a 31+ # cancelled IPC log message after waitNext returns. 32+ with node.assert_debug_log(expected_msgs=["interrupted (cancelled)"]): 33+ disconnected_log_check.close() 34 self.generate(node, 1) 35- except (http.client.RemoteDisconnected, BrokenPipeError, ConnectionResetError): 36- pass 37- node.wait_until_stopped(expected_ret_code=(-11, -6, 1, 66), expected_stderr=re.compile(r"\S")) 38- self.start_node(0) 39 40 def run_thread_busy_test(self): 41 """Test behavior when sending multiple calls to the same server thread 42 which used to cause a crash as reported 43- [#33923](/bitcoin-bitcoin/33923/) and currently causes a 44- thread busy error. A future change will make this just queue the calls 45- for execution and not trigger any error""" 46+ [#33923](/bitcoin-bitcoin/33923/).""" 47 node = self.nodes[0] 48 self.log.info("Running thread busy test") 49 timeout = self.rpc_timeout * 1000.0 50@@ -480,18 +474,15 @@ class IPCInterfaceTest(BitcoinTestFramework): 51 with node.assert_debug_log(expected_msgs=["BlockTemplate.waitNext", "IPC server post request"]): 52 promise2 = template.waitNext(ctx, waitoptions) 53 await asyncio.sleep(0.1) 54- try: 55- await template.waitNext(ctx, waitoptions) 56- except capnp.lib.capnp.KjException as e: 57- assert_equal(e.description, "remote exception: std::exception: thread busy") 58- assert_equal(e.type, "FAILED") 59- else: 60- raise AssertionError("Expected thread busy exception") 61+ with node.assert_debug_log(expected_msgs=["BlockTemplate.waitNext", "IPC server post request"]): 62+ promise3 = template.waitNext(ctx, waitoptions) 63+ await asyncio.sleep(0.1) 64 65 # Generate a new block to make the active waitNext calls return, then clean up. 66 self.generate(node, 1, sync_fun=lambda: None) 67 await ((await promise1).result).destroy(ctx) 68 await ((await promise2).result).destroy(ctx) 69+ await ((await promise3).result).destroy(ctx) 70 await template.destroy(ctx) 71 72 asyncio.run(capnp.run(async_routine()))DrahtBot added the label Needs rebase on Feb 7, 202661c9db4872ipc, test: Add tests for unclean disconnect and thread busy behavior
Upcoming libmultiprocess changes are expected to alter this behavior (https://github.com/bitcoin/bitcoin/issues/34250#issuecomment-3749243782), making test coverage useful for documenting current behavior and validating the intended changes.
in test/functional/test_framework/test_node.py:481 in f532e9b2ff outdated
477@@ -477,12 +478,17 @@ def is_node_stopped(self, *, expected_stderr="", expected_ret_code=0): 478 return False 479 480 # process has stopped. Assert that it didn't return an error code. 481- assert return_code == expected_ret_code, self._node_msg( 482+ if not isinstance(expected_ret_code, Iterable):
Sjors commented at 9:03 am on February 11, 2026:Let’s add a comment tois_node_stoppedthat this takes either one ret code or a list, then the line here is less a surprise.
ryanofsky commented at 8:27 pm on February 11, 2026:re: #34284 (review)
Let’s add a comment to
is_node_stoppedthat this takes either one ret code or a list, then the line here is less a surprise.Thanks, added comment
ryanofsky referenced this in commit 4f42fc4ddc on Feb 11, 2026ryanofsky force-pushed on Feb 11, 2026ryanofsky commented at 8:50 pm on February 11, 2026: contributorThanks for the review!
Rebased f532e9b2ff1a8b6359a79f238426a912abbf7428 -> 66d0258efa9edcb64dc09b0e2b00e1b82b1dd97f (
pr/ipc-testasync.5->pr/ipc-testasync.6, compare) due to conflicts with #34452 and implemented suggestionsUpdated 66d0258efa9edcb64dc09b0e2b00e1b82b1dd97f -> 97d2b7375ae3b7bcb479b4ea9b50fd83f7600209 (
pr/ipc-testasync.6->pr/ipc-testasync.7, compare) to fix race condition in thread busy test https://github.com/bitcoin/bitcoin/actions/runs/21922475933/job/63306151362?pr=34284Updated 97d2b7375ae3b7bcb479b4ea9b50fd83f7600209 -> 61c9db4872bc069658f87743e07688d05cab01ae (
pr/ipc-testasync.7->pr/ipc-testasync.8, compare) to deal with no stderr output from musl in https://github.com/bitcoin/bitcoin/actions/runs/21924150204/job/63312180100?pr=34284ryanofsky force-pushed on Feb 11, 2026DrahtBot added the label CI failed on Feb 11, 2026DrahtBot removed the label Needs rebase on Feb 11, 2026ryanofsky force-pushed on Feb 12, 2026DrahtBot removed the label CI failed on Feb 12, 2026DrahtBot requested review from ismaelsadeeq on Feb 13, 2026
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-02-17 09:13 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me