Currently, when a client makes simultaneous IPC calls on a single server thread (assigning the same Context.thread value in the request), assert(!m_fn) triggers in Waiter::post.
Since this error is pretty easy to trigger in python and rust cilents not using libmultiprocess, it would be better for the server to refuse these requests and provide a clear error message instead of crashing. This should just require a small change to the Waiter class and the code calling it in type-context.h. Context: https://github.com/bitcoin/bitcoin/pull/33201#issuecomment-3258111873 and https://gnusha.org/bitcoin-core-dev/2025-09-05.log:
004:50 < sipa> ryanofsky: any idea about this error in https://github.com/bitcoin/bitcoin/actions/runs/17491932432/job/49683489249?pr=33201 :
104:51 < sipa> node0 stderr bitcoin-node: ipc/libmultiprocess/include/mp/proxy-io.h:279: void mp::Waiter::post(Fn &&) [Fn = (lambda at ./ipc/libmultiprocess/include/mp/type-context.h:67:19)]: Assertion `!m_fn' failed.
204:55 < kevkevin> looks like the test got hung up on this line in the ipc test: template4 = File "/home/admin/actions-runner/_work/_temp/build/test/functional/interface_ipc.py", line 156, in async_routine: await template2.result.waitNext(ctx, waitoptions)
304:56 < sipa> yeah, bitcoin-node crashed during that invocation
405:01 < sipa> i can reproduce it locally, though it's pretty rare (once in 60 runs, so far)
505:04 < kevkevin> hmm interesting, I'll try to reproduce it locally myself
605:05 < ryanofsky> that is interesting, i've never seen that assert hit before. i think it could happen if two ipc clients try to execute a different IPC calls on the same thread at the same time
705:06 < sipa> huh!
805:06 < ryanofsky> libmultiprocess assumes a 1:1 threading model where a there is one server thread for every client thread, emulating a traditional c++ call stack
905:06 < ryanofsky> allowing recursive mutexes, synchronous callbacks, etc
1005:06 < sipa> should the python side do anything specific to arrange that?
1105:07 < sipa> i would assume not, because there is only one thread on the python side that does all IPC calls
1205:07 < ryanofsky> it should just make two IPC calls at the same time, and they should be calls that do not return instantly like waitNext
1305:08 < ryanofsky> yeah, it is unexpected that current python test would cause that
1405:08 < dodo> c
1505:09 -!- hacker4web3bitco [~hacker4we@user/hacker4web3bitco] has joined #bitcoin-core-dev
1605:10 < ryanofsky> maybe there is a race condition where c++ code sends a response but the server thread doesn't return to being idle, and python code sends the next request quickly enough to hit the assert
1705:10 -!- bitcoinlover [~hacker4we@user/hacker4web3bitco] has quit [Ping timeout: 248 seconds]
1805:11 < sipa> oh, i may have messed something up
1905:11 < sipa> waitnext = template2.result.waitNext(ctx, waitoptions)
2005:11 < sipa> template4 = await template2.result.waitNext(ctx, waitoptions)
2105:11 < sipa> might create two asynchronous calls simultaneously?
2205:12 < ryanofsky> it seems like that would just chain the calls so the server runs them back to back, and make the bug more likely to happen
2305:13 < ryanofsky> but if you added an await on the first line it might prevent this
2405:13 < sipa> the first one is just unnecessary - i inlined it into the second one, but forgot to remove the original
2505:14 < sipa> you say "make the bug more likely", so you think that the bug isn't in this code itself, but elsewhere?
2605:14 < ryanofsky> actually i misread. yeah i assumed you were using waitnext variable second line and the calls would be chained. but actually this does look like simultaneous calls
2705:14 < sipa> alright, let me run it 1000 times to see if it reappears, but if not, i'm going to assume that was the culprit
2805:14 < ryanofsky> no i think this code does explain the bug
2905:15 < sipa> thanks!
3005:15 < ryanofsky> i just first misread it as chained calls that would be pipelined when actually those are parallel calls
3105:15 < sipa> right
3205:16 < ryanofsky> thanks for the test, and finding this. server could definitely do better here, it could just refuse the request instead of asserting
3305:18 < sipa> or at least a more helpful error message
3405:18 < sipa> i could imagine third parties trying to use the interface might hit something like this too
3505:20 < ryanofsky> yeah the error message is the most important thing in practice