Bitcoin Core rpc_misc.py test failure #128

issue Sjors openend this issue on January 17, 2025
  1. Sjors commented at 9:49 am on January 17, 2025: member
  2. Sjors commented at 9:56 am on January 17, 2025: member

    So far I tried on (M4) macOS 15.2 with commit 9a90f964a3c4fe75477926d34b7f1813b144449d:

    0cmake -B build -DWITH_MULTIPROCESS=ON 
    1cmake --build build
    2BITCOIND=$(pwd)/build/src/bitcoin-node build/test/functional/rpc_misc.py
    

    Using multiprocess master @ 3b2617b3e59f55ae3501c82a62c87f393077779f

  3. ryanofsky commented at 3:09 pm on January 17, 2025: collaborator

    From the CI link above the error seems to be:

    0node0 stderr libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
    

    which according to https://stackoverflow.com/questions/66773247/libcabi-dylib-terminating-with-uncaught-exception-of-type-std-1system-er “typically happens when .lock() is called on a mutex that is not yet constructed, or has already been destructed.”

    In the test, the error happens when the node is shutting down after an echoipc call has been made:

    https://github.com/bitcoin/bitcoin/blob/90227f89411c1a1d4e62ed0e4de3099e1104bc19/test/functional/rpc_misc.py#L83

  4. Sjors commented at 8:56 am on January 21, 2025: member

    Another CI failure on the same test, this time on centOS. I’m unsure if it’s the same issue:

    https://cirrus-ci.com/task/6027377219731456?logs=ci#L4105

    0raise AssertionError("Unexpected stderr {} != {}".format(stderr, expected_stderr))
    1AssertionError: Unexpected stderr terminate called after throwing an instance of 'kj::ExceptionImpl'
    2what():  mp/proxy.cpp:242: disconnected: write(m_post_fd, &buffer, 1): Broken pipe
    
  5. ryanofsky commented at 4:20 pm on January 21, 2025: collaborator

    The failure on centos https://cirrus-ci.com/task/6027377219731456?logs=ci#L4111 also reported https://github.com/bitcoin/bitcoin/pull/30975#issuecomment-2603827142 is happening in nearly the same place as originally reported failure on macos, when the test tries to stop the node on line https://github.com/bitcoin/bitcoin/blob/11b293a9e7e89f25acc74d0bda5cc68cd164b551/test/functional/rpc_misc.py#L83 after an echoipc call is made.

    However, there are some differences between the two cases:

    In both cases there seems to be a race condition and the bug could have the same root cause. I think my next step will be to try to run the test in loop to see if I could reproduce locally.

  6. ryanofsky commented at 9:00 pm on January 21, 2025: collaborator
    The “Unexpected stderr” “terminate called after throwing an instance of ‘kj::ExceptionImpl” “mp/proxy.cpp:242: disconnected: write(m_post_fd, &buffer, 1): Broken pipe” error that happens on CentOS was actually seen before and has an issue open https://github.com/bitcoin/bitcoin/issues/31151
  7. ryanofsky commented at 3:18 pm on January 23, 2025: collaborator

    Looking into this more it’s not clear if this “mutex lock failed: Invalid argument” issue is the same as the other “disconnected: write(m_post_fd, &buffer, 1): Broken pipe” issue reported in https://github.com/bitcoin/bitcoin/issues/31151, or if the fix in #129 will affect it.

    The other “Broken pipe” issue happens the when the child process in the echoipc test does not shut down cleanly due to a race condition in the EventLoop shutdown sequence that happens there. But the “mutex lock failed” here is happening in the parent process not the child process, and even though the last log messages are about the event loop shutting down, they could just indicate that the event loop shut down successfully, and the error is happening at some point after that.

    The echoipc parent process starts its event loop with the CapnpProtocol::connect function instead of CapnpProtocol::serve method running the event loop in the child process. And instead of shutting down when the last clean disconnects, it shuts down when the main() process exits and MakeNodeInit() destructor is called to destroy the interfaces::Init object which owns the interfaces::Ipc object which owns the CapnpProtocol object which calls removeClient in its destructor to stop the event loop.

    But even though the event loop shutdown sequence is different, it could have the race conditions described in https://github.com/bitcoin/bitcoin/issues/31151 as long as there are any ProxyServerBase::m_impl objects that need to be destroyed asynchronously and could cause EventLoop::startAsyncThread to spawn a separate thread.

    The echoipc method in the parent process isn’t passing any interface pointers to the child process so it isn’t actually creating any ProxyServer objects of its own. However the IPC framework is creating ProxyServer<Thread> and ProxyServer<ThreadMap> objects and while the former is a custom class that doesn’t inherit from ProxyServerBase, the latter does, so it’s possible a similar race condition could exist that causes this issue and would be fixed by #129.

    But it’s also possible #129 does not fix this, so will probably just need to see if it happens again.

  8. ryanofsky commented at 7:20 pm on January 27, 2025: collaborator
    I’m going to close this issue assuming #129 fixes it, but if it happens again, this can be reopened.
  9. ryanofsky closed this on Jan 27, 2025

  10. ryanofsky commented at 12:08 pm on February 7, 2025: collaborator

    But it’s also possible #129 does not fix this, so will probably just need to see if it happens again.

    This issue did turn out to be different than #31151 and not fixed by #129. A similar macos “mutex lock failed” error was reported in #154. Both this issue and #154 should be fixed by https://github.com/bitcoin/bitcoin/pull/31815, but this has not been confirmed yet.


Sjors ryanofsky


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/libmultiprocess. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-12-04 19:30 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me