multiprocess: Lock CapnpProtocol::m_loop with mutex #31815

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/macrace changing 1 files +26 −22
  1. ryanofsky commented at 11:55 am on February 7, 2025: contributor

    This change is intended to fix intermittent failures in ipc_tests unit test and rpc_misc.py functional tests that happen on mac os, which sometimes fail with error terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument as reported https://github.com/chaincodelabs/libmultiprocess/issues/154

    This error could happen when m_loop.reset(); in CapnpProtocol::startLoop executes before the m_loop->removeClient(lock); method returns in the ~CapnpProtocol destructor , causing an failure to reacquire the EventLoop::m_mutex inside the EventLoop::removeClient method, and the “mutex lock failed” exception to be thrown.

    Fix this error by adding a new mutex to guard the CapnpProtocol::m_loop variable and making sure m_loop.reset(); cannot be called until after m_loop->removeClient(lock); returns.


    This PR is part of the process separation project.

  2. multiprocess: Lock CapnpProtocol::m_loop with mutex
    This change is intended to fix occasional ipc_tests unit tests and rpc_misc.py
    functional tests errors that happen on mac os, which sometimes fail with error
    "terminating due to uncaught exception of type std::__1::system_error: mutex
    lock failed: Invalid argument" as reported
    https://github.com/chaincodelabs/libmultiprocess/issues/154
    
    This error could happen when `m_loop.reset();` in `CapnpProtocol::startLoop`
    executes before the `m_loop->removeClient(lock);` call in the `~CapnpProtocol`
    destructor returns, causing an failure to reacquire the `EventLoop::m_mutex`
    inside the `removeClient` method.
    
    Fix this error by adding a new mutex to guard access to the
    `CapnpProtocol::m_loop` variable and making sure `m_loop.reset();` cannot be
    called until after `m_loop->removeClient(lock);` returns.
    baea320be0
  3. DrahtBot commented at 11:55 am on February 7, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31815.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

  4. maflcko added the label Bug on Feb 7, 2025
  5. maflcko removed the label Bug on Feb 7, 2025
  6. DrahtBot added the label interfaces on Feb 7, 2025
  7. ryanofsky commented at 8:16 pm on February 7, 2025: contributor

    So far as of baea320be01eb68ff7ba47d0ca0e00b64366a880 it seems like this change does not solve the problem it was supposed to fix.

    There is a new “libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument” failure https://github.com/bitcoin/bitcoin/actions/runs/13206947916/job/36872233303?pr=30975#step:7:2433 in 553a7e2f66649590864604ed187bf5bf2e796297 which includes a cherry-picked version of this fix.

  8. Sjors commented at 8:17 pm on February 7, 2025: member

    This might have a bug somewhere, in #30975:

     0test/ipc_tests.cpp:11: Entering test suite "ipc_tests"
     1test/ipc_tests.cpp:12: Entering test case "ipc_tests"
     2libc++abi: terminating due to uncaught exception of type std::__1::system_error: mutex lock failed: Invalid argument
     32025-02-07T19:59:17.387846Z [unknown] [test/util/random.cpp:48] [SeedRandomStateForTest] Setting random seed for current tests to RANDOM_CTX_SEED=32db8d22914609b5d53f87f6947ca780fea50790b141c9ccc134e7a13f971121
     42025-02-07T19:59:17.388318Z [test] [init/common.cpp:153] [LogPackageVersion] Bitcoin Core version v28.99.0-553a7e2f6664-dirty (release build)
     52025-02-07T19:59:17.388468Z [test] [kernel/context.cpp:20] [operator()] Using the 'arm_shani(1way,2way)' SHA256 implementation
     62025-02-07T19:59:17.397870Z [test] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client first request from current thread, constructing waiter
     72025-02-07T19:59:17.398037Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.add$Params (a = 1, b = 2)
     82025-02-07T19:59:17.398114Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#1](/bitcoin-bitcoin/1/) FooInterface.add$Params (a = 1, b = 2)
     92025-02-07T19:59:17.398144Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#1](/bitcoin-bitcoin/1/) FooInterface.add$Results (result = 3)
    102025-02-07T19:59:17.398166Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.add$Results (result = 3)
    112025-02-07T19:59:17.398206Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.passOutPoint$Params (arg = "d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000")
    122025-02-07T19:59:17.398231Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#2](/bitcoin-bitcoin/2/) FooInterface.passOutPoint$Params (arg = "d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000")
    132025-02-07T19:59:17.398244Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#2](/bitcoin-bitcoin/2/) FooInterface.passOutPoint$Results (result = "d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000")
    142025-02-07T19:59:17.398262Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.passOutPoint$Results (result = "d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000")
    152025-02-07T19:59:17.398306Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.passUniValue$Params (arg = "{\\"i\\":1,\\"s\\":\\"two\\"}")
    162025-02-07T19:59:17.398324Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#3](/bitcoin-bitcoin/3/) FooInterface.passUniValue$Params (arg = "{\\"i\\":1,\\"s\\":\\"two\\"}")
    172025-02-07T19:59:17.398338Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#3](/bitcoin-bitcoin/3/) FooInterface.passUniValue$Results (result = "{\\"i\\":1,\\"s\\":\\"two\\"}")
    182025-02-07T19:59:17.398354Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.passUniValue$Results (result = "{\\"i\\":1,\\"s\\":\\"two\\"}")
    192025-02-07T19:59:17.398384Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.passTransaction$Params (arg = "\\002\\000\\000\\000\\001d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000\\000\\377\\377\\377\\377\\001\\000\\341\\365\\005\\000\\000\\000\\000\\000\\003\\000\\000\\000")
    202025-02-07T19:59:17.398404Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#4](/bitcoin-bitcoin/4/) FooInterface.passTransaction$Params (arg = "\\002\\000\\000\\000\\001d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000\\000\\377\\377\\377\\377\\001\\000\\341\\365\\005\\000\\000\\000\\000\\000\\003\\000\\000\\000")
    212025-02-07T19:59:17.398448Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#4](/bitcoin-bitcoin/4/) FooInterface.passTransaction$Results (result = "\\002\\000\\000\\000\\001d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000\\000\\377\\377\\377\\377\\001\\000\\341\\365\\005\\000\\000\\000\\000\\000\\003\\000\\000\\000")
    222025-02-07T19:59:17.398467Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.passTransaction$Results (result = "\\002\\000\\000\\000\\001d\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\000\\310\\000\\000\\000\\000\\377\\377\\377\\377\\001\\000\\341\\365\\005\\000\\000\\000\\000\\000\\003\\000\\000\\000")
    232025-02-07T19:59:17.398501Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.passVectorChar$Params (arg = "Hello")
    242025-02-07T19:59:17.398516Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#5](/bitcoin-bitcoin/5/) FooInterface.passVectorChar$Params (arg = "Hello")
    252025-02-07T19:59:17.398525Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#5](/bitcoin-bitcoin/5/) FooInterface.passVectorChar$Results (result = "Hello")
    262025-02-07T19:59:17.398538Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.passVectorChar$Results (result = "Hello")
    272025-02-07T19:59:17.398663Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.passBlockState$Params (arg = (mode = 1, result = 8, rejectReason = "reject reason", debugMessage = "debug message"))
    282025-02-07T19:59:17.398704Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#6](/bitcoin-bitcoin/6/) FooInterface.passBlockState$Params (arg = (mode = 1, result = 8, rejectReason = "reject reason", debugMessage = "debug message"))
    292025-02-07T19:59:17.398725Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#6](/bitcoin-bitcoin/6/) FooInterface.passBlockState$Results (result = (mode = 1, result = 8, rejectReason = "reject reason", debugMessage = "debug message"))
    302025-02-07T19:59:17.398749Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.passBlockState$Results (result = (mode = 1, result = 8, rejectReason = "reject reason", debugMessage = "debug message"))
    312025-02-07T19:59:17.398772Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.passBlockState$Params (arg = (mode = 0, result = 0, rejectReason = "", debugMessage = ""))
    322025-02-07T19:59:17.398790Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#7](/bitcoin-bitcoin/7/) FooInterface.passBlockState$Params (arg = (mode = 0, result = 0, rejectReason = "", debugMessage = ""))
    332025-02-07T19:59:17.398801Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#7](/bitcoin-bitcoin/7/) FooInterface.passBlockState$Results (result = (mode = 0, result = 0, rejectReason = "", debugMessage = ""))
    342025-02-07T19:59:17.398819Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.passBlockState$Results (result = (mode = 0, result = 0, rejectReason = "", debugMessage = ""))
    352025-02-07T19:59:17.398843Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client send FooInterface.passScript$Params (arg = "[")
    362025-02-07T19:59:17.398858Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server recv request  [#8](/bitcoin-bitcoin/8/) FooInterface.passScript$Params (arg = "[")
    372025-02-07T19:59:17.398868Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server send response [#8](/bitcoin-bitcoin/8/) FooInterface.passScript$Results (result = "[")
    382025-02-07T19:59:17.398883Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/b-test-58682} IPC client recv FooInterface.passScript$Results (result = "[")
    392025-02-07T19:59:17.399029Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} IPC server destroy N2mp11ProxyServerIN3gen12FooInterfaceEEE
    402025-02-07T19:59:17.399089Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} EventLoop::loop done, cancelling event listeners.
    412025-02-07T19:59:17.399098Z [unknown] [test/ipc_test.cpp:60] [operator()] LOG0: {IpcPipeTest-24701/58689} EventLoop::loop bye.
    

    https://github.com/bitcoin/bitcoin/actions/runs/13206947916/job/36872233303?pr=30975

  9. ryanofsky referenced this in commit be6ad9f955 on Feb 10, 2025
  10. ryanofsky referenced this in commit f8cbd0d34a on Feb 10, 2025
  11. ryanofsky referenced this in commit 61bf9456f9 on Feb 10, 2025
  12. ryanofsky referenced this in commit 48d01bcca7 on Feb 10, 2025
  13. ryanofsky referenced this in commit 26b9f3dda4 on Feb 10, 2025
  14. ryanofsky commented at 5:07 am on February 10, 2025: contributor
    I reproduced the bug locally (see https://github.com/chaincodelabs/libmultiprocess/issues/154#issuecomment-2646926155) and implemented a different fix in https://github.com/chaincodelabs/libmultiprocess/pull/159, which should make these changes unnecessary, because instead of trying to delay eventloop destruction in Bitcoin code until all locks are released, it just avoids reacquiring a lock unnecessary and pretents eventloop::loop() from returning until all locks are released so it won’t be destroyed. Will close this PR since these changes only incompletely fixed the bug and should no longer be needed.
  15. ryanofsky closed this on Feb 10, 2025

  16. Sjors commented at 8:28 am on February 10, 2025: member
    Thanks. I dropped this from #30975 and rebased on your latest #31741 which contains https://github.com/chaincodelabs/libmultiprocess/pull/159.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-02-22 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me