Fix race conditions on disconnects #250

pull Sjors wants to merge 6 commits into bitcoin-core:master from Sjors:2026/03/races changing 5 files +137 −5
  1. race fix: worker thread destroyed before it is initialized (#34711, #34756)
    Fix from: https://github.com/bitcoin/bitcoin/issues/34711#issuecomment-3986380070
    Also reported in: https://github.com/bitcoin/bitcoin/issues/34756
    4fa90c6801
  2. race fix: getParams() called after request cancel (#34777)
    Fix from: https://github.com/bitcoin/bitcoin/issues/34777#issuecomment-4024285314
    88b4d85099
  3. race fix: m_on_cancel called after request finishes (#34782)
    Fix from: https://github.com/bitcoin/bitcoin/issues/34782#issuecomment-4033169085
    Failure is also triggered by antithesis https://github.com/bitcoin/bitcoin/issues/34782#issuecomment-4031170798
    7f954aa5ea
  4. DrahtBot commented at 10:38 am on March 11, 2026: none

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

  5. Sjors commented at 10:39 am on March 11, 2026: member

    I made some adjustments to the tests in #249:

    • drop the NDEBUG guard
    • (hopefully) fix the sanitizer issue by saving testing_hook_after_cleanup to a local

    Opened a fresh PR for a full CI run. @ryanofsky feel free to take the next turn and push to your PR, or I can address remaining issues here.

  6. test: getParams() called after request cancel
    Test from: https://github.com/bitcoin/bitcoin/issues/34777#issuecomment-4033231296
    5aa8c36cdd
  7. test: m_on_cancel called after request finishes
    Test from: https://github.com/bitcoin/bitcoin/issues/34782#issuecomment-4033363093
    789ac8c86d
  8. Sjors force-pushed on Mar 11, 2026
  9. Sjors commented at 12:33 pm on March 11, 2026: member
    • dropped # PR number from the (test) commit messages, since those link to this repo and the commit message already has the full link

    The LLM came up with a third test, but beyond reproducing the original issue, it doesn’t seem very useful. So I omitted it in this PR. See https://github.com/Sjors/libmultiprocess/commit/7c59dacbddd0779240c14f554c18605b2d39cbdd

  10. ryanofsky commented at 3:05 pm on March 11, 2026: collaborator

    Thanks for updating this! I opened https://github.com/bitcoin/bitcoin/pull/34804 to see how this interacts with bitcoin core CI.

    (I wonder if there is a way we automatically run bitcoin core CI on every libmultiprocess PR, or run it more easily. Might be something to look into).

    The LLM came up with a third test, but beyond reproducing the original issue, it doesn’t seem very useful. So I omitted it in this PR. See Sjors@7c59dac

    Didn’t look very closely but the test seems clean and I’d be inclined to add it since I already know I want to follow up this PR by duplicating the disconnect tests and maybe reviving my earlier attempt. If the test turns out not to be meaningful or is a maintenance burden we can remove it.

    Anyway I plan to review this soon

  11. maflcko commented at 3:26 pm on March 11, 2026: contributor

    (I wonder if there is a way we automatically run bitcoin core CI on every libmultiprocess PR, or run it more easily. Might be something to look into).

    I think you can just copy-paste this yaml:

    https://github.com/bitcoin-core/qa-assets/blob/main/.github/workflows/ci.yml#L1

    (The qa-assets checkout can be removed)

  12. Sjors commented at 4:13 pm on March 11, 2026: member

    I pushed the third test.

    (oh oops, let me check the CI error…)

  13. test: worker thread destroyed before it is initialized
    Add testing_hook_makethread that fires on the worker thread between
    the Lock and set_value operations in makeThread. A checker thread uses
    try_lock to verify the waiter mutex is held at that point. With the
    fix (Lock before set_value) the mutex is held, so try_lock fails.
    Without the fix (set_value before Lock) the hook fires before Lock, so
    try_lock succeeds -- indicating the race window exists.
    3f28bca517
  14. Sjors force-pushed on Mar 11, 2026
  15. Sjors commented at 4:33 pm on March 11, 2026: member
    Fixed missing include.
  16. Sjors commented at 10:36 am on March 12, 2026: member
    ACK 3f28bca5174f2a231b7fff6772411f8689f3d7e7 for the fix commits by @ryanofsky.
  17. fanquake commented at 2:23 pm on March 12, 2026: member
    Looks like the subtree update PR that includes this, in our CI, still has heap-use-after-free / data race issues: https://github.com/bitcoin/bitcoin/pull/34804#issuecomment-4047157116.
  18. ryanofsky commented at 4:49 pm on March 12, 2026: collaborator

    Looks like the subtree update PR that includes this, in our CI, still has heap-use-after-free / data race issues: bitcoin/bitcoin#34804 (comment).

    Commented in the other PR, but those are both caused by this line in newly added test code. I’m currently going through the PR and improving / rewriting the tests now. I will probably reopen #249 with the updates then refresh https://github.com/bitcoin/bitcoin/pull/34804

  19. Sjors commented at 5:56 pm on March 12, 2026: member

    Ok, I’ll close this one.

    but those are both caused by this line in newly added test code.

    The first two commits also had issues in their test in earlier versions. I didn’t thoroughly check the third commit because I initially didn’t use it.

  20. Sjors closed this on Mar 12, 2026

  21. ryanofsky commented at 6:04 pm on March 12, 2026: collaborator
    Thanks! I wound up changing a bunch of things in the tests too, but they were extremely helpful to have as a starting point

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/libmultiprocess. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-03-29 21:30 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me