ThreadSanitizer: Fix #29767 #29776

pull nanlour wants to merge 1 commits into bitcoin:master from nanlour:master changing 1 files +1 −1
  1. nanlour commented at 3:24 am on April 1, 2024: contributor
    I think this problem #29767#issue-2216373048 is because of in BaseIndex::Sync https://github.com/bitcoin/bitcoin/blob/61de64df6790077857faba84796bb874b59c5d15/src/index/base.cpp#L163-L168 Setup m_synced = true; before Commit(); So this may cause a race condition window to BaseIndex::BlockConnected https://github.com/bitcoin/bitcoin/blob/61de64df6790077857faba84796bb874b59c5d15/src/index/base.cpp#L271-L274 So i try to fix it with move m_synced = true after Commit(). Also see comment of Sync(): https://github.com/bitcoin/bitcoin/blob/61de64df6790077857faba84796bb874b59c5d15/src/index/base.h#L151-L156 I am a newcomer interested in Bitcoin, trying to become a member of the Bitcoin Core development team. Please give me some feedback if you could, as I may be doing something wrong. Thank you!
  2. Fix #29767, set m_synced = true after Commit() bbe82c116e
  3. DrahtBot commented at 3:24 am on April 1, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK ryanofsky, fjahr
    Concept ACK mzumsande

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  4. fanquake renamed this:
    ThreadSanitizer: Fix #29767, ThreadSanitizer: data race src/flatfile.cpp:47:13 in FlatFileSeq::Open(FlatFilePos const&, bool)
    ThreadSanitizer: Fix #29767
    on Apr 1, 2024
  5. furszy commented at 4:45 pm on April 1, 2024: member

    Setup m_synced = true; before Commit(); So this may cause a race condition window to BaseIndex::BlockConnected

    If that’s the case, sleeping the thread after setting m_synced and before Commit() should trigger the failure. Have you tried it?

  6. mzumsande commented at 9:45 pm on April 1, 2024: contributor

    Concept ACK

    Seems like this is not just a theoretical ThreadSanitizer error but a bug that could lead to index corruption in the case of unfortunate timing between the threads running Sync() and BlockConnected().

  7. Sjors commented at 1:55 pm on April 2, 2024: member

    The change to bbe82c116e72ca0638751e063bf564cd1fe5c4d5 in index/base.cpp looks safe to me.

    cc @fjahr

  8. nanlour force-pushed on Apr 2, 2024
  9. nanlour commented at 3:37 pm on April 2, 2024: contributor

    @furszy

    If that’s the case, sleeping the thread after setting m_synced and before Commit() should trigger the failure. Have you tried it?

    Yes, I have tried to recreate this bug with the original feature_assumeutxo test from #29767, but simply sleeping between setting m_synced and Commit() cannot trigger it. I’ve attempted to recreate the data race by only pause Sync thread after it set m_synced and before it call FlatFileSeq::Open, this trigger data race when run feature_assumeutxo test with a probability of around 10%. This is my way to recreate it: 5931fb8f0966f435e9b975ee1c4c4357e88e9e56. I am pretty sure there will be a better solution, but I can only come up with this one for now.

  10. ryanofsky approved
  11. ryanofsky commented at 4:16 pm on April 2, 2024: contributor

    Code review ACK bbe82c116e72ca0638751e063bf564cd1fe5c4d5

    It seems like the race condition between m_synced = true and committing has been around since 4368384f1d267b011e03a805f934f5c47e2ca1b2 from #14121, and the race condition accessing the m_next_filter_pos variable existed then too.

  12. DrahtBot requested review from mzumsande on Apr 2, 2024
  13. DrahtBot added the label CI failed on Apr 2, 2024
  14. fjahr commented at 10:40 pm on April 3, 2024: contributor

    Code review ACK bbe82c116e72ca0638751e063bf564cd1fe5c4d5

    This looks like the correct fix to me. CI failure is unrelated.

  15. ryanofsky merged this on Apr 4, 2024
  16. ryanofsky closed this on Apr 4, 2024

  17. ryanofsky commented at 1:30 pm on April 4, 2024: contributor
    Merged this since it is a small straightfoward fix, only affects indexing, has 2 acks (ryanofsky, fjahr) and positive comments from other reviewers (mzumsande, sjors, furszy)
  18. Pttn referenced this in commit 037f05580a on Apr 13, 2024
  19. luke-jr commented at 6:50 pm on April 21, 2024: member

    It seems like the race condition between m_synced = true and committing has been around since 4368384 from #14121, and the race condition accessing the m_next_filter_pos variable existed then too.

    Until #28955, cs_main was held across all this to prevent a race.

    Not going to look at it right now, but I suspect this fix just creates a different race instead.

  20. luke-jr commented at 7:49 pm on April 21, 2024: member
    Looks like furszy already got this in #29867
  21. nanlour commented at 11:49 am on April 26, 2024: contributor

    Not going to look at it right now, but I suspect this fix just creates a different race instead. Looks like furszy already got this in #29867

    I think the race condition fixed in #29867 was not introduced by my change; it was created by #28955, although my change makes the race condition window bigger.

  22. ryanofsky commented at 12:29 pm on April 26, 2024: contributor

    Until #28955, cs_main was held across all this to prevent a race.

    The race existed before #28955 because even before #28955, only the BaseIndex::Sync thread was locking cs_main. The BaseIndex::BlockConnected notification thread was never locking cs_main. So setting m_synced = true; before calling Commit(); was always unsafe, even with cs_main held because the BlockConnected thread could run in parallel and both threads could try to update index state at the same time. #28955 would make the race more likely to happen though, which was probably why it was caught more recently.

  23. fanquake referenced this in commit 0fcceefe22 on Apr 26, 2024
  24. fanquake commented at 1:07 pm on April 26, 2024: member
    Backported for 27.x in #29888.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me