I think this problem #29767#issue-2216373048 is because of in BaseIndex::Sync https://github.com/bitcoin/bitcoin/blob/61de64df6790077857faba84796bb874b59c5d15/src/index/base.cpp#L163-L168 Setup m_synced = true; before Commit(); So this may cause a race condition window to BaseIndex::BlockConnected https://github.com/bitcoin/bitcoin/blob/61de64df6790077857faba84796bb874b59c5d15/src/index/base.cpp#L271-L274 So i try to fix it with move m_synced = true after Commit(). Also see comment of Sync(): https://github.com/bitcoin/bitcoin/blob/61de64df6790077857faba84796bb874b59c5d15/src/index/base.h#L151-L156 I am a newcomer interested in Bitcoin, trying to become a member of the Bitcoin Core development team. Please give me some feedback if you could, as I may be doing something wrong. Thank you!
ThreadSanitizer: Fix #29767 #29776
pull nanlour wants to merge 1 commits into bitcoin:master from nanlour:master changing 1 files +1 −1-
nanlour commented at 3:24 AM on April 1, 2024: contributor
-
Fix #29767, set m_synced = true after Commit() bbe82c116e
-
DrahtBot commented at 3:24 AM on April 1, 2024: contributor
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--006a51241073e994b41acfe9ec718e94-->
Code Coverage
For detailed information about the code coverage, see the test coverage report.
<!--021abf342d371248e50ceaed478a90ca-->
Reviews
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
- fanquake renamed this:
ThreadSanitizer: Fix #29767, ThreadSanitizer: data race src/flatfile.cpp:47:13 in FlatFileSeq::Open(FlatFilePos const&, bool)
ThreadSanitizer: Fix #29767
on Apr 1, 2024 -
furszy commented at 4:45 PM on April 1, 2024: member
Setup m_synced = true; before Commit(); So this may cause a race condition window to BaseIndex::BlockConnected
If that's the case, sleeping the thread after setting
m_syncedand beforeCommit()should trigger the failure. Have you tried it? -
mzumsande commented at 9:45 PM on April 1, 2024: contributor
Concept ACK
Seems like this is not just a theoretical ThreadSanitizer error but a bug that could lead to index corruption in the case of unfortunate timing between the threads running
Sync()andBlockConnected(). - nanlour force-pushed on Apr 2, 2024
-
nanlour commented at 3:37 PM on April 2, 2024: contributor
If that's the case, sleeping the thread after setting
m_syncedand beforeCommit()should trigger the failure. Have you tried it?Yes, I have tried to recreate this bug with the original
feature_assumeutxotest from #29767, but simply sleeping between settingm_syncedandCommit()cannot trigger it. I've attempted to recreate the data race by only pause Sync thread after it setm_syncedand before it callFlatFileSeq::Open, this trigger data race when runfeature_assumeutxotest with a probability of around 10%. This is my way to recreate it: 5931fb8f0966f435e9b975ee1c4c4357e88e9e56. I am pretty sure there will be a better solution, but I can only come up with this one for now. - ryanofsky approved
-
ryanofsky commented at 4:16 PM on April 2, 2024: contributor
Code review ACK bbe82c116e72ca0638751e063bf564cd1fe5c4d5
It seems like the race condition between
m_synced = trueand committing has been around since 4368384f1d267b011e03a805f934f5c47e2ca1b2 from #14121, and the race condition accessing them_next_filter_posvariable existed then too. - DrahtBot requested review from mzumsande on Apr 2, 2024
- DrahtBot added the label CI failed on Apr 2, 2024
-
fjahr commented at 10:40 PM on April 3, 2024: contributor
Code review ACK bbe82c116e72ca0638751e063bf564cd1fe5c4d5
This looks like the correct fix to me. CI failure is unrelated.
- ryanofsky merged this on Apr 4, 2024
- ryanofsky closed this on Apr 4, 2024
-
ryanofsky commented at 1:30 PM on April 4, 2024: contributor
Merged this since it is a small straightfoward fix, only affects indexing, has 2 acks (ryanofsky, fjahr) and positive comments from other reviewers (mzumsande, sjors, furszy)
- Pttn referenced this in commit 037f05580a on Apr 13, 2024
-
luke-jr commented at 6:50 PM on April 21, 2024: member
It seems like the race condition between m_synced = true and committing has been around since 4368384 from #14121, and the race condition accessing the m_next_filter_pos variable existed then too.
Until #28955,
cs_mainwas held across all this to prevent a race.Not going to look at it right now, but I suspect this fix just creates a different race instead.
-
nanlour commented at 11:49 AM on April 26, 2024: contributor
Not going to look at it right now, but I suspect this fix just creates a different race instead. Looks like furszy already got this in #29867
I think the race condition fixed in #29867 was not introduced by my change; it was created by #28955, although my change makes the race condition window bigger.
-
ryanofsky commented at 12:29 PM on April 26, 2024: contributor
Until #28955,
cs_mainwas held across all this to prevent a race.The race existed before #28955 because even before #28955, only the
BaseIndex::Syncthread was lockingcs_main. TheBaseIndex::BlockConnectednotification thread was never lockingcs_main. So settingm_synced = true;before callingCommit();was always unsafe, even withcs_mainheld because theBlockConnectedthread could run in parallel and both threads could try to update index state at the same time. #28955 would make the race more likely to happen though, which was probably why it was caught more recently. - fanquake referenced this in commit 0fcceefe22 on Apr 26, 2024
- fanquake referenced this in commit c7885ecd77 on May 13, 2024
- glozow referenced this in commit bb46b90b2e on May 23, 2024
- glozow referenced this in commit 6d7a1e3670 on May 24, 2024
- tomt1664 referenced this in commit 5e2be91127 on Apr 1, 2025
- tomt1664 referenced this in commit c1603c82e4 on Apr 21, 2025
- bitcoin locked this on Apr 26, 2025