No longer re-acquire lock cs_main during IBD in CChainState::IsInitialBlockDownload(), as most of its callers already hold the lock. Have these callers invoke non-locking function CChainState::IsIBD() instead. Apply thread safety analysis to the IBD functions. Retains the post-IBD optimization latch for the callers that do not hold the lock. Makes explicit which callers hold cs_main, and which do not.
validation: don't re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload() #24220
pull jonatack wants to merge 3 commits into bitcoin:master from jonatack:remove-cs_main-lock-from-IsInitialBlockDownload changing 6 files +38 −25-
jonatack commented at 5:59 PM on January 31, 2022: contributor
- DrahtBot added the label P2P on Jan 31, 2022
- DrahtBot added the label Validation on Jan 31, 2022
-
sdaftuar commented at 6:59 PM on January 31, 2022: member
I recall that the reason IsIBD was written like this:
bool CChainState::IsInitialBlockDownload() const { // Optimization: pre-test latch before taking the lock. if (m_cached_finished_ibd.load(std::memory_order_relaxed)) return false; LOCK(cs_main);is so that in the common case -- which is that we've already latched to false -- that we can respond without ever acquiring cs_main inside this function. See #8007
Maybe there's a better refactor that could be done so that we can reduce locking contention on cs_main further, but this patch would have all the call sites grab the lock in order to even call IsIBD(), which would add more cs_main lock acquisitions in general -- and that seems counter to where we want to end up?
Is the motivation for this patch to reduce the uses of cs_main as a recursive mutex, and thereby get to a better design? If so that sounds like a good direction, but I think it'd be better if we avoided adding extra lock acquisitions (and holding cs_main longer) in order to get there. For instance it seems like with this patch some of the validation callbacks will now be invoked while holding cs_main the whole time? That seems undesirable.
-
jonatack commented at 7:09 PM on January 31, 2022: contributor
Yes, the trade-off this pull makes EDIT: is no longer made, thanks to review feedback. <strike>foregoing the optimization for a half dozen callers that don't already hold cs_main, in favor of avoiding re-taking the cs_main lock for the 18 callers that do already hold it, by moving the lock-taking to the half dozen callers (but just for the call itself, so essentially only the optimization is affected unless I am missing something).</strike>
Thanks for the link to #8007. It looks like the proportion of callers that hold the lock has since increased.
-
in src/validation.cpp:3117 in 595f34d709 outdated
3116 | - // Only notify about a new block tip if the active chain was modified. 3117 | - if (pindex_was_in_chain) { 3118 | - uiInterface.NotifyBlockTip(GetSynchronizationState(IsInitialBlockDownload()), to_mark_failed->pprev); 3119 | + // Only notify about a new block tip if the active chain was modified. 3120 | + if (pindex_was_in_chain) { 3121 | + uiInterface.NotifyBlockTip(GetSynchronizationState(IsInitialBlockDownload()), to_mark_failed->pprev);
sdaftuar commented at 7:15 PM on January 31, 2022:I believe this change will cause the NotifyBlockTip() callbacks to be invoked while holding cs_main the whole time.
jonatack commented at 7:35 PM on January 31, 2022:Thanks! Narrowed the lock to the IBD call:
uiInterface.NotifyBlockTip(GetSynchronizationState(WITH_LOCK(::cs_main, return IsInitialBlockDownload())), to_mark_failed->pprev);jonatack force-pushed on Jan 31, 2022jamesob commented at 7:47 PM on January 31, 2022: memberFor instance it seems like with this patch some of the validation callbacks will now be invoked while holding cs_main the whole time? That seems undesirable.
Agree with @sdaftuar; IsIBD() is called in a few performance-critical places and at the very least this change should be benchmarked before merge. I'm Concept NACK if this change is essentially just for refactoring's sake; if there's some overarching benefit it would be nice to hear that.
jonatack renamed this:validation, refactor: remove cs_main lock from CChainState::IsInitialBlockDownload()
validation: remove cs_main lock from CChainState::IsInitialBlockDownload()
on Jan 31, 2022DrahtBot commented at 8:37 PM on January 31, 2022: contributor<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #24178 (p2p: Respond to getheaders if we have sufficient chainwork by sdaftuar)
- #24171 (p2p: Sync chain more readily from inbound peers during IBD by sdaftuar)
- #24008 (assumeutxo: net_processing changes by jamesob)
- #22564 (refactor: Move mutable globals cleared in
::UnloadBlockIndextoBlockManagerby dongcarl) - #21160 (net/net processing: Move tx inventory into net_processing by jnewbery)
- #20827 (During IBD, prune as much as possible until we get close to where we will eventually keep blocks by luke-jr)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
sdaftuar commented at 10:12 PM on January 31, 2022: memberpotentially improving performance by not needlessly retaking the cs_main lock is the only point here
AFAIU lock acquisition is essentially free if it's already held, so I would be surprised if there's a performance gain from avoiding re-acquisition.
jonatack commented at 11:05 PM on January 31, 2022: contributorLooking ahead at the assumeutxo work in the pipeline, IsIBD() will see further callers already holding cs_main. If the assumption holds that a thread safety annotation is preferable to retaking a lock, we might have our cake and eat it too: the majority can call the core IsIBD() with an annotation, and the seven callers sans lock can go through a wrapper with optimization and lock.
in src/net_processing.cpp:4615 in 5159d2cb1d outdated
4608 | @@ -4608,11 +4609,11 @@ bool PeerManagerImpl::SendMessages(CNode* pto) 4609 | // MaybeSendPing may have marked peer for disconnection 4610 | if (pto->fDisconnect) return true; 4611 | 4612 | - MaybeSendAddr(*pto, *peer, current_time); 4613 | - 4614 | { 4615 | LOCK(cs_main); 4616 | 4617 | + MaybeSendAddr(*pto, *peer, current_time);
MarcoFalke commented at 8:54 AM on February 1, 2022:Slightly tend toward NACK. We shouldn't be broadening the scope of cs_main, but reduce it. Seems wrong to use a validation lock for sending p2p addresses on the network.
jonatack commented at 9:30 PM on February 1, 2022:Agree. Addressed in latest push.
MarcoFalke dismissedMarcoFalke commented at 8:56 AM on February 1, 2022: memberThis may also have a silent merge conflict with #22053?
jonatack force-pushed on Feb 1, 2022jonatack commented at 9:36 PM on February 1, 2022: contributorPushed a different approach along the lines of #24220 (comment). It does not increase the scope of
cs_mainand maintains the optimization, while avoiding reacquiring the mutex during IBD for the majority of callers that already hold it.This may also have a silent merge conflict with #22053?
With the latest push, if
MaybeSendAddrno longer holds cs_main it should callIsIBD, otherwise clang provides a thread safety warning.jonatack renamed this:validation: remove cs_main lock from CChainState::IsInitialBlockDownload()
validation: don't re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload()
on Feb 1, 2022jonatack renamed this:validation: don't re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload()
validation: don't re-acquire cs_main after IBD in CChainState::IsInitialBlockDownload()
on Feb 1, 2022jonatack renamed this:validation: don't re-acquire cs_main after IBD in CChainState::IsInitialBlockDownload()
validation: don't re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload()
on Feb 1, 2022jonatack force-pushed on Feb 4, 2022MarcoFalke commented at 3:24 PM on February 4, 2022: memberto reduce the use of cs_main as a recursive mutex.
I think this is a goal that is almost impossible to achieve with a mutex visible in the global scope/namespace. And if someone did it, it would be brittle and easily lead to UB.
See also how for a far simpler and private mutex (
m_chainstate_mutex), it took two attempts to get right: #24235.Personally, I think a better way would be to no longer hold cs_main for non-validation uses and then make cs_main private to chainman.
jonatack commented at 4:00 PM on February 4, 2022: contributorYes. This change makes explicit which callers hold
cs_mainand which do not, now and going forward. Edit: updated the pull description.in src/validation.h:679 in a22c867c47 outdated
675 | @@ -676,7 +676,13 @@ class CChainState 676 | void UnloadBlockIndex() EXCLUSIVE_LOCKS_REQUIRED(::cs_main); 677 | 678 | /** Check whether we are doing an initial block download (synchronizing from disk or network) */ 679 | - bool IsInitialBlockDownload() const; 680 | + bool IsInitialBlockDownload() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
luke-jr commented at 9:12 PM on February 5, 2022:Would prefer to make
IsIBDthe new semantics
jonatack commented at 12:20 PM on February 7, 2022:Done
validation: create non-locking IsIBD() wrapped by IsInitialBlockDownload() d780578d178e644157b3validation: don't re-acquire cs_main in IsInitialBlockDownload() during IBD
as most of its callers already hold the cs_main lock; have these callers invoke non-locking IsIBD() instead. Retains the post-IBD optimization latch for the callers that do not hold the lock. Makes explicit which callers hold cs_main and which do not.
601f90ec46validation: apply thread safety analysis to IBD boolean functions
Ensures that callers now and going forward invoke the correct function.
in src/validation.h:685 in a22c867c47 outdated
681 | + 682 | + /** 683 | + * Same as `IsInitialBlockDownload()`, but also locks mutex cs_main during 684 | + * initial block download. Use for callers not already holding the lock. 685 | + */ 686 | + bool IsIBD() const;
PastaPastaPasta commented at 9:30 AM on February 7, 2022:should we add a
LOCKS_EXCLUDEDto prevent accidentally calling this when we know we have cs_main locked?
jonatack commented at 12:23 PM on February 7, 2022:You read my mind. Added a
LOCKS_EXCLUDEDthread safety annotation to the declaration along with anAssertLockNotHeldrun-time assertion in the definition.jonatack force-pushed on Feb 7, 2022jonatack commented at 12:48 PM on February 7, 2022: contributorUpdated with @luke-jr and @PastaPastaPasta feedback (thanks!)
DrahtBot commented at 5:16 PM on March 25, 2022: contributor<!--cf906140f33d8803c4a75a2196329ecb-->
🐙 This pull request conflicts with the target branch and needs rebase.
<sub>Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".</sub>
DrahtBot added the label Needs rebase on Mar 25, 2022DrahtBot commented at 7:03 AM on July 25, 2022: contributor<!--13523179cfe9479db18ec6c5d236f789-->There hasn't been much activity lately and the patch still needs rebase. What is the status here?
- Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
- Is it no longer relevant? ➡️ Please close.
- Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.
MarcoFalke commented at 7:07 AM on July 25, 2022: memberClosing for now due to inactivity. Let us know if this should be reopened.
MarcoFalke closed this on Jul 25, 2022bitcoin locked this on Jul 25, 2023
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 21:13 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me