validation: don’t re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload() #24220

pull jonatack wants to merge 3 commits into bitcoin:master from jonatack:remove-cs_main-lock-from-IsInitialBlockDownload changing 6 files +38 −25
  1. jonatack commented at 5:59 pm on January 31, 2022: contributor
    No longer re-acquire lock cs_main during IBD in CChainState::IsInitialBlockDownload(), as most of its callers already hold the lock. Have these callers invoke non-locking function CChainState::IsIBD() instead. Apply thread safety analysis to the IBD functions. Retains the post-IBD optimization latch for the callers that do not hold the lock. Makes explicit which callers hold cs_main, and which do not.
  2. jonatack commented at 6:03 pm on January 31, 2022: contributor
    Noticed while reviewing #24178.
  3. DrahtBot added the label P2P on Jan 31, 2022
  4. DrahtBot added the label Validation on Jan 31, 2022
  5. sdaftuar commented at 6:59 pm on January 31, 2022: member

    I recall that the reason IsIBD was written like this:

    0bool CChainState::IsInitialBlockDownload() const
    1{
    2    // Optimization: pre-test latch before taking the lock.
    3    if (m_cached_finished_ibd.load(std::memory_order_relaxed))
    4        return false;
    5
    6    LOCK(cs_main);
    

    is so that in the common case – which is that we’ve already latched to false – that we can respond without ever acquiring cs_main inside this function. See #8007

    Maybe there’s a better refactor that could be done so that we can reduce locking contention on cs_main further, but this patch would have all the call sites grab the lock in order to even call IsIBD(), which would add more cs_main lock acquisitions in general – and that seems counter to where we want to end up?

    Is the motivation for this patch to reduce the uses of cs_main as a recursive mutex, and thereby get to a better design? If so that sounds like a good direction, but I think it’d be better if we avoided adding extra lock acquisitions (and holding cs_main longer) in order to get there. For instance it seems like with this patch some of the validation callbacks will now be invoked while holding cs_main the whole time? That seems undesirable.

  6. jonatack commented at 7:09 pm on January 31, 2022: contributor

    Yes, the trade-off this pull makes EDIT: is no longer made, thanks to review feedback. foregoing the optimization for a half dozen callers that don’t already hold cs_main, in favor of avoiding re-taking the cs_main lock for the 18 callers that do already hold it, by moving the lock-taking to the half dozen callers (but just for the call itself, so essentially only the optimization is affected unless I am missing something).

    Thanks for the link to #8007. It looks like the proportion of callers that hold the lock has since increased.

  7. in src/validation.cpp:3117 in 595f34d709 outdated
    3116-    // Only notify about a new block tip if the active chain was modified.
    3117-    if (pindex_was_in_chain) {
    3118-        uiInterface.NotifyBlockTip(GetSynchronizationState(IsInitialBlockDownload()), to_mark_failed->pprev);
    3119+        // Only notify about a new block tip if the active chain was modified.
    3120+        if (pindex_was_in_chain) {
    3121+            uiInterface.NotifyBlockTip(GetSynchronizationState(IsInitialBlockDownload()), to_mark_failed->pprev);
    


    sdaftuar commented at 7:15 pm on January 31, 2022:
    I believe this change will cause the NotifyBlockTip() callbacks to be invoked while holding cs_main the whole time.

    jonatack commented at 7:35 pm on January 31, 2022:
    Thanks! Narrowed the lock to the IBD call: uiInterface.NotifyBlockTip(GetSynchronizationState(WITH_LOCK(::cs_main, return IsInitialBlockDownload())), to_mark_failed->pprev);
  8. jonatack force-pushed on Jan 31, 2022
  9. jamesob commented at 7:47 pm on January 31, 2022: member

    For instance it seems like with this patch some of the validation callbacks will now be invoked while holding cs_main the whole time? That seems undesirable.

    Agree with @sdaftuar; IsIBD() is called in a few performance-critical places and at the very least this change should be benchmarked before merge. I’m Concept NACK if this change is essentially just for refactoring’s sake; if there’s some overarching benefit it would be nice to hear that.

  10. jonatack commented at 7:55 pm on January 31, 2022: contributor
    @jamesob I agree; potentially improving performance by not needlessly retaking the cs_main lock is the only point here.
  11. jonatack renamed this:
    validation, refactor: remove cs_main lock from CChainState::IsInitialBlockDownload()
    validation: remove cs_main lock from CChainState::IsInitialBlockDownload()
    on Jan 31, 2022
  12. DrahtBot commented at 8:37 pm on January 31, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #24178 (p2p: Respond to getheaders if we have sufficient chainwork by sdaftuar)
    • #24171 (p2p: Sync chain more readily from inbound peers during IBD by sdaftuar)
    • #24008 (assumeutxo: net_processing changes by jamesob)
    • #22564 (refactor: Move mutable globals cleared in ::UnloadBlockIndex to BlockManager by dongcarl)
    • #21160 (net/net processing: Move tx inventory into net_processing by jnewbery)
    • #20827 (During IBD, prune as much as possible until we get close to where we will eventually keep blocks by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  13. sdaftuar commented at 10:12 pm on January 31, 2022: member

    potentially improving performance by not needlessly retaking the cs_main lock is the only point here

    AFAIU lock acquisition is essentially free if it’s already held, so I would be surprised if there’s a performance gain from avoiding re-acquisition.

  14. jonatack commented at 11:05 pm on January 31, 2022: contributor
    Looking ahead at the assumeutxo work in the pipeline, IsIBD() will see further callers already holding cs_main. If the assumption holds that a thread safety annotation is preferable to retaking a lock, we might have our cake and eat it too: the majority can call the core IsIBD() with an annotation, and the seven callers sans lock can go through a wrapper with optimization and lock.
  15. in src/net_processing.cpp:4615 in 5159d2cb1d outdated
    4608@@ -4608,11 +4609,11 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
    4609     // MaybeSendPing may have marked peer for disconnection
    4610     if (pto->fDisconnect) return true;
    4611 
    4612-    MaybeSendAddr(*pto, *peer, current_time);
    4613-
    4614     {
    4615         LOCK(cs_main);
    4616 
    4617+        MaybeSendAddr(*pto, *peer, current_time);
    


    MarcoFalke commented at 8:54 am on February 1, 2022:
    Slightly tend toward NACK. We shouldn’t be broadening the scope of cs_main, but reduce it. Seems wrong to use a validation lock for sending p2p addresses on the network.

    jonatack commented at 9:30 pm on February 1, 2022:
    Agree. Addressed in latest push.
  16. MarcoFalke dismissed
  17. MarcoFalke commented at 8:56 am on February 1, 2022: member
    This may also have a silent merge conflict with #22053?
  18. jonatack force-pushed on Feb 1, 2022
  19. jonatack commented at 9:36 pm on February 1, 2022: contributor

    Pushed a different approach along the lines of #24220 (comment). It does not increase the scope of cs_main and maintains the optimization, while avoiding reacquiring the mutex during IBD for the majority of callers that already hold it.

    This may also have a silent merge conflict with #22053?

    With the latest push, if MaybeSendAddr no longer holds cs_main it should call IsIBD, otherwise clang provides a thread safety warning.

  20. jonatack renamed this:
    validation: remove cs_main lock from CChainState::IsInitialBlockDownload()
    validation: don't re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload()
    on Feb 1, 2022
  21. jonatack renamed this:
    validation: don't re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload()
    validation: don't re-acquire cs_main after IBD in CChainState::IsInitialBlockDownload()
    on Feb 1, 2022
  22. jonatack renamed this:
    validation: don't re-acquire cs_main after IBD in CChainState::IsInitialBlockDownload()
    validation: don't re-acquire cs_main during IBD in CChainState::IsInitialBlockDownload()
    on Feb 1, 2022
  23. jonatack force-pushed on Feb 4, 2022
  24. MarcoFalke commented at 3:24 pm on February 4, 2022: member

    to reduce the use of cs_main as a recursive mutex.

    I think this is a goal that is almost impossible to achieve with a mutex visible in the global scope/namespace. And if someone did it, it would be brittle and easily lead to UB.

    See also how for a far simpler and private mutex (m_chainstate_mutex), it took two attempts to get right: #24235.

    Personally, I think a better way would be to no longer hold cs_main for non-validation uses and then make cs_main private to chainman.

  25. jonatack commented at 4:00 pm on February 4, 2022: contributor
    Yes. This change makes explicit which callers hold cs_main and which do not, now and going forward. Edit: updated the pull description.
  26. in src/validation.h:679 in a22c867c47 outdated
    675@@ -676,7 +676,13 @@ class CChainState
    676     void UnloadBlockIndex() EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    677 
    678     /** Check whether we are doing an initial block download (synchronizing from disk or network) */
    679-    bool IsInitialBlockDownload() const;
    680+    bool IsInitialBlockDownload() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    luke-jr commented at 9:12 pm on February 5, 2022:
    Would prefer to make IsIBD the new semantics

    jonatack commented at 12:20 pm on February 7, 2022:
    Done
  27. validation: create non-locking IsIBD() wrapped by IsInitialBlockDownload() d780578d17
  28. validation: don't re-acquire cs_main in IsInitialBlockDownload() during IBD
    as most of its callers already hold the cs_main lock;
    have these callers invoke non-locking IsIBD() instead.
    
    Retains the post-IBD optimization latch for the callers that
    do not hold the lock.
    
    Makes explicit which callers hold cs_main and which do not.
    8e644157b3
  29. validation: apply thread safety analysis to IBD boolean functions
    Ensures that callers now and going forward invoke the correct function.
    601f90ec46
  30. in src/validation.h:685 in a22c867c47 outdated
    681+
    682+    /**
    683+     * Same as `IsInitialBlockDownload()`, but also locks mutex cs_main during
    684+     * initial block download. Use for callers not already holding the lock.
    685+     */
    686+    bool IsIBD() const;
    


    PastaPastaPasta commented at 9:30 am on February 7, 2022:
    should we add a LOCKS_EXCLUDED to prevent accidentally calling this when we know we have cs_main locked?

    jonatack commented at 12:23 pm on February 7, 2022:
    You read my mind. Added a LOCKS_EXCLUDED thread safety annotation to the declaration along with an AssertLockNotHeld run-time assertion in the definition.
  31. jonatack force-pushed on Feb 7, 2022
  32. jonatack commented at 12:48 pm on February 7, 2022: contributor
    Updated with @luke-jr and @PastaPastaPasta feedback (thanks!)
  33. DrahtBot commented at 5:16 pm on March 25, 2022: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  34. DrahtBot added the label Needs rebase on Mar 25, 2022
  35. DrahtBot commented at 7:03 am on July 25, 2022: contributor
    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  36. MarcoFalke commented at 7:07 am on July 25, 2022: member
    Closing for now due to inactivity. Let us know if this should be reopened.
  37. MarcoFalke closed this on Jul 25, 2022

  38. bitcoin locked this on Jul 25, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me