Add cs_main annotation to WriteBatchSync(), drop lock in CDiskBlockIndex #24199

pull jonatack wants to merge 2 commits into bitcoin:master from jonatack:remove-cs_main-lock-in-CDiskBlockIndex changing 3 files +9 −3
  1. jonatack commented at 0:10 am on January 29, 2022: contributor

    Addresses review feedback by Marco Falke in #22932 (review):

    1. add cs_main thread safety annotation to CBlockTreeDB::WriteBatchSync(), which allows us to

    2. remove the cs_main lock in CDiskBlockIndex::SERIALIZE_METHODS with respect to its 2 callers in the codebase, CBlockTreeDB::LoadBlockIndexGuts() and CBlockTreeDB::WriteBatchSync(), that already hold the lock and whose callers hold the lock.

    I didn’t manage to remove the thread safety warnings generated by the fuzz tests (suggestions welcome) and resolved them here with a NO_THREAD_SAFETY_ANALYSIS annotation and explanatory documentation.

  2. Add cs_main thread safety annotation to CBlockTreeDB::WriteBatchSync()
    This allows us to remove the cs_main lock in CDiskBlockIndex::SERIALIZE_METHODS
    with the remaining thread safety warnings generated only by the tests.
    9b5f90127d
  3. Drop lock and thread safety analysis in CDiskBlockIndex serialization 0e052eb04f
  4. DrahtBot added the label UTXO Db and Indexes on Jan 29, 2022
  5. DrahtBot added the label Validation on Jan 29, 2022
  6. luke-jr commented at 3:58 am on January 31, 2022: member
    Concept NACK. Prefer recursive locks over fragile assumptions about the rest of the codebase.
  7. maflcko commented at 7:54 am on January 31, 2022: member

    Concept NACK. Prefer recursive locks over fragile assumptions about the rest of the codebase.

    My point was that the lock isn’t needed to be recursively taken during serialization since the data is copied before serialization.

  8. luke-jr commented at 8:16 am on January 31, 2022: member
    Specifically, I don’t think we should need NO_THREAD_SAFETY_ANALYSIS - surely there’s a better way?
  9. jonatack commented at 8:20 am on January 31, 2022: contributor
    I agree but didn’t see how to update the fuzz tests to do it – happy to update with suggestions.
  10. vasild commented at 2:49 pm on March 7, 2022: contributor

    To me it looks like that it is not just the tests that would be problematic:

    1. This CDBBatch::Write() call would internally invoke the CDiskBlockIndex serialization method: https://github.com/bitcoin/bitcoin/blob/5e49b2a2529adc737175ebdcbf312635c52b05f6/src/txdb.cpp#L280

    2. This CDBIterator::GetValue() call would internally invoke the CDiskBlockIndex deserialization method: https://github.com/bitcoin/bitcoin/blob/5e49b2a2529adc737175ebdcbf312635c52b05f6/src/txdb.cpp#L309

    Both Write() and GetValue() are too generic and we can’t tag them as requiring cs_main or acquire cs_main from within them.

    So, we have some methods in the (non-test) code that invoke CDiskBlockIndex ser/deser and can’t acquire cs_main or require their callers to acquire it.

  11. in src/chain.h:388 in 0e052eb04f
    384+    // Thread safety analysis is disabled because (a) CDiskBlockIndex creates a
    385+    // copy of the data to serialize a struct that is subsequently discarded,
    386+    // and (b) if the analysis is enabled the nStatus compiler warnings are only
    387+    // generated by the tests, as the callers in the codebase that invoke this
    388+    // serialization hold cs_main and have lock annotations and assertions.
    389+    SERIALIZE_METHODS(CDiskBlockIndex, obj) NO_THREAD_SAFETY_ANALYSIS
    


    ryanofsky commented at 6:16 pm on March 8, 2022:

    In commit “Drop lock and thread safety analysis in CDiskBlockIndex serialization” (0e052eb04fa07fc4915c11faa8866ab759365dc4)

    It’d be nice to do this without disabling thread safety checks. Though I can see it would require more work, and maybe better to do separately or as a followup. (IMO ideally we would have a linter to make sure every NO_THREAD_SAFETY_ANALYSIS usage was accompanied by a link to a github issue where we could discuss the plan to reenable thread safety checks.)

    In this case it looks like most obvious way of removing NO_THREAD_SAFETY_ANALYSIS would be to replace public guarded variables with private unguarded variables, changing:

    0public:
    1    int nFile GUARDED_BY(::cs_main){0};
    2    unsigned int nDataPos GUARDED_BY(::cs_main){0};
    3    unsigned int nUndoPos GUARDED_BY(::cs_main){0};
    4    uint32_t nStatus GUARDED_BY(::cs_main){0};
    

    to:

    0private:
    1    int m_file_num = 0;
    2    unsigned int m_data_pos = 0;
    3    unsigned int m_undo_pos = 0;
    4    uint32_t m_status = 0;
    

    And add EXCLUSIVE_LOCKS_REQUIRED accessor methods as needed to ensure variables are always accessed with the right locks held.


    vasild commented at 7:54 am on March 9, 2022:

    Hmm, how would that be different than the current code?

    current: LOCK(cs_main); read nStatus proposed: LOCK(cs_main); call GetNStatus()

    and the ser/deser methods would still need to acquire cs_main, no?

    The access to all nStatus/nDataPos/nUndoPos from within ser/deser has to happen in one atomic block. I mean that this is not ok:

    0lock
    1read nStatus
    2unlock
    3lock
    4read nDataPos
    5unlock
    

    it has to be

    0lock
    1read nStatus
    2read nDataPos
    3unlock
    

    maflcko commented at 8:20 am on March 9, 2022:
    All CDiskBlockIndex are short-lived copied data on the stack, shared with no other thread, so they don’t need any locks. If there is a missing lock then it can at most be missed while copying the data.

    vasild commented at 9:52 am on March 9, 2022:

    @ryanofsky, now I see your point - to change the members of the parent class CBlockIndex to be not-tagged as guarded by cs_main and change them to private, in the hopes that the methods that touch them respect the implicit guarding. Then, access them without guarding from the child class CDiskBlockIndex. I do not like this much because the members in CBlockIndex need to be guarded, but will be without a GUARDED_BY() tag. Hmm… @MarcoFalke, I did not realize this before, thanks for pointing it out!

    If there is a missing lock then it can at most be missed while copying the data

    there is not - both users of CDiskBlockIndex own cs_main when creating such objects.


    maflcko commented at 9:55 am on March 9, 2022:

    @MarcoFalke, I did not realize this before, thanks for pointing it out!

    I pointed it only out thrice up to now :sweat_smile: . #24199 (comment) and #22932 (review)


    vasild commented at 10:57 am on March 9, 2022:
    yeah, I needed a few days to grasp that :exploding_head:

    ryanofsky commented at 1:35 pm on March 9, 2022:

    I do not like this much because the members in CBlockIndex need to be guarded, but will be without a GUARDED_BY() tag.

    I should probably explain reasoning behind what I suggested yesterday. I don’t think this is a question what anybody likes or doesn’t like, but a question of what’s right and wrong. It’s wrong to annotate these struct members as being guarded by cs_main, because they aren’t guarded by it. The whole struct is not even guarded by cs_main, just certain instances of it are. What is guarded by cs_main is the m_block_index map and pointers & references to CBlockIndex instances specifically inside that map.

    The correct way to annotate this code would be not to annotate anything in CBlockIndex (Ideally I would rename it to struct BlockInfo and make it plain struct with no methods), but to not expose this low-level struct to high level application code. Higher level code would use a BlockReference accessor class that has a BlockInfo& member and methods appropriately annotated with EXCLUSIVE_LOCKS_REQUIRED(cs_main) to enforce the access pattern which enables our lock usage. CBlockIndex* pointers throughout the code would be replaced by BlockReference objects. In my ideal world, this would look something like:

     0struct BlockInfo {
     1   uint256 hash;
     2   int height = 0;
     3   BlockInfo* prev = nullptr;
     4   /// ... more members here ... ///
     5};
     6
     7struct BlockState : public BlockInfo {
     8  int flags = 0;
     9  int file_num = -1;
    10  unsigned data_pos = 0;
    11  unsigned undo_pos = 0;
    12};
    13
    14class BlockReference {
    15private:
    16   BlockState& m_block;
    17public:
    18   BlockInfo& info() const { return m_block; }
    19   BlockState& state() EXCLUSIVE_LOCKS_REQUIRED(cs_main) { return m_block; }
    20   BlockReference prev() const { return {static_cast<BlockState&>(*Assert(m_block.prev))}; }
    21   // Could add more fine-grained accessors like height() or flags() or whatever.
    22};
    23
    24class BlockManager {
    25private:
    26  std::unordered_set<BlockState> m_blocks GUARDED_BY(cs_main);
    27public:
    28  BlockReference FindBlock(const uint256& hash) EXCLUSIVE_LOCKS_REQUIRED(cs_main);
    29   /// ... more stuff ... ///
    30}
    

    The appoach I suggested yesterday does the same thing but keeps everything mushed into the same CBlockIndex class/struct hybrid instead of splitting it out. It would be pretty easy to implement what I suggested yesterday as a followup to this PR to re-enable thread safety checking. In any case I was not suggesting doing something bigger like that in this PR.


    vasild commented at 9:36 am on March 10, 2022:

    to use another wording (not like/dislike):

    It is a design smell to have a variable without GUARDED_BY() and rely on EXCLUSIVE_LOCKS_REQUIRED() on some methods to protect it. This way, if the code goes wrong (like, new access is added to the variable from a function not marked with EXCLUSIVE_LOCKS_REQUIRED()) then the compiler has no way to detect that. This defeats the purpose of thread safety annotations which is to enforce proper access and detect mistakes if the code goes wrong.

    Your suggestion above does that:

    replace public guarded variables with private unguarded variables … And add EXCLUSIVE_LOCKS_REQUIRED accessor methods as needed to ensure variables are always accessed with the right locks held.


    jonatack commented at 9:44 am on March 10, 2022:
    Thanks for the review feedback! Given the number of pulls that are currently touching this area of the codebase, it seems good to keep this change minimal and perhaps propose these ideas as a follow-up (happy to review them).

    ryanofsky commented at 12:33 pm on March 10, 2022:

    It is a design smell to have a variable without GUARDED_BY() and rely on EXCLUSIVE_LOCKS_REQUIRED() on some methods to protect it. This way, if the code goes wrong (like, new access is added to the variable from a function not marked with EXCLUSIVE_LOCKS_REQUIRED()) then the compiler has no way to detect that. This defeats the purpose of thread safety annotations which is to enforce proper access and detect mistakes if the code goes wrong.

    These are mostly incorrect claims made without specific reasoning or justification. If you have an inner variable that does not require lucking, and an outer data structure containing the variable, or an outer pointer pointing to the variable, or an outer function accessing the variable, and the outer container/pointer/function does require locking, the correct way of annotating the code is to not annotate the variable, and to annotate the outer container/pointer/function.

    If you look at the actual code I posted in #24199 (review), you can see that the compiler will detect all unsafe accesses. If you want to talk about a code smell, NO_THREAD_SAFETY_ANALYSIS is the code smell, and the design in that post shows how to get rid of it.

    That said, there is a difference between the bigger change I posted yesterday #24199 (review), and the smaller suggestion #24199 (review) I posted two days ago. In both cases the compiler can guarantee thread safety outside of chain.h/chain.cpp if these files are annotated correctly. But in the bigger change where I split up CBlockIndex class into separate BlockInfo / BlockState / BlockReference components, it’s easy to annotate everything correctly, while in the smaller change where CBlockIndex is kept as a monolithic class, what you are saying about it being trickier to see notice missing EXCLUSIVE_LOCKS_REQUIRED annotations on CBlockIndex methods is true.

    If you want to argue that using incorrect annotations is better than using correct ones for practical reasons (e.g. it is better to use NO_THREAD_SAFETY_ANALYSIS in one method, so the rest of the file is easier to maintain), that’s fine. But let’s not create confusion about what correct annotations are and what incorrect annotations are. Correct annotations do not require using NO_THREAD_SAFETY_ANALYSIS, and that’s just the most obvious clue.


    vasild commented at 11:58 am on March 16, 2022:
    Wrt NO_THREAD_SAFETY_ANALYSIS I tend to agree with @luke-jr: #24199 (comment). This is why I did not ACK this PR, which contains NO_THREAD_SAFETY_ANALYSIS. Personally, I do not see a good and elegant solution to the problem this PR aims to fix (yet).
  12. ryanofsky approved
  13. ryanofsky commented at 6:43 pm on March 8, 2022: contributor

    Code review ACK 0e052eb04fa07fc4915c11faa8866ab759365dc4

    re: #24199 (comment)

    To me it looks like that it is not just the tests that would be problematic […]

    This is all true, CDiskBlockIndex flow is a little convoluted. I think following my suggestion below to make relevant CBlockIndex member variables private & nonguarded, you could keep the CDiskBlockIndex class working by declaring it as a friend of CBlockIndex, or by changing it to use SER_READ / SER_WRITE macros with specialized Set/Get accessor methods that don’t require cs_main.

  14. DrahtBot commented at 11:26 pm on August 17, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK ryanofsky
    Concept NACK luke-jr

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #25862 (refactor, kernel: Remove gArgs accesses from dbwrapper and txdb by ryanofsky)
    • #19463 (Prune locks by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  15. achow101 commented at 7:01 pm on October 12, 2022: member
    Are you still working on this?
  16. achow101 commented at 5:13 pm on November 10, 2022: member
    Closing this as it has not had any activity in a while. If you are interested in continuing work on this, please leave a comment so that it can be reopened.
  17. achow101 closed this on Nov 10, 2022

  18. jonatack commented at 5:19 pm on November 10, 2022: contributor

    Are you still working on this?

    Yes, sorry not to have seen this comment and replied. There is valuable review feedback here so it may be helpful to re-open here rather than open a new pull (mentioning as I do not have the privileges to re-open it.)

  19. maflcko reopened this on Nov 10, 2022

  20. fanquake marked this as a draft on Feb 6, 2023
  21. fanquake commented at 3:02 pm on February 6, 2023: member
    Moved this to draft. It’s not clear what the status of this is. There is clearly disagreement about the approach amongst contributors. Was re-opened to address feedback, but no activity in the 4 months since.
  22. DrahtBot commented at 11:35 pm on February 17, 2023: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  23. DrahtBot added the label Needs rebase on Feb 17, 2023
  24. jonatack commented at 11:09 am on April 25, 2023: contributor
    Closing temporarily so that I can re-open it – am still interested to work on this.
  25. jonatack closed this on Apr 25, 2023

  26. bitcoin locked this on Apr 24, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me