Fix some race conditions in BanMan::DumpBanlist() #24168

pull hebasto wants to merge 4 commits into bitcoin:master from hebasto:220126-dump changing 1 files +13 −7
  1. hebasto commented at 3:32 pm on January 26, 2022: member

    This PR split from bitcoin/bitcoin#24097 with some additions. This makes the following switch from RecursiveMutex to Mutex a pure refactoring.

    See details in commit messages.

  2. in src/banman.cpp:44 in 2c77427fd9 outdated
    38@@ -39,6 +39,10 @@ BanMan::~BanMan()
    39 
    40 void BanMan::DumpBanlist()
    41 {
    42+    static Mutex dump_mutex;
    43+    TRY_LOCK(dump_mutex, dump_lock);
    44+    if (!dump_lock) return;
    


    MarcoFalke commented at 4:52 pm on January 26, 2022:
    why is is ok to refuse to dump when two threads call this function?

    hebasto commented at 4:59 pm on January 26, 2022:
    It is an optimization. Calling this function in the destructor guarantees that the final state will be dumped to the disk.

    MarcoFalke commented at 5:10 pm on January 26, 2022:
    Then it should be mentioned in the release notes that calling the ban rpc no longer implies the stuff is written to disk and when the node crashes, the state will be lost.

    hebasto commented at 7:42 pm on January 28, 2022:
    Not relevant since the recent update, I guess.
  3. DrahtBot commented at 11:17 pm on January 26, 2022: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #24097 (Replace RecursiveMutex m_cs_banned with Mutex, and rename it by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. w0xlt approved
  5. w0xlt commented at 12:07 pm on January 28, 2022: contributor

    tACK e06d092

    This PR :

    . adds m_banned_mutex to prevent the m_is_dirty value from being read in a different state in BannedSetIsDirty() and SweepBanned().

    . gets rid of GetBanned() so it doesn’t call SweepBanned() twice and sets banmap = m_banned. This works because the m_banned_mutex is already held in this operation.

    . adds the same strategy used in DumpMempool to avoid concurrent writing to disk, but adds an optimization to return early if a write is in progress.

    . Since there is only m_is_dirty = true; (not = false) outside of BanMan::DumpBanlist(), the last commit makes sense to avoid inconsistent values for m_dirty in a race condition.

  6. shaavan commented at 1:06 pm on January 28, 2022: contributor

    Concept ACK

    Going commit wise:

    1. Concurrent calls to Write() have been prevented by “Trying” to lock dump_lock and exit if locking fails. However, as @MarcoFalke pointed out, it can lead to ban rpc calls in which DumpBanList() could not write anything to disk, leading to potential data loss. This behavior should be appropriately documented. Other than this, it could also lead to writing failure when called through other functions as well. What do you think about introducing a mutex for this function that allows calling DumpBanList only when we AssertLockNotHeld(new_mutex).

    2. SweepBanned and BannedSetIsDirty are brought under the same lock block to prevent race conditions for m_is_dirty. I agree with this approach.

    3. Prevented SweepBanned to be called twice by directly setting the value of banmap (= m_banned) instead of calling GetBanned(). I think this is an improvement over the current code.

    4. Modifying the code to call SetBannedSetDirty(true); throughout the codebase prevents the risk of inconsistent value due to race conditions. I agree with this change.

  7. w0xlt commented at 1:29 pm on January 28, 2022: contributor

    Perhaps the same strategy used in DumpMempool would be a safer approach in 2c77427 .

    0static Mutex dump_mutex;
    1LOCK(dump_mutex);
    
  8. Prevent possible concurrent CBanDB::Write() calls 5e20e9ec38
  9. Fix data race condition in BanMan::DumpBanlist()
    The m_is_dirty value being read in BannedSetIsDirty() can differ from
    the value being set in SweepBanned(), i.e., be inconsistent with a
    BanMan instance internal state.
    33bda6ab87
  10. Avoid calling BanMan::SweepBanned() twice in a row 83c7646715
  11. Fix race condition for SetBannedSetDirty() calls
    Another thread can `SetBannedSetDirty(true)` while `CBanDB::Write()`
    call being executed. The following `SetBannedSetDirty(false)`
    effectively makes `m_is_dirty` flag value inconsistent with the actual
    `m_banned` state. Such behavior can result in data loss, e.g., during
    shutdown.
    99a6b699cd
  12. hebasto force-pushed on Jan 28, 2022
  13. hebasto commented at 7:40 pm on January 28, 2022: member

    Updated e06d092f475b261afc05bbec7c4aba903d1b1a4b -> 99a6b699cd650f13d7200d344bf5e2d4b45b20ac (pr24168.01 -> pr24168.02, diff).

    Lock optimization dropped as an any optimization must be justified with benchmarking, at least. That was not done in this case. So sorry for a premature optimization.

  14. w0xlt approved
  15. w0xlt commented at 7:58 pm on January 28, 2022: contributor

    reACK 99a6b69

    Optimization removed, but the dump behavior now is more predictable.

  16. jonatack commented at 8:54 pm on January 28, 2022: member
    Concept ACK
  17. shaavan approved
  18. shaavan commented at 2:36 pm on January 30, 2022: contributor

    ACK 99a6b699cd650f13d7200d344bf5e2d4b45b20ac

    Changes since my last review:

    • Changed logic of preventing concurrent write by creating a static mutex and locking it. @hebasto We can move on to the optimization logic after we have some benchmark results. Until then, we can stick with the current logic, which is more reliable and predictable.
  19. luke-jr approved
  20. luke-jr commented at 2:37 am on January 31, 2022: member
    utACK
  21. MarcoFalke merged this on Jan 31, 2022
  22. MarcoFalke closed this on Jan 31, 2022

  23. in src/banman.cpp:57 in 99a6b699cd
    59         SetBannedSetDirty(false);
    60     }
    61 
    62+    int64_t n_start = GetTimeMillis();
    63+    if (!m_ban_db.Write(banmap)) {
    64+        SetBannedSetDirty(true);
    


    MarcoFalke commented at 8:37 am on January 31, 2022:
    unrelated: Is m_dirty even used on shutdown when true? Shouldn’t the below log be adjusted when it fails?
  24. hebasto deleted the branch on Jan 31, 2022
  25. sidhujag referenced this in commit b7b7bf20e3 on Feb 1, 2022
  26. DrahtBot locked this on Jan 31, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-24 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me