This PR split from bitcoin/bitcoin#24097 with some additions. This makes the following switch from RecursiveMutex to Mutex a pure refactoring.
See details in commit messages.
This PR split from bitcoin/bitcoin#24097 with some additions. This makes the following switch from RecursiveMutex to Mutex a pure refactoring.
See details in commit messages.
38 | @@ -39,6 +39,10 @@ BanMan::~BanMan() 39 | 40 | void BanMan::DumpBanlist() 41 | { 42 | + static Mutex dump_mutex; 43 | + TRY_LOCK(dump_mutex, dump_lock); 44 | + if (!dump_lock) return;
why is is ok to refuse to dump when two threads call this function?
It is an optimization. Calling this function in the destructor guarantees that the final state will be dumped to the disk.
Then it should be mentioned in the release notes that calling the ban rpc no longer implies the stuff is written to disk and when the node crashes, the state will be lost.
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
tACK e06d092
This PR :
. adds m_banned_mutex to prevent the m_is_dirty value from being read in a different state in BannedSetIsDirty() and SweepBanned().
. gets rid of GetBanned() so it doesn't call SweepBanned() twice and sets banmap = m_banned. This works because the m_banned_mutex is already held in this operation.
. adds the same strategy used in DumpMempool to avoid concurrent writing to disk, but adds an optimization to return early if a write is in progress.
. Since there is only m_is_dirty = true; (not = false) outside of BanMan::DumpBanlist(), the last commit makes sense to avoid inconsistent values for m_dirty in a race condition.
Concept ACK
Going commit wise:
Concurrent calls to Write() have been prevented by "Trying" to lock dump_lock and exit if locking fails. However, as @MarcoFalke pointed out, it can lead to ban rpc calls in which DumpBanList() could not write anything to disk, leading to potential data loss. This behavior should be appropriately documented.
Other than this, it could also lead to writing failure when called through other functions as well. What do you think about introducing a mutex for this function that allows calling DumpBanList only when we AssertLockNotHeld(new_mutex).
SweepBanned and BannedSetIsDirty are brought under the same lock block to prevent race conditions for m_is_dirty. I agree with this approach.
Prevented SweepBanned to be called twice by directly setting the value of banmap (= m_banned) instead of calling GetBanned(). I think this is an improvement over the current code.
Modifying the code to call SetBannedSetDirty(true); throughout the codebase prevents the risk of inconsistent value due to race conditions. I agree with this change.
Perhaps the same strategy used in DumpMempool would be a safer approach in 2c77427 .
static Mutex dump_mutex;
LOCK(dump_mutex);
The m_is_dirty value being read in BannedSetIsDirty() can differ from
the value being set in SweepBanned(), i.e., be inconsistent with a
BanMan instance internal state.
Another thread can `SetBannedSetDirty(true)` while `CBanDB::Write()`
call being executed. The following `SetBannedSetDirty(false)`
effectively makes `m_is_dirty` flag value inconsistent with the actual
`m_banned` state. Such behavior can result in data loss, e.g., during
shutdown.
Updated e06d092f475b261afc05bbec7c4aba903d1b1a4b -> 99a6b699cd650f13d7200d344bf5e2d4b45b20ac (pr24168.01 -> pr24168.02, diff).
Lock optimization dropped as an any optimization must be justified with benchmarking, at least. That was not done in this case. So sorry for a premature optimization.
reACK 99a6b69
Optimization removed, but the dump behavior now is more predictable.
Concept ACK
ACK 99a6b699cd650f13d7200d344bf5e2d4b45b20ac
Changes since my last review:
utACK
59 | SetBannedSetDirty(false); 60 | } 61 | 62 | + int64_t n_start = GetTimeMillis(); 63 | + if (!m_ban_db.Write(banmap)) { 64 | + SetBannedSetDirty(true);
unrelated: Is m_dirty even used on shutdown when true? Shouldn't the below log be adjusted when it fails?