scripted-diff: Restore AssertLockHeld after #19668, remove LockAssertion

ryanofsky commented at 11:49 pm on September 3, 2020: member

This PR is a simple 3-line scripted diff followed by a documentation cleanup. The scripted diff does three things:

Deletes all existing AssertLockHeld calls. Since #19668, AssertLockHeld has had an EXCLUSIVE_LOCKS_REQUIRED annotation which has guaranteed (though ongoing travis and cirrus CI builds since #19668 was merged) that these AssertLockHeld calls are redundant with EXCLUSIVE_LOCKS_REQUIRED or EXCLUSIVE_LOCK_FUNCTION annotations already present in surrounding code, and specifically that:

There is no way the assert calls can trigger any behavior at runtime.
The assert calls provide no new thread safety information to the compiler.

Reverts AssertLockHeld implementation which got changed in #19668 to it’s original pre-#19668 state. In #19668, AssertLockHeld was changed to have an EXCLUSIVE_LOCKS_REQUIRED annotation instead of having an ASSERT_EXCLUSIVE_LOCK annotation. As described above, having an EXCLUSIVE_LOCKS_REQUIRED annotation on an assert statement is only useful as proof that the assert statement doesn’t do anything or convey any new information. By contrast, having an ASSERT_EXCLUSIVE_LOCK annotation on an assert statement can be an actually useful way of conveying to the compiler than a specific mutex is locked at a specific place in the code, when the compiler thread safety analysis can’t determine that on its own (because of lost type information).
Removes LockAssertion class, replacing all current uses with calls to AssertLockHeld. Ever since the LockAssertion class was first introduced in #14437 (as LockAnnotation), it has always been an inferior alternative to AssertLockHeld and not had a reason to exist. (#14437 was originally written before #13423 which added ASSERT_EXCLUSIVE_LOCK annotation. It was justified when the code was first written, but no longer necessary by the time it was merged).

Motivation for this PR is to get rid of confusion between different types of lock assertions and only keep one assertion: AssertLockHeld which goes back to the implementation it’s had since #13423 was merged until #19668 was merged. After this PR, AssertLockHeld only needs to be used sparingly to augment compile-time thread safety annotations, which are a superior alternative to runtime checks for guaranteeing thread safety.

PR is compared with alternatives in https://github.com/bitcoin-core/bitcoin-devwiki/wiki/AssertLockHeld-PRs

DrahtBot added the label Mempool on Sep 4, 2020

DrahtBot added the label P2P on Sep 4, 2020

DrahtBot added the label Refactoring on Sep 4, 2020

DrahtBot added the label RPC/REST/ZMQ on Sep 4, 2020

DrahtBot added the label TX fees and policy on Sep 4, 2020

DrahtBot added the label UTXO Db and Indexes on Sep 4, 2020

DrahtBot added the label Validation on Sep 4, 2020

DrahtBot added the label Wallet on Sep 4, 2020

DrahtBot commented at 1:57 am on September 4, 2020: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#19980 (refactor: Some wallet cleanups by promag)
#19979 (Use proper TSA attributes (attempt two) by hebasto)
#19970 (sync.h: fix LockAssertion error reporting by ajtowns)
#19918 (sync: Replace LockAssertion with AssertLockHeldUnverified by ryanofsky)
#19905 (Remove dead CheckForkWarningConditionsOnNewFork by MarcoFalke)
#19498 (Tidy up ProcessOrphanTx by jnewbery)
#10443 (Add fee_est tool for debugging fee estimation code by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

ajtowns commented at 3:28 am on September 4, 2020: member

-0.5 on concept, I don’t think dropping the runtime checks has any advantage.

For approach, I think just doing git grep -l AssertLockHeld src | grep -v 'sync.h$' | xargs sed -i '/^ *AssertLockHeld(.*);/d' would be better. The current code has AssertLockHeld and AssertLockNotHeld behave in the same way (a runtime check that the mutex is held by this thread or not), which is worth preserving imo.

The updated docs in doc/developer-notes.md would also need updating if this change were to be merged.

hebasto commented at 8:02 am on September 4, 2020: member

I agree with @ajtowns’ comment:

We have three types of assertions related for locking:

marking a function as EXCLUSIVE_LOCKS_REQUIRED has a compile time check with clang that the caller has already obtained the lock

declaring a LockAssertion instance overrides the clang compile time checks by saying that we’ve already somehow acquired the lock in a way that we can’t prove via clang thread safety annotations. This is useful for lambda functions (which could be annotated) that are called from some generic dispatcher like ForEach (that can’t be annotated).

adding a call to AssertLockHeld does a runtime check if DEBUG_LOCKORDER is enabled. it’s annotated with EXCLUSIVE_LOCKS_REQUIRED so is redundant with the compile time checks, but the compile time checks aren’t available outside of clang. it’s automatically called from LockAssertion to ensure that the compile time checks aren’t overridden incorrectly.

As a result:

using AssertLockHeld everywhere remains fine, and no longer needlessly disables the compile time checks

LockAssertion should only be used very rarely – and in fact it’s only used in net_processing in some ForEach/ForEachNode functions.

Probably, AssertLockHeld deserves a better name, but its functionality and usage are ok.

EXCLUSIVE_LOCKS_REQUIRED is used in a header file. If a function definition is placed in a *.cpp file, using AssertLockHeld has the following benefits (besides a run time check):

it shows to a code reader the expected state of lock without referencing to a header file
it will warn about missed proper EXCLUSIVE_LOCKS_REQUIRED annotation

AssertLockHeld is a great tool to transit from RecursiveMutex to Mutex in a safe and proven manner. See: #19303, #19833, #19854.

Concept NACK.

ryanofsky commented at 2:46 pm on September 4, 2020: member

If a function definition is placed in a *.cpp file, using AssertLockHeld has the following benefits (besides a run time check):

it shows to a code reader the expected state of lock without referencing to a header file

it will warn about missed proper EXCLUSIVE_LOCKS_REQUIRED annotation

You are literally talking about adding an annotation to check for the presence of another annotation. This is an absurd idea to me, but to take it seriously, what should the developer guidelines say about using AssertLockHeld this way? Should every function that is annotated with EXCLUSIVE_LOCKS_REQUIRED also have an AssertLockHeld at the top? Is there going to be a linter to check for this, or is this going to be another source of nits in review comments?

AssertLockHeld is a great tool to transit from RecursiveMutex to Mutex in a safe and proven manner. See: #19303, #19833, #19854.

Before the lock annotations added in #19668 this was true. It was a good way to discover where to add EXCLUSIVE_LOCKS_REQUIRED annotations. But now those annotations are added AssertLockHeld is only functioning as annotation checking the presence of another annotation, and doesn’t impact the work in those other PRs or improve thread safety in any way.

hebasto commented at 2:58 pm on September 4, 2020: member

You are literally talking about adding an annotation to check for the presence of another annotation.

This is one benefit among others (run time check is the main purpose of AssertLockHeld).

This is an absurd idea to me…

Annotations that was missed and added in 3ddc150857178bfb1c854c05bf9b526777876f56 and 2ee7743fe723227f2ea1b031eddb14fc6863f4c8 justify the #19668 approach.

… but to take it seriously, what should the developer guidelines say about using AssertLockHeld this way? Should every function that is annotated with EXCLUSIVE_LOCKS_REQUIRED also have an AssertLockHeld at the top?

Why not?

… is this going to be another source of nits in review comments?

I think it is desirable for a new code.

ryanofsky commented at 3:06 pm on September 4, 2020: member

Good! So we agree this PR has no detrimental effects on thread safety, and the NACK is based on a style preference?

hebasto commented at 3:10 pm on September 4, 2020: member

Good! So we agree this PR has no detrimental effects on thread safety, and the NACK is based on a style preference?

No. It is based on thread safety.

While migrating from RecursiveMutex to Mutex how one could be confident in the fact that a mutex is actually locked without run time assertion?

Try to remove confusing and no longer useful lock asserts.

At least, could this change be postponed until getting rid of RecursiveMutexs?

ryanofsky commented at 3:56 pm on September 4, 2020: member

While migrating from RecursiveMutex to Mutex how one could be confident in the fact that a mutex is actually locked without run time assertion?

If it is annotated with EXCLUSIVE_LOCKS_REQUIRED, it seems you should be confident either that the mutex is actually locked or that LockAssertion was used earlier and would have triggered a runtime error where it was used. This PR isn’t removing all runtime checks, just runtime checks redundant with compile time checks. AssertLockHeld is still available whenever you want to use it. It just returns to functioning like a normal runtime check, and not a strange compile time check enforcing the presence of a different compile time check. I can see how the strange check was useful during development of #19668, but it doesn’t serve a purpose for thread safety going forward or help with future PRs.

If you want to make an argument for keeping all AssertLockHelds based on readability, that’s fine, but then I think you should make a developer guideline saying that AssertLockHeld should be called first thing in any function annotated with EXCLUSIVE_LOCKS_REQUIRED, and ideally have a linter to enforce this. Otherwise if the assert is only used in some places but not others, that is just adding confusion and inconsistency.

jonatack commented at 4:33 pm on September 4, 2020: member

I need to review this PR just for what I’ll learn. 🐳

MarcoFalke commented at 10:19 am on September 5, 2020: member

I think by default ./configure will pick up gcc, which does not check lock annotations, so the current AssertLockHeld in master have a slight benefit of telling ./configure --enable-debug devs (with gcc) who run the tests before creating a pull that something with their locks is wrong. Though, you correctly say that travis will compile with clang and fail if there is an inconsistency.

The redundant run time checks also serve as a insurance against bugs in clang.

Ideally, they’d be inserted by the compiler whenever a function is annotated. Though, I don’t see a way to do this in C++ without wrapping everything into more macros. Another option would be to have a preprocessing step in our ci scripts to insert the redundant run-time checks in enable-debug builds. At least that would make me feel more comfortable removing them.

MarcoFalke removed the label Mempool on Sep 5, 2020

MarcoFalke removed the label P2P on Sep 5, 2020

MarcoFalke removed the label RPC/REST/ZMQ on Sep 5, 2020

MarcoFalke removed the label TX fees and policy on Sep 5, 2020

MarcoFalke removed the label UTXO Db and Indexes on Sep 5, 2020

MarcoFalke removed the label Validation on Sep 5, 2020

MarcoFalke removed the label Wallet on Sep 5, 2020

ryanofsky commented at 12:23 pm on September 5, 2020: member

At least that would make me feel more comfortable removing them.

Can you clarify what is uncomfortable? If a function is annotated with EXCLUSIVE_LOCKS_REQUIRED(mutex), and the developer is not using clang, and the developer makes a change that calls the function without locking mutex, having a redundant AssertLockHeld(mutex) isn’t going to impact the thread safety of the codebase, because CI ensures we will not merge this code.

So you are uncomfortable about the inconvenience that removing asserts which are already haphazardly and inconsistently placed will cause for developers who are not using clang, but who are building in debug mode, and who are removing locks or calling functions in new places, and who are doing some kind of manual or automated testing that would happen to trigger these assertions at runtime?

Or uncomfortable about something different?

MarcoFalke commented at 12:28 pm on September 5, 2020: member

I simply wouldn’t put too much trust into clang, since it may have bugs.

vasild commented at 10:28 am on September 7, 2020: member

We managed to misuse the compiler directives:

Our AssertLockHeldInternal() does a runtime check and would not return (aka abort()) if it fails. There is an attribute exactly for that: ASSERT_EXCLUSIVE_LOCK, but we don’t use it. We use EXCLUSIVE_LOCKS_REQUIRED, meant for something else. This is a misuse because AssertLockHeldInternal() does not do anything that requires holding the mutex (like reading variables protected by that mutex).
Our LockAssertion::LockAssertion() does the same as AssertLockHeldInternal(), but is tagged with EXCLUSIVE_LOCK_FUNCTION. A misuse because LockAssertion() does not acquire the mutex.

I think we shouldn’t be doing that for no matter what reason - misleading the compiler that we do one thing while we do another.

The OP in #19668 boils down to:

0 1 int x GUARDED_BY(cs_main);
1 2
2 3 void f() // not annotated with EXCLUSIVE_LOCKS_REQUIRED(cs_main)
3 4 {
4 5     // AssertLockHeld was properly annotated with ASSERT_EXCLUSIVE_LOCK() before [#19668](/bitcoin-bitcoin/19668/)
5 6     AssertLockHeld(cs_main);
6 7
7 8     // no warning here that we access x without holding cs_main
8 9     x = 5;
910 }

The compiler does not issue a compile-time warning for line 9 because it knows line 9 is unreachable at run-time if cs_main is not held - AssertLockHeld() from line 6 would not return in that case. IMO we shouldn’t try to extort the compiler to produce a warning here. Mis-labeling AssertLockHeld() with EXCLUSIVE_LOCKS_REQUIRED() only to ensure that f() has that attribute indeed looks like adding an annotation to check for the presence of another annotation.

I agree with @MarcoFalke that clang may have bugs and we better not have it as our sole protection. The thread safety analysis look a bit immature to me - for example the warnings produced depend on the order in which attributes are defined (:-O). IMO compile time checks are good and they are an addition to runtime checks, not a replacement.

What about going back to using the proper attributes? Tag AssertLockHeldInternal() with ASSERT_EXCLUSIVE_LOCK when DEBUG_LOCKORDER is defined and no attributes otherwise (because it is a noop, don’t fool the compiler) and remove the confusing LockAssertion()?

hebasto commented at 10:57 am on September 7, 2020: member

@vasild

The compiler does not issue a compile-time warning for line 9 because it knows line 9 is unreachable at run-time if cs_main is not held - AssertLockHeld() from line 6 would not return in that case.

But it is not known for a code reader.

ajtowns commented at 5:39 pm on September 7, 2020: member

The compiler does not issue a compile-time warning for line 9 because it knows line 9 is unreachable at run-time if cs_main is not held

Line 9 is reachable at run-time if cs_main is not held and DEBUG_LOCKORDER is not specified. If there is not a test that exercises that code path, CI will not detect the bug and prevent the merge.

vasild commented at 6:38 pm on September 7, 2020: member

@ajtowns this is because of another misuse (or “trying to fool the compiler”) - in master before #19668 in the non-DEBUG_LOCKORDER case AssertLockHeldInternal() was defined like this:

0void static inline AssertLockHeldInternal(...) ASSERT_EXCLUSIVE_LOCK(cs) {}

So, we lied the compiler that we will check and abort(), but we did not do that.

In this case, IMO ASSERT_EXCLUSIVE_LOCK() should not be present. Then line 9 is reachable and we get a compilation warning.

hebasto commented at 12:32 pm on September 8, 2020: member

@vasild

@ajtowns this is because of another misuse (or “trying to fool the compiler”) - in master before #19668 in the non-DEBUG_LOCKORDER case AssertLockHeldInternal() was defined like this:
0void static inline AssertLockHeldInternal(...) ASSERT_EXCLUSIVE_LOCK(cs) {}
So, we lied the compiler that we will check and abort(), but we did not do that.

In this case, IMO ASSERT_EXCLUSIVE_LOCK() should not be present. Then line 9 is reachable and we get a compilation warning.

So, the 23d71d171e6e22ba5e4a909d597a54595b2a2c1f commit from the #19668 is a correct change, and it should not be reverted, right?

vasild commented at 12:42 pm on September 8, 2020: member

23d71d1 contains 2 changes. IMO the first one should be reverted and the second change should stay.

ajtowns commented at 5:47 am on September 14, 2020: member

So, we lied the compiler that we will check and abort(), but we did not do that.

I don’t think it’s useful to put this in moral terms. We’re trying to prevent buggy code, by in order of preference, (a) making it impossible to write (eg RAII so locks are free automatically); (b) making the compiler complain about it (eg thread safety annotations); (c) getting predictable safe errors at runtime rather than crashes, hangs or undefined behaviour that only happen randomly (eg lock order checks).

In our code, it’s never correct to call AssertLockHeld (that is as it stands prior to this PR, not after LockAssertion is renamed) without already holding the lock, so the additional behaviours allowed by ASSERT_EXCLUSIVE_LOCK aren’t helpful, and make it easier to write buggy code.

scripted-diff: Restore AssertLockHeld after #19668, remove LockAssertion

-BEGIN VERIFY SCRIPT-
git grep -l AssertLockHeldInternal | xargs sed -i /AssertLockHeldInternal/s/EXCLUSIVE_LOCKS_REQUIRED/ASSERT_EXCLUSIVE_LOCK/
git grep -l AssertLockHeld ':!src/sync.h' | xargs sed -i '/^ *AssertLockHeld(.*);/d'
git grep -l LockAssertion | xargs sed -i 's/LockAssertion lock(\(.*\));/AssertLockHeld(\1);/g'
-END VERIFY SCRIPT-

2cc908d201

Clean up documentation and header after AssertLockHeld script-diff 0c53f3d5c4

ryanofsky force-pushed on Sep 14, 2020

ryanofsky marked this as ready for review on Sep 14, 2020

DrahtBot added the label Needs rebase on Sep 23, 2020

DrahtBot commented at 4:15 pm on September 23, 2020: member

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

MarcoFalke commented at 9:54 am on October 19, 2020: member

Can be closed?

MarcoFalke commented at 1:41 pm on October 25, 2020: member

Let me know if I should reopen

MarcoFalke closed this on Oct 25, 2020

DrahtBot locked this on Feb 15, 2022

scripted-diff: Restore AssertLockHeld after #19668, remove LockAssertion #19865

Conflicts