Assume
may be optimized away in production builds when the compiler proves they are side-effect-free.
This use case is demonstrated in #31363 and is suggested to be documented in #31363 (comment).
doc: clarify the documentation of Assume
assertion
#32100
pull
ismaelsadeeq
wants to merge
1
commits into
bitcoin:master
from
ismaelsadeeq:03-2025-clarify-assume-doc
changing
1
files
+4 −1
-
ismaelsadeeq commented at 11:20 am on March 20, 2025: memberAn Expression inside
-
DrahtBot commented at 11:21 am on March 20, 2025: contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Code Coverage & Benchmarks
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32100.
Reviews
See the guideline for information on the review process.
Type Reviewers ACK l0rinc, hodlinator, jonatack, rkrux Stale ACK sipa If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
-
DrahtBot added the label Docs on Mar 20, 2025
-
in doc/developer-notes.md:464 in a880d1bf87 outdated
459@@ -460,7 +460,14 @@ other input. 460 safely continue even if the assumption is violated. In debug builds it 461 behaves like `Assert`/`assert` to notify developers and testers about 462 nonfatal errors. In production it doesn't warn or log anything, though the 463- expression is always evaluated. 464+ expression is always evaluated. However, if the compiler can prove that a 465+ statement inside `Assume` is side-effect-free, it may optimize the call away,
yancyribbens commented at 2:21 pm on March 20, 2025:Interesting. If
Assume
can be optimized out during production possibly, is there any reason to preferAssert
?There’s an assert here in a coin-selection algo that is meant to be performant. Maybe this should be changed to
Asume
.
yancyribbens commented at 6:32 pm on March 20, 2025:I tested however the benchmarks don’t seem to be sensitive enough for this to make a difference.sipa commented at 2:36 pm on March 20, 2025: memberConcept ACK, this is largely how I’ve been using Assume().
My thinking is that due to the optimizing-out property of
Assume()
, the concern about potential performance impact is (even) lower than forAssert()
/assert()
, and thus one can be even more liberal when using these. Note that evenAssert()
/assert()
tend to have very low cost for cheap conditions, as they’re generally well-predicted by the CPU branch predictor (so the cost tends to be in the nanosecond or less range), but for cheap provably-side-effect-free conditions, forAssume()
the concern just disappears entirely.So the benefit, I think, isn’t directly that
Assume()
is necessarily significantly faster thanAssert()
/assert()
- in almost all cases in the codebase, the impact of an additional well-predicted conditional isn’t going to be measurable. The benefit is that it may, for reviewers, largely remove the consideration of whether or not there is any performance impact at all.Of course, compared to the others it has the downside of not actually triggering in production builds, so it should be reserved for cases where (a) the code can safely continue when the assumption is violated (that’s what’s documented already) but also (b) when there is sufficient test coverage to give confidence that the assumption actually can’t trigger in production.
So I think the advice about this kind of usage should be:
- There aren’t other reasons to have an
Assert()
/assert()
; i.e., if it weren’t for the possibility of a production-optimized-outAssume()
, you’d instead have no assertion at all. - The condition is clearly side-effect free.
- It helps readability/reviewability by making assumptions explicitly verified in fuzz tests. If the code doesn’t have good fuzz test coverage, this doesn’t achieve much.
in doc/developer-notes.md:470 in a880d1bf87 outdated
466+ skipping its evaluation in production. 467+ - `Assume` can also act as a lightweight debugging assertion, ensuring 468+ statements are tested e.g., during fuzzing—to catch violations. With 469+ sufficient test coverage, the compiler may optimize away proven conditions 470+ in production, eliminating runtime overhead. This aids both testing and code 471+ review by making statements explicit and tested.
hodlinator commented at 9:57 am on March 22, 2025:This seems sufficient to me:
0 expression is always evaluated. However, if the compiler can prove that a 1 statement inside `Assume` is side-effect-free, it may optimize the call away, 2 skipping its evaluation in production. This enables a lower-cost way of 3 making explicit statements about the code, aiding review.
Motivation:
Assume
can also act as a lightweight debugging assertion, ensuring statements are tested e.g., during fuzzing—to catch violations.Even if a lot of fuzzing is were to be performed with debug builds, this feels redundant. If we fuzz mostly in non-debug, it’s not very accurate.
With sufficient test coverage, the compiler may optimize away proven conditions in production, eliminating runtime overhead.
Seems to imply that the compiler optimizes away depending on test coverage.
This aids both testing and code review by making statements explicit and tested.
Rephrased and appended to preceding paragraph.
An additional reason for not inserting a extra bullet-point here is to maintain the existing pattern of using one bulletpoint for each assert/check type, starting with “For example, “. So even if insist on adding a mention of fuzzing, ideally it should not be in bullet-point form, but rather appended to the preceding paragraph.
ismaelsadeeq commented at 12:17 pm on March 24, 2025:fixed!
hodlinator commented at 1:07 pm on March 24, 2025:Thank you for taking my suggestion!hodlinator commented at 10:04 am on March 22, 2025: contributorConcept ACK
Good to document useful aspects of
Assume
.rkrux commented at 10:33 am on March 24, 2025: contributorConcept ACK a880d1bf87817b1e6606c971cbfe98382898a0be
I find myself agreeing with the points mentioned in this comment #32100 (review), there is some redundancy in the verbiage currently that can be removed.
ismaelsadeeq force-pushed on Mar 24, 2025ismaelsadeeq commented at 12:19 pm on March 24, 2025: memberI find myself agreeing with the points mentioned in this comment #32100 (comment), there is some redundancy in the verbiage currently that can be removed.
This is fixed now in 2898a0be..a7c65edc
hodlinator approvedhodlinator commented at 1:11 pm on March 24, 2025: contributorACK a7c65edc884b0e22aaabd7e725f5f39e60b6e76b
Adds more information on the strengths of
Assume()
, being able to be used as a kind of low-cost documentation of constraints that in the best case can be optimized away in production.Latest version of the change also keeps with the pattern of only having one
- For example ...
-subsection.DrahtBot requested review from sipa on Mar 24, 2025DrahtBot requested review from rkrux on Mar 24, 2025sipa commented at 1:24 pm on March 24, 2025: memberACK a7c65edc884b0e22aaabd7e725f5f39e60b6e76brkrux approvedrkrux commented at 1:29 pm on March 24, 2025: contributorACK a7c65edc884b0e22aaabd7e725f5f39e60b6e76bin doc/developer-notes.md:464 in a7c65edc88 outdated
459@@ -460,7 +460,10 @@ other input. 460 safely continue even if the assumption is violated. In debug builds it 461 behaves like `Assert`/`assert` to notify developers and testers about 462 nonfatal errors. In production it doesn't warn or log anything, though the 463- expression is always evaluated. 464+ expression is always evaluated. However, if the compiler can prove that a 465+ statement inside `Assume` is side-effect-free, it may optimize the call away,
l0rinc commented at 2:06 pm on March 24, 2025:According to https://www.reddit.com/r/cpp_questions/comments/plg8ij/expression_vs_statement/
Statements do not have a return type
So by definition, unless they’re empty, they cannot be
side-effect-free
- if my interpretation is correct.0 expression is always evaluated. However, if the compiler can prove that 1 an expression inside `Assume` is side-effect-free, it may optimize the call away,
ismaelsadeeq commented at 2:30 pm on March 24, 2025:Fixed!l0rinc changes_requesteddoc: clarify the documentation of `Assume` 329a0dcdafismaelsadeeq force-pushed on Mar 24, 2025l0rinc commented at 2:37 pm on March 24, 2025: contributorACK 329a0dcdafe05002f662e8737a76bfdeaba9a3edDrahtBot requested review from rkrux on Mar 24, 2025DrahtBot requested review from hodlinator on Mar 24, 2025hodlinator approvedhodlinator commented at 3:17 pm on March 24, 2025: contributorre-ACK 329a0dcdafe05002f662e8737a76bfdeaba9a3edjonatack commented at 5:04 pm on March 24, 2025: memberACK 329a0dcdafe05002f662e8737a76bfdeaba9a3edl0rinc approvedrkrux approvedrkrux commented at 7:45 am on March 25, 2025: contributorre-ACK 329a0dcdafe05002f662e8737a76bfdeaba9a3edyancyribbens commented at 7:02 pm on March 25, 2025: contributorOf course, compared to the others it has the downside of not actually triggering in production builds, so it should be reserved for cases where (a) the code can safely continue when the assumption is violated (that’s what’s documented already)
This is a good call out, although I don’t see it already documented (here), maybe I’m blind
The other reason I think this is a good call it is this excerpt from http://fiona.dmcs.p.lodz.pl/oopc/reference/en/cpp/language/attributes/assume.html.
If the expression can ever evaluate to false, it will inject undefined behavior into your whole program (not just where the assumption appears). It should never be used in place of an assertion or precondition.
Other note is I don’t see any mention in the docs that this is optimized out in production. The best I can find is that the expression is not evaluated and converted to a bool, then when the program reaches that bool, if the bool evaluates to true it’s basically a noop (has no effect) otherwise boom.
in doc/developer-notes.md:464 in 329a0dcdaf
459@@ -460,7 +460,10 @@ other input. 460 safely continue even if the assumption is violated. In debug builds it 461 behaves like `Assert`/`assert` to notify developers and testers about 462 nonfatal errors. In production it doesn't warn or log anything, though the 463- expression is always evaluated. 464+ expression is always evaluated. However, if the compiler can prove that 465+ an expression inside `Assume` is side-effect-free, it may optimize the call away,
yancyribbens commented at 7:03 pm on March 25, 2025:0 an expression inside `Assume` is true, it _will_ have no affect,
yancyribbens commented at 7:04 pm on March 25, 2025:s/affect/effect
ismaelsadeeq commented at 7:32 pm on March 25, 2025:I will fix nit when there is need to retouch.
sipa commented at 7:38 pm on March 25, 2025:This suggestion is not correct.
If the expression inside
Assume
has a side-effect, but still can be proven to evaluate to true, the side effect will still remain.
yancyribbens commented at 6:36 pm on March 26, 2025:If the expression inside Assume has a side-effect, but still can be proven to evaluate to true, the side effect will still remain.
I am assuming you are talking about the Macro then and not the C++ 23 Assume since that’s not how the docs read. The docs read that if it’s true it has no effect: “If the expression would not evaluate to true at the place the assumption occurs, the behavior is undefined. Otherwise, the statement has no effect. "
Also, I think may should be replaced with will unless it’s ambigous wether or not this will be otimized away depending on it’s side-effect which it doesn’t sound like it is.
sipa commented at 6:44 pm on March 26, 2025:Yes, nothing in this PR is about the C++23[[assume()]]
attribute. I don’t understand why you keep bringing it up.
yancyribbens commented at 6:46 pm on March 26, 2025:I just noticed that Sipa said “There is no guarantee that the compiler will do that.” which I guess is why the may exists here.
yancyribbens commented at 6:56 pm on March 26, 2025:Yes, nothing in this PR is about the C++23 [[assume()]] attribute. I don’t understand why you keep bringing it up.
The name collision had me confused
maflcko commented at 7:07 pm on March 25, 2025: memberThe assume attribute is C++23. However, this codebase is C++20 and this pull request is about the
Assume
macro in the codebase itself.yancyribbens commented at 7:11 pm on March 25, 2025: contributorThe assume attribute is C++23. However, this codebase is C++20 and this pull request is about the Assume macro in the codebase itself.
Good point. I did also link the C++23 link above. I’m just looking for confirmation that this doesn’t appear in production which doesn’t appear in any reference that I see.
sipa commented at 7:36 pm on March 25, 2025: memberI’m just looking for confirmation that this doesn’t appear in production which doesn’t appear in any reference that I see.
You cannot find any reference about it, because
Assume()
is a macro defined in the Bitcoin Core codebase, not a C++ language/library feature. It’s defined here. It is different from, and unrelated to, the[[assume(...)]]
attribute added in C++23 which you link to.And you can’t find confirmation that it doesn’t appear in production, because as far as the language is concerned, it does appear in production. It just happens to be the case that compilers may always remove code which they can prove has no effect.
yancyribbens commented at 5:10 pm on March 26, 2025: contributorYou cannot find any reference about it, because Assume() is a macro defined in the Bitcoin Core codebase, not a C++ language/library feature. It’s defined here. It is different from, and unrelated to, the [[assume(…)]] attribute added in C++23 which you link to.
I see, thanks for the explanation. Others reading this may also think this is talking about the C++ assume (maybe).
And you can’t find confirmation that it doesn’t appear in production, because as far as the language is concerned, it does appear in production. It just happens to be the case that compilers may always remove code which they can prove has no effect.
Ok. The wording might be more clear if it said that the compiler will optimize this out if it can prove there is no side-effect and drop the reference to production.
sipa commented at 5:12 pm on March 26, 2025: memberif it said that the compiler will optimize this out if it can prove there is no side-effect
There is no guarantee that the compiler will do that.
drop the reference to production
In debug builds, it cannot be optimized out, because the statement necessarily has a side-effect (aborting if it detects the condition is false).
fanquake merged this on Mar 27, 2025fanquake closed this on Mar 27, 2025
ismaelsadeeq deleted the branch on Mar 27, 2025ryanofsky commented at 2:57 pm on March 27, 2025: contributorPost-merge code review ACK 329a0dcdafe05002f662e8737a76bfdeaba9a3ed. This should be a helpful change since it clarifies how
Assume
works.IMO, the
Assume
macro is not very well designed because it is useful for two different things and not good at doing either of them.Assume
is useful:-
To check for states you never expect to occur, but are harmless. It can be used when you want to make sure unexpected states trigger failures during testing, but don’t want those states to crash the program in release builds. You may want to handle them instead by logging a warning, returning an error from the current RPC, safely aborting the current operation, or just doing nothing and moving on.
-
To check for states you never expect to occur, and probably indicate serious bugs, but the checks are slow, or are located in hotspots where they would be too expensive to perform in release builds, so are better to skip.
Assume
is suboptimal for the first case because in most of the places where it is called the return value is never checked, so even when the check is performed in release builds, nothing is logged and there’s no indication of any problem. For the first case it would make more sense to have anAssume
function that did the same thing but logged a warning like “Unexpected condition […] reached, consider reporting a bug.” Release builds are run in many more environments than debug builds so if unexpected states that we have explicitly written checks for are being reached, it would be probably better to know about them than not know about them.Assume
is suboptimal for the second case because instead of reliably not performing checks in release builds, checks will only be optimized out if the compiler is able to determine that the check has no side effects. This means the slowest checks may not be optimized out, which is confusing to say the least.My inclination to improve this would be to keep using
Assume
for the first case and improve it, but introduce a new macro likeDCHECK
to be used for the second case that would only execute checks in debug builds. (I previously also suggested aSLOWCHECK
macro to try to improve the second case)hodlinator commented at 3:22 pm on March 27, 2025: contributorWhat I’m missing is a
SLOWCHECK
/DCHECK
macro that is active in debug and always active in fuzz builds (and fails fuzz-builds on failure even in release). For cases when something very unexpected is happening but it’s not worth crashing the node in production over it. Hopefully this need is not due to wanting to make up for sloppy comprehension of the code.(It’s also unfortunate that
Assume()
uses the same term as the C++23[[assume]]
feature with very different semantics).maflcko commented at 9:41 pm on March 27, 2025: memberWhat I’m missing is a
SLOWCHECK
/DCHECK
macro that is active in debug and always active in fuzz builds (and fails fuzz-builds on failure even in release).The
Assume()
macro does exactly this, albeit the fuzz part is only documented in the code itself.hodlinator commented at 9:44 pm on March 27, 2025: contributorThe
Assume()
macro does exactly this, albeit the fuzz part is only documented in the code itself.Ah, nice. Was getting thrown off by
ABORT_ON_FAILED_ASSUME
being debug-only, butG_FUZZING
/FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
seems to achieve what I’m after.
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-03-28 15:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me