linearizeoptimallyexample11
benchmark 4x’d it’s execution time at some point in the last few days.
bench: linearizeoptimallyexample11
benchmark now running 4x slow than previously
#31178
issue
fanquake
openend this issue on
October 29, 2024
-
fanquake commented at 2:16 pm on October 29, 2024: memberAccording to https://corecheck.dev/benchmarks, the
-
sipa commented at 2:20 pm on October 29, 2024: memberMore like 2x (from 0.64 to 1.21)? Still bizarre, this code has not been touched the past month for as far as I can see?
-
fanquake commented at 2:22 pm on October 29, 2024: memberI was just looking at total execution time increasing from ~4.3s to ~16.3s.
-
sipa commented at 2:23 pm on October 29, 2024: memberOh nevermind, I was looking at the “outliers” panel, which doesn’t give absolute time. Indeed, 4x.
-
m3dwards commented at 10:19 pm on October 29, 2024: contributor
I have been able to recreate a roughly 2x slowdown on both x86 Linux and AArch64 Mac. Not quite as much as the corcheck.dev slowdown but still significant. Just for reference, corecheck.dev runs on AArch64 Linux.
I compared f0130ab1a1 (October 17th) with 97b790e844abd2 (October 29th).
I’ll have a scan for the offending commit and worse case do a binary search.
-
m3dwards commented at 11:05 pm on October 30, 2024: contributor
The slowdown appears to be introduced with https://github.com/bitcoin/bitcoin/commit/1c7ca6e64de9b47b2295c81cb0fedd432ffaf001
Not sure why yet.
-
davidgumberg commented at 2:40 am on October 31, 2024: contributor
@m3dwards Thanks for digging! I believe that explains it: since #31093,
Assume()
calls are no longer optimized out in production builds.+1 I’ve reproduced this performance issue locally, and when you remove
g_fuzzing
from theinline_assertion_check
’sif
clause, the regression goes away:./build/src/bench/bench_bitcoin -filter=LinearizeOptimallyExample11 -min-time=30000
branch ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark 1c7ca6e 374,612,291.12 2.67 0.1% 7,324,884,820.00 1,304,215,224.38 5.616 898,388,711.38 0.2% 32.23 LinearizeOptimallyExample11
1c7ca6e~1 301,758,341.60 3.31 0.3% 4,990,499,172.00 1,050,585,438.00 4.750 352,054,044.90 0.3% 32.48 LinearizeOptimallyExample11
no check g_fuzzing in assume 88a2427a68 300,695,825.00 3.33 0.1% 4,990,499,171.80 1,046,167,482.50 4.770 352,054,044.70 0.3% 32.39 LinearizeOptimallyExample11
I recorded some flamegraphs that point to where the hot
Assume
call(s) are: 1c7ca6e (permalink) vs parent (permalink) -
l0rinc commented at 7:59 am on October 31, 2024: contributorCould we obviate in the DrahtBot’s response that the link doesn’t just provide a
test coverage report
- since the regression was already visible there: https://corecheck.dev/bitcoin/bitcoin/pulls/31093 -
maflcko commented at 9:42 am on October 31, 2024: memberSure, it is open source and pull requests for re-wording are welcome: https://github.com/maflcko/DrahtBot/blob/main/webhook_features/src/features/summary_comment.rs#L194
-
maflcko commented at 9:48 am on October 31, 2024: memberI wonder if an
[[unlikely]]
annotation can take care of this, so that the compiler can treat the Assume as an (almost) no-op in the happy path. Before and after theg_fuzzing
change,val
would have to always be evaluated. However, it seems the!val
evaluation is the overhead in this case? -
sipa commented at 11:35 am on October 31, 2024: member@maflcko Worth trying, but I have been using
Assume
in many places assuming it’ll be optimized out entirely in production code. If we leave thisg_fuzzing
in place, we will need to re-evaluate whether all thoseAssume
s are still worth having if they have a runtime impact. -
maflcko commented at 11:51 am on October 31, 2024: member
Then I think the only solution is to make it a build-time option again:
0constexpr bool G_FUZZING{ 1#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION 2 true 3#else 4 false 5#endif 6};
-
ryanofsky commented at 1:58 pm on October 31, 2024: contributor
I think it is worth looking for a better way to fix this because of the clumsiness of needing to compile two different versions of libraries when fuzzing is used.
It is not true that in general that
Assume
checks were “optimized out” before #31093 because the checks are performed before theinline_assertion_check
function is called, so wouldn’t be skipped in general just because the function is implemented differently. The compiler would be able to skip the checks if it knows they don’t have side effects. But if it knows they don’t have side effects, it should also be able to skip the checks wheng_fuzzing
is false.I think we should look at what compiler is doing here and see if there’s a way we can get it to generate better code to fix the benchmark. Also if we can change the implementation of Assume to check
g_fuzzing
before running the Assume checks it might increase performance of the production builds generally in cases where the compile doesn’t know if the checks have side effects. -
maflcko commented at 2:19 pm on October 31, 2024: member
It is not true that in general that
Assume
checks were “optimized out” before #31093 because the checks are performed before theinline_assertion_check
function is called, so wouldn’t be skipped in general just because the function is implemented differently.I may be missing something, but I don’t think this is true. The condition
!val
is never evaluated (and the compiler doesn’t has to account for it being evaluated and doesn’t have to emit any statements for it) in the following code example:0if (false) { 1 if (!val) { 2 assertion_fail(file, line, func, assertion); 3 } 4} 5return std::forward<T>(val);
However, if the condition was
false || g_fuzzing
, the compiler would have to account for it and emit the!val
check and theassertion_fail
branch, even if tagged with[[unlikely]]
. -
ryanofsky commented at 2:44 pm on October 31, 2024: contributorI could also be missing something, but doesn’t
Assume(<check expression>)
just expand toinline_assertion_check<false>(<check expression>, __FILE__, __LINE__, __func__, "<check expression>")
so unless the compiler knows<check expression>
does not have side effects, must execute it and can’t optimize it out? IfAssume
macro were implemented differently it could skip evaluating<check expression>
at all. -
sipa commented at 2:58 pm on October 31, 2024: member
@ryanofsky Yeah, the context here is exactly
Assume()
calls where the conditions is easily shown to be side-effect free by the compiler. I have been using those fairly liberally (inside production code), because it means free extra checks in fuzzing tests that have no runtime cost, but are simultaneously definitely not going to result in different behavior between fuzz tests and production code (because it’s the compiler deciding the condition is side-effect free).I think these are very useful to have, but it inherently requires a separate compilation for fuzz tests (or any other testing environment where we’d want them enabled). I see that such a separate compilation has downsides too, but the alternative I think does mean just getting rid of some of the most tight-loop Assumes. I don’t think it’s practical to prevent those from running by putting a conditional higher up the stack (at least not without introducing a much more significant risk of distinct behavior between production and testing).
-
maflcko commented at 3:10 pm on October 31, 2024: member
so unless the compiler knows
<check expression>
does not have side effects, must execute it and can’t optimize it out?Correct, but the that hasn’t been changed by
g_fuzzing
at all. It is not the execution of<check expression>
, but the execution of the negation of it (wrong, see #31178 (comment) ), and the corresponding expensive branch in hot paths which has to be emitted for the unlikely case that the negation evaluates to true. -
sipa commented at 3:16 pm on October 31, 2024: member
Correct, but the that hasn’t been changed by
g_fuzzing
at all.Ah, right, it’s possible that the compiler can turn
Assume(<expr>);
intoif (g_fuzzing) { bool x = <expr>; if (!x) { ... } }
, where<expr>
is not evaluated at all ifg_fuzzing
is false. I was operating under the assumption that with the emitted branch,<expr>
would always be evaluated even if side-effect free, but that is indeed not the case.Still, just the fact that the
g_fuzzing
condition is there means a conditional is likely emitted, which can slow things down even if correctly predicted. -
maflcko commented at 3:26 pm on October 31, 2024: memberIn any case #31191 should fix the recently introduced
g_fuzzing
regression. If there are suggestions for other (more fundamental) changes or alternative code patches, separate pull requests would probably be best going forward. I am happy to review any of them, if they exist. -
ryanofsky commented at 3:46 pm on October 31, 2024: contributor
After testing this, it does seem like, in practice, the slowdown comes from checking the
g_fuzzing
variable, not from evaluating the assume condition. I don’t agree with (or don’t understand) Marco’s logic that this was neccesarily the case, since before #31093 if the compiler knew evaluating the assume condition didn’t have side effects and that result of the assume condition was not used, it would have no reason to evaluate it before #31093, but would need to evaluate it after #31093, so an optimization that would skip evaluating the assume condition wheng_fuzzing
is false could result in a speedup.But in practice, the speedup that comes from doing this is relatively minor. When I run the LinearizeOptimallyExample11 benchmark on master, I get around 1.4 operations per second. With a change to skip evaluating the assume condition when
g_fuzzing
is false, this increases to 1.5 operations per second. But if I revert #31093, it increases to 1.8 operation to second.Change I was tested with was:
0--- a/src/util/check.h 1+++ b/src/util/check.h 2@@ -41,21 +41,24 @@ T&& inline_check_non_fatal(LIFETIMEBOUND T&& val, const char* file, int line, co 3 void assertion_fail(std::string_view file, int line, std::string_view func, std::string_view assertion); 4 5 /** Helper for Assert()/Assume() */ 6-template <bool IS_ASSERT, typename T> 7+template <typename T> 8 constexpr T&& inline_assertion_check(LIFETIMEBOUND T&& val, [[maybe_unused]] const char* file, [[maybe_unused]] int line, [[maybe_unused]] const char* func, [[maybe_unused]] const char* assertion) 9 { 10- if (IS_ASSERT || std::is_constant_evaluated() || g_fuzzing 11-#ifdef ABORT_ON_FAILED_ASSUME 12- || true 13-#endif 14- ) { 15 if (!val) { 16 assertion_fail(file, line, func, assertion); 17 } 18- } 19 return std::forward<T>(val); 20 } 21 22+constexpr bool AbortOnFailedAssume() { 23+#ifdef ABORT_ON_FAILED_ASSUME 24+ return true; 25+#endif 26+ if (std::is_constant_evaluated()) return true; 27+ if (g_fuzzing) [[unlikely]] return true; 28+ return false; 29+} 30+ 31 // All macros may use __func__ inside a lambda, so put them under nolint. 32 // NOLINTBEGIN(bugprone-lambda-function-name) 33 34@@ -76,7 +79,7 @@ constexpr T&& inline_assertion_check(LIFETIMEBOUND T&& val, [[maybe_unused]] con 35 inline_check_non_fatal(condition, __FILE__, __LINE__, __func__, #condition) 36 37 /** Identity function. Abort if the value compares equal to zero */ 38-#define Assert(val) inline_assertion_check<true>(val, __FILE__, __LINE__, __func__, #val) 39+#define Assert(val) inline_assertion_check(val, __FILE__, __LINE__, __func__, #val) 40 41 /** 42 * Assume is the identity function. 43@@ -88,7 +91,7 @@ constexpr T&& inline_assertion_check(LIFETIMEBOUND T&& val, [[maybe_unused]] con 44 * - For non-fatal errors in interactive sessions (e.g. RPC or command line 45 * interfaces), CHECK_NONFATAL() might be more appropriate. 46 */ 47-#define Assume(val) inline_assertion_check<false>(val, __FILE__, __LINE__, __func__, #val) 48+#define Assume(val) inline_assertion_check(!AbortOnFailedAssume() || (val), __FILE__, __LINE__, __func__, #val) 49 50 /** 51 * NONFATAL_UNREACHABLE() is a macro that is used to mark unreachable code. It throws a NonFatalCheckError.
-
ryanofsky referenced this in commit 03cff2c142 on Nov 5, 2024
-
ryanofsky closed this on Nov 5, 2024
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-04 06:12 UTC
More mirrored repositories can be found on mirror.b10c.me