g_genesis_wait_cv
is similar to m_tip_block_cv
but shuffling everything through a redundant boost::signals2
.
So remove it, along with some other dead code, as well as minor fixups.
g_genesis_wait_cv
is similar to m_tip_block_cv
but shuffling everything through a redundant boost::signals2
.
So remove it, along with some other dead code, as well as minor fixups.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
For detailed information about the code coverage, see the test coverage report.
See the guideline for information on the review process.
Type | Reviewers |
---|---|
ACK | ryanofsky, Sjors, TheCharlatan |
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
1797@@ -1822,17 +1798,17 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
1798
1799 // ********************************************************* Step 12: start node
1800
1801- //// debug print
utACK fa964f82c52a7d0fd40c5133cf5fc18ef13819e5
Dropping g_genesis_wait_cv
is a nice simplification.
944@@ -945,7 +945,7 @@ class MinerImpl : public Mining
945 const auto deadline{now + timeout};
946 {
947 WAIT_LOCK(notifications().m_tip_block_mutex, lock);
948- while ((notifications().m_tip_block == uint256() || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
949+ while ((notifications().m_tip_block == uint256::ZERO || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
wait_for
that takes a predicate, rather than wrapping a wait_until
in a while()
.
I don’t think this is possible in a refactor, because the code was written to ignore system-time adjustments and always ensure the wait is measured on the steady clock. (No idea if this matters or even was intentional)
If you disagree, I am happy to take and push any diff that is a refactor or a behavior change.
re: #30967 (review)
I’m not sure about the steady clock thing, but I think the reason this code is written to use a while loop is because it wants to wake up every 1000ms to check m_interrupt in case m_interrupt
was signaled without m_tip_block_cv
being signalled. I posted another change above to ensure that signalling m_interrupt
will reliably signal m_tip_block_cv
, After that, I think this code could be changed to use a simple wait predicate and drop the 1000ms polling.
I’m not sure about the steady clock thing
I was referring to https://en.cppreference.com/w/cpp/thread/condition_variable/wait_until#Notes, but it can probably be dropped. I just wanted to mention that it may not be a refactor in a strict sense.
re: #30967 (review)
I think Cory’s suggestion could be implemented here with:
0--- a/src/node/interfaces.cpp
1+++ b/src/node/interfaces.cpp
2@@ -940,17 +940,11 @@ public:
3
4 BlockRef waitTipChanged(uint256 current_tip, MillisecondsDouble timeout) override
5 {
6- // Interrupt check interval
7- const MillisecondsDouble tick{1000};
8- auto now{std::chrono::steady_clock::now()};
9- const auto deadline{now + timeout};
10 {
11 WAIT_LOCK(notifications().m_tip_block_mutex, lock);
12- while ((notifications().m_tip_block == uint256::ZERO || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
13- now = std::chrono::steady_clock::now();
14- if (now >= deadline) break;
15- notifications().m_tip_block_cv.wait_until(lock, std::min(deadline, now + tick));
16- }
17+ notifications().m_tip_block_cv.wait_for(lock, timeout, [&] {
18+ return (notifications().m_tip_block != current_tip && notifications().m_tip_block != uint256::ZERO) || chainman().m_interrupt;
19+ });
20 }
21 // Must release m_tip_block_mutex before locking cs_main, to avoid deadlocks.
22 LOCK(::cs_main);
I think the reason this code is written to use a while loop is because it wants to wake up every 1000ms to check m_interrupt in case
m_interrupt
was signaled withoutm_tip_block_cv
being signalled
IIRC that’s indeed why.
the code was written to ignore system-time adjustments and always ensure the wait is measured on the steady clock
The Sv2 TemplateProvider unit tests rely on SetMockTime
, though I think they mostly care about timeout
, not the 1000ms
shutdown ticker. In any case I can adjust those tests if needed to simplify this code.
The Sv2 TemplateProvider unit tests rely on
SetMockTime
mocktime only affects the NodeClock
, neither the system clock, nor the steady clock (obviously), so I don’t think this will affect your unit tests and you don’t have to change anything.
I think Cory’s suggestion could be implemented here with:
Thanks, pushed the diff as a new commit on top, with added clang-threadsafety annotations.
Actually it uncovered a bug in GCC:
0node1 stderr /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:225:8: runtime error: inf is outside the range of representable values of type 'long'
1 [#0](/bitcoin-bitcoin/0/) 0x55c8ba4a1562 in std::chrono::duration<long, std::ratio<1l, 1000000000l>> std::chrono::__duration_cast_impl<std::chrono::duration<long, std::ratio<1l, 1000000000l>>, std::ratio<1000000l, 1l>, double, false, true>::__cast<double, std::ratio<1l, 1000l>>(std::chrono::duration<double, std::ratio<1l, 1000l>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:225:8
2 [#1](/bitcoin-bitcoin/1/) 0x55c8ba4a1562 in std::enable_if<__is_duration<std::chrono::duration<long, std::ratio<1l, 1000000000l>>>::value, std::chrono::duration<long, std::ratio<1l, 1000000000l>>>::type std::chrono::duration_cast<std::chrono::duration<long, std::ratio<1l, 1000000000l>>, double, std::ratio<1l, 1000l>>(std::chrono::duration<double, std::ratio<1l, 1000l>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:287:11
3 [#2](/bitcoin-bitcoin/2/) 0x55c8ba4a1562 in std::enable_if<__is_duration<std::chrono::duration<long, std::ratio<1l, 1000000000l>>>::value, std::chrono::duration<long, std::ratio<1l, 1000000000l>>>::type std::chrono::ceil<std::chrono::duration<long, std::ratio<1l, 1000000000l>>, double, std::ratio<1l, 1000l>>(std::chrono::duration<double, std::ratio<1l, 1000l>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:410:14
4 [#3](/bitcoin-bitcoin/3/) 0x55c8ba4a1562 in bool std::condition_variable::wait_for<double, std::ratio<1l, 1000l>, node::(anonymous namespace)::MinerImpl::waitTipChanged(uint256, std::chrono::duration<double, std::ratio<1l, 1000l>>)::'lambda'()>(std::unique_lock<std::mutex>&, std::chrono::duration<double, std::ratio<1l, 1000l>> const&, node::(anonymous namespace)::MinerImpl::waitTipChanged(uint256, std::chrono::duration<double, std::ratio<1l, 1000l>>)::'lambda'()) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/condition_variable:179:6
5 [#4](/bitcoin-bitcoin/4/) 0x55c8ba4a1562 in node::(anonymous namespace)::MinerImpl::waitTipChanged(uint256, std::chrono::duration<double, std::ratio<1l, 1000l>>) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/node/interfaces.cpp:945:44
6 [#5](/bitcoin-bitcoin/5/) 0x55c8ba635ae9 in waitforblockheight()::$_0::operator()(RPCHelpMan const&, JSONRPCRequest const&) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/blockchain.cpp:397:27
7 [#6](/bitcoin-bitcoin/6/) 0x55c8ba635ae9 in UniValue std::__invoke_impl<UniValue, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>(std::__invoke_other, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/invoke.h:61:14
8 [#7](/bitcoin-bitcoin/7/) 0x55c8ba635ae9 in std::enable_if<is_invocable_r_v<UniValue, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>, UniValue>::type std::__invoke_r<UniValue, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>(waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/invoke.h:114:9
9 [#8](/bitcoin-bitcoin/8/) 0x55c8ba635ae9 in std::_Function_handler<UniValue (RPCHelpMan const&, JSONRPCRequest const&), waitforblockheight()::$_0>::_M_invoke(std::_Any_data const&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290:9
10 [#9](/bitcoin-bitcoin/9/) 0x55c8bb3491fd in std::function<UniValue (RPCHelpMan const&, JSONRPCRequest const&)>::operator()(RPCHelpMan const&, JSONRPCRequest const&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
11 [#10](/bitcoin-bitcoin/10/) 0x55c8bb3491fd in RPCHelpMan::HandleRequest(JSONRPCRequest const&) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/util.cpp:667:20
12 [#11](/bitcoin-bitcoin/11/) 0x55c8ba650ee1 in CRPCCommand::CRPCCommand(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, RPCHelpMan (*)())::'lambda'(JSONRPCRequest const&, UniValue&, bool)::operator()(JSONRPCRequest const&, UniValue&, bool) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.h:101:91
13 [#12](/bitcoin-bitcoin/12/) 0x55c8ba85576c in std::function<bool (JSONRPCRequest const&, UniValue&, bool)>::operator()(JSONRPCRequest const&, UniValue&, bool) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
14 [#13](/bitcoin-bitcoin/13/) 0x55c8ba85576c in ExecuteCommand(CRPCCommand const&, JSONRPCRequest const&, UniValue&, bool) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:512:20
15 [#14](/bitcoin-bitcoin/14/) 0x55c8ba85576c in ExecuteCommands(std::vector<CRPCCommand const*, std::allocator<CRPCCommand const*>> const&, JSONRPCRequest const&, UniValue&) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:477:13
16 [#15](/bitcoin-bitcoin/15/) 0x55c8ba8531b2 in CRPCTable::execute(JSONRPCRequest const&) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:497:13
17 [#16](/bitcoin-bitcoin/16/) 0x55c8ba852455 in JSONRPCExec(JSONRPCRequest const&, bool) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:353:31
18 [#17](/bitcoin-bitcoin/17/) 0x55c8baad4b11 in HTTPReq_JSONRPC(std::any const&, HTTPRequest*) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httprpc.cpp:218:21
19 [#18](/bitcoin-bitcoin/18/) 0x55c8baaf1ad0 in std::function<bool (HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&)>::operator()(HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
20 [#19](/bitcoin-bitcoin/19/) 0x55c8baaf1ad0 in HTTPWorkItem::operator()() ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httpserver.cpp:62:9
21 [#20](/bitcoin-bitcoin/20/) 0x55c8baaf76b3 in WorkQueue<HTTPClosure>::Run() ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httpserver.cpp:117:13
22 [#21](/bitcoin-bitcoin/21/) 0x55c8baae4031 in HTTPWorkQueueRun(WorkQueue<HTTPClosure>*, int) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httpserver.cpp:415:12
23 [#22](/bitcoin-bitcoin/22/) 0x7f71e9fa5bb3 (/lib/x86_64-linux-gnu/libstdc++.so.6+0xeabb3) (BuildId: 40b9b0d17fdeebfb57331304da2b7f85e1396ef2)
24 [#23](/bitcoin-bitcoin/23/) 0x55c8ba0d69fc in asan_thread_start(void*) asan_interceptors.cpp.o
25 [#24](/bitcoin-bitcoin/24/) 0x7f71e9c1ca93 (/lib/x86_64-linux-gnu/libc.so.6+0x9ca93) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
26 [#25](/bitcoin-bitcoin/25/) 0x7f71e9ca9c3b (/lib/x86_64-linux-gnu/libc.so.6+0x129c3b) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
27
28SUMMARY: UndefinedBehaviorSanitizer: float-cast-overflow /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:225:8
I guess the bug is in stdlibc++ as well: https://godbolt.org/z/3az431Gdn
Yeah I guess values too large or too small are unsupported in the stdlibs for now and have to be avoided before passing them in.
I’ve capped at 100 years for now.
1789 // We previously could hang here if shutdown was requested prior to
1790 // ImportBlocks getting started, so instead we just wait on a timer to
1791 // check ShutdownRequested() regularly.
1792- while (!fHaveGenesis && !ShutdownRequested(node)) {
1793- g_genesis_wait_cv.wait_for(lock, std::chrono::milliseconds(500));
1794+ while (kernel_notifications.m_tip_block.IsNull() && !ShutdownRequested(node)) {
In commit “refactor: Replace g_genesis_wait_cv with m_tip_block_cv” (faaf888eb275cecc2d2803c5756afac633e0e2a4)
This code is waiting for m_tip_block
to be non-null, but then code below will be assuming that chainman.ActiveTip()
will be non-null. These are similar conditions but not exactly the same, so I think it would be good to add an assert below the if statement like assert(WITH_LOCK(chainman.GetMutex(), return chainman.ActiveChain().Tip() != nullptr));
to make sure this code is functioning as expected.
These are similar conditions but not exactly the same
The m_tip_block
condition implies the Tip()
condition, because SetTip()
happens before blockTip()
.
An assert (Assert
) is already present in the next commit, which was asked to be split up: #30967 (review)
8@@ -9,7 +9,6 @@
9 #include <rpc/request.h>
10 #include <rpc/util.h>
11
12-#include <any>
NodeContext
here instead.
Code review ACK fa61644768c3f6932860d1c5e2c6c0a38d5257e5.
Overall very nice changes, but I think the first commit fa5ac85d19a387c2e12fc0833368e1240a3dcf51 has a drawback because it adds a NodeContext
argument to RPCStop
when ideally RPC code should be agnostic to node initialization and not depend on node types. Would suggest following change to clean it up, which could be squashed into the first commit:
This would also enable additional simplification in AppInitMain which could be squashed into commit faaf888eb275cecc2d2803c5756afac633e0e2a4:
(Suggested commits are from branch https://github.com/ryanofsky/bitcoin/tree/review.30967.2-edit based on this PR)
🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/30654153102
Make sure to run all tests locally, according to the documentation.
The failure may happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
302@@ -303,9 +303,6 @@ void StopRPC(const std::any& context)
303 LogDebug(BCLog::RPC, "Stopping RPC\n");
304 WITH_LOCK(g_deadline_timers_mutex, deadlineTimers.clear());
305 DeleteAuthCookie();
306- node::NodeContext& node = EnsureAnyNodeContext(context);
307- // The notifications interface doesn't exist between initialization step 4a and 7.
if (node.notifications)
check is still there)
Commit 86bddb439bff846630b1efdea777119439ba56b2 is a bit over my head, but seems like a good idea.
Light code review ACK fa7c2c9f18f5468baf81c04177b75e2b684a327a.
Code review ACK fa877f4c5f9abb057f1476f21a063e0f687d0ad1. Since last review, just added new commit with another simplification.
Could consider squashing the new commit:
into earlier commit:
since they are touching same code, and there should be no need to have one commit simplifying some arithmetic followed by another commit removing the arithmetic.
Could also consider moving the doc commit:
to be first, so it’s clear the documentation fixes in it apply to current code, not just new code.
947@@ -948,7 +948,7 @@ class MinerImpl : public Mining
948 if (deadline < now) deadline = std::chrono::steady_clock::time_point::max();
949 {
950 WAIT_LOCK(notifications().m_tip_block_mutex, lock);
951- while ((notifications().m_tip_block == uint256() || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
952+ while ((notifications().m_tip_block == uint256::ZERO || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
In commit “refactor: Add missing GUARDED_BY(m_tip_block_mutex)” (fa47d893acdd18132204810dd95880ab21abf415)
It seems like it would be a little better to make this change in the next commit “refactor: Use wait_for predicate to check for interrupt” (faa3f411f6842bf1201b25d6391b1913e1b72439) instead of this one so this code is not changing more than once.
944- const MillisecondsDouble tick{1000};
945- auto now{std::chrono::steady_clock::now()};
946- auto deadline = now + timeout;
947- // std::chrono does not check against overflow
948- if (deadline < now) deadline = std::chrono::steady_clock::time_point::max();
949+ if (timeout > std::chrono::years{100}) timeout = std::chrono::years{100}; // Upper bound to avoid UB in std::chrono
In commit “refactor: Use wait_for predicate to check for interrupt” (faa3f411f6842bf1201b25d6391b1913e1b72439)
Might be better to avoid the 100 year hack by switching to wait_until
0--- a/src/node/interfaces.cpp
1+++ b/src/node/interfaces.cpp
2@@ -940,10 +940,10 @@ public:
3
4 BlockRef waitTipChanged(uint256 current_tip, MillisecondsDouble timeout) override
5 {
6- if (timeout > std::chrono::years{100}) timeout = std::chrono::years{100}; // Upper bound to avoid UB in std::chrono
7 {
8 WAIT_LOCK(notifications().m_tip_block_mutex, lock);
9- notifications().m_tip_block_cv.wait_for(lock, timeout, [&]() EXCLUSIVE_LOCKS_REQUIRED(notifications().m_tip_block_mutex) {
10+ const auto deadline{std::chrono::steady_clock::now() + timeout};
11+ notifications().m_tip_block_cv.wait_until(lock, deadline, [&]() EXCLUSIVE_LOCKS_REQUIRED(notifications().m_tip_block_mutex) {
12 return (notifications().m_tip_block != current_tip && notifications().m_tip_block != uint256::ZERO) || chainman().m_interrupt;
13 });
14 }
This fails with the same error. It would have also surprised me if a float-cast UB was hit in one code path, but not in an equivalent one.
0node1 stderr /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:212:8: runtime error: inf is outside the range of representable values of type 'long'
1 [#0](/bitcoin-bitcoin/0/) 0x55cbcf9b4fb3 in std::chrono::duration<long, std::ratio<1l, 1l>> std::chrono::__duration_cast_impl<std::chrono::duration<long, std::ratio<1l, 1l>>, std::ratio<1l, 1000000000l>, double, true, false>::__cast<double, std::ratio<1l, 1000000000l>>(std::chrono::duration<double, std::ratio<1l, 1000000000l>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:212:8
2 [#1](/bitcoin-bitcoin/1/) 0x55cbcf9b4fb3 in std::enable_if<__is_duration<std::chrono::duration<long, std::ratio<1l, 1l>>>::value, std::chrono::duration<long, std::ratio<1l, 1l>>>::type std::chrono::duration_cast<std::chrono::duration<long, std::ratio<1l, 1l>>, double, std::ratio<1l, 1000000000l>>(std::chrono::duration<double, std::ratio<1l, 1000000000l>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:287:11
3 [#2](/bitcoin-bitcoin/2/) 0x55cbcf9b4fb3 in std::enable_if<__is_duration<std::chrono::duration<long, std::ratio<1l, 1l>>>::value, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<long, std::ratio<1l, 1l>>>>::type std::chrono::time_point_cast<std::chrono::duration<long, std::ratio<1l, 1l>>, std::chrono::_V2::steady_clock, std::chrono::duration<double, std::ratio<1l, 1000000000l>>>(std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<double, std::ratio<1l, 1000000000l>>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:1022:22
4 [#3](/bitcoin-bitcoin/3/) 0x55cbcf9b4fb3 in std::cv_status std::condition_variable::__wait_until_impl<std::chrono::duration<double, std::ratio<1l, 1000000000l>>>(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<double, std::ratio<1l, 1000000000l>>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/condition_variable:194:13
5 [#4](/bitcoin-bitcoin/4/) 0x55cbcf99f119 in std::cv_status std::condition_variable::wait_until<std::chrono::duration<double, std::ratio<1l, 1000000000l>>>(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<double, std::ratio<1l, 1000000000l>>> const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/condition_variable:113:16
6 [#5](/bitcoin-bitcoin/5/) 0x55cbcf99f119 in bool std::condition_variable::wait_until<std::chrono::_V2::steady_clock, std::chrono::duration<double, std::ratio<1l, 1000000000l>>, node::(anonymous namespace)::MinerImpl::waitTipChanged(uint256, std::chrono::duration<double, std::ratio<1l, 1000l>>)::'lambda'()>(std::unique_lock<std::mutex>&, std::chrono::time_point<std::chrono::_V2::steady_clock, std::chrono::duration<double, std::ratio<1l, 1000000000l>>> const&, node::(anonymous namespace)::MinerImpl::waitTipChanged(uint256, std::chrono::duration<double, std::ratio<1l, 1000l>>)::'lambda'()) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/condition_variable:154:8
7 [#6](/bitcoin-bitcoin/6/) 0x55cbcf99f119 in node::(anonymous namespace)::MinerImpl::waitTipChanged(uint256, std::chrono::duration<double, std::ratio<1l, 1000l>>) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/node/interfaces.cpp:945:44
8 [#7](/bitcoin-bitcoin/7/) 0x55cbcfb33ca9 in waitforblockheight()::$_0::operator()(RPCHelpMan const&, JSONRPCRequest const&) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/blockchain.cpp:397:27
9 [#8](/bitcoin-bitcoin/8/) 0x55cbcfb33ca9 in UniValue std::__invoke_impl<UniValue, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>(std::__invoke_other, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/invoke.h:61:14
10 [#9](/bitcoin-bitcoin/9/) 0x55cbcfb33ca9 in std::enable_if<is_invocable_r_v<UniValue, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>, UniValue>::type std::__invoke_r<UniValue, waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>(waitforblockheight()::$_0&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/invoke.h:114:9
11 [#10](/bitcoin-bitcoin/10/) 0x55cbcfb33ca9 in std::_Function_handler<UniValue (RPCHelpMan const&, JSONRPCRequest const&), waitforblockheight()::$_0>::_M_invoke(std::_Any_data const&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290:9
12 [#11](/bitcoin-bitcoin/11/) 0x55cbd08473bd in std::function<UniValue (RPCHelpMan const&, JSONRPCRequest const&)>::operator()(RPCHelpMan const&, JSONRPCRequest const&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
13 [#12](/bitcoin-bitcoin/12/) 0x55cbd08473bd in RPCHelpMan::HandleRequest(JSONRPCRequest const&) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/util.cpp:667:20
14 [#13](/bitcoin-bitcoin/13/) 0x55cbcfb4f0a1 in CRPCCommand::CRPCCommand(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, RPCHelpMan (*)())::'lambda'(JSONRPCRequest const&, UniValue&, bool)::operator()(JSONRPCRequest const&, UniValue&, bool) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.h:101:91
15 [#14](/bitcoin-bitcoin/14/) 0x55cbcfd5392c in std::function<bool (JSONRPCRequest const&, UniValue&, bool)>::operator()(JSONRPCRequest const&, UniValue&, bool) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
16 [#15](/bitcoin-bitcoin/15/) 0x55cbcfd5392c in ExecuteCommand(CRPCCommand const&, JSONRPCRequest const&, UniValue&, bool) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:512:20
17 [#16](/bitcoin-bitcoin/16/) 0x55cbcfd5392c in ExecuteCommands(std::vector<CRPCCommand const*, std::allocator<CRPCCommand const*>> const&, JSONRPCRequest const&, UniValue&) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:477:13
18 [#17](/bitcoin-bitcoin/17/) 0x55cbcfd51372 in CRPCTable::execute(JSONRPCRequest const&) const ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:497:13
19 [#18](/bitcoin-bitcoin/18/) 0x55cbcfd50615 in JSONRPCExec(JSONRPCRequest const&, bool) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/rpc/server.cpp:353:31
20 [#19](/bitcoin-bitcoin/19/) 0x55cbcffd2cd1 in HTTPReq_JSONRPC(std::any const&, HTTPRequest*) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httprpc.cpp:218:21
21 [#20](/bitcoin-bitcoin/20/) 0x55cbcffefc90 in std::function<bool (HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&)>::operator()(HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
22 [#21](/bitcoin-bitcoin/21/) 0x55cbcffefc90 in HTTPWorkItem::operator()() ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httpserver.cpp:62:9
23 [#22](/bitcoin-bitcoin/22/) 0x55cbcfff5873 in WorkQueue<HTTPClosure>::Run() ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httpserver.cpp:117:13
24 [#23](/bitcoin-bitcoin/23/) 0x55cbcffe21f1 in HTTPWorkQueueRun(WorkQueue<HTTPClosure>*, int) ci/scratch/build-x86_64-pc-linux-gnu/src/./src/httpserver.cpp:415:12
25 [#24](/bitcoin-bitcoin/24/) 0x7fb54623cbb3 (/lib/x86_64-linux-gnu/libstdc++.so.6+0xeabb3) (BuildId: 40b9b0d17fdeebfb57331304da2b7f85e1396ef2)
26 [#25](/bitcoin-bitcoin/25/) 0x55cbcf5d49fc in asan_thread_start(void*) asan_interceptors.cpp.o
27 [#26](/bitcoin-bitcoin/26/) 0x7fb545eb3a93 (/lib/x86_64-linux-gnu/libc.so.6+0x9ca93) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
28 [#27](/bitcoin-bitcoin/27/) 0x7fb545f40c3b (/lib/x86_64-linux-gnu/libc.so.6+0x129c3b) (BuildId: 6d64b17fbac799e68da7ebd9985ddf9b5cb375e6)
29
30SUMMARY: UndefinedBehaviorSanitizer: float-cast-overflow /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/chrono.h:212:8
if
statement is a bit tricky to read, because of the open braces on the next line. How about adding whitespace, or adding braces to the if statement?
950 {
951 WAIT_LOCK(notifications().m_tip_block_mutex, lock);
952- while ((notifications().m_tip_block == uint256() || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
953- now = std::chrono::steady_clock::now();
954- if (now >= deadline) break;
955- notifications().m_tip_block_cv.wait_until(lock, std::min(deadline, now + tick));
953- now = std::chrono::steady_clock::now();
954- if (now >= deadline) break;
955- notifications().m_tip_block_cv.wait_until(lock, std::min(deadline, now + tick));
956- }
957+ notifications().m_tip_block_cv.wait_for(lock, timeout, [&]() EXCLUSIVE_LOCKS_REQUIRED(notifications().m_tip_block_mutex) {
958+ return (notifications().m_tip_block != current_tip && notifications().m_tip_block != uint256::ZERO) || chainman().m_interrupt;
444424b96deef72780f617d1dbab844a2c9d72ab I think that since waitTipChanged
takes a current_tip
argument the m_tip_block != uint256::ZERO
check is no longer needed. Though it seems fine to keep it.
IIRC the reason I added it was in case waitTipChanged
was called during early node initialization, before m_tip_block
is set to the genesis block.
That scenario is now handled by m_tip_block != current_tip
.
In terms of how the Template Provider works, it used to rely on this check, but now it just waits for getTip()
to return a value before calling waitTipChanged()
. .
When implemented as part of Bitcoin Core, as in #29432, it can’t call waitTipChanged()
earlier, because it has no way of knowing what to set for current_tip
.
When implemented as an external application (e.g. https://github.com/Sjors/bitcoin/pull/48) it might know the tip from some other source before m_tip_block
is set. So it could call waitTipChanged()
very early in node initialization, but that still doesn’t require the m_tip_block != uint256::ZERO
check.
The m_tip_block.IsNull()
is used in fa05721cbc622b9b044af920f63312a5e407a022, but that doesn’t matter here, so I am happy to drop it. (In theory it should be std::optional
, instead of encoding nullopt as zero, but that may be overkill)
IIRC the reason I added it was in case
waitTipChanged
was called during early node initialization, beforem_tip_block
is set to the genesis block.
I don’t think this is true. m_tip_block
may never be set to the genesis block, and stay zero forever if no block is ever connected and state is just initialized from persisted storage.
You can test this by calling waitfornewblock 11 0000000000000000000000000000000000000000000000000000000000000000
and observing the stdout 1
, indicating that the block tip is null, using the following diff:
0diff --git a/src/node/interfaces.cpp b/src/node/interfaces.cpp
1index cbbe86d669..122e0020fc 100644
2--- a/src/node/interfaces.cpp
3+++ b/src/node/interfaces.cpp
4@@ -943,6 +943,7 @@ public:
5 if (timeout > std::chrono::years{100}) timeout = std::chrono::years{100}; // Upper bound to avoid UB in std::chrono
6 {
7 WAIT_LOCK(notifications().m_tip_block_mutex, lock);
8+ std::cout << notifications().m_tip_block.IsNull() << std::endl;
9 notifications().m_tip_block_cv.wait_for(lock, timeout, [&]() EXCLUSIVE_LOCKS_REQUIRED(notifications().m_tip_block_mutex) {
10 return (notifications().m_tip_block != current_tip && notifications().m_tip_block != uint256::ZERO) || chainman().m_interrupt;
11 });
12diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
13index 683a9dae88..2b47c3327d 100644
14--- a/src/rpc/blockchain.cpp
15+++ b/src/rpc/blockchain.cpp
16@@ -262,6 +262,7 @@ static RPCHelpMan waitfornewblock()
17 "\nMake sure to use no RPC timeout (bitcoin-cli -rpcclienttimeout=0)",
18 {
19 {"timeout", RPCArg::Type::NUM, RPCArg::Default{0}, "Time in milliseconds to wait for a response. 0 indicates no timeout."},
20+ {"blockhash", RPCArg::Type::STR_HEX, RPCArg::Optional::OMITTED, "Last known block hash. Returns immediately if the tip is different."},
21 },
22 RPCResult{
23 RPCResult::Type::OBJ, "", "",
24@@ -284,6 +285,10 @@ static RPCHelpMan waitfornewblock()
25 Mining& miner = EnsureMining(node);
26
27 auto block{CHECK_NONFATAL(miner.getTip()).value()};
28+ if (!request.params[1].isNull()) {
29+ block.hash=(ParseHashV(request.params[1], "blockhash"));
30+ }
31+
32 if (IsRPCRunning()) {
33 block = timeout ? miner.waitTipChanged(block.hash, std::chrono::milliseconds(timeout)) : miner.waitTipChanged(block.hash);
34 }
Either approach seems ok. I probably wouldn’t have added the null check, but it does a provide a small convenience if miner is running while the node is restarted, since waitTipChanged will patiently wait for startup to complete and a new block to be connected, instead of returning 0 on the first waitTipChanged call and requiring the miner to make a second call.
Whether we keep this check or drop it, it would probably be good to update the waitTipChanged
documentation for the method to say that it either returns 0 on startup (if null check is dropped) or waits for the first block to be connected on startup (if null check is kept).
re: #30967 (review)
In terms of how the Template Provider works, it used to rely on this check, but now it just waits for
getTip()
to return a value before callingwaitTipChanged()
. .When implemented as part of Bitcoin Core, as in #29432, it can’t call
waitTipChanged()
earlier, because it has no way of knowing what to set forcurrent_tip
.
I may be misunderstanding, but this doesn’t seem right to me. If a caller just wants to know when a new block is connected it shouldn’t need to call getTip()
in a loop until it returns non-null before calling waitTipChanged()
, if that is what this comment is saying. The code should just be able to call waitTipChanged()
with the zero value, which will return as soon as node is started and it knows what the tip is. (But I do agree with Marco it might be better to make let waitTipChanged() take a std::optional hash and not treat 0 as special). If this doesn’t work, it suggests there could be a bug and code is not setting m_tip_block
reliably.
documentation
Updated the m_tip_block
and waitTipChanged
docs.
The code should just be able to call
waitTipChanged()
with the zero value, which will return as soon as node is started and it knows what the tip is.
That indeed seems like a better approach, will change that for the TP.
re-utACK cccc5b19fba6d04e586f59bff5f5b2cdef3dffba
The main change since my last review is in 444424b96deef72780f617d1dbab844a2c9d72ab, which is a nice simplification.
204@@ -205,7 +205,14 @@ void InitContext(NodeContext& node)
205 g_shutdown.emplace();
206
207 node.args = &gArgs;
208- node.shutdown = &*g_shutdown;
209+ node.shutdown_signal = &*g_shutdown;
210+ node.shutdown_request = [&node] {
211+ assert(node.shutdown_signal);
212+ if (!(*node.shutdown_signal)()) return false;
213+ // Wake any threads that may be waiting for the tip to change.
To avoid a race, this needs to lock node.notifications->m_tip_block_mutex
before calling notify. (*node.shutdown_signal)()
doesn’t necessarily need to happen under the lock, but a lock is required after the signal in order to synchronize. Otherwise the update could be missed and we could potentially wait forever.
This is currently an example of this trap (see the “An atomic predicate” example).
See another example (and solution) here (ctrl+f for “acquisition above is significant”).
So this would become:
0if (!(*node.shutdown_signal)()) return false;
1// Wake any threads that may be waiting for the tip to change.
2if (node.notifications) {
3 {
4 // synchronize the shutdown state
5 LOCK(node.notifications->m_tip_block_mutex);
6 }
7 node.notifications->m_tip_block_cv.notify_all();
8}
Thanks, good catch. I knew that pthreads required the lock to be held while notifying to avoid the lost wakeup bug (“if predictable scheduling behavior is required”), and that c++ dropped this requirement, even recommending against it saying “Doing so may be a pessimization…”.
But I didn’t know that c++ still requires the lock to be held before notifying, even though it now recommends against holding the lock while notifying.
It does make sense that after the notifying thread updates the shutdown state, some extra synchronization is needed to make sure the notification is not sent too early, before the waiting thread is actually waiting.
Suggested fix looks right to me. And it looks like this bug might have been present even before this commit, though was much less likely to happen since notify_all call was happening long after the shutdown flag was set, not immediately after.
notify_all
, so doing it here once more during shutdown shouldn’t be noticeable either way.
Added commit fa3ad1da70 bugfix: Mark m_tip_block_cv as guarded by m_tip_block_mutex
It should be obvious that a wait is not needed if the tip does not
match.
Also, remove a comment that the blockTip notification was only meant for
the "UI". (It is used by other stuff for a long time)
Found by Cory Fields in
https://github.com/bitcoin/bitcoin/pull/30409#discussion_r1774001261
This is not strictly required, but all places using m_tip_block_cv
(except shutdown) already take the lock. The annotation makes it easier
to catch potential deadlocks before review.
Adding the missing lock to the shutdown sequence is a bugfix.
An alternative would be to take the lock and release it before
notifying, see
https://github.com/bitcoin/bitcoin/pull/30967#discussion_r1778899716
Instead of having a single NodeContext::shutdown member that is used both to
request shutdowns and check if they have been requested, use separate members
for each. Benefits of this change:
1. Should make code a little clearer and easier to search because it is easier
to see which parts of code are triggering shutdowns and which parts are just
checking to see if they were triggered.
2. Makes it possible for init.cpp to specify additional code to run when a
shutdown is requested, like signalling the m_tip_block_cv condition variable.
Motivation for this change was to remove hacky NodeContext argument and
m_tip_block_cv access from the StopRPC function, so StopRPC can just be
concerned with RPC functionality, not other node functionality.
Also use uint256::ZERO where appropriate for self-documenting code.
They achieve the same, but the extra indirection over boost::signals2 is
not needed.
The tip is set after waiting for the genesis block.
ACK fa22e5c430acaef9713d9a4b4b97bb3f4876f816
Manually tested -stopatheight
, a fatal indexer error (by breaking some code), calling bitcoin-cli stop
, manually quitting qt, calling stop
form the GUI console.
31@@ -35,8 +32,8 @@ static constexpr int DEFAULT_STOPATHEIGHT{0};
32 class KernelNotifications : public kernel::Notifications
33 {
34 public:
35- KernelNotifications(util::SignalInterrupt& shutdown, std::atomic<int>& exit_status, node::Warnings& warnings)
36- : m_shutdown(shutdown), m_exit_status{exit_status}, m_warnings{warnings} {}
37+ KernelNotifications(const std::function<bool()>& shutdown_request, std::atomic<int>& exit_status, node::Warnings& warnings)
38+ : m_shutdown_request(shutdown_request), m_exit_status{exit_status}, m_warnings{warnings} {}
ACK fa22e5c430acaef9713d9a4b4b97bb3f4876f816
Nice improvements to the shutdown handling since I last reviewed. The nits can be ignored.