Introduce waitTipChanged() mining interface, replace RPCNotifyBlockChange, drop CRPCSignals & g_best_block #30409

pull Sjors wants to merge 12 commits into bitcoin:master from Sjors:2024/07/wait-tip-changed changing 22 files +203 −164
  1. Sjors commented at 5:11 pm on July 8, 2024: member

    This continues the work in #30200 so that a future Stratum v2 Template Provider (see #29432) can avoid accessing node internals. It needs to know when a new block arrives in order to push new templates to connected clients.

    waitTipChanged() uses a new kernel notification notifications().m_tip_block_mutex, which this PR also introduces (a previous version used g_best_block).

    In order to ensure the new method works as intended, the waitfornewblock, waitforblock and waitforblockheight RPC methods are refactored to use it. This allows removing RPCNotifyBlockChange.

    There’s a commit to add (direct) tests for the methods that are about to be refactored:

    • waitfornewblock was already implicitly tested by feature_shutdown.py.
    • waitforblockheight by feature_coinstatsindex.py and example_test.py

    This PR renames getTipHash() to getTip() and returns a BlockRef (renamed from BlockKey) so that callers can use either the height or hash.

    The later commits make trivial improvements to the waitfor* RPC calls (not needed for this PR).

    The waitTipChanged() method could probably also be used for the longpoll functionality in getblocktemplate, but I’m a bit reluctant to touch that.

    RPCServer::OnStarted no longer does anything and RPCServer::OnStopped merely prints a log statement. They were added in #5711 as a refactor. This PR drops them entirely.

    Finally g_best_block is also dropped.

  2. DrahtBot commented at 5:11 pm on July 8, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK ryanofsky, TheCharlatan, achow101

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30437 (multiprocess: add bitcoin-mine test program by ryanofsky)
    • #26593 (tracing: Only prepare tracepoint arguments when actually tracing by 0xB10C)
    • #19461 (multiprocess: Add bitcoin-gui -ipcconnect option by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label CI failed on Jul 8, 2024
  4. Sjors force-pushed on Jul 9, 2024
  5. Sjors renamed this:
    Introduce waitTipChanged() mining interface and replace RPCNotifyBlockChange
    Introduce waitTipChanged() mining interface, replace RPCNotifyBlockChange and drop CRPCSignals
    on Jul 9, 2024
  6. Sjors force-pushed on Jul 9, 2024
  7. Sjors commented at 8:17 am on July 9, 2024: member
    Some of the functional tests that use the refactored RPC calls still fail under TSAN, so keeping this draft until that’s resolved.
  8. Sjors force-pushed on Jul 9, 2024
  9. Sjors commented at 8:56 am on July 9, 2024: member

    Fixed the deadlock. I ran into the same issue before when working on #29432: UpdateTip requires holding cs_main. It then takes g_best_block_mutex in order to announces the new block. So then lines after g_best_block_cv.wait_until should release g_best_block_mutex before trying to lock cs_main.

    That’s not a problem here; I just had to break out of the while loop instead of returning from it.

    Back in the other PR this led to a deadlock if we simultaneously submitted a block and received it via p2p from the pool, which can happen in Stratum v2, see #29432 (comment)

  10. Sjors marked this as ready for review on Jul 9, 2024
  11. DrahtBot removed the label CI failed on Jul 9, 2024
  12. Sjors force-pushed on Jul 11, 2024
  13. in src/node/interfaces.cpp:879 in 482a5fd48d outdated
    866+
    867+        auto deadline = std::chrono::steady_clock::now() + timeout;
    868+        {
    869+            WAIT_LOCK(g_best_block_mutex, lock);
    870+            while (!chainman().m_interrupt && std::chrono::steady_clock::now() < deadline) {
    871+                auto check_time = std::chrono::steady_clock::now() + std::min(timeout, MillisecondsDouble(1000));
    


    Sjors commented at 12:35 pm on July 11, 2024:
    I changed this in the last push, previously it would always wait for one second each interrupt “round” even if the requested timeout was shorter.

    ryanofsky commented at 2:24 pm on August 12, 2024:

    In commit “Add waitTipChanged to Mining interface” (3435d7dd430559abffc964c163a2ae73febd99d1)

    I changed this in the last push, previously it would always wait for one second each interrupt “round” even if the requested timeout was shorter.

    This doesn’t seem right. If timeout is 1500ms and the block never changes, it looks like this will wait 1000ms+1000ms instead of 1000ms+500ms. I’m also concerned about the deadline = now() + timeout assignment above because it seems like that will be an overflowed value when timeout is MillisecondsDouble::max() so the while loop condition might be invalid. I fed this to chatgpt and it suggested the following which seems a bit cleaner & safer:

     0auto deadline = (timeout != std::chrono::duration::max()) 
     1    ? std::make_optional(std::chrono::steady_clock::now() + timeout)
     2    : std::nullopt;
     3
     4WAIT_LOCK(g_best_block_mutex, lock);
     5while (g_best_block == current_tip.hash && !chainman().m_interrupt) {
     6    auto now = std::chrono::steady_clock::now();
     7    auto check_time = now + MillisecondsDouble(1000);
     8    if (deadline) {
     9        if (now >= *deadline) {
    10            break;
    11        }
    12        check_time = std::min(*deadline, check_time);
    13    }
    14    g_best_block_cv.wait_until(lock, check_time);
    15}
    
  14. ryanofsky referenced this in commit 4e1a4342f3 on Jul 16, 2024
  15. ryanofsky referenced this in commit e1ae38f4bb on Jul 17, 2024
  16. ryanofsky referenced this in commit e1fa475a5b on Jul 17, 2024
  17. ryanofsky referenced this in commit 5126ac430f on Jul 17, 2024
  18. ryanofsky referenced this in commit bdd68d5bca on Jul 17, 2024
  19. Sjors referenced this in commit 173580b123 on Jul 17, 2024
  20. Sjors referenced this in commit 7967d78d73 on Jul 18, 2024
  21. Sjors referenced this in commit 79336c95fa on Jul 18, 2024
  22. DrahtBot added the label Needs rebase on Jul 18, 2024
  23. Sjors force-pushed on Jul 18, 2024
  24. Sjors commented at 4:59 pm on July 18, 2024: member
    Rebased after #30356 (#include order).
  25. DrahtBot removed the label Needs rebase on Jul 18, 2024
  26. ryanofsky referenced this in commit 8a814d5f93 on Jul 18, 2024
  27. ryanofsky referenced this in commit fba04d7756 on Jul 18, 2024
  28. ryanofsky referenced this in commit 0b76ed6480 on Jul 18, 2024
  29. ryanofsky referenced this in commit b01bf642b7 on Jul 18, 2024
  30. Sjors referenced this in commit 01bc77e850 on Jul 19, 2024
  31. ryanofsky referenced this in commit 0fb50734c3 on Jul 24, 2024
  32. ryanofsky referenced this in commit ec5f45b918 on Jul 24, 2024
  33. ryanofsky referenced this in commit 257cc06e87 on Jul 24, 2024
  34. ryanofsky referenced this in commit 35b80d2260 on Jul 24, 2024
  35. in src/interfaces/mining.h:50 in 4554b3b1ad outdated
    45+     *
    46+     * @param[in] timeout how long to wait for a new tip
    47+     * @returns hash and height for the new tip or the previous tip if
    48+     *          interrupted or after the timeout
    49+     */
    50+    virtual std::pair<uint256, int> waitTipChanged(MillisecondsDouble timeout = MillisecondsDouble::max()) = 0;
    


    ryanofsky commented at 1:49 pm on July 25, 2024:

    In commit “Add waitTipChanged to Mining interface” (4554b3b1adf678e17d479d72738f4f2089971c4a)

    Not important, but it could be good to return BlockKey struct here instead of std::pair. Could also be good to provide a more consistent interface instead of separate getTipHash() and getTipHeight() functions:

    0virtual BlockKey getTip() = 0;
    1virtual BlockKey waitTipChanged(MillisecondsDouble timeout = MillisecondsDouble::max()) = 0;
    

    If you do use BlockKey here would suggest moving it from interfaces/chain.h to interfaces/types.h. Also I think I a name like BlockId or BlockRef would be better than BlockKey in this context, given other implications ofthe word “key”. But feel free to move/rename/reuse BlockKey from chain.h here, or not.


    Sjors commented at 10:47 am on August 12, 2024:
    getTip() and waitTipChanged() now both return BlockRef.
  36. in src/node/interfaces.cpp:866 in 4554b3b1ad outdated
    860@@ -861,6 +861,25 @@ class MinerImpl : public Mining
    861         return tip->GetBlockHash();
    862     }
    863 
    864+    std::pair<uint256, int> waitTipChanged(MillisecondsDouble timeout) override
    865+    {
    866+        uint256 previous_hash{WITH_LOCK(::cs_main, return chainman().ActiveChain().Tip()->GetBlockHash();)};
    


    ryanofsky commented at 1:57 pm on July 25, 2024:

    In commit “Add waitTipChanged to Mining interface” (4554b3b1adf678e17d479d72738f4f2089971c4a)

    I think caller should be required to pass in the previous hash to avoid race conditions. Otherwise, the caller could initiate a waitTipChanged call waiting for the tip to change, and then it could change before this line is executed, and then the previous_hash variable will contain the new tip hash, and the function won’t return until the tip changes a second time. This problem wouldn’t happen if the caller were required to pass in the block hash that they think is the current one.


    Sjors commented at 10:47 am on August 12, 2024:
    Done. I also simplified the waitfornewblock (etc) RPC call to just call getTip() and then pass that into waitTipChanged. That gets rid of struct CUpdatedBlock.
  37. in src/node/interfaces.cpp:874 in 4554b3b1ad outdated
    869+        {
    870+            WAIT_LOCK(g_best_block_mutex, lock);
    871+            while (!chainman().m_interrupt && std::chrono::steady_clock::now() < deadline) {
    872+                auto check_time = std::chrono::steady_clock::now() + std::min(timeout, MillisecondsDouble(1000));
    873+                g_best_block_cv.wait_until(lock, check_time);
    874+                if (g_best_block != previous_hash) break;
    


    ryanofsky commented at 2:01 pm on July 25, 2024:

    In commit “Add waitTipChanged to Mining interface” (4554b3b1adf678e17d479d72738f4f2089971c4a)

    If you adopt suggestion above for caller to pass in the previous_hash value, you will want to delete this line and add g_best_block == previous_hash && to the while loop condition so this function does not block if the hash is already different when it is called.


    Sjors commented at 10:47 am on August 12, 2024:
    Done
  38. in test/functional/rpc_blockchain.py:539 in a18cb7adf1 outdated
    534+        node.invalidateblock(b2.hash)
    535+        node.reconsiderblock(b2.hash)
    536+
    537+        assert_equal(
    538+            node.waitfornewblock(timeout=2)['hash'],
    539+            current_hash)
    


    ryanofsky commented at 2:25 pm on July 25, 2024:

    In commit “rpc: add test for waitforblock and waitfornewblock” (a18cb7adf192bae481c879fe865f1c159b2b8622)

    I don’t think I understand what this test is checking for. Maybe there could be a comment that describes what the test setup is doing what a failed result might look like. It seems like it is just creating a small fork and then invalidating the fork tip, so it shouldn’t affect the most work tip or current chain. I would expect to see wait_until call after the invalidateblock call which verifies it has the intended effect. Also the 2 millisecond timeout seems small so if this is supposed to wait for something it doesn’t seem like it waiting very long.

    Same comments also apply to the _test_waitforblock test below, and maybe it would be good to dedupe the test setup shared by these functions, moving it into a helper method they both could call.


    Sjors commented at 10:47 am on August 12, 2024:
    I don’t fully understand the original test either. Will look into it later.
  39. in src/rpc/blockchain.cpp:260 in e2a1347db0 outdated
    254@@ -255,7 +255,7 @@ void RPCNotifyBlockChange(const CBlockIndex* pindex)
    255 static RPCHelpMan waitfornewblock()
    256 {
    257     return RPCHelpMan{"waitfornewblock",
    258-                "\nWaits for a specific new block and returns useful info about it.\n"
    259+                "\nWaits for any new block and returns useful info about it.\n"
    


    ryanofsky commented at 2:30 pm on July 25, 2024:

    In commit “rpc: fix waitfornewblock description” (e2a1347db07933d1c6e40ab45f2ca4c56c691fed)

    This is a good documentation fix, but as described previously this waitfornewblock interface is inherently racy. Ideally it would accept an current_block option or argument, and return as soon as the tip is not not pointing to that block. This could be something to add in a followup PR, though, because it is not directly relevant to this one.


    Sjors commented at 10:48 am on August 12, 2024:
    I made a note to either add that argument in this PR or make a followup.
  40. in src/init.cpp:442 in 9d3b075261 outdated
    435 
    436 static void OnRPCStopped()
    437 {
    438-    rpc_notify_block_change_connection.disconnect();
    439-    RPCNotifyBlockChange(nullptr);
    440-    g_best_block_cv.notify_all();
    


    ryanofsky commented at 2:32 pm on July 25, 2024:

    In commit “Replace RPCNotifyBlockChange with waitTipChanged()” (9d3b07526110a842c9592425b89785e688a8f46c)

    g_best_block_cv still exists, so I don’t think you should delete this line

    Can move it to the place this function is called when this function is deleted in the next commit.


    Sjors commented at 10:48 am on August 12, 2024:
    I moved it to StopRPC().
  41. in src/rpc/blockchain.cpp:327 in 9d3b075261 outdated
    329+    block.height = CHECK_NONFATAL(miner.getTipHeight()).value();
    330+
    331+    const auto deadline{std::chrono::steady_clock::now() + 1ms * timeout};
    332+    while(IsRPCRunning() && (timeout == 0 || std::chrono::steady_clock::now() < deadline)) {
    333+        if (block.hash == hash) break;
    334+        auto res = timeout ? miner.waitTipChanged(std::chrono::milliseconds(timeout)) : miner.waitTipChanged();
    


    ryanofsky commented at 2:35 pm on July 25, 2024:

    In commit “Replace RPCNotifyBlockChange with waitTipChanged()” (9d3b07526110a842c9592425b89785e688a8f46c)

    Should pass block.hash as waitTipChanged argument here to avoid race condition (see earlier suggestion)


    Sjors commented at 10:48 am on August 12, 2024:
    Done
  42. ryanofsky commented at 2:47 pm on July 25, 2024: contributor
    Code review e367258c9b37f0494bdc9d36a6a48adb62f9c94f. This is a nice change, but I didn’t ack because I think it would be a regression to not signal g_best_block_cv on shutdown. I posted other suggestions to consider too but they are not critical.
  43. Sjors force-pushed on Aug 12, 2024
  44. Sjors commented at 10:47 am on August 12, 2024: member

    Rebased and restored signalling g_best_block_cv on shutdown.

    I created interfaces/types.h, moved BlockKey there and renamed it to BlockRef. I then renamed getTipHash() to getTip() instead of adding getTipHeight(). waitTipChanged also returns a BlockKey.

    Addressed inline comments, except that I still need to look into the tests.

  45. Sjors commented at 11:43 am on August 12, 2024: member

    As for race conditions with RPC calls that rely on waitForBlock():

    • waitforblock [hash]: the while loop breaks if block.hash == hash, which together with the early return inside waitTipChanged should prevent any blockage
    • waitforblockheight: similar to waitforblock, the while loop checks block.height >= height
    • waitfornewblock: I opened a draft PR to add an optional blockhash argument #30635
  46. Sjors force-pushed on Aug 12, 2024
  47. Sjors force-pushed on Aug 12, 2024
  48. Sjors commented at 1:37 pm on August 12, 2024: member

    I simplified the waitforblock and waitfornewblock tests.

    They’re not very thorough at the moment, because reconsiderblock is either atomic or too fast. A better way to test the asynchronous behaviour would be to spin up a second node, and feed it blocks through add_p2p_connection.

  49. Sjors force-pushed on Aug 12, 2024
  50. DrahtBot commented at 1:57 pm on August 12, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/28654787886

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  51. DrahtBot added the label CI failed on Aug 12, 2024
  52. in src/interfaces/mining.h:52 in dcf984b055 outdated
    49+     * @param[in] current_tip the current tip, in case of a race condition
    50+     * @param[in] timeout how long to wait for a new tip
    51+     * @returns hash and height for the new tip or the previous tip if
    52+     *          interrupted or after the timeout
    53+     */
    54+    virtual BlockRef waitTipChanged(BlockRef current_tip, MillisecondsDouble timeout = MillisecondsDouble::max()) = 0;
    


    ryanofsky commented at 3:08 pm on August 12, 2024:

    In commit “Add waitTipChanged to Mining interface” (3435d7dd430559abffc964c163a2ae73febd99d1)

    I think it would be slightly better for current_tip to just be a block hash, rather than hash and height, because the height value is unused, and a accepting height might create misleading impression that this method could be called to wait for a height to be reached.


    Sjors commented at 9:53 am on August 13, 2024:
    Done
  53. in src/rpc/blockchain.cpp:289 in d919a23056 outdated
    274+    const auto deadline{std::chrono::steady_clock::now() + 1ms * timeout};
    275+    auto block{CHECK_NONFATAL(miner.getTip()).value()};
    276+
    277+    while(IsRPCRunning() && (timeout == 0 || std::chrono::steady_clock::now() < deadline)) {
    278+        block = timeout ? miner.waitTipChanged(block, std::chrono::milliseconds(timeout)) : miner.waitTipChanged(block);
    279     }
    


    ryanofsky commented at 3:10 pm on August 12, 2024:

    In commit “Replace RPCNotifyBlockChange with waitTipChanged()” (d919a230563bd2efee1fccb83194b43f436781ef)

    It seems like the implementation of the this loop always waits for the timeout to be reached, and doesn’t exit early if the tip did change. So if timeout is 0 it will just infinite loop.

    I think if it can be simplified to just:

    0    if (IsRPCRunning()) {
    1        block = timeout 
    2            ? miner.waitTipChanged(block, std::chrono::milliseconds(timeout))
    3            : miner.waitTipChanged(block);
    4    }
    

    And you could get rid of the deadline variable.


    Sjors commented at 8:39 am on August 13, 2024:
    Indeed, this should not have been a while loop. I reproduced the infinite loop here.
  54. in src/rpc/blockchain.cpp:316 in d919a23056 outdated
    319+
    320+    auto block{CHECK_NONFATAL(miner.getTip()).value()};
    321+    const auto deadline{std::chrono::steady_clock::now() + 1ms * timeout};
    322+    while(IsRPCRunning() && (timeout == 0 || std::chrono::steady_clock::now() < deadline)) {
    323+        if (block.hash == hash) break;
    324+        block = timeout ? miner.waitTipChanged(block, std::chrono::milliseconds(timeout)) : miner.waitTipChanged(block);
    


    ryanofsky commented at 3:31 pm on August 12, 2024:

    In commit “Replace RPCNotifyBlockChange with waitTipChanged()” (d919a230563bd2efee1fccb83194b43f436781ef)

    Passing timeout parameter doesn’t seem right, because it could wait too long. For example if timeout is 10 seconds, and tip changed after 8 seconds but did not match expected block hash, the next wait call will wait 10 seconds instead of 2 seconds.

    Might be better to replace with:

    0while (IsRPCRunning()) {
    1    auto now = timeout ? std::chrono::steady_clock::now() : std::chrono::steady_clock::time_point::min();
    2    if (block.hash == hash || now > deadline) break;
    3    block = timeout ? miner.waitTipChanged(block, deadline - now) : miner.waitTipChanged(block);
    4}
    

    Sjors commented at 9:52 am on August 13, 2024:
    Fixed, though in a slightly different way.
  55. in src/rpc/blockchain.cpp:364 in d919a23056 outdated
    367+
    368+    auto block{CHECK_NONFATAL(miner.getTip()).value()};
    369+    const auto deadline{std::chrono::steady_clock::now() + 1ms * timeout};
    370+    while(IsRPCRunning() && (timeout == 0 || std::chrono::steady_clock::now() < deadline)) {
    371+        if (block.height >= height) break;
    372+        block = timeout ? miner.waitTipChanged(block, std::chrono::milliseconds(timeout)) : miner.waitTipChanged(block);
    


    ryanofsky commented at 3:34 pm on August 12, 2024:

    In commit “Replace RPCNotifyBlockChange with waitTipChanged()” (d919a230563bd2efee1fccb83194b43f436781ef)

    Seems like this has same problem as waitforblock where waitTipChanged could wait too long because it is called with wrong timeout value. Might be better as:

    0while (IsRPCRunning()) {
    1    auto now = timeout ? std::chrono::steady_clock::now() : std::chrono::steady_clock::time_point::min();
    2    if (block.height >= height || now > deadline) break;
    3    block = timeout ? miner.waitTipChanged(block, deadline - now) : miner.waitTipChanged(block);
    4}
    

    Sjors commented at 9:52 am on August 13, 2024:
    Fixed (in the same way as waitforblock)
  56. ryanofsky commented at 4:02 pm on August 12, 2024: contributor
    Code review dcf984b055a367facf7704eb8055619d6fe64a55. I think there might be an infinite loop bug if waitfornewblock is called with timeout of 0, and it looks like there are some other quirks. Left a few suggestions below.
  57. DrahtBot removed the label CI failed on Aug 12, 2024
  58. Sjors force-pushed on Aug 13, 2024
  59. Sjors commented at 9:53 am on August 13, 2024: member

    Re #30409 (review)

    it looks like this will wait 1000ms+1000ms instead of 1000ms+500ms

    Indeed it seems to pick the wrong wait time after the first inner loop. If fixed that and the overflow, though in a slightly different way than the AI overlord suggested.

    I also added a g_best_block == uint256() check because it starts 0 until the first UpdateTip().

    Changed current_tip argument to uint256 instead of BlockRef.

    I also added a commit to recommend -rpcclienttimeout=0 for these three calls.

  60. Sjors referenced this in commit 2cff441694 on Aug 13, 2024
  61. Sjors referenced this in commit 0fc045978e on Aug 13, 2024
  62. Sjors referenced this in commit 1758b5d2b4 on Aug 13, 2024
  63. Sjors referenced this in commit 27630ec524 on Aug 13, 2024
  64. in src/interfaces/mining.h:47 in d1c197fbba outdated
    40@@ -40,6 +41,16 @@ class Mining
    41     //! Returns the hash and height for the tip of this chain
    42     virtual std::optional<BlockRef> getTip() = 0;
    43 
    44+    /**
    45+     * Waits for the tip to change
    46+     *
    47+     * @param[in] current_tip the current tip, in case of a race condition
    


    ryanofsky commented at 2:32 pm on August 16, 2024:

    In commit “Add waitTipChanged to Mining interface” (d1c197fbbad86e3f966d9bf9abe4dd0aa76bc83d)

    Not sure “in case of race condition” actually explains this. Would suggest changing this to:

    • current_tip - block hash compared against the current chain tip. Function waits for the chain tip to change if this matches, otherwise it returns right away.

    Sjors commented at 7:18 am on August 19, 2024:
    Taken (mostly).
  65. in src/interfaces/mining.h:50 in d1c197fbba outdated
    45+     * Waits for the tip to change
    46+     *
    47+     * @param[in] current_tip the current tip, in case of a race condition
    48+     * @param[in] timeout how long to wait for a new tip
    49+     * @returns hash and height for the new tip or the previous tip if
    50+     *          interrupted or after the timeout
    


    ryanofsky commented at 2:39 pm on August 16, 2024:

    In commit “Add waitTipChanged to Mining interface” (d1c197fbbad86e3f966d9bf9abe4dd0aa76bc83d)

    Think it would be simpler to say “returns hash and height of the current chain tip after this call.”

    It seems potentially misleading to refer to a “previous” tip here because that sounds like this could sometimes return an old block hash. And it’s a little confusing to refer to a “new “tip in the case where this function just returns immediately and there no interruption or wait or timeout.


    Sjors commented at 7:18 am on August 19, 2024:
    Done
  66. in src/node/interfaces.cpp:890 in d1c197fbba outdated
    885+                if (now >= deadline) break;
    886+                const MillisecondsDouble remaining{deadline - now};
    887+                const auto interval{std::min(remaining, tick)};
    888+                g_best_block_cv.wait_until(lock, now + interval);
    889+                // Obtaining the height here using chainman().ActiveChain().Tip()->nHeight
    890+                // would result in a deadlock, because UpdateTip requires holding cs_main.
    


    ryanofsky commented at 2:57 pm on August 16, 2024:

    In commit “Add waitTipChanged to Mining interface” (d1c197fbbad86e3f966d9bf9abe4dd0aa76bc83d)

    I think this comment is only warning about a small part of a bigger danger. It’s not just unsafe to obtain the height here but unsafe to acquire cs_main anywhere in this code block, even after the while loop.

    Would suggest dropping this comment and instead adding a simpler comment above LOCK(::cs_main) that says “Must release g_best_block_mutex before locking cs_main, to avoid deadlocks.”

    A comment like that seems more likely to prevent deadlocks generally in this function.

  67. in src/node/interfaces.cpp:888 in d1c197fbba outdated
    883+            while ((g_best_block == uint256() || g_best_block == current_tip) && !chainman().m_interrupt) {
    884+                now = std::chrono::steady_clock::now();
    885+                if (now >= deadline) break;
    886+                const MillisecondsDouble remaining{deadline - now};
    887+                const auto interval{std::min(remaining, tick)};
    888+                g_best_block_cv.wait_until(lock, now + interval);
    


    ryanofsky commented at 3:06 pm on August 16, 2024:

    In commit “Add waitTipChanged to Mining interface” (d1c197fbbad86e3f966d9bf9abe4dd0aa76bc83d)

    Not important but I think it would be clearer to condense these 3 lines to just:

    0g_best_block_cv.wait_until(lock, std::min(deadline, now + tick));
    

    I don’t think it helps to introduce 2 extra variables and subtract now then add now back.


    Sjors commented at 7:25 am on August 19, 2024:
    Indeed that’s nicer.
  68. ryanofsky approved
  69. ryanofsky commented at 3:28 pm on August 16, 2024: contributor
    Code review ACK 9ba60b3a2c22dcd1c9e7b4880f0adb678e4e8d63. Just fixed a few bugs to make waiting more reliable since last review and changed test logging. I like the new change to check for overflows generally in waitTipChanged, instead of just checking specifically for the max timeout value. That was better than the suggestion.
  70. Sjors force-pushed on Aug 19, 2024
  71. in src/rpc/blockchain.cpp:315 in bf3377f4bf outdated
    318+    Mining& miner = EnsureMining(node);
    319+
    320+    auto block{CHECK_NONFATAL(miner.getTip()).value()};
    321+    const auto deadline{std::chrono::steady_clock::now() + 1ms * timeout};
    322+    while (IsRPCRunning()) {
    323+        auto now{std::chrono::steady_clock::now()};
    


    ryanofsky commented at 1:01 pm on August 19, 2024:

    In commit “Replace RPCNotifyBlockChange with waitTipChanged()” (bf3377f4bf988f2477944ca54c4dee33b9b45b6e)

    now variable is unused if timeout is zero, so this line could be moved inside the if (timeout) block below.


    Sjors commented at 2:15 pm on August 19, 2024:
    Done here and in waitforblockheight.
  72. ryanofsky approved
  73. ryanofsky commented at 1:08 pm on August 19, 2024: contributor

    Code review ACK d880419cb65fd9d446057ea05384dad55da0f292. Only changes since list review are documentation improvements and a small code simplification.

    Overall this PR makes nice improvements to the waitfor* RPCs and simplifies the code getting rid of boost signals and a global variable.

  74. Sjors force-pushed on Aug 19, 2024
  75. in src/rpc/blockchain.cpp:316 in b4d37fa705 outdated
    318+    Mining& miner = EnsureMining(node);
    319+
    320+    auto block{CHECK_NONFATAL(miner.getTip()).value()};
    321+    const auto deadline{std::chrono::steady_clock::now() + 1ms * timeout};
    322+    while (IsRPCRunning()) {
    323+        if (block.hash == hash) break;
    


    ryanofsky commented at 2:24 pm on August 19, 2024:

    In commit “Replace RPCNotifyBlockChange with waitTipChanged()” (b4d37fa70506129ba4a4d218bc49ce8ca9d8380c)

    Not important, but could simplify these two lines to while (IsRPCRunning() && block.hash != hash)

    Similarly, could simplify to while (IsRPCRunning() && block.height < height) in waitforblockheight below.


    Sjors commented at 2:56 pm on August 19, 2024:
    Thanks, will do if I have to retouch.

    Sjors commented at 9:46 am on August 26, 2024:
    Done
  76. ryanofsky approved
  77. ryanofsky commented at 2:27 pm on August 19, 2024: contributor
    Code review ACK e8176694bb5508b2b488a8c79a6001d2fa625302, just reduced a variable scope since last review.
  78. in src/rpc/blockchain.cpp:288 in b4d37fa705 outdated
    272+    NodeContext& node = EnsureAnyNodeContext(request.context);
    273+    Mining& miner = EnsureMining(node);
    274+
    275+    auto block{CHECK_NONFATAL(miner.getTip()).value()};
    276+    if (IsRPCRunning()) {
    277+        block = timeout ? miner.waitTipChanged(block.hash, std::chrono::milliseconds(timeout)) : miner.waitTipChanged(block.hash);
    


    TheCharlatan commented at 1:52 pm on August 22, 2024:

    In commit b4d37fa70506129ba4a4d218bc49ce8ca9d8380c

    I think there is a significant change in behaviour here between waitTipChanged, or rather g_best_block, and the rpc notification through blockTip. AFAICT the blockTip notification reports on completed operations in ActivateBestChain and InvalidatedBlock, while g_best_block is updated through the lower-level ConnectTip and DisconnectTip, which may respectively be called iteratively by ActivateBestChain and InvalidateBlock. So for operations involving multiple blocks this might yield different results here. I’m not sure if these RPCs should report on these more ephemeral state changes rather than the results of the respective operations.

    I have not actually tested this, so I might be mistaken here, in which case I think it would still be good to have a note in the commit message comparing the two methods.


    Sjors commented at 3:27 pm on August 22, 2024:
    g_best_block_cv.notify_all(); is called by Chainstate::UpdateTip(), which indeed is called by ConnectTip and DisconnectTip(). Still wrapping my head around what the previous approach was. I doubt these RPC’s were used outside of tests, but it would make sense to document the different.

    ryanofsky commented at 4:35 pm on August 22, 2024:

    re: #30409 (review)

    This is a good catch, but I"m not sure there is a observable change of behavior in ActivateBestChain because cs_main is held the whole time blockTip / uiInterface.NotifyBlockTip / RPCNotifyBlockChange is called and while the old latestblock global (now replaced by g_best_block) was updated:

    https://github.com/bitcoin/bitcoin/blob/5ce2285b8739e12eebd1d53358038ec2277c662b/src/validation.cpp#L3552

    So I don’t think waitTipChanged will actually return during any intermediate tip changes, because it can’t return until it locks cs_main, so it should just report the latest block as before.

    However, InvalidBlock seems like a different story because it is releasing cs_main between disconnecting blocks:

    https://github.com/bitcoin/bitcoin/blob/5ce2285b8739e12eebd1d53358038ec2277c662b/src/validation.cpp#L3701

    but it would previously only notify about a tip change and update latestblock at the end:

    https://github.com/bitcoin/bitcoin/blob/5ce2285b8739e12eebd1d53358038ec2277c662b/src/validation.cpp#L3783

    Now it will update g_best_block and release cs_main while it is disconnecting blocks. This seems like something worth documenting, but probably not likely to cause real problems.


    TheCharlatan commented at 8:33 pm on August 22, 2024:

    Couple of months ago I was thinking how to get rid of g_best_block and its friends entirely in the context of the kernel library. We already have two interfaces for notifying if something changed during validation, having a third one, that is also globally mutable and potentially dangerous if multiple chainstate managers are used is not ideal. I did not make progress then, but looking at it again now, I think this might be the opportunity to get rid of it.

    How about duplicating the best block logic into the kernel notifications like I’ve done here https://github.com/TheCharlatan/bitcoin/commit/3c665065f132fd80ea3b1733785b98e45c8b371e (our locking conventions still puzzle me a bit, maybe there is a better way to do it)? This would avoid any change in behavior altogether and mostly make this a refactor. In a follow-up the remaining users of g_best_block could then be migrated and the global deleted.

    I think the current user of g_best_block (getblocktemplate() in mining.cpp), could be migrated to this interface as well. I don’t really see the downside of not being notified on these interim states, since it is undesirable to build work on top of them.


    Sjors commented at 8:44 am on August 26, 2024:
    @TheCharlatan do you want to make a separate PR for the kernel notification changes, or should I just include them here?

    TheCharlatan commented at 8:47 am on August 26, 2024:
    I think it would be good to integrate them here, and I’ll try and follow up with removing the m_best_header global from validation.

    ryanofsky commented at 3:41 pm on August 26, 2024:

    If we are going to add a new m_tip_block variable to replace g_best_block seems like it would be good to make replacement complete and eliminate g_best_block. I think following could work (compiles but untested):

      0--- a/src/rpc/mining.cpp
      1+++ b/src/rpc/mining.cpp
      2@@ -739,7 +739,6 @@ static RPCHelpMan getblocktemplate()
      3     {
      4         // Wait to respond until either the best block changes, OR a minute has passed and there are more transactions
      5         uint256 hashWatchedChain;
      6-        std::chrono::steady_clock::time_point checktxtime;
      7         unsigned int nTransactionsUpdatedLastLP;
      8 
      9         if (lpval.isStr())
     10@@ -760,19 +759,14 @@ static RPCHelpMan getblocktemplate()
     11         // Release lock while waiting
     12         LEAVE_CRITICAL_SECTION(cs_main);
     13         {
     14-            checktxtime = std::chrono::steady_clock::now() + std::chrono::minutes(1);
     15-
     16-            WAIT_LOCK(g_best_block_mutex, lock);
     17-            while (g_best_block == hashWatchedChain && IsRPCRunning())
     18-            {
     19-                if (g_best_block_cv.wait_until(lock, checktxtime) == std::cv_status::timeout)
     20-                {
     21-                    // Timeout: Check transactions for update
     22-                    // without holding the mempool lock to avoid deadlocks
     23-                    if (miner.getTransactionsUpdated() != nTransactionsUpdatedLastLP)
     24-                        break;
     25-                    checktxtime += std::chrono::seconds(10);
     26-                }
     27+            MillisecondsDouble checktxtime{std::chrono::minutes(1)};
     28+            while (tip == hashWatchedChain && IsRPCRunning()) {
     29+                tip = miner.waitTipChanged(hashWatchedChain, checktxtime).hash;
     30+                // Timeout: Check transactions for update
     31+                // without holding the mempool lock to avoid deadlocks
     32+                if (miner.getTransactionsUpdated() != nTransactionsUpdatedLastLP)
     33+                    break;
     34+                checktxtime = std::chrono::seconds(10);
     35             }
     36         }
     37         ENTER_CRITICAL_SECTION(cs_main);
     38--- a/src/rpc/server.cpp
     39+++ b/src/rpc/server.cpp
     40@@ -303,7 +303,6 @@ void StopRPC(const std::any& context)
     41         LogPrint(BCLog::RPC, "Stopping RPC\n");
     42         WITH_LOCK(g_deadline_timers_mutex, deadlineTimers.clear());
     43         DeleteAuthCookie();
     44-        g_best_block_cv.notify_all();
     45         Assert(EnsureAnyNodeContext(context).notifications)->m_tip_block_cv.notify_all();
     46         LogPrint(BCLog::RPC, "RPC stopped.\n");
     47     });
     48--- a/src/test/validation_chainstate_tests.cpp
     49+++ b/src/test/validation_chainstate_tests.cpp
     50@@ -4,6 +4,7 @@
     51 //
     52 #include <chainparams.h>
     53 #include <consensus/validation.h>
     54+#include <node/kernel_notifications.h>
     55 #include <random.h>
     56 #include <rpc/blockchain.h>
     57 #include <sync.h>
     58@@ -69,14 +70,14 @@ BOOST_AUTO_TEST_CASE(validation_chainstate_resize_caches)
     59 BOOST_FIXTURE_TEST_CASE(chainstate_update_tip, TestChain100Setup)
     60 {
     61     ChainstateManager& chainman = *Assert(m_node.chainman);
     62-    uint256 curr_tip = ::g_best_block;
     63+    uint256 curr_tip = m_node.notifications->m_tip_block;
     64 
     65     // Mine 10 more blocks, putting at us height 110 where a valid assumeutxo value can
     66     // be found.
     67     mineBlocks(10);
     68 
     69     // After adding some blocks to the tip, best block should have changed.
     70-    BOOST_CHECK(::g_best_block != curr_tip);
     71+    BOOST_CHECK(m_node.notifications->m_tip_block != curr_tip);
     72 
     73     // Grab block 1 from disk; we'll add it to the background chain later.
     74     std::shared_ptr<CBlock> pblockone = std::make_shared<CBlock>();
     75@@ -91,15 +92,15 @@ BOOST_FIXTURE_TEST_CASE(chainstate_update_tip, TestChain100Setup)
     76     // Ensure our active chain is the snapshot chainstate.
     77     BOOST_CHECK(WITH_LOCK(::cs_main, return chainman.IsSnapshotActive()));
     78 
     79-    curr_tip = ::g_best_block;
     80+    curr_tip = m_node.notifications->m_tip_block;
     81 
     82     // Mine a new block on top of the activated snapshot chainstate.
     83     mineBlocks(1);  // Defined in TestChain100Setup.
     84 
     85     // After adding some blocks to the snapshot tip, best block should have changed.
     86-    BOOST_CHECK(::g_best_block != curr_tip);
     87+    BOOST_CHECK(m_node.notifications->m_tip_block != curr_tip);
     88 
     89-    curr_tip = ::g_best_block;
     90+    curr_tip = m_node.notifications->m_tip_block;
     91 
     92     BOOST_CHECK_EQUAL(chainman.GetAll().size(), 2);
     93 
     94@@ -138,7 +139,7 @@ BOOST_FIXTURE_TEST_CASE(chainstate_update_tip, TestChain100Setup)
     95     // g_best_block should be unchanged after adding a block to the background
     96     // validation chain.
     97     BOOST_CHECK(block_added);
     98-    BOOST_CHECK_EQUAL(curr_tip, ::g_best_block);
     99+    BOOST_CHECK_EQUAL(curr_tip, m_node.notifications->m_tip_block);
    100 }
    101 
    102 BOOST_AUTO_TEST_SUITE_END()
    103--- a/src/validation.cpp
    104+++ b/src/validation.cpp
    105@@ -107,10 +107,6 @@ const std::vector<std::string> CHECKLEVEL_DOC {
    106  * */
    107 static constexpr int PRUNE_LOCK_BUFFER{10};
    108 
    109-GlobalMutex g_best_block_mutex;
    110-std::condition_variable g_best_block_cv;
    111-uint256 g_best_block;
    112-
    113 const CBlockIndex* Chainstate::FindForkInGlobalIndex(const CBlockLocator& locator) const
    114 {
    115     AssertLockHeld(cs_main);
    116@@ -2987,12 +2983,6 @@ void Chainstate::UpdateTip(const CBlockIndex* pindexNew)
    117         m_mempool->AddTransactionsUpdated(1);
    118     }
    119 
    120-    {
    121-        LOCK(g_best_block_mutex);
    122-        g_best_block = pindexNew->GetBlockHash();
    123-        g_best_block_cv.notify_all();
    124-    }
    125-
    126     std::vector<bilingual_str> warning_messages;
    127     if (!m_chainman.IsInitialBlockDownload()) {
    128         const CBlockIndex* pindex = pindexNew;
    129--- a/src/validation.h
    130+++ b/src/validation.h
    131@@ -85,11 +85,6 @@ enum class SynchronizationState {
    132     POST_INIT
    133 };
    134 
    135-extern GlobalMutex g_best_block_mutex;
    136-extern std::condition_variable g_best_block_cv;
    137-/** Used to notify getblocktemplate RPC of new tips. */
    138-extern uint256 g_best_block;
    139-
    140 /** Documentation for argument 'checklevel'. */
    141 extern const std::vector<std::string> CHECKLEVEL_DOC;
    142 
    

    Sjors commented at 4:49 pm on August 26, 2024:

    Nice! I split the changes to rpc/mining.cpp into its own commit, that looks correct.

    Running and debugging some failing tests before pushing…


    TheCharlatan commented at 7:35 pm on August 26, 2024:
    Thanks ryanofsky, that is exactly what I had in mind for a potential followup - happy to have this in one go. I still need to think through the trade-offs for the block template rpc, but I think this might actually be better.
  79. Sjors force-pushed on Aug 26, 2024
  80. Sjors commented at 9:39 am on August 26, 2024: member
    I incorporated https://github.com/TheCharlatan/bitcoin/commit/3c665065f132fd80ea3b1733785b98e45c8b371e so the waitTipChanged() implementation uses the kernel signal instead of g_best_block. Also rebased.
  81. Sjors force-pushed on Aug 26, 2024
  82. in src/node/interfaces.cpp:952 in 078fc60f19 outdated
    882+        {
    883+            WAIT_LOCK(notifications().m_tip_block_mutex, lock);
    884+            while ((notifications().m_tip_block == uint256() || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
    885+                now = std::chrono::steady_clock::now();
    886+                if (now >= deadline) break;
    887+                notifications().m_tip_block_cv.wait_until(lock, std::min(deadline, now + tick));
    


    ryanofsky commented at 3:35 pm on August 26, 2024:

    In commit “Add waitTipChanged to Mining interface” (078fc60f198eb7a3bcccb557416fe1d4498f26e7)

    One problem with switching from g_best_block_cv to m_tip_block_cv is that this m_tip_block_cv is not signaled when StopRPC is called, which could mean shutdown hangs for a second instead of being instantaneous. I believe this could be fixed in earlier commit 8400d49ffec47add2dc27267ae13ede5d2fabeea with:

     0--- a/src/init.cpp
     1+++ b/src/init.cpp
     2@@ -282,7 +282,7 @@ void Shutdown(NodeContext& node)
     3 
     4     StopHTTPRPC();
     5     StopREST();
     6-    StopRPC();
     7+    StopRPC(&node);
     8     StopHTTPServer();
     9     for (const auto& client : node.chain_clients) {
    10         client->flush();
    11--- a/src/node/interfaces.cpp
    12+++ b/src/node/interfaces.cpp
    13@@ -139,7 +139,7 @@ public:
    14         // Stop RPC for clean shutdown if any of waitfor* commands is executed.
    15         if (args().GetBoolArg("-server", false)) {
    16             InterruptRPC();
    17-            StopRPC();
    18+            StopRPC(m_context);
    19         }
    20     }
    21     bool shutdownRequested() override { return ShutdownRequested(*Assert(m_context)); };
    22--- a/src/rpc/server.cpp
    23+++ b/src/rpc/server.cpp
    24@@ -11,6 +11,7 @@
    25 #include <common/system.h>
    26 #include <logging.h>
    27 #include <node/context.h>
    28+#include <node/kernel_notifications.h>
    29 #include <rpc/server_util.h>
    30 #include <rpc/util.h>
    31 #include <sync.h>
    32@@ -293,16 +294,17 @@ void InterruptRPC()
    33     });
    34 }
    35 
    36-void StopRPC()
    37+void StopRPC(const std::any& context)
    38 {
    39     static std::once_flag g_rpc_stop_flag;
    40     // This function could be called twice if the GUI has been started with -server=1.
    41     assert(!g_rpc_running);
    42-    std::call_once(g_rpc_stop_flag, []() {
    43+    std::call_once(g_rpc_stop_flag, [&]() {
    44         LogPrint(BCLog::RPC, "Stopping RPC\n");
    45         WITH_LOCK(g_deadline_timers_mutex, deadlineTimers.clear());
    46         DeleteAuthCookie();
    47         g_best_block_cv.notify_all();
    48+        Assert(EnsureAnyNodeContext(context).notifications)->m_tip_block_cv.notify_all();
    49         LogPrint(BCLog::RPC, "RPC stopped.\n");
    50     });
    51 }
    52--- a/src/rpc/server.h
    53+++ b/src/rpc/server.h
    54@@ -9,6 +9,7 @@
    55 #include <rpc/request.h>
    56 #include <rpc/util.h>
    57 
    58+#include <any>
    59 #include <functional>
    60 #include <map>
    61 #include <stdint.h>
    62@@ -172,7 +173,7 @@ extern CRPCTable tableRPC;
    63 
    64 void StartRPC();
    65 void InterruptRPC();
    66-void StopRPC();
    67+void StopRPC(const std::any& context);
    68 UniValue JSONRPCExec(const JSONRPCRequest& jreq, bool catch_errors);
    69 
    70 #endif // BITCOIN_RPC_SERVER_H
    
  83. ryanofsky commented at 3:46 pm on August 26, 2024: contributor
    Code review 93176016322687f3ad9ade8f32c1ff1392ee6470, but I’m concerned code is failing to signal m_tip_block_cv on shutdown so it could make shutdown slower. Suggested a fix below. It also seems like it would be easy to eliminate g_best_block completely, so suggested a way to do that too.
  84. Sjors force-pushed on Aug 26, 2024
  85. Sjors commented at 5:26 pm on August 26, 2024: member

    Alright then, dropping g_best_block as well!

    Some tests failed with EnsureAnyNodeContext(context).notifications, so I fixed that (relative to the above patches). In init.cpp the RPC server starts at step 4a, but the notifications are added at step 7. This case is hit by several functional tests that shut the node down early.

  86. Sjors renamed this:
    Introduce waitTipChanged() mining interface, replace RPCNotifyBlockChange and drop CRPCSignals
    Introduce waitTipChanged() mining interface, replace RPCNotifyBlockChange, drop CRPCSignals & g_best_block
    on Aug 26, 2024
  87. in src/node/kernel_notifications.cpp:55 in eb56e9a819 outdated
    49@@ -50,6 +50,12 @@ namespace node {
    50 
    51 kernel::InterruptResult KernelNotifications::blockTip(SynchronizationState state, CBlockIndex& index)
    52 {
    53+    {
    54+        LOCK(m_tip_block_mutex);
    55+        m_tip_block= index.GetBlockHash();
    


    TheCharlatan commented at 8:33 am on August 28, 2024:
    Self-Nit: I missed the space here.

    Sjors commented at 1:41 pm on August 29, 2024:
    Fixed
  88. Sjors commented at 12:22 pm on August 29, 2024: member
    Rebased to ensure it still works with CMake. This PR only adds a header file (interfaces/types.h) which don’t need to be added to CMakeLists.txt. If #30664 lands before this PR, I’ll have to drop the addition to src/Makefile.am.
  89. Sjors force-pushed on Aug 29, 2024
  90. Sjors force-pushed on Aug 29, 2024
  91. Sjors commented at 2:23 pm on August 29, 2024: member
    Based on IRC discussion today it appears there was no need to keep src/Makefile.am, so I dropped that: 1b2cdef410abb7fa900c52809ebca249cdaa099c -> 02272716dce292618453d0407988118bc5f07140.
  92. Sjors force-pushed on Aug 29, 2024
  93. DrahtBot added the label Needs rebase on Sep 2, 2024
  94. Sjors force-pushed on Sep 3, 2024
  95. DrahtBot removed the label Needs rebase on Sep 3, 2024
  96. Sjors commented at 9:03 am on September 3, 2024: member
    Rebased after #8230.
  97. TheCharlatan commented at 9:18 am on September 3, 2024: contributor

    Rebased after #8230.

    Huh, that was merged 8 years ago? :smile:

  98. Sjors commented at 10:37 am on September 3, 2024: member
    @TheCharlatan oops, I meant #30750.
  99. DrahtBot added the label Needs rebase on Sep 3, 2024
  100. Sjors commented at 9:12 am on September 4, 2024: member
    Rebased after trivial conflict with #29553.
  101. Sjors force-pushed on Sep 4, 2024
  102. DrahtBot removed the label Needs rebase on Sep 4, 2024
  103. ryanofsky approved
  104. ryanofsky commented at 6:19 pm on September 4, 2024: contributor
    Code review ACK 6c419285720fe5a954e01f09cbe053a0f51bef47. Since last review just rebased, added suggested m_tip_block_cv notify call on shutdown, and added two new commits dropping g_best_block.
  105. TheCharlatan approved
  106. TheCharlatan commented at 8:35 am on September 5, 2024: contributor

    ACK 6c419285720fe5a954e01f09cbe053a0f51bef47

    But I think it would be good to get more review here. I am fairly convinced that the changes to when the notifications are issued now are good, but the change in logic is a bit tricky to follow.

  107. Sjors commented at 11:27 am on September 5, 2024: member
    Paging some people who touched this code in recent years… well, decades: @sipa #12743, @theuni #8680 (d6a5dc4a2eaa0d7348804254ca09e75fc3a858ab), @luke-jr & @laanwj #4503.
  108. Sjors commented at 11:41 am on September 5, 2024: member

    the change in logic is a bit tricky to follow.

    Any particular commit that’s problematic? I could try splitting it.

  109. TheCharlatan commented at 12:21 pm on September 5, 2024: contributor

    Any particular commit that’s problematic? I could try splitting it.

    I don’t think so. The logic in the changed lines is relatively easy to understand, but it takes a bit of time to check where the notifications are issued and what the differences between the two might be. I don’t think dividing anything here might help with that.

  110. ryanofsky commented at 2:24 pm on September 5, 2024: contributor

    the change in logic is a bit tricky to follow.

    Any particular commit that’s problematic? I could try splitting it.

    I think the PR is good in it’s current form, but if I wanted to be more confident it wasn’t changing behavior I would not change code from using g_best_block and ::latestblock directly to calling waitTipChanged(). Instead, I would first remove g_best_block and replace it with KernelNotifications::m_tip_block, then I would remove ::latestblock and replace it with KernelNotifications::m_tip_block, then finally introduce waitTipChanged() and change RPCs to call waitTipChanged() instead of using m_tip_block directly.

    The drawback to this approach is it would change the same code twice, and increase the amount of code reviewers need to look at and intermediate state they need to think about, which is the same problem as the approach I took in #24230. So I wouldn’t necessarily recommend that. (For #24230 I think I will simplify it and keeping the granular commits in a side branch with the same overall diff. So if you want to just review the changes for correctness you can look at the PR and if you want to be more sure behavior isn’t changing you can look at the side branch.)

  111. ryanofsky referenced this in commit cdf59bbbec on Sep 6, 2024
  112. ryanofsky referenced this in commit 697ba11c94 on Sep 6, 2024
  113. ryanofsky referenced this in commit 4080052474 on Sep 6, 2024
  114. ryanofsky referenced this in commit 4d077f17de on Sep 6, 2024
  115. DrahtBot added the label Needs rebase on Sep 9, 2024
  116. Sjors force-pushed on Sep 10, 2024
  117. Sjors commented at 6:23 am on September 10, 2024: member
    Trivial rebase after #30509.
  118. DrahtBot removed the label Needs rebase on Sep 10, 2024
  119. ryanofsky approved
  120. ryanofsky commented at 12:38 pm on September 10, 2024: contributor
    Code review ACK 966027bdda53d150321af7db48b57f9756c54e68, no changes since last rebase
  121. DrahtBot requested review from TheCharlatan on Sep 10, 2024
  122. Sjors referenced this in commit d38df65999 on Sep 10, 2024
  123. DrahtBot added the label CI failed on Sep 10, 2024
  124. TheCharlatan approved
  125. TheCharlatan commented at 7:09 pm on September 10, 2024: contributor
    Re-ACK 966027bdda53d150321af7db48b57f9756c54e68
  126. Sjors referenced this in commit 8c1329ff8b on Sep 11, 2024
  127. Sjors referenced this in commit dc10cd379d on Sep 11, 2024
  128. DrahtBot added the label Needs rebase on Sep 12, 2024
  129. Sjors force-pushed on Sep 13, 2024
  130. Sjors commented at 8:28 am on September 13, 2024: member
    Trivial #include rebase after #30546.
  131. DrahtBot removed the label Needs rebase on Sep 13, 2024
  132. DrahtBot removed the label CI failed on Sep 13, 2024
  133. ryanofsky approved
  134. ryanofsky commented at 2:23 pm on September 13, 2024: contributor
    Code review ACK 76ec301c60e2530de7f58e2e4cea3d15c6a77cbc. Just simple rebase since last review
  135. DrahtBot requested review from TheCharlatan on Sep 13, 2024
  136. DrahtBot added the label Needs rebase on Sep 16, 2024
  137. refactor: rename BlockKey to BlockRef 89a8f74bbb
  138. Rename getTipHash() to getTip() and return BlockRef ebb8215f23
  139. node: Track last block that received a blockTip notification
    Also signal m_tip_block_cv when StopRPC is called, for
    consistency with g_best_block_cv. This is handled in
    StopRPC instead of OnRPCStopped() because the latter
    is deleted in a later commit.
    
    Co-authored-by: TheCharlatan <seb.kung@gmail.com>
    Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
    7eccdaf160
  140. Add waitTipChanged to Mining interface
    Co-authored-by: TheCharlatan <seb.kung@gmail.com>
    b94b27cf05
  141. rpc: add test for waitforblock and waitfornewblock 285fe9fb51
  142. rpc: fix waitfornewblock description
    The waitforblock RPC method takes a hash argument and waits for that specific block.  The waitfornewblock waits for any new block. This commit fixes the documentation.
    77ec072925
  143. rpc: recommend -rpcclienttimeout=0 for waitfor* de7c855b3a
  144. rpc: check for negative timeout arg in waitfor* 2a40ee1121
  145. Replace RPCNotifyBlockChange with waitTipChanged()
    This refactoring commit uses the newly introduced waitTipChanged mining interface method to replace the RPCNotifyBlockChange mechanism.
    dca923150e
  146. Remove unused CRPCSignals
    They are no longer used for anything since RPCNotifyBlockChange was replaced with waitTipChanged() from the mining interface.
    460687a09c
  147. rpc: use waitTipChanged for longpoll
    This removes the last remaining use of g_best_block by the RPC.
    e3a560ca68
  148. Remove unused g_best_block 7942951e3f
  149. Sjors force-pushed on Sep 17, 2024
  150. Sjors commented at 7:32 am on September 17, 2024: member
    Rebased after #30440. Just some #include and using conflicts since that PR also touched the mining interface.
  151. DrahtBot removed the label Needs rebase on Sep 17, 2024
  152. ryanofsky approved
  153. ryanofsky commented at 3:19 pm on September 17, 2024: contributor
    Code review ACK 7942951e3fcc58f7db0059546d03be9ea243f1db. Just rebased since last review
  154. TheCharlatan approved
  155. TheCharlatan commented at 3:21 pm on September 17, 2024: contributor
    Re-ACK 7942951e3fcc58f7db0059546d03be9ea243f1db
  156. ryanofsky referenced this in commit 54612bd402 on Sep 19, 2024
  157. ryanofsky referenced this in commit 7b9aa0b6eb on Sep 19, 2024
  158. ryanofsky referenced this in commit 9ccbbcc7b8 on Sep 19, 2024
  159. ryanofsky referenced this in commit 4bc985fe38 on Sep 19, 2024
  160. Sjors referenced this in commit 3030e023e1 on Sep 19, 2024
  161. Sjors referenced this in commit 28f1c62dee on Sep 19, 2024
  162. Sjors referenced this in commit 709e1fd91e on Sep 20, 2024
  163. achow101 commented at 7:26 pm on September 23, 2024: member
    ACK 7942951e3fcc58f7db0059546d03be9ea243f1db
  164. achow101 merged this on Sep 23, 2024
  165. achow101 closed this on Sep 23, 2024

  166. Sjors deleted the branch on Sep 24, 2024
  167. in src/rpc/server.cpp:325 in 7eccdaf160 outdated
    322+    std::call_once(g_rpc_stop_flag, [&]() {
    323         LogDebug(BCLog::RPC, "Stopping RPC\n");
    324         WITH_LOCK(g_deadline_timers_mutex, deadlineTimers.clear());
    325         DeleteAuthCookie();
    326         g_rpcSignals.Stopped();
    327+        node::NodeContext& node = EnsureAnyNodeContext(context);
    


    maflcko commented at 10:41 am on September 24, 2024:

    nit in 7eccdaf16081d6f624c4dc21df75b0474e049d2b: Any reason to use the throwing function? The internal error isn’t caught anywhere, IIUC. So, the program will terminate either way. By using *Assert(AnyPtr(context)) here instead, at least a usable error message is printed.


    Sjors commented at 1:59 pm on September 24, 2024:

    I probably just copied similar code from another place :-)

    Can make a followup, though *Assert(AnyPtr(context)) looks pretty scary.


    maflcko commented at 2:17 pm on September 24, 2024:

    You probably copied it from an RPC method implementation. There, it would be the right choice, because the http threads catch exceptions and then continue with the next RPC call.

    However, here no exceptions are caught (and doing so wouldn’t make sense, probably). The program will terminate either way, for example:

    02024-09-24T10:38:06Z opencon thread exit
    1terminate called after throwing an instance of 'UniValue'
    2Aborted (core dumped)
    

    However, by using Assert/assert, at least a source location is printed. If you don’t like the assert, you can also use an if guard against a nullptr, but I am not sure if that is cleaner.

    In any case, this is just a style nit.

  168. in src/node/interfaces.cpp:946 in b94b27cf05 outdated
    941+        // Interrupt check interval
    942+        const MillisecondsDouble tick{1000};
    943+        auto now{std::chrono::steady_clock::now()};
    944+        auto deadline = now + timeout;
    945+        // std::chrono does not check against overflow
    946+        if (deadline < now) deadline = std::chrono::steady_clock::time_point::max();
    


    maflcko commented at 12:09 pm on September 24, 2024:

    b94b27cf05c709674117e308e441a8d1efddcd0a: Can you explain the meaning of “overflow” in the context of double?

    To clarify: Signed interger overflow is UB, so it can’t happen. And overflow for double can’t happen either, so this is dead code either way, unless I am missing something.


    Sjors commented at 2:02 pm on September 24, 2024:

    It was in response to: #30409 (review) where @ryanofsky wrote:

    I’m also concerned about the deadline = now() + timeout assignment above because it seems like that will be an overflowed value when timeout is MillisecondsDouble::max() so the while loop condition might be invalid.


    ryanofsky commented at 2:49 pm on September 24, 2024:

    Nice catch. I believe this can be simplified to:

     0--- a/src/node/interfaces.cpp
     1+++ b/src/node/interfaces.cpp
     2@@ -941,9 +941,7 @@ public:
     3         // Interrupt check interval
     4         const MillisecondsDouble tick{1000};
     5         auto now{std::chrono::steady_clock::now()};
     6-        auto deadline = now + timeout;
     7-        // std::chrono does not check against overflow
     8-        if (deadline < now) deadline = std::chrono::steady_clock::time_point::max();
     9+        const auto deadline{now + timeout};
    10         {
    11             WAIT_LOCK(notifications().m_tip_block_mutex, lock);
    12             while ((notifications().m_tip_block == uint256() || notifications().m_tip_block == current_tip) && !chainman().m_interrupt) {
    

    This previous code written to handle the case where now + timeout calculation would overflow, basically assuming both values were unsigned integer values and overflow behavior would be well defined. But since timeout is a floating point number where “overflow” would just result in inf being set, trying to deal with it this way doesn’t make sense.

  169. maflcko commented at 1:32 pm on September 24, 2024: member

    review ACK 7942951e3fcc58f7db0059546d03be9ea243f1db 🛋

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: review ACK 7942951e3fcc58f7db0059546d03be9ea243f1db 🛋
    3KRzS4gS3SXCvu+SqXrmOlO59h7EfpDm03wNYXbXyxcUpf1uSvRs0Wxsdg7n96gG9dwQPet+ZPAL6FR2cNKqBCw==
    
  170. bitcoin deleted a comment on Sep 24, 2024
  171. in src/node/kernel_notifications.h:63 in 7eccdaf160 outdated
    55@@ -52,6 +56,12 @@ class KernelNotifications : public kernel::Notifications
    56     int m_stop_at_height{DEFAULT_STOPATHEIGHT};
    57     //! Useful for tests, can be set to false to avoid shutdown on fatal error.
    58     bool m_shutdown_on_fatal_error{true};
    59+
    60+    Mutex m_tip_block_mutex;
    61+    std::condition_variable m_tip_block_cv;
    62+    //! The block for which the last blockTip notification was received for.
    63+    uint256 m_tip_block;
    


    theuni commented at 7:58 pm on September 24, 2024:

    This needs a GUARDED_BY(m_tip_block_mutex).

    Adding it causes new warnings in validation_chainstate_tests.cpp. For ex:

    src/test/validation_chainstate_tests.cpp:73:46: warning: reading variable ’m_tip_block’ requires holding mutex ’m_node.notifications->m_tip_block_mutex’ [-Wthread-safety-analysis]

    I assume those particular accesses are harmless (maybe actually racy in the right conditions?), but future users may not be.

    Related: It’s a shame that the condvar/mutex leak out of here. Is there a reason not to add getter/waiter/notifier functions?

  172. Sjors commented at 7:21 am on September 25, 2024: member
    I’ll work on a followup for the above things.
  173. maflcko commented at 8:10 am on September 25, 2024: member

    I’ll work on a followup for the above things.

    Already started working on this, because I needed it for other stuff.

    Happy to drop mine, if you are done already.

  174. Sjors commented at 8:16 am on September 25, 2024: member
  175. maflcko referenced this in commit faa0e6091b on Sep 25, 2024
  176. maflcko referenced this in commit fa7dd1d072 on Sep 25, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me