Set notifications m_tip_block in LoadChainTip() #31346

pull Sjors wants to merge 1 commits into bitcoin:master from Sjors:2024/11/init_m_tip_block changing 6 files +27 −9
  1. Sjors commented at 12:13 pm on November 22, 2024: member

    Ensure KernelNotifications m_tip_block is set even if no new block arrives.

    Suggested in #31297 (comment)

  2. DrahtBot commented at 12:13 pm on November 22, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31346.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31393 (refactor: Move GuessVerificationProgress into ChainstateManager by maflcko)
    • #31325 (Make m_tip_block std::optional by Sjors)
    • #31283 (Add waitNext() to BlockTemplate interface by Sjors)
    • #31177 (rpc, logging: return “verificationprogress” of 1 when up to date by polespinasa)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in src/validation.cpp:4712 in b6d5be198a outdated
    4705@@ -4706,6 +4706,11 @@ bool Chainstate::LoadChainTip()
    4706               m_chain.Height(),
    4707               FormatISO8601DateTime(tip->GetBlockTime()),
    4708               GuessVerificationProgress(m_chainman.GetParams().TxData(), tip));
    4709+
    4710+    // Ensure KernelNotifications m_tip_block is set even if no new block arrives.
    4711+    // Ignoring return value for now.
    4712+    (void)m_chainman.GetNotifications().blockTip(GetSynchronizationState(/*init=*/true, m_chainman.m_blockman.m_blockfiles_indexed), *pindex);
    


    l0rinc commented at 12:17 pm on November 22, 2024:
    nit: at this point it may make more sense to just remove the [[nodiscard]] from blockTip instead

    Sjors commented at 12:30 pm on November 22, 2024:
    Maybe, or we could do something with the return value. I’m not familiar enough with init code and kernel notifications to have a strong opinion on this. cc @ryanofsky, @TheCharlatan

    ryanofsky commented at 1:16 pm on November 22, 2024:
    Just ignoring the return value is probably the best thing for now since this function and its callers return bool. After #29700 it could more easily bubble up the interrupted status. The kernel API is designed for a better world where operations can be more easily interrupted but not everything is interruptible for now.

    mzumsande commented at 4:24 am on November 25, 2024:
    LoadChainTip() is called in a loop for all chainstates in CompleteChainstateInitialization() so that blockTip would be called twice, for different blocks, in the case of an ongoing AssumeUtxo sync. This seems problematic to me - how would subscribers to blockTip be supposed to deal with that? I think that the background chainstate shouldn’t send a notification, as it it is done in ActivateBestChain().
  4. maflcko commented at 12:42 pm on November 22, 2024: member
    This is incomplete. You’ll have to update the docs as well: “If the tip was not connected on * startup, this will wait.”, and “which may //! be true even long after startup, until shutdown.”. Also, in init.cpp you can remove the if-guard?
  5. Sjors force-pushed on Nov 22, 2024
  6. Sjors commented at 2:19 pm on November 22, 2024: member
    I updated the waitTipChanged() and m_tip_block doc. @maflcko which if-guard do you mean?
  7. ryanofsky commented at 2:23 pm on November 22, 2024: contributor

    @maflcko which if-guard do you mean?

    I think probably this one:

    0    // Wait for genesis block to be processed
    1    if (WITH_LOCK(chainman.GetMutex(), return chainman.ActiveTip() == nullptr)) {
    

    Removing the if() might help verify that m_tip_block is set reliably

  8. maflcko commented at 2:29 pm on November 22, 2024: member
    Also, the ZERO workaround could be removed completely, ensuring that any interfaces will only be spun up after that point in init?
  9. Sjors force-pushed on Nov 22, 2024
  10. Sjors commented at 2:42 pm on November 22, 2024: member

    I dropped the if guard.

    ensuring that any interfaces will only be spun up after that point in init

    It could, but I’m not sure if it’s worth putting such a restriction on the interfaces.

    Similarly we have startupnotify which happens at the end of init. If we also ensure that the interfaces are ready to listen before this notification, then an application launched with systemd (or similiar) can wait for this signal before trying to connect.

    If on the other hand an external application is started at the same time, it would just have to keep trying to connect until we listen. Such a race would exist even if we opened the interface a bit earlier in the init process, though it would be shorter.

  11. ryanofsky commented at 3:00 pm on November 22, 2024: contributor

    It could, but I’m not sure if it’s worth putting such a restriction on the interfaces.

    Tend to agree for now. Having waitTipChanged wait for a tip to be connected seems like a natural thing to implement and might avoid the need for clients to poll the node on startup. But I think this check could be dropped in BlockTemplate::waitNext in #31283 (maybe replaced with an assert there) since it should not be possible to get a blocktemplate reference before the tip is connected. After #31283 it might make sense to drop the waitTipChanged method entirely too.

  12. Sjors commented at 4:02 pm on November 22, 2024: member

    Several reindex related tests are broken, investigating…

    After #31283 it might make sense to drop the waitTipChanged method entirely too.

    I think it’s generically useful though. There are various types of applications that need to know when a new block is added to the chain (and have no need for a template).

  13. DrahtBot added the label CI failed on Nov 22, 2024
  14. DrahtBot commented at 4:06 pm on November 22, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/33386413030

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  15. Sjors commented at 4:46 pm on November 22, 2024: member

    Specifically -reindex-chainstate is broken, easy to reproduce with e.g. testnet4.

    It stops after the second block:

    02024-11-22T16:44:14.338986Z Setting NODE_NETWORK on non-prune mode
    12024-11-22T16:44:14.339014Z Wait for genesis block to be processed
    22024-11-22T16:44:14.339023Z initload thread start
    32024-11-22T16:44:14.341109Z UpdateTip: new best=00000000da84f2bafbbc53dee25a72ae507ff4914b867c565be350b0da8bf043 height=0 version=0x00000001 log2_work=32.000022 tx=1 date='2024-05-03T23:11:00Z' progress=0.000001 cache=0.3MiB(0txo)
    42024-11-22T16:44:14.341179Z UpdateTip: new best=0000000012982b6d5f621229286b880e909984df669c2afabb102ce311b13f28 height=1 version=0x20000000 log2_work=33.000022 tx=2 date='2024-05-06T13:08:48Z' progress=0.000002 cache=0.3MiB(1txo)
    

    Putting the if-guard back “fixes” it. When I put a log statement before WAIT_LOCK and after the wait function returns, I can see that this lines are called regardless of the if-guard.

    Turns out the only critical thin is that the WAIT_LOCK stuff lives in its own scope.

  16. Sjors force-pushed on Nov 22, 2024
  17. DrahtBot removed the label CI failed on Nov 22, 2024
  18. in src/init.cpp:1794 in 3317123aa3 outdated
    1791@@ -1792,7 +1792,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1792     });
    1793 
    1794     // Wait for genesis block to be processed
    


    mzumsande commented at 3:56 am on November 25, 2024:

    My understanding is that this will only ever actually wait in three cases: 1.) first startup with an empty datadir 2.) reindex 3.) reindex-chainstate

    where the genesis block is being connected by the initload thread, whereas otherwise the m_tip_block_cv wait condition should be satisfied because m_tip_block should have been set earlier by the same thread. So removing the if guard is not necessary but just a matter of taste.

    After this PR, it will be harder to understand how the folllowing code block refers to the genesis block, so maybe the comment could be expanded to mention some of the above.


    Sjors commented at 8:42 am on November 25, 2024:

    I drafted a comment, but I think it’s wrong so I haven’t pushed it yet:

    0    /* 
    1     * Wait for genesis block to be processed. Typically kernel_notifications.m_tip_block
    2     * has already been set, except in three cases where the genesis block is
    3     * being connected by the initload thread:
    4     *
    5     * 1. first startup with an empty datadir
    6     * 2. reindex
    7     * 3. reindex-chainstate
    8     */
    

    I don’t think the initload thread ever sets kernel_notifications.m_tip_block. The thread only deals with indexes derived from BaseIndex, which doesn’t include CBlockIndex.

    I’m still confused under what circumstance it would actually be unset.


    Sjors commented at 8:47 am on November 25, 2024:

    It’s clearly doing something though, because the following breaks several functional tests:

     0--- a/src/init.cpp
     1+++ b/src/init.cpp
     2@@ -1792,13 +1792,14 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
     3     });
     4 
     5     // Wait for genesis block to be processed
     6     {
     7         WAIT_LOCK(kernel_notifications.m_tip_block_mutex, lock);
     8-        kernel_notifications.m_tip_block_cv.wait(lock, [&]() EXCLUSIVE_LOCKS_REQUIRED(kernel_notifications.m_tip_block_mutex) {
     9-            return !kernel_notifications.m_tip_block.IsNull() || ShutdownRequested(node);
    10-        });
    11+        Assert(!kernel_notifications.m_tip_block.IsNull());
    12+        // kernel_notifications.m_tip_block_cv.wait(lock, [&]() EXCLUSIVE_LOCKS_REQUIRED(kernel_notifications.m_tip_block_mutex) {
    13+        //     return !kernel_notifications.m_tip_block.IsNull() || ShutdownRequested(node);
    14+        // });
    15     }
    

    Sjors commented at 9:02 am on November 25, 2024:
    With some manual testing I indeed get all three conditions from your comment to hit that assert. I’m just confused why.

    Sjors commented at 9:31 am on November 25, 2024:

    Outside the context of a snapshot, we only call LoadChainTip() if:

    https://github.com/bitcoin/bitcoin/blob/2638fdb4f934be96b7c798dbac38ea5ab8a6374a/src/node/chainstate.cpp#L147-L150

    That explains why these three scenarios are different. With an empty datadir or when we -reindex(-chainstate) the UTXO set is empty[0].

    Still looking into where it is set.

    [0] more accurately, doesn’t point to a tip or we asked to wipe it. The name is misleading, it doesn’t matter if by some magic every UTXO ends up spent.

    https://github.com/bitcoin/bitcoin/blob/2638fdb4f934be96b7c798dbac38ea5ab8a6374a/src/node/chainstate.cpp#L260-L262


    Sjors commented at 10:22 am on November 25, 2024:

    In the case of fresh data dir or reindex, it’s ActivateBestChain() that calls blockTip() once it sets the genesis block to the tip.

    This indeed happens in the initload threads, which seems to keep doing this throughout the reindex process. Once all previously stored blocks are done, it exits and the msghand thread takes over.

    With a fresh datadir the initload thread only connects the genesis block.


    Sjors commented at 10:42 am on November 25, 2024:

    Specifically it’s ImportBlocks() that does this, even if no blocks are passed to it.

    which seems to keep doing this throughout the reindex process.

    In fact it does it after reindex is finished. ActivateBestChain connects blocks one by one until it has nothing left to do so that makes sense.


    Sjors commented at 10:56 am on November 25, 2024:
    Took the comment and expanded it a bit.

    mzumsande commented at 3:32 pm on November 25, 2024:

    In fact it does it after reindex is finished. ActivateBestChain connects blocks one by one until it has nothing left to do so that makes sense.

    Just as an aside: ABC is also called during reindexing when encountering the genesis block. That’s why we have this slightly weird behavior with a full -reindex that normally first reindexes all blocks and then builds the chainstate, but if the user interrupts it during reindexing and restarts the node without -reindex, it will pick up by first connecting all blocks of the previously indexed block files, then reindex the remaining block files and finally connect the blocks from the second batch.

    But that’s not really relevant to this PR, the comment seems fine now!

  19. Sjors force-pushed on Nov 25, 2024
  20. Sjors commented at 8:44 am on November 25, 2024: member
    I added a check to ensure we’re not the background chainstate when sending the notification.
  21. Set notifications m_tip_block in LoadChainTip()
    Ensure KernelNotifications m_tip_block is set even if no new block arrives.
    
    Additionally, have node init always wait for this to happen.
    be78eaa983
  22. Sjors force-pushed on Nov 25, 2024
  23. Sjors commented at 10:59 am on November 25, 2024: member

    I added a comment based on #31346 (review) to clarify under which circumstances m_tip_block is unset at this init stage.

    I find it quite confusing that ActivateBestChain is called by ImportBlocks even if no blocks are imported, but haven’t dared to refactor it.

  24. allysa08 approved
  25. maflcko commented at 11:17 am on November 25, 2024: member

    I find it quite confusing that ActivateBestChain is called by ImportBlocks even if no blocks are imported, but haven’t dared to refactor it.

    It is explained in the code comment above the line that does it. It was added as a fix for https://github.com/bitcoin/bitcoin/issues/2239


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-03 21:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me