kernel: Remove StartShutdown calls from validation code #28048

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/stopafter changing 10 files +55 −16
  1. ryanofsky commented at 5:03 pm on July 7, 2023: contributor

    This change drops the last kernel dependency on shutdown.cpp. It also adds new hooks for libbitcoinkernel applications to be able to interrupt kernel operations when the chain tip changes.

    ~This change is mostly a refactoring, but does slightly change -stopatheight behavior (see release note and commit message)~

    (EDIT: This change is a refactoring and doesn’t change behavior. The code change appeared to affect -stopatheight behavior, but doesn’t really).

  2. DrahtBot commented at 5:03 pm on July 7, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK TheCharlatan, furszy, hebasto, MarcoFalke
    Concept ACK theuni, stickies-v, mzumsande

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28051 (Get rid of shutdown.cpp/shutdown.h, use SignalInterrupt directly by ryanofsky)
    • #26762 (bugfix: Make CCheckQueue RAII-styled by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Validation on Jul 7, 2023
  4. ryanofsky force-pushed on Jul 7, 2023
  5. ryanofsky commented at 6:07 pm on July 7, 2023: contributor
    Updated 828af9ad19a702e711963f9b498f7462bde8aabb -> c2bcc339d7b8def58289d3ff01b2a905dc291ed6 (pr/stopafter.1 -> pr/stopafter.2, compare) just updating comments and adding release note about -stopatheight behavior
  6. theuni commented at 7:01 pm on July 7, 2023: member

    Very nice. Concept ACK and quick (very shallow) codereview ACK.

    I think the -stopatheight change is reasonable. If it turns out anyone/anything was relying on the way this worked before, we could always add functionality for that use-case (maybe -stopatblock?)

    However, the help string should be updated from:

    Stop running after reaching the given height in the main chain

    It sounds like it’s not completely accurate now, and less so after this change :)

  7. in src/node/kernel_notifications.cpp:64 in c2bcc339d7 outdated
    69+}
    70+
    71+kernel::InterruptResult KernelNotifications::blockTip(SynchronizationState state, CBlockIndex& index)
    72 {
    73     uiInterface.NotifyBlockTip(state, &index);
    74+    if (m_stop_at_height && index.nHeight >= m_stop_at_height) {
    


    theuni commented at 7:24 pm on July 7, 2023:

    Is it possible for this to be called twice?

    0if (m_stop_at_height) {
    1  if (index.nHeight > m_stop_at_height) {
    2    // assert(false) or return kernel::Interrupted{} ?
    3  } else if (index.nHeight == m_stop_at_height) {
    4  ..
    5  }
    6}
    

    Not that I imagine calling StartShutdown() twice would be particularly harmful anyway.


    ryanofsky commented at 8:09 pm on July 7, 2023:

    re: #28048 (review)

    I don’t think StartShutdown() can be called twice, but it should be ok if that did happen, like you say.

    Checking index.nHeight > m_stop_at_height wouldn’t be a good way to detect two shutdown calls, though, because that condition can also happen if they the node is restarted with a lower -stopatheight value (because the implementation only stops when new blocks are added, and doesn’t make any attempt to rewind)

  8. TheCharlatan commented at 9:12 pm on July 7, 2023: contributor

    Strong Concept ACK :)

    I’m glad you changed your mind on returning things from the notifications, putting the infrastructure in place to allow the code to now bubble a -stopatheight interrupt is a clear advantage to me.

    I was about to open a similar PR, albeit with a different approach for handling -stopafterblockimport (patch). Instead of creating a new notification for it, I thought slightly refactoring the function to instead return when the block import is done and then allow the user of the kernel (in this case the init code) to decide what to do is easier to reason with. This has the added benefit that users of the kernel can now just call BlockImport without having to rely on options and notification handlers to skip loading the mempool. Can you comment on this approach?

  9. ryanofsky force-pushed on Jul 7, 2023
  10. ryanofsky commented at 9:48 pm on July 7, 2023: contributor

    Updated c2bcc339d7b8def58289d3ff01b2a905dc291ed6 -> 09938a41d904c05b4676b064da9baa85e53e3e6f (pr/stopafter.2 -> pr/stopafter.3, compare) updating -stopatheight documentation to be a little more precise.

    I was also going to add a new commit 03dfa85e057192e38d706b8c40a21e70c5aeec91 dropping shutdown.h and shutdown.cpp entirely, but it was bigger than I thought it would be so will save it for another PR. [EDIT: This is now done in #28051]

    re: #28048 (comment)

    different approach for handling -stopafterblockimport (patch). Instead of creating a new notification for it, I thought slightly refactoring the function to instead return when the block import is done and then allow the user of the kernel (in this case the init code) to decide what to do is easier to reason with. This has the added benefit that users of the kernel can now just call BlockImport without having to rely on options and notification handlers to skip loading the mempool. Can you comment on this approach?

    I think the approach looks good, and thought of doing something similar, but I didn’t because I thought it would make this change bigger and more complicated. I think you should post that patch as a separate PR, because @furszy is actually making similar changes in #27607 and should be able to give good review and feedback. Having those changes done separately would let me simplify this PR and probably make for a better kernel API as you say.

  11. hebasto commented at 10:15 pm on July 7, 2023: member
    Concept ACK.
  12. furszy commented at 10:22 pm on July 7, 2023: member

    Oh yeah. The first commit on #27607 extracts the LoadMempool call out of the ImportBlocks method, second and third one are about renaming the function (and some other fields) and documenting it. Then the rest are improving the indexes initialization process so we can, at the end, erase the indexes global flag.

    But the -stopafterblockimport changes are new and conceptually look great in a first glance @TheCharlatan. Happy to review it if you push it into a standalone PR.

    Maybe the simplest path to not conflict much could be to provide ThreadImport with a “return_after_import” bool arg (true when -stopafterblockimport is set), so the function returns early after importing blocks, so you don’t need to make the mempool and indexes flag changes.

  13. DrahtBot added the label CI failed on Jul 7, 2023
  14. TheCharlatan commented at 11:24 am on July 8, 2023: contributor

    Re #28048#pullrequestreview-1520042712

    Maybe the simplest path to not conflict much could be to provide ThreadImport with a “return_after_import” bool arg (true when -stopafterblockimport is set)

    Thanks @furszy, opened #28053.

  15. in src/kernel/notifications_interface.h:19 in 09938a41d9 outdated
    14 class CBlockIndex;
    15 enum class SynchronizationState;
    16 
    17 namespace kernel {
    18 
    19+//! Result type for use with std::variant to indiciate that an operation should be interrupted.
    


    stickies-v commented at 10:17 am on July 10, 2023:

    nit

    0//! Result type for use with std::variant to indicate that an operation should be interrupted.
    

    ryanofsky commented at 10:40 pm on July 10, 2023:

    re: #28048 (review)

    nit

    Thanks, fixed

  16. in src/init.cpp:1412 in 09938a41d9 outdated
    1409@@ -1409,6 +1410,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1410     // ********************************************************* Step 7: load block chain
    1411 
    1412     node.notifications = std::make_unique<KernelNotifications>(node.exit_status);
    1413+    ReadNotificationArgs(args, *node.notifications);
    


    stickies-v commented at 4:45 pm on July 10, 2023:
    nit: would SetNotificationArgs be a better name to clarify the side effects this function has?

    ryanofsky commented at 10:40 pm on July 10, 2023:

    re: #28048 (review)

    nit: would SetNotificationArgs be a better name to clarify the side effects this function has?

    It seems like if you just saw a ReadNotificationArgs(args, *node.notifications) call with no context you would assume that it is reading args from the “args” variable into the “notifications” variable. Is there another way it could be interpreted?

    node::ReadNotificationArgs function name is meant to be consistent with node::ReadDatabaseArgs, node::ReadCoinsViewArgs, and wallet::ReadDatabaseArgs functions. The “args” that are being referred to are command line arguments and other ArgsManager settings that come from the OS and filesystem. So it seems more accurate to me to say that the args are read or parsed or interpreted here, not set here.


    stickies-v commented at 11:01 am on July 11, 2023:
    You make a lot of good points, and I think I agree with all of them. Thank you.
  17. in src/validation.cpp:3377 in 09938a41d9 outdated
    3371@@ -3369,7 +3372,10 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3372 
    3373     // Only notify about a new block tip if the active chain was modified.
    3374     if (pindex_was_in_chain) {
    3375-        m_chainman.GetNotifications().blockTip(GetSynchronizationState(IsInitialBlockDownload()), *to_mark_failed->pprev);
    3376+        // Ignoring return value for now, this could be changed to bubble up
    3377+        // kernel::Interrupted value to the caller so the caller could
    3378+        // distinguish between completed and interrupted operations.
    


    stickies-v commented at 4:58 pm on July 10, 2023:
    I don’t see in which scenario blockTip() would return kernel::Interrupted here, but the docs seem to indicate that it could. I’d suggest to either update the docs to reflect that we never expect this to interrupt, or add a logging statement similar to what’s done in ThreadImport?

    ryanofsky commented at 11:23 pm on July 10, 2023:

    re: #28048 (review)

    I don’t see in which scenario blockTip() would return kernel::Interrupted here, but the docs seem to indicate that it could. I’d suggest to either update the docs to reflect that we never expect this to interrupt, or add a logging statement similar to what’s done in ThreadImport?

    I’m happy to update the comment, but it would help to have a specific suggestion, because to me this is only saying that a kernel::Interrupted value might be returned here (as a fact of the [[nodiscard]] virtual InterruptResult blockTip return type). I don’t think there’s a suggestion here that bitcoind will return kernel::Interrupted{} . But other libbitcoinkernel applications or tests might could handle the blockTip notification in the future and return an interrupted value, and that case could be handled here.


    stickies-v commented at 10:55 am on July 11, 2023:

    I think I was confusing the kernel/node boundaries a bit. You’re right that here in the kernel logic that is validation.cpp, the documentation and behaviour are what we’d expect.

    My concern was around the node implementation of blockTip() now potentially calling StartShutdown() and returning a kernel:Interrupted without leaving any trace. It’s not something we’d expect to happen in InvalidateBlock() (and it would be behaviour change). I think I was wrong in suggesting we catch/log/document that here in validation.cpp though, as it’s node specific behaviour.

    Perhaps it would be beneficial to move this logging to KernelNotifications::blocksImported() and add a similar logging statement in KernelNotifications::blockTip?

    Although having typed all of this out… Is InvalidateBlock() even meant to be/stay in kernel?


    ryanofsky commented at 3:04 pm on July 13, 2023:

    re: #28048 (review)

    I’m not too concerned about blockTip notification shutting down during an invalidateblock call, because it seems like a corner case to worry about someone using -stopafterheight and invalidblock at the same time. If someone is doing this, I think there is only a small change of behavior not worth documenting where the -stopatheight option might take effect a little earlier and the node would shut down a little sooner.

    If distinguishing invalidateblock tip changes from other tip changes is a concern for other reaons in the future, it could be addressed by adding an enum parameter to the blockTip function that indicates the source or the tip change.

    The point of the “this could be changed to bubble up kernel::Interrupted” comment is not to worry about the stopatheight+invalidateblock corner case, but just suggest in general it should probably become a pattern to do something like if (IsInterrupted(result)) return std::get<Interrrupted>(result) and bubble up interrupted values to callers.

    On logging, I think it is usually better to log from kernel code rather than from node hooks, because if the log statements are actually useful for our code there is a good chance they will be useful for other code as well.

    On InvalidateBlock staying in the kernel, declaring blocks invalid seems like a useful feature to offer to kernel applications, so I don’t think there would be much benefit to removing it.

  18. stickies-v commented at 5:01 pm on July 10, 2023: contributor
    Concept ACK
  19. ryanofsky force-pushed on Jul 11, 2023
  20. ryanofsky commented at 0:04 am on July 11, 2023: contributor
    Updated 09938a41d904c05b4676b064da9baa85e53e3e6f -> 3e6b6877645f978b4999efdfb9e616f4dbf9dd19 (pr/stopafter.3 -> pr/stopafter.4, compare) just fixing a spelling mistake that was pointed out
  21. DrahtBot removed the label CI failed on Jul 11, 2023
  22. ryanofsky referenced this in commit e253568da8 on Jul 11, 2023
  23. DrahtBot added the label Needs rebase on Jul 11, 2023
  24. kernel: Remove StartShutdown calls from validation code
    This change drops the last kernel dependency on shutdown.cpp. It also adds new
    hooks for libbitcoinkernel applications to be able to interrupt kernel
    operations when the chain tip changes.
    
    This is a refactoring that does not affect behavior. (Looking at the code it
    can appear like the new break statement in the ActivateBestChain function is a
    change in behavior, but actually the previous StartShutdown call was indirectly
    triggering a break before, because it was causing m_chainman.m_interrupt to be
    true. The new code just makes the break more obvious.)
    31eca93a9e
  25. in doc/release-notes-28048.md:1 in 3e6b687764 outdated
    0@@ -0,0 +1 @@
    1+- The `-stopatheight` option will now stop exactly at the specified height, rather than at or above the specified height. Since the node shuts down as soon as it find a valid chain at the specified height, it is possible for the resulting chain to no longer be the most-work chain, if some of the blocks above the specified height are invalid.
    


    mzumsande commented at 5:36 pm on July 11, 2023:

    Just wanted to note that this proposal has some history, see Issue #13477 and the two closed PRs: #13490 and #13713.

    In particular, comment #13490 (comment) still seems relevant: The expectation that -stopatheight stops exactly at the specified height relies on ActivateBestChainStep() only connecting a single block in each invocation, which is currently the case, but more of a coincidence due to not wanting to lock cs_main for too long than an actual guarantee given by the function.

    Maybe it would be good to explicitly document in ActivateBestChainStep that -stopatheight relies on it conecting just a single block (outside of a reorg situation)?


    ryanofsky commented at 6:38 pm on July 11, 2023:

    I don’t actually want to make any new guarantees here. I’m just trying to make a simplification to the ActivateBestChain function and document the resulting behavior in release notes. Maybe the following release note would be better?

    • The -stopatheight option now stops earlier and no longer tries to validate and connect all the downloaded blocks above the specified height on the most-work chain. This means the resulting chain is more likely to be exactly the specified height instead of longer. But it also means the resulting chain might no longer the most-work chain if blocks above the specified height appeared to contain more work but were actually invalid.

    Would appreciate any suggestions to make the description better.


    mzumsande commented at 5:14 pm on July 12, 2023:

    I looked into this deeper and did some test runs on signet, and now I’m no longer sure that this is a behavioral change requiring a release note in the first place:

    My understanding is that the IBD behavior on master is:

    • ActivateBestChainStep()  will connect only one block if we made progress
    • The inner do / while loop of ActivateBestChain also will only run once unless the chain has less work than intitially (only relevant in a reorg).
    • so during IBD we’ll connect one block, release cs_main, call StartShutdown() if the stopheight is reached, break out of the outer do / while loop because m_chainman.m_interrupt is set.
    • There is a race between Shutdown() executed in the init thread, and msghand, so it’s possible that msghand connects another block or two (by executing new calls to ABC!) before it’s being stopped by Shutdown(). But the same should be possible after this PR?

    Also, the explanation of previous behavior in the commit msg (“It would not stop connecting blocks after the specified  height if they were already downloaded.”) does not seem to be accurate because of the existing break.


    ryanofsky commented at 5:53 pm on July 12, 2023:

    re: #28048 (review)

    :man_facepalming: Thanks! I didn’t notice the StartShutdown call was followed by a if (m_chainman.m_interrupt) break line. So it was already exiting the loop after the height check, and adding a new break would not change anything.

    I think you are also right that there are race conditions where extra blocks could be attached, but I haven’t looked into the details. Maybe it is worth opening an issue with your observations, in case there is a quick fix, or more questions to discuss. Naively, I’d think ProcessNewBlock should just do if (m_interrupt) return false; and the types of races you mention would be avoided.


    furszy commented at 7:45 pm on July 13, 2023:

    Naively, I’d think ProcessNewBlock should just do if (m_interrupt) return false; and the types of races you mention would be avoided.

    Hmm, if it arrived to ProcessNewBlock is because the node requested the block. So I would try to not discard it, and instead make it pass through AcceptBlock so it can be stored. Then quit early before ABC.

  26. mzumsande commented at 5:47 pm on July 11, 2023: contributor
    Concept ACK
  27. ryanofsky force-pushed on Jul 13, 2023
  28. DrahtBot removed the label Needs rebase on Jul 13, 2023
  29. ryanofsky force-pushed on Jul 13, 2023
  30. ryanofsky commented at 3:33 pm on July 13, 2023: contributor
    Rebased 3e6b6877645f978b4999efdfb9e616f4dbf9dd19 -> cbea21da035d9dbcd38669d780611c9a05442f3d (pr/stopafter.4 -> pr/stopafter.5, compare) fixing conflict with #28053 and also removing mentions about changed -stopafterheight behavior after Martin’s comment #28048 (review) Updated cbea21da035d9dbcd38669d780611c9a05442f3d -> 31eca93a9eb8e54f856d3f558aa3c831ef181d37 (pr/stopafter.5 -> pr/stopafter.6, compare) adding comment about distinguishing types of blockTip to address stickies comment #28048 (review)
  31. TheCharlatan approved
  32. TheCharlatan commented at 4:13 pm on July 13, 2023: contributor

    Very happy with this change. Nice to see these last few PRs pushing configuration options out of the kernel and letting the user decide how to react to kernel actions.

    ACK 31eca93a9eb8e54f856d3f558aa3c831ef181d37

  33. furszy commented at 7:52 pm on July 13, 2023: member
    Concept and light review ACK 31eca93a
  34. in src/node/kernel_notifications.cpp:61 in 31eca93a9e
    57@@ -57,9 +58,14 @@ static void DoWarning(const bilingual_str& warning)
    58 
    59 namespace node {
    60 
    61-void KernelNotifications::blockTip(SynchronizationState state, CBlockIndex& index)
    62+kernel::InterruptResult KernelNotifications::blockTip(SynchronizationState state, CBlockIndex& index)
    


    hebasto commented at 10:01 am on July 14, 2023:
    This requires to #include "kernel/notifications_interface.h", no?

    ryanofsky commented at 12:23 pm on July 14, 2023:

    This requires to #include "kernel/notifications_interface.h", no?

    It is included from the header, and is not a new dependency since this file is subclassing that one, but yes according to IWYU’s reasoning it should be included directly here as well. I’ll add the include if I need to update this PR for another reason.


    MarcoFalke commented at 12:47 pm on July 14, 2023:
    I’d say no. There should be never a need to include kernel/notifications_interface.h when you’ve included node/kernel_notifications.h. And I don’t see an outcome where node/kernel_notifications.h was ever changed to not include kernel/notifications_interface.h. Thus, iwyu pragma export should be used.
  35. hebasto approved
  36. hebasto commented at 10:02 am on July 14, 2023: member
    ACK 31eca93a9eb8e54f856d3f558aa3c831ef181d37, I have reviewed the code and it looks OK.
  37. MarcoFalke approved
  38. MarcoFalke commented at 12:47 pm on July 14, 2023: member

    lgtm ACK 31eca93a9eb8e54f856d3f558aa3c831ef181d37 🕷

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: lgtm ACK 31eca93a9eb8e54f856d3f558aa3c831ef181d37 🕷
    3Hkw+kA0Lvq2l5r2bA8OJD4QEslihedKBKDZiOpVUMUTGt5An9XfkcjccSf/I7WbXmLoxbsq78Feb4fRhSsC+AA==
    
  39. achow101 merged this on Jul 14, 2023
  40. achow101 closed this on Jul 14, 2023

  41. mzumsande commented at 5:21 pm on July 14, 2023: contributor

    ACK 31eca93a9eb8e54f856d3f558aa3c831ef181d37

    The PR description is outdated - its last sentence can be removed.

  42. ryanofsky commented at 7:58 pm on July 14, 2023: contributor

    The PR description is outdated - its last sentence can be removed.

    Thanks, struck it out since PR was merged and the old description did get added to git history.

  43. sidhujag referenced this in commit ef5ea647e7 on Jul 15, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me