kernel: Flush in ChainstateManager destructor #31382

pull TheCharlatan wants to merge 6 commits into bitcoin:master from TheCharlatan:chainman_flush_destructor changing 14 files +70 −89
  1. TheCharlatan commented at 1:18 pm on November 27, 2024: contributor

    This simplifies the shutdown code a bit and allows for getting rid of the goto in bitcoin-chainstate as well as making it a bit easier to use for future users of the kernel library.

    Moving the second flush out of the Shutdown function into the ChainstateManager destructor, should be safe since there should be no new events created after the first flush. Tweak the chainman tests to actually exercise this new tear down behaviour.


    This PR is part of the libbitcoinkernel project

  2. DrahtBot commented at 1:18 pm on November 27, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31382.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK l0rinc, stickies-v

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31664 (Fees: add Fee rate Forecaster Manager by ismaelsadeeq)
    • #30610 (validation: do not wipe utxo cache for stats/scans/snapshots by sipa)
    • #30157 (Fee Estimation via Fee rate Forecasters by ismaelsadeeq)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Validation on Nov 27, 2024
  4. TheCharlatan marked this as a draft on Nov 27, 2024
  5. DrahtBot added the label CI failed on Nov 27, 2024
  6. DrahtBot commented at 2:04 pm on November 27, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/33601554977

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. in src/bitcoin-chainstate.cpp:69 in b17201e7d2 outdated
    68-    // things instantiated so far requires running the epilogue to be torn down
    69-    // properly
    70-    assert(kernel::SanityChecks(kernel_context));
    71+    if (!kernel::SanityChecks(kernel_context)) {
    72+        std::cerr << "Failed sanity check.";
    73+        return 0;
    


    l0rinc commented at 2:09 pm on November 27, 2024:
    nit: this being a console app, it might make sense to return a non-zero error code when cerr is used

    TheCharlatan commented at 2:22 pm on November 27, 2024:
    Yeah, done.
  8. l0rinc commented at 2:15 pm on November 27, 2024: contributor

    Concept ACK

    Closed #31362 in favor of this

  9. TheCharlatan force-pushed on Nov 27, 2024
  10. TheCharlatan force-pushed on Nov 27, 2024
  11. TheCharlatan commented at 3:01 pm on November 27, 2024: contributor
    Seems like I am running into this here: #25073 (comment), maybe time to introduce a replace mempool method like Carl suggested there?
  12. shutdown: Tear down mempool after chainman
    Next to being more consistent, since tearing down the mempool first
    leaves a dangling pointer in the chainman, this also enables the changes
    in the next commits, since flushing calls GetCoinsCacheSizeState, which
    would access this dangling pointer otherwise.
    fa078572f8
  13. TheCharlatan force-pushed on Nov 27, 2024
  14. TheCharlatan force-pushed on Nov 27, 2024
  15. TheCharlatan force-pushed on Nov 27, 2024
  16. DrahtBot removed the label CI failed on Nov 27, 2024
  17. TheCharlatan commented at 8:51 am on November 28, 2024: contributor

    maybe time to introduce a replace mempool method like Carl suggested there?

    Did this now, ready for review.

  18. TheCharlatan marked this as ready for review on Nov 28, 2024
  19. in src/test/util/setup_common.cpp:316 in 93049689f3 outdated
    311+    m_node.mempool.reset();
    312+    bilingual_str error;
    313+    m_node.mempool = std::make_unique<CTxMemPool>(MemPoolOptionsForTest(m_node), error);
    314+    auto& dummy_chainstate{static_cast<DummyChainState&>(m_node.chainman->ActiveChainstate())};
    315+    dummy_chainstate.SetMempool(m_node.mempool.get());
    316+    Assert(error.empty());
    


    stickies-v commented at 2:42 pm on December 17, 2024:
    Shouldn’t this be checked right after making the mempool?

    TheCharlatan commented at 1:34 pm on January 2, 2025:
    Yeah, that makes more sense.
  20. in src/test/util/setup_common.cpp:316 in 93049689f3 outdated
    310+    // instead.
    311+    m_node.mempool.reset();
    312+    bilingual_str error;
    313+    m_node.mempool = std::make_unique<CTxMemPool>(MemPoolOptionsForTest(m_node), error);
    314+    auto& dummy_chainstate{static_cast<DummyChainState&>(m_node.chainman->ActiveChainstate())};
    315+    dummy_chainstate.SetMempool(m_node.mempool.get());
    


    stickies-v commented at 9:28 am on January 2, 2025:
    I wonder if it would make sense to (probably in a separate PR, I suspect it’ll be quite a big change) remove the Chainstate::m_mempool member. It feels like a footgunny shortcut (but need to investigate more), conceptually doesn’t make a lot of sense to me, and not having it would avoid having to do e.g. theDummyChainState gymnastics. Is this something you’ve explored already?

    TheCharlatan commented at 1:34 pm on January 2, 2025:
    I could not come up with a satisfactory alternative yet, suggestions welcome!
  21. in src/init.cpp:357 in 7a37f20083 outdated
    352+    // if missing a callback results in an unrecoverable situation, unclean
    353+    // shutdown would too. The only reason to do the above flushes is to let
    354+    // the wallet and indexes catch up with our current chain to avoid any
    355+    // strange pruning edge cases and make next startup faster by avoiding
    356+    // rescan.
    357+
    


    stickies-v commented at 10:02 am on January 2, 2025:

    nit: I would go further and merge it with the above docstring. IIUC correctly, they all talk about L349?

     0diff --git a/src/init.cpp b/src/init.cpp
     1index 695e365867..b558a1594d 100644
     2--- a/src/init.cpp
     3+++ b/src/init.cpp
     4@@ -344,16 +344,17 @@ void Shutdown(NodeContext& node)
     5         }
     6     }
     7 
     8-    // After there are no more peers/RPC left to give us new data which may generate
     9-    // CValidationInterface callbacks, flush them...
    10+    // After there are no more peers/RPC left to give us new data which
    11+    // may generate CValidationInterface callbacks, flush them. Any
    12+    // future callbacks will be dropped. This should absolutely be safe
    13+    // - if missing a callback results in an unrecoverable situation,
    14+    // unclean shutdown would too. The only reason to flush is to let
    15+    // the wallet and indexes catch up with our current chain to avoid
    16+    // any strange pruning edge cases and make next startup faster by
    17+    // avoiding rescan.
    18     if (node.validation_signals) node.validation_signals->FlushBackgroundCallbacks();
    19 
    20-    // Any future callbacks will be dropped. This should absolutely be safe -
    21-    // if missing a callback results in an unrecoverable situation, unclean
    22-    // shutdown would too. The only reason to do the above flushes is to let
    23-    // the wallet and indexes catch up with our current chain to avoid any
    24-    // strange pruning edge cases and make next startup faster by avoiding
    25-    // rescan.
    26+
    27 
    28     // Stop and delete all indexes only after flushing background callbacks.
    29     for (auto* index : node.indexes) index->Stop();
    
  22. in src/test/validation_chainstatemanager_tests.cpp:391 in 32dc5732ee outdated
    385@@ -397,7 +386,11 @@ struct SnapshotTestSetup : TestChain100Setup {
    386             // For robustness, ensure the old manager is destroyed before creating a
    387             // new one.
    388             m_node.chainman.reset();
    389+            // Process all callbacks referring to the old manager before creating a
    390+            // new one.
    391+            m_node.validation_signals->SyncWithValidationInterfaceQueue();
    


    stickies-v commented at 11:25 am on January 2, 2025:
    Should we do this before calling m_node.chainman.reset();? I’m not sure if any of the callbacks actually depend on the chainman (it doesn’t look like it, but I’ll need to investigate more), but I think it’s a bit more logical at least?

    TheCharlatan commented at 1:35 pm on January 2, 2025:
    The chainman reset may trigger validation interface notifications, so I think it makes sense to sync with the queue afterwards? Unlike the shutdown procedure, we keep the validation interface around beyond the restarts there.

    stickies-v commented at 12:00 pm on January 3, 2025:

    The chainman reset may trigger validation interface notifications

    Ah, right, ChainStateFlushed would be called. Yeah, I think with that knowledge doing it afterwards makes more sense. Also, callbacks are processed regardless of SyncWithValidationInterfaceQueue() being called, so I suppose this doesn’t really change anything wrt callbacks assuming m_node.chainman exists.

  23. in src/test/validation_chainstatemanager_tests.cpp:383 in 32dc5732ee outdated
    378-                cs->ForceFlushStateToDisk();
    379-            }
    380-            // Process all callbacks referring to the old manager before wiping it.
    381-            m_node.validation_signals->SyncWithValidationInterfaceQueue();
    382-            LOCK(::cs_main);
    383-            chainman.ResetChainstates();
    


    stickies-v commented at 11:26 am on January 2, 2025:

    nit: the docstring is now a bit out of date:

    0    // Simulate a restart of the node by flushing all state to disk, clearing the
    1    // existing ChainstateManager, and unloading the block index.
    

    TheCharlatan commented at 1:59 pm on January 2, 2025:
    This seems fine as is? I’d prefer not to document these deeper behaviours, that the caller can expect to “just work”.

    stickies-v commented at 2:23 pm on January 2, 2025:

    Sorry, I meant to to remove the flushing reference since that’s now automatically handled by clearing chainman, so I agree with your comment.

    Simulate a restart of the node by clearing the existing ChainstateManager, and unloading the block index.


    TheCharlatan commented at 12:15 pm on January 13, 2025:
    I’ll do that next time I have to push.
  24. stickies-v commented at 1:05 pm on January 2, 2025: contributor

    Concept ACK

    Simplifies the shutdown logic, and makes ChainstateManager more self contained. I’ll need to think a bit more about how to test or verify the safety of changes such as in 7a37f2008372c021541809d06d9300278fd5a8da.

  25. tests: Add replace mempool method
    This is done in preparation for the next commit to also change the
    mempool pointer within the chainstate once the old mempool is reset.
    Without this the old, dangling pointer will be reused when destructing
    the chainstate manager.
    91bcb04538
  26. kernel: Flush in ChainstateManager dtor
    Moving the second flush out of the Shutdown function into the
    ChainstateManager destructor, should be safe since there should be no
    new events created after the first flush.
    
    This simplifies the shutdown code a bit and allows for getting rid of
    the `goto` in bitcoin-chainstate.
    c79a714461
  27. tests: Use chainman dtor logic to simulate restart
    Also remove the reference to the destructed chainman, which could cause
    UB once the chainman is reset.
    9bf6df763d
  28. validation: Remove ResetCoinsViews
    Now that its only callsite has been moved to the destructor, let the
    smart pointer do its thing.
    cd618e2020
  29. kernel: Remove epilogue in bitcoin-chainstate
    There is no need to flush the background callbacks, since the immediate
    task runner is used for the validation signals, which does not have any
    pending callbacks. This allows for removal of the epilogue goto.
    8e86e77311
  30. TheCharlatan force-pushed on Jan 3, 2025
  31. TheCharlatan commented at 8:54 pm on January 3, 2025: contributor

    Thank you for the review @stickies-v,

    Updated 6c3bc478ca343557d798e2b18b5f99c08079ba9a -> 8e86e77311a2c76fd34b4481f83205b2432b151a (chainman_flush_destructor_0 -> chainman_flush_destructor_1, compare)

    • Addressed @stickies-v’s comment, moved Assert checking that there is no mempool construction error to right after constructing the mempool.
    • Addressed @stickies-v’s comment, consolidated teardown notification comment into a single block.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me