refactor: Improve assumeutxo state representation #30214

pull ryanofsky wants to merge 17 commits into bitcoin:master from ryanofsky:pr/cstate changing 39 files +693 −744
  1. ryanofsky commented at 2:53 pm on May 31, 2024: contributor

    This PR contains the first part of #28608, which tries to make assumeutxo code more maintainable, and improve it by not locking cs_main for a long time when the snapshot block is connected, and by deleting the snapshot validation chainstate when it is no longer used, instead of waiting until the next restart.

    The changes in this PR are just refactoring. They make Chainstate objects self-contained, so for example, it is possible to determine what blocks to connect to a chainstate without querying ChainstateManager, and to determine whether a Chainstate is validated without basing it on inferences like &cs != &ActiveChainstate() or GetAll().size() == 1.

    The PR also tries to make assumeutxo terminology less confusing, using “current chainstate” to refer to the chainstate targeting the current network tip, and “historical chainstate” to refer to the chainstate downloading old blocks and validating the assumeutxo snapshot. It removes uses of the terms “active chainstate,” “usable chainstate,” “disabled chainstate,” “ibd chainstate,” and “snapshot chainstate” which are confusing for various reasons.

  2. DrahtBot commented at 2:53 pm on May 31, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30214.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK mzumsande
    Approach ACK fjahr, TheCharlatan

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31845 (Add -pruneduringinit option to temporarily use another prune target during IBD by luke-jr)
    • #31615 (validation: ensure assumevalid is always used during reindex by Eunovo)
    • #31533 (fuzz: Add fuzz target for block index tree and related validation events by mzumsande)
    • #31382 (kernel: Flush in ChainstateManager destructor by TheCharlatan)
    • #31282 (refactor: Make node_id a const& in RemoveBlockRequest by maflcko)
    • #31144 ([IBD] multi-byte block obfuscation by l0rinc)
    • #30610 (validation: do not wipe utxo cache for stats/scans/snapshots by sipa)
    • #30595 (kernel: Introduce initial C header API by TheCharlatan)
    • #30221 (wallet: Ensure best block matches wallet scan state by achow101)
    • #29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)
    • #29680 (wallet: fix unrelated parent conflict doesn’t cause child tx to be marked as conflict by Eunovo)
    • #29652 (wallet: Avoid potentially writing incorrect best block locator by ryanofsky)
    • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
    • #29039 (versionbits refactoring by ajtowns)
    • #26022 (Add util::ResultPtr class by ryanofsky)
    • #25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Refactoring on May 31, 2024
  4. DrahtBot added the label CI failed on May 31, 2024
  5. DrahtBot commented at 7:56 pm on May 31, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/25656143496

  6. ryanofsky force-pushed on Jun 3, 2024
  7. DrahtBot removed the label CI failed on Jun 4, 2024
  8. fjahr commented at 9:21 am on June 8, 2024: contributor
    Concept ACK
  9. DrahtBot added the label Needs rebase on Jun 10, 2024
  10. ryanofsky force-pushed on Jun 17, 2024
  11. DrahtBot removed the label Needs rebase on Jun 17, 2024
  12. DrahtBot added the label Needs rebase on Jun 17, 2024
  13. ryanofsky force-pushed on Jun 17, 2024
  14. DrahtBot removed the label Needs rebase on Jun 17, 2024
  15. DrahtBot added the label Needs rebase on Jun 25, 2024
  16. in src/validation.h:556 in cdc44e878a outdated
    551+    {
    552+        return m_validity;
    553+    }
    554+
    555+    //! Return true if chainstate fully validated or is assumed valid.
    556+    bool IsValid() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    


    fjahr commented at 9:49 am on July 15, 2024:
    nit: In other places similar checks are inlined without a helper method. That might be actually a bit more clear as well.

    ryanofsky commented at 5:43 pm on July 18, 2024:

    re: #30214 (review)

    nit: In other places similar checks are inlined without a helper method. That might be actually a bit more clear as well.

    Good point, at least at first glance this looks reasonable to inline.


    ryanofsky commented at 9:48 pm on July 19, 2024:

    re: #30214 (review)

    nit: In other places similar checks are inlined without a helper method. That might be actually a bit more clear as well.

    You’re right, that is clearer. Updated different places where this was used.

  17. in src/test/validation_chainstatemanager_tests.cpp:32 in ce02d68008 outdated
    28@@ -29,6 +29,17 @@ using node::BlockManager;
    29 using node::KernelNotifications;
    30 using node::SnapshotMetadata;
    31 
    32+static std::vector<Chainstate*> GetAll(const ChainstateManager& chainman)
    


    fjahr commented at 10:13 am on July 15, 2024:
    Why do you add this function around here instead of using m_chainstates as well?

    ryanofsky commented at 5:43 pm on July 18, 2024:

    Why do you add this function around here instead of using m_chainstates as well?

    I need to double-check but IIRC, the issue is that m_chainstates requires a lock to access, and it would require larger changes and more verbosity in the tests if they had to be changed to access m_chainstates directly. Will follow up and add a comment describing purpose of this function.


    ryanofsky commented at 9:47 pm on July 19, 2024:

    re: #30214 (review)

    Why do you add this function around here instead of using m_chainstates as well?

    Dropped this function now, it was not hard to get rid of.

  18. in src/validation.h:571 in 493d3844cf outdated
    551     //! @sa ChainstateRole
    552-    ChainstateRole GetRole() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    553+    kernel::ChainstateRole GetRole() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    554+
    555+    //! Return whether chain is fully validated, assumed valid, or invalid.
    556+    ChainValidity Validity() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    


    fjahr commented at 10:51 am on July 15, 2024:
    Not sure why we need this when in another commit we remove GetAll() and instead use m_chainstates. Seems kind of inconsistent but maybe I am missing something.

    ryanofsky commented at 5:43 pm on July 18, 2024:

    re: #30214 (review)

    Not sure why we need this when in another commit we remove GetAll() and instead use m_chainstates. Seems kind of inconsistent but maybe I am missing something.

    Is this comment about the Validity() method? This is just supposed to be a public accessor for code that cannot access the private m_validity member/

  19. fjahr commented at 11:11 am on July 15, 2024: contributor

    Approach ACK

    I think pretty much everything here is a worth while improvement and should set up the following improvements well. I am not so deep into the current state of the kernel work though, so I can not comment on that part, for example the newly introduced typing file. I guess @TheCharlatan could give his nod to that here.

    It removes uses of the terms “active chainstate,” “usable chainstate,” “disabled chainstate,” “ibd chainstate,” and “snapshot chainstate” which are confusing for various reasons.

    I think this is a bit confusing because it sounds like you are getting rid of this completely but it’s actually just avoided where you touch the code, right? Cleaning the terms up everywhere can be a follow-up, I would like to try to rewrite the design doc a bit more anyway. But I think (selfishly) would like to break it up into design and usage docs because IMO that makes it way more manageable, i.e. the last commit of #29553.

  20. ryanofsky commented at 6:41 pm on July 18, 2024: contributor

    re: #30214#pullrequestreview-2177291599

    Thanks for reviewing! Will rebase and try to address comments. I’m especially interested if you have ideas on splitting up documentation and code changes into separate PRs.

    I am not so deep into the current state of the kernel work though, so I can not comment on that part, for example the newly introduced typing file. I guess @TheCharlatan could give his nod to that here.

    Sure, for background kernel/types.h is analogous to node/types.h, wallet/types.h, and common/types.h. The types files are used when libraries want to define shared struct and enum types that can be used by code NOT directly depending on the library. For example the gui code uses the wallet::isminetype enum type, but the gui should not depend on the wallet library, so the enum is defined in a shared header wallet/types.h. In this PR, wallet code uses the kernel::ChainstateRole struct type, but the wallet code should not depend on the kernel library, so the struct is defined in a shared header kernel/types.h. This pattern started with the wallet library, but has been useful for node and common libraries as well, so it is natural to extend to the kernel.

    A more stricter separation between libraries would be possible with a more complicated structure, but having shared types files has been a reasonable way to separate the libraries while having a 1 library = 1 directory = 1 namespace structure.

    It removes uses of the terms “active chainstate,” “usable chainstate,” “disabled chainstate,” “ibd chainstate,” and “snapshot chainstate” which are confusing for various reasons.

    I think this is a bit confusing because it sounds like you are getting rid of this completely but it’s actually just avoided where you touch the code, right? Cleaning the terms up everywhere can be a follow-up, I would like to try to rewrite the design doc a bit more anyway. But I think (selfishly) would like to break it up into design and usage docs because IMO that makes it way more manageable, i.e. the last commit of #29553.

    Yeah this description was written aspirationally before I implemented this PR and I did not get around to removing all the old language. I’d be interested if you have specific ideas on how to break this PR up better. I definitely think it could make sense to move documentation / comment changes to a separate PR, and I hope #29553 can be reviewed more and merged soon, before this PR. (This is a good reminder for me to re-ack #29553)

  21. ryanofsky force-pushed on Jul 19, 2024
  22. ryanofsky commented at 9:55 pm on July 19, 2024: contributor
    Rebased 493d3844cfc8628fd63184fa7a657126d9b5dc95 -> 2c1d8b426fa1832fa464c12f3cae7e2976584430 (pr/cstate.4 -> pr/cstate.5, compare) due to conflicts with various prs including #30267, #30388, #30395, #29996, #30406, #30320. Also made suggested simplifications. Rebased 2c1d8b426fa1832fa464c12f3cae7e2976584430 -> 4c995842e3689974364c8c82ee220bfaa30f7f79 (pr/cstate.5 -> pr/cstate.6, compare) due to conflict with #30111 Rebased 4c995842e3689974364c8c82ee220bfaa30f7f79 -> 7fd7960389254c7a2ba50614e1cbdf8911ef6a7d (pr/cstate.6 -> pr/cstate.7, compare) due to conflict with #29656
  23. DrahtBot removed the label Needs rebase on Jul 19, 2024
  24. DrahtBot added the label Needs rebase on Jul 24, 2024
  25. ryanofsky force-pushed on Jul 24, 2024
  26. DrahtBot removed the label Needs rebase on Jul 24, 2024
  27. DrahtBot added the label Needs rebase on Aug 5, 2024
  28. ryanofsky force-pushed on Aug 7, 2024
  29. DrahtBot removed the label Needs rebase on Aug 7, 2024
  30. DrahtBot added the label Needs rebase on Aug 9, 2024
  31. fjahr commented at 8:20 pm on August 25, 2024: contributor

    Thanks for addressing my feedback @ryanofsky ! I hope we can merge #29553 soon and then I will get back to this with another full review. But it does look very good to me already.

    I’m especially interested if you have ideas on splitting up documentation and code changes into separate PRs.

    I’m unsure if it’s really needed because I find clean refactoring commits like those here can be reviewed quicker than other changes, so I would be fine to keep it as is just based on volume of the change. But I did think about it and if that is feasible maybe splitting along the lines of everything that touches ChainstateRole and the rest (in whatever order you think works better) could be an idea. If you feel like splitting it that’s fine for me but I think I would prefer to keep it as one PR as I have already passed everything so I am biased towards fewer fundamental changes ;)

    Ah, and since the first is a fix you could just open that as a separate PR now. That should be an easy review and merge that doesn’t conflict with anything else. The docs changes it feels like it’s better to keep them along the code since you need to get your head into the topic either way. I often feel like reviewing docs and code side by side challenges your understanding and might lead to higher quality review.

  32. TheCharlatan commented at 9:09 pm on October 9, 2024: contributor
    Concept ACK
  33. in src/validation.cpp:3524 in 7fd7960389 outdated
    3489-    if (WITH_LOCK(::cs_main, return m_disabled)) {
    3490-        LogPrintf("m_disabled is set - this chainstate should not be in operation. "
    3491+    // Belt-and-suspenders check that we aren't attempting to advance the
    3492+    // chainstate past the target block.
    3493+    if (WITH_LOCK(::cs_main, return m_target_utxohash)) {
    3494+        LogPrintf("m_target_utxohash is set - this chainstate should not be in operation. "
    


    TheCharlatan commented at 12:57 pm on March 9, 2025:
    I’ve always been a bit puzzled of this check. Is it really required? If so, why don’t we need to check it on every iteration of the do loop further down? What if it is actually reached, will validation on that chain just always be stuck in an unrecoverable state? Wouldn’t a fatal error be more appropriate then?

    ryanofsky commented at 3:53 pm on March 10, 2025:

    #30214 (review)

    I’ve always been a bit puzzled of this check. Is it really required? If so, why don’t we need to check it on every iteration of the do loop further down? What if it is actually reached, will validation on that chain just always be stuck in an unrecoverable state? Wouldn’t a fatal error be more appropriate then?

    This is labeled as “Belt-and-suspenders check” so it should be safe to assume that it is not required. Sometimes it is useful to have checks for unexpected states that should never happen, so it is possible to detect bugs that may otherwise go unnoticed, or to provide better debug information if the unexpected state does lead to a problem later.

    Sometimes unexpected states should trigger asserts or fatal errors, but sometimes they don’t need to be fatal and can just trigger warnings like this one. In this case the unexpected state may cause the current snapshot to never get fully validated (if the background chainstate is disabled), but since the node will still be functional, and the user might be able to recover from it by loading another snapshot, maybe it is not worth causing a fatal error. Also it might make sense to have this check at the top of this function not in the loop below because it is checking a precondition: looking for an invalid state set by previous code, not invalid state set by this code. It may be possible to improve this by adding other checks, though. And it would probably be good to use if (Assume(...)) here so if this condition is reached it will be detected in fuzz and debug builds.

  34. TheCharlatan commented at 4:01 pm on March 12, 2025: contributor
    Reading through commit a35a2a05887202de8c49727b6ea9166bf5f1a6f2 I am asking myself if with the addition of m_target_blockhash, the state of m_validity can just be computed on the fly without having to lug around and assert its state. For tracking the change in validity to INVALID once the tip reaches the target block the historical chainstate, could the presence of the m_target_utxo_hash (similar to how the belt-and-suspenders check is done later on) be checked instead? And for the current chainstate could the ASSUMED_VALID state be tracked with the presence of the snapshot base block and then the transition to the VALIDATED state by checking if the blocks up to the snapshot base block have BLOCK_VALID_MASK?
  35. ryanofsky commented at 4:42 pm on March 12, 2025: contributor

    Reading through commit a35a2a0 I am asking myself if with the addition of m_target_blockhash, the state of m_validity can just be computed on the fly without having to lug around and assert its state.

    I think it could probably work but two reasons not to do it might be:

    (1) to avoid performance regressions in parts of the code that are assuming it is possible to determine whether a chainstate is valid or valid instantly without without iterating over the chain (2) to make intent explicit and make code general and future proof

    To explain (2) maybe it is safe right now to assume that any chain that is not invalid and not fully validated must be assumed-valid. But it’s not clear that this should always be the case. Maybe a potential libbitcoinkernel application could be a fork-monitoring client that tracks multiple chains for some other reason and doesn’t want to mark any of them as assume-valid. Having an explicit assume-valid state could make purpose of the chain more explicit and could let code dealing with multiple chainstates use chains based on how they are intended to be used without trying to infer intent from other attributes.

    Maybe what you are are suggesting could be an improvement, through, and it’s been a while since I thought about this and reasons I listed above are speculative. Could try implementing the suggestion and see if it simplifies the code.

  36. TheCharlatan commented at 11:18 am on March 13, 2025: contributor

    Maybe what you are are suggesting could be an improvement, through, and it’s been a while since I thought about this and reasons I listed above are speculative. Could try implementing the suggestion and see if it simplifies the code.

    This is what I had in mind: https://github.com/TheCharlatan/bitcoin/commit/9c89548c670e7b807a969e13408ff03df15143a2 (also rebased this branch to get cmake) . It is very broken in its current state though. There are a few problems with it, but I guess the fundamental problem is that the current application of a snapshot is done in relation to an existing chain and its state, and not against the block index and its state. E.g. the check for not loading a snapshot with less work only looks at the chain state, not the block index. If the chain is behind the block index, but the block index already has more fully-validated work, the snapshot application passes, but my new function would think it is already fully validated. The tests heavily rely on the current behaviour, where the best work chain is taken from the existing chainstate.

    Thinking outside of this proposed change, is this really the desired behaviour though? Shouldn’t the block index be our fundamental source of truth? I can think of a few scenarios outside of the tests where the chain being behind the fully-validated block index can be encountered, for example during a chainstate reindex, but I don’t think they are something where snapshots should be supported.

    (2) to make intent explicit and make code general and future proof

    I would argue that adding this validity state could also make it more complicated to implement, or compose from existing functionality, parallel validation, or similar IBD speedups that might partition the chain even more.

  37. ryanofsky force-pushed on Mar 13, 2025
  38. ryanofsky commented at 5:14 pm on March 13, 2025: contributor

    Rebased 7fd7960389254c7a2ba50614e1cbdf8911ef6a7d -> 58f62c2221d82e22e7c2d9cccedd58e4390ace5a (pr/cstate.7 -> pr/cstate.8, compare) due to various conflicts. Updated 58f62c2221d82e22e7c2d9cccedd58e4390ace5a -> d09b82156ced5d543d08226e8d7b8fda2e0ec532 (pr/cstate.8 -> pr/cstate.9, compare) to fix “error: private field ’m_is_memory’ is not used” https://github.com/bitcoin/bitcoin/pull/30214/checks?check_run_id=38726768015


    re: #30214 (comment)

    Shouldn’t the block index be our fundamental source of truth?

    I’m open to this idea but not sure the block index has enough information in this case to provide information efficiently since you can’t know if a particular block is fully validated without iterating over the whole chain back to genesis. Also, if mulltiple chainstates might exist for different purposes, it may be useful to store information about the purpose of each chainstate and what assumptions it was created under in the chainstate itself not in blocks that it is pointing to. I think the Chainstate::m_validity member in this PR does both of these things in a pretty straightforward way, but if we want to replace it with something better, hopefully that should not be too hard since it is a private member, so the public Chainstate interface would not have change.

    One thing this PR is trying to accomplish is making it possible determine status of Chainstate objects independently, without having to look at ChainstateManager objects and other Chainstate objects. But it should be fine for Chainstate methods to use BlockIndex flags to return their status.

    I would argue that adding this validity state could also make it more complicated to implement, or compose from existing functionality, parallel validation, or similar IBD speedups that might partition the chain even more.

    I think if multiple chainstates will exist for different purposes, the chainstate objects will need to have some field describing their purpose and controlling their behavior. I can easily see the m_validity field not being general enough to do that, but it’s more easy for me to imagine it being replaced by another field than being eliminated entirely. But again I think the fact that it’s a private field should make it easier to change or eliminate.

    Should also point out the that m_validity member added here is not really new. It’s just replacing the previous m_disabled member with something more positive and less ambiguous.

  39. ryanofsky force-pushed on Mar 13, 2025
  40. DrahtBot added the label CI failed on Mar 13, 2025
  41. DrahtBot commented at 5:34 pm on March 13, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/38726767974

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  42. DrahtBot removed the label Needs rebase on Mar 13, 2025
  43. DrahtBot removed the label CI failed on Mar 13, 2025
  44. TheCharlatan commented at 8:39 pm on March 13, 2025: contributor

    I think if multiple chainstates will exist for different purposes, the chainstate objects will need to have some field describing their purpose and controlling their behavior.

    Yes, I think I have come to a similar conclusion, and I think the introduced reliance here on the m_target_blockhash and m_from_snapshot_blockhash are very good changes. Having some kind of “start” and “stop” blocks for a chainstate seems like something that is easy to understand and might also be useful beyond assumeutxo.

    since you can’t know if a particular block is fully validated without iterating over the whole chain back to genesis

    Isn’t that only true when iterating over the snapshot block or when you don’t know if you might come across it? My understanding is that is also what the docs say about BLOCK_VALID_CHAIN and BLOCK_VALID_SCRIPTS and is also checked in CheckBlockIndex.

    I think if multiple chainstates will exist for different purposes, the chainstate objects will need to have some field describing their purpose and controlling their behavior.

    I would also like to see a future with more diverse modes of validation, but I feel like we already have the ChainstateManager class for controlling the behaviour of a Chainstate. In my mind a Chainstate should not have to know all the details of what it is being used for, or have rich descriptive state about its own behaviour. I feel like many of the self-describing behaviours, like passing the ChainstateRole through a notification, should actually be done by the ChainstateManager. I don’t think the changes in this PR go diametrically against that goal, but I am also not sure yet if the changes here really make it easier to eventually go into that direction.

  45. ryanofsky commented at 11:08 pm on March 13, 2025: contributor

    I think I agree with all that, and think chainstate objects should just contain information for chainstatemanager to deal with them, not to have “rich” data or application specific data. I also think depending on what you want to do with chainstates, block status flags may provide enough information, but they are not equivalent to the old Chainstate::m_disabled field or the new Chainstate::m_validity field replacing it, since block flags only tell you about validity back to the last snapshot block, not back to genesis block.

    I guess I’m not really sure what problems you see with chainstates knowing their role or validity, but it is true they continue to know some of that information after this PR just like they knew it before. If you want to move that information elsewhere like chainstate manager or derive it from block validity flags I don’t think this PR should get in the way of that, and it probably even helps a little because it narrows the definition of ChainstateRole to not indicate assumeutxo status but only indicate basic information about whether the chainstate is validated and what block it is targeting. The PR should also help more if you are interested in using chainstate objects for applications other than assumeutxo, because it is dropping assumeutxo-specific assumptions many places in validation and networking code and adding a generic ChainstateManager::m_chainstates vector that can hold arbitrary chainstates.

    I’d also point out that the Chainstate::m_validity field only gets written in 2 places: in the Chainstate constructor and in the ChainstateManager::MaybeCompleteSnapshotValidation method, so I don’t think it adds much maintenance burden even if you think other fields might be more useful.

  46. DrahtBot added the label Needs rebase on Mar 14, 2025
  47. in src/validation.cpp:6112 in 22bfd948f0 outdated
    6146     }
    6147     const int snapshot_tip_height = this->ActiveHeight();
    6148     const int snapshot_base_height = *Assert(this->GetSnapshotBaseHeight());
    6149     const CBlockIndex& index_new = *Assert(m_ibd_chainstate->m_chain.Tip());
    6150-
    6151-    if (index_new.nHeight < snapshot_base_height) {
    


    TheCharlatan commented at 12:52 pm on March 14, 2025:
    In commit 22bfd948f0d45083748724a304fa2c8482464028: Just a note: The removal of this particular check in this commit seems a bit random with all the following kind of redundant assertions, but the code does get cleaned up in later commits.

    ryanofsky commented at 4:06 pm on March 17, 2025:

    re: #30214 (review)

    In commit 22bfd94: Just a note: The removal of this particular check in this commit seems a bit random with all the following kind of redundant assertions, but the code does get cleaned up in later commits.

    We might be noticing different things here so let me know if you have suggestions for improving this, but I don’t think this change is random. This is replacing the index_new.nHeight < snapshot_base_height check with a ReachedTarget() check, and combining the return SKIPPED it triggers with the earlier one with a more complete comment about when SKIPPED is returned.

    The other changes to this function that happen in later commit don’t really affect these things, they just pass new parameters into this function so it can be more independent of ChainstateManager.


    TheCharlatan commented at 3:12 pm on March 27, 2025:

    Missed this in my previous round, but is this not changing behaviour significantly? As far as I can tell we now just check that the target blockhash is reached at some point, while this check that is removed here only skips blocks up to the snapshot height and tries to complete validation with the rest once the snapshot base height is reached. What if the snapshot block is never reached? Does background validation then just continue on indefinitely?

    I might be missing something, but the way I am reading this we also don’t have any tests for exercising what previously was previously the BASE_BLOCKHASH_MISMATCH failure case.


    ryanofsky commented at 3:22 pm on March 27, 2025:

    re: #30214 (review)

    Will look more closely, but I think both the old code and new code are assuming that the chainstate tip is an ancestor of the chainstate target block, or points directly to it. If that’s not true I think you are right that the behavior of this function before and after this change would be different, but a lot of other things would be broken too.


    ryanofsky commented at 4:03 pm on March 27, 2025:

    re: #30214 (comment)

    What if the snapshot block is never reached? Does background validation then just continue on indefinitely?

    I think you are right and there is an improvement that can be made here. The case where the validation chainstate tip somehow points to a higher block than than target block (snapshot block) is supposed to be impossible and should never occur. That case used to return BASE_BLOCKHASH_MISMATCH, but was changed here to trigger an Assert instead, so the code would be simpler and so we would know if a bug occurred. However, because the Assert is place on line 6173 below the SKIPPED check on line 6126 I think the assert would never trigger, and SKIPPED would always be returned instead.

    There are two callers MaybeCompleteSnapshotValidation, one is in ConnectTip which was ignoring the return value so would not be affected, but the other call in LoadChainstate would be affected and could cause the error to be ignored, instead of causing a startup failure.

    I think it would probably be good to add an assert to the top of this function to make sure the chain tip is really is an ancestor of the target block, to document the assumption make make it for likely for bugs to be caught.


    TheCharlatan commented at 9:40 pm on March 27, 2025:

    I find it a bit hard to check this, but wouldn’t the assertion then be triggerable if an alternative block at that height is received? How about adding a ReachedTargetHeight, and use that in the if block? Something like:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 2d19ce45e5..da9efae00f 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -6091 +6091 @@ SnapshotCompletionResult ChainstateManager::MaybeCompleteSnapshotValidation(Chai
     5-            !validated_chainstate.ReachedTarget()) {
     6+            !validated_chainstate.ReachedTargetHeight()) {
     7@@ -6096,0 +6097 @@ SnapshotCompletionResult ChainstateManager::MaybeCompleteSnapshotValidation(Chai
     8+    if (!validated_chainstate.ReachedTarget()) return SnapshotCompletionResult::BASE_BLOCKHASH_MISMATCH;
     9diff --git a/src/validation.h b/src/validation.h
    10index 1b872c5af6..2d66995440 100644
    11--- a/src/validation.h
    12+++ b/src/validation.h
    13@@ -643,0 +644,7 @@ public:
    14+    //! Return true if chainstate reached target block height.
    15+    bool ReachedTargetHeight() const EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    16+    {
    17+        const CBlockIndex* target_block{TargetBlock()};
    18+        return target_block && target_block->nHeight <= m_chain.Height();
    19+    }
    20+
    21@@ -865,0 +873,2 @@ enum class SnapshotCompletionResult {
    22+    BASE_BLOCKHASH_MISMATCH,
    
  48. in src/validation.cpp:6157 in 22bfd948f0 outdated
    6181@@ -6161,14 +6182,7 @@ SnapshotCompletionResult ChainstateManager::MaybeCompleteSnapshotValidation()
    6182         GetNotifications().fatalError(user_error);
    6183     };
    6184 
    6185-    if (index_new.GetBlockHash() != snapshot_blockhash) {
    6186-        LogPrintf("[snapshot] supposed base block %s does not match the "
    6187-          "snapshot base block %s (height %d). Snapshot is not valid.\n",
    6188-          index_new.ToString(), snapshot_blockhash.ToString(), snapshot_base_height);
    6189-        handle_invalid_snapshot();
    6190-        return SnapshotCompletionResult::BASE_BLOCKHASH_MISMATCH;
    


    TheCharlatan commented at 12:56 pm on March 14, 2025:
    I think this SnapshotcompletionResult variant can be removed, no?

    ryanofsky commented at 4:04 pm on March 17, 2025:

    re: #30214 (review)

    I think this SnapshotcompletionResult variant can be removed, no?

    Good catch, removed.

  49. in src/validation.h:1142 in 9853c8384e outdated
    1138@@ -1136,7 +1139,7 @@ class ChainstateManager
    1139     Chainstate* HistoricalChainstate() const EXCLUSIVE_LOCKS_REQUIRED(GetMutex())
    1140     {
    1141         auto* cs{m_ibd_chainstate.get()};
    1142-        return IsUsable(cs) && cs->m_target_blockhash ? cs : nullptr;
    1143+        return cs && cs->Validity() != ChainValidity::INVALID && cs->m_target_blockhash && !cs->m_target_utxohash ? cs : nullptr;
    


    TheCharlatan commented at 1:37 pm on March 14, 2025:
    In commit 9853c8384e1df9ad14c2801e780cc189ac20c6d0: Is the validity check required? AFAICT the historic chainstate is never invalid.

    ryanofsky commented at 6:35 pm on March 17, 2025:

    re: #30214 (review)

    In commit 9853c83: Is the validity check required? AFAICT the historic chainstate is never invalid.

    That’s true. This anticipates a later change but there is no reason to add that complexity here. Dropped this condition.

  50. in src/kernel/types.h:1 in 7e9ee9345a outdated
    0@@ -0,0 +1,30 @@
    1+// Copyright (c) 2024 The Bitcoin Core developers
    


    TheCharlatan commented at 4:52 pm on March 14, 2025:
    Nit: Are we still doing -present now? I thought I saw some discussion about that recently, but can’t find it anymore.

    ryanofsky commented at 7:04 pm on March 17, 2025:

    re: #30214 (review)

    Nit: Are we still doing -present now? I thought I saw some discussion about that recently, but can’t find it anymore.

    The change to “present” never made sense to me because I think the only thing “present” could mean is “when this line was written”, not what we might want it to mean like “whenever the code here was last modified” or “whenever the code here was published.”

    Since “present” just seems like a more ambiguous way of writing the current year, I’d be inclined to stick with a more a conventional approach. But I’m happy to change if the project wants to make a wider switch to “present” or someone can explain why writing “present” is better than writing the current year and never updating it. I couldn’t find much discussion about this topic, but https://opensource.stackexchange.com/questions/4885/copyright-until-the-present-year-in-bsd-license seems to back up my suspicions here.


    TheCharlatan commented at 12:53 pm on March 27, 2025:
    I really hate copyright discussions, but I think it would good to at least apply it consistently in the project. Definitely not something that should be hashed out on this PR though.
  51. in src/kernel/types.h:20 in 7e9ee9345a outdated
    15+
    16+namespace kernel {
    17+//! Information about chainstate that notifications are sent from.
    18+struct ChainstateRole {
    19+    //! Whether this is a notification from chainstate that's been fully
    20+    //! validated starting from the genesis block. False if is from an
    


    TheCharlatan commented at 4:53 pm on March 14, 2025:
    Nit: s/if is/if it is/. I think there are some missing articles around here too.

    ryanofsky commented at 7:16 pm on March 17, 2025:

    re: #30214 (review)

    Nit: s/if is/if it is/. I think there are some missing articles around here too.

    Thanks, fixed up various comments in this header.

  52. in src/kernel/types.h:24 in 7e9ee9345a outdated
    19+    //! Whether this is a notification from chainstate that's been fully
    20+    //! validated starting from the genesis block. False if is from an
    21+    //! assumeutxo snapshot chainstate that has not been validated yet.
    22+    bool validated{true};
    23+
    24+    //! Whether this a historical chainstate downloading old blocks to validate
    


    TheCharlatan commented at 4:58 pm on March 14, 2025:
    Nit: s/this a/this is a/

    ryanofsky commented at 7:16 pm on March 17, 2025:

    re: #30214 (review)

    Nit: s/this a/this is a/

    Thanks, fixed

  53. in src/validation.cpp:3930 in 5e22fdcdb2 outdated
    3926@@ -3927,7 +3927,7 @@ void ChainstateManager::ReceivedBlockTransactions(const CBlock& block, CBlockInd
    3927     // m_chain_tx_count value is not zero, assert that value is actually correct.
    3928     auto prev_tx_sum = [](CBlockIndex& block) { return block.nTx + (block.pprev ? block.pprev->m_chain_tx_count : 0); };
    3929     if (!Assume(pindexNew->m_chain_tx_count == 0 || pindexNew->m_chain_tx_count == prev_tx_sum(*pindexNew) ||
    3930-                pindexNew == CurrentChainstate().SnapshotBase())) {
    3931+                std::any_of(m_chainstates.begin(), m_chainstates.end(), [&](auto& cs) EXCLUSIVE_LOCKS_REQUIRED(cs_main) { return cs->SnapshotBase() == pindexNew; }))) {
    


    TheCharlatan commented at 8:07 pm on March 14, 2025:
    I don’t quite follow this change. What is wrong with using CurrentChainstate here?

    ryanofsky commented at 5:30 pm on March 18, 2025:

    In commit “refactor: Add ChainstateManager::m_chainstates member” (5e22fdcdb2665c0bb5e47c53719796096afe2dd1)

    re: #30214 (review)

    I don’t quite follow this change. What is wrong with using CurrentChainstate here?

    IMO chainstatemanager should generally just validate and update chainstates with incoming data and not rely on concept of a “current” chainstate. The concept of a current chainstate is a little vague and makes more sense at the appilcation level than this level. For example wallets will treat chainstates loaded from assumeutxo snapshots as the current chainstate while indexes ignore that chainstate and treat the fully validated background chainstate as the current one.

    More specifically in this case avoiding CurrentChainstate() makes this code more correct, and would avoid a failed Assume check and a warning if multiple snapshot chainstates were loaded, since loading a snapshot updates the snapshots block’s m_chain_tx_count value when when the previous block may have a 0 value. (We don’t currently support loading multiple snapshot chainstates now, but there is no reason we could not allow it in the future, for example to support loading a new snapshot after an older one has been loaded.)

  54. in src/validation.cpp:6510 in 58265b955a outdated
    6505@@ -6514,3 +6506,24 @@ std::optional<std::pair<const CBlockIndex*, const CBlockIndex*>> ChainstateManag
    6506     if (!chainstate) return {};
    6507     return std::make_pair(chainstate->m_chain.Tip(), chainstate->TargetBlock());
    6508 }
    6509+
    6510+util::Result<void> ChainstateManager::ActivateBestChains() const
    


    TheCharlatan commented at 8:36 pm on March 14, 2025:
    Why is this not re-used for ProcessNewBlock?

    ryanofsky commented at 5:30 pm on March 18, 2025:

    In commit “refactor: Add ChainstateManager::ActivateBestChains() method” (58265b955a3d8622f22958943f79409129d58b36)

    re: #30214 (review)

    Why is this not re-used for ProcessNewBlock?

    ProcessNewBlock() passes a non-null block pointer to ActivateBestChain so it can’t directly reuse the new method. It might make sense for generalize the method, but the main goal here to get rid of unnecessary uses of ChainstateManager::GetAll() by outside code, so it’s a bit outside the scope of what this particular commit is trying to do.

  55. in src/validation.cpp:6519 in 58265b955a outdated
    6514+    // the relevant pointers before the ABC call.
    6515+    std::vector<Chainstate*> chainstates;
    6516+    {
    6517+        LOCK(GetMutex());
    6518+        for (auto& chainstate : m_chainstates) {
    6519+            chainstates.emplace_back(chainstate.get());
    


    TheCharlatan commented at 8:38 pm on March 14, 2025:
    Shouldn’t this skip over the historical chainstate if m_target_utxohash is set?

    ryanofsky commented at 5:30 pm on March 18, 2025:

    In commit “refactor: Add ChainstateManager::ActivateBestChains() method” (58265b955a3d8622f22958943f79409129d58b36)

    re: #30214 (review)

    Shouldn’t this skip over the historical chainstate if m_target_utxohash is set?

    Yes good catch. Updated this to be equivalent.

  56. in src/node/blockstorage.cpp:317 in ca48478fbf outdated
    309@@ -310,8 +310,9 @@ void BlockManager::FindFilesToPrune(
    310 {
    311     LOCK2(cs_main, cs_LastBlockFile);
    312     // Distribute our -prune budget over all chainstates.
    313+    const int num_chainstates{chainman.HistoricalChainstate() ? 2 : 1};
    


    TheCharlatan commented at 8:40 pm on March 14, 2025:
    I’m surprised this is not just taking the m_chainstates size. Why is that?

    ryanofsky commented at 5:30 pm on March 18, 2025:

    In commit “refactor: Delete ChainstateManager::GetAll() method” (ca48478fbfcc83cc6f8c928e57800707fe0c3b51)

    re: #30214 (review)

    I’m surprised this is not just taking the m_chainstates size. Why is that?

    Wrote a better comment since existing comment wasn’t really explaining what was happening here. It’s possible this code might need to be updated to customize pruning behavior if more chainstates are added in the future. But I think default behavior should just be to reserve block storage for the two chainstates we know about: the most-work chainstate and a historical chainstate, not to reserve storage for an arbitrary number of chainstates which may or may not need it.

  57. in src/test/validation_chainstatemanager_tests.cpp:372 in ca48478fbf outdated
    368@@ -365,15 +369,17 @@ struct SnapshotTestSetup : TestChain100Setup {
    369 
    370         BOOST_TEST_MESSAGE("Simulating node restart");
    371         {
    372-            for (Chainstate* cs : chainman.GetAll()) {
    373-                LOCK(::cs_main);
    374-                cs->ForceFlushStateToDisk();
    375+            LOCK(::cs_main);
    


    TheCharlatan commented at 8:52 pm on March 14, 2025:
    I’m a bit confused when it is appropriate to use ::cs_main and when chainman.GetMutex(). Is there a rule to that?

    ryanofsky commented at 5:30 pm on March 18, 2025:

    In commit “refactor: Delete ChainstateManager::GetAll() method” (ca48478fbfcc83cc6f8c928e57800707fe0c3b51)

    re: #30214 (review)

    I’m a bit confused when it is appropriate to use ::cs_main and when chainman.GetMutex(). Is there a rule to that?

    I don’t really make any distinction between the two but probably it’s better to use the latter so switched to that in these tests

  58. TheCharlatan commented at 9:24 pm on March 14, 2025: contributor

    Approach ACK

    I am not yet convinced of the last three commits changing the code (5e22fdcdb2665c0bb5e47c53719796096afe2dd1, 58265b955a3d8622f22958943f79409129d58b36, ca48478fbfcc83cc6f8c928e57800707fe0c3b51), maybe they could do with some more motivation in the commit message? I guess the last of the commits ca48478fbfcc83cc6f8c928e57800707fe0c3b51 is kind of the result of the preceding two and it is nice not to have to construct the vector on the fly for every GetAll() call anymore.

    There is an opportunity for also simplifying the pruning code a bit, I made a commit here https://github.com/TheCharlatan/bitcoin/commit/fe76cc3a04c94ca4fdbb9550d478909f3b13d8ec that removes the chainman arguments from the manual pruning function. I think it would fit in well with the patch set here.

  59. ryanofsky force-pushed on Mar 18, 2025
  60. ryanofsky commented at 9:26 pm on March 18, 2025: contributor

    Rebased d09b82156ced5d543d08226e8d7b8fda2e0ec532 -> 467528960689c2913c101ef75bc833e8f04bd0f3 (pr/cstate.9 -> pr/cstate.10, compare) due to conflict 31961 and implementing various suggestions.

    re: #30214#pullrequestreview-2685423528

    Thanks for the review! I added comments to the m_chainstates commits to motivate them better, also added your simplification fe76cc3a04c94ca4fdbb9550d478909f3b13d8ec and implemented other suggestions below.

  61. in src/test/validation_chainstatemanager_tests.cpp:607 in df99f19d81 outdated
    599@@ -600,9 +600,14 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_snapshot_init, SnapshotTestSetup)
    600     // This call reinitializes the chainstates.
    601     this->LoadVerifyActivateChainstate();
    602 
    603+    std::vector<Chainstate*> chainstates;
    604     {
    605         LOCK(chainman_restarted.GetMutex());
    606-        BOOST_CHECK_EQUAL(chainman_restarted.GetAll().size(), 2);
    607+        chainstates = chainman_restarted.GetAll();
    608+        BOOST_CHECK_EQUAL(chainstates.size(), 2);
    


    mzumsande commented at 7:17 pm on March 19, 2025:

    df99f19d81537ecfbf66f6662bc9814b594ef8d2: I was trying to understand why this fix is correct / the height suddenly changes from 109 to 110, and I find the entire chainstatemanager_snapshot_init subtest very confusing: First, there is a comment saying “Test that simulating a shutdown (…) we end up with a single chainstate that is at tip”, followed by a check that there are still 2 chainstates - so the test contradicts itself.

    Then the test calls the test-only function LoadVerifyActivateChainstate which only calls ABC for the active chainstate, instead of doing it for all chainstates like its counterpart in ImportBlocks() during actual init does - why is this using an artificial init process that differs from the way it’s done in production?

    Because of that, only during mineBlocks() (as a result of ProcessNewBlock() calling ABC for both chainstates) the block that was previously disconnected in a hacky way (by simply calling DisconnectBlock() instead of e.g. invalidating it) is being connected again, and the background snapshot is verified, which is a surprising side effects of just trying to connect some blocks on the tip - and the actual reason why the height is suddenly 110. I think that this deserves an explanation, but more importantly, I don’t see the point / strategy of the subtest in general - what situation is it actually trying to test?

    I realize that this isn’t really what this PR is about, so maybe there could be a separate PR to clean this test up properly, especially if the other subtests (which I didn’t check) would have similar problems?


    ryanofsky commented at 12:53 pm on March 20, 2025:

    re: #30214 (review)

    I find the entire chainstatemanager_snapshot_init subtest very confusing

    Yes. And thanks for digging into the test. I added comments to try to make it less confusing. For reference the test was added in e4d7995 and seems basically unchanged since when it was written.

    First, there is a comment saying “Test that simulating a shutdown (…) we end up with a single chainstate that is at tip”, followed by a check that there are still 2 chainstates - so the test contradicts itself.

    Good catch, the comment is wrong. If you search for “single chainstate” you can see the same comment is copied from 2 other tests in this file and is ok in those places. Fixed in latest push.

    Then the test calls the test-only function LoadVerifyActivateChainstate which only calls ABC for the active chainstate, instead of doing it for all chainstates like its counterpart in ImportBlocks() during actual init does - why is this using an artificial init process that differs from the way it’s done in production?

    I don’t know the answer to this. The unit tests do a lot of artificial things and I’d agree it is usually better when we can remove artificial code so tests are more realistic and there is less code to maintain.

    Because of that, only during mineBlocks() (as a result of ProcessNewBlock() calling ABC for both chainstates) the block that was previously disconnected in a hacky way (by simply calling DisconnectBlock() instead of e.g. invalidating it) is being connected again, and the background snapshot is verified, which is a surprising side effects of just trying to connect some blocks on the tip - and the actual reason why the height is suddenly 110.

    Added a comment about that instead of changing it so this is still a focused bugfix. Agree if test were more realistic it might be clearer.

    I think that this deserves an explanation, but more importantly, I don’t see the point / strategy of the subtest in general - what situation is it actually trying to test?

    I think the main thing it is trying to test is BOOST_CHECK_EQUAL(chainstates.size(), 2); i.e. “that snapshot chainstates initialize properly when found on disk” and that the chainstates function as expected when more blocks are mined. Expanded the comment at the top of the test, too.

    I realize that this isn’t really what this PR is about, so maybe there could be a separate PR to clean this test up properly, especially if the other subtests (which I didn’t check) would have similar problems?

    I think it would be nice if tests were more realistic, and the main thing that annoys me about these tests that is instead of using literal numbers and expressions, the tests like to declare lots of helper variables, so there are many similarly named variables it it is difficult to keep track of what they all mean and how they are used. It probably would be nice to clean these tests up and make them simpler, though not a priority.

  62. in src/validation.h:616 in a1fd5a8862 outdated
    605@@ -598,12 +606,16 @@ class Chainstate
    606     //! target the most-work, valid block.
    607     std::optional<uint256> m_target_blockhash GUARDED_BY(::cs_main);
    608 
    609+    //! Hash of the UTXO set at the target block, computed when the chainstate
    610+    //! reaches the target block, and null before then.
    611+    std::optional<AssumeutxoHash> m_target_utxohash GUARDED_BY(::cs_main);
    


    mzumsande commented at 8:55 pm on March 19, 2025:

    a1fd5a886252ebd813ee69029dcf17de6883e047: Why is m_target_utxohash a std::optional<AssumeutxoHash> if it’s only ever used as a bool? The hash, once set , is never read anywhere as far as I can see.

    nit: it should be m_target_utxohash instead of m_target_blockhash in the commit message.


    TheCharlatan commented at 9:44 am on March 20, 2025:
    I was thinking that this might be interesting to add to the getchainstates rpc call.

    ryanofsky commented at 2:12 pm on March 20, 2025:

    re: #30214 (review)

    Thanks, fixed commit message. Ultimately I think it would be best to drop the m_target_utxohash member and just check the ReachedTarget() condition everywhere instead of using m_target_utxohash.has_value(). But this needs to be done carefully, and it might require getting rid of some “belt and suspenders” checks in the current code which are technically unnecessary but give some confidence the code works correctly.

    This commit is a more of a pure refactoring and tries to keep code the working the way it does currently, but add transparency so it clearer why it works and what assumptions it is making. It’s trying to replace the opaque m_disabled bool which could indicate either (1) that the chainstate target is invalid or (2) that a chainstate is valid but no longer in use after its utxo hash was checked. Storing the utxo hash in the latter state seemed more natural representation than adding a bool utxohash_checked member, but I could switch to that if preferred.

  63. ryanofsky removed the label Needs rebase on Mar 19, 2025
  64. mzumsande commented at 9:42 pm on March 19, 2025: contributor
    Concept ACK only reviewed up to 199a2eb87c3ad9382f2fe4ae97524099224a7b04 so far.
  65. DrahtBot added the label Needs rebase on Mar 20, 2025
  66. test: Fix broken chainstatemanager_snapshot_init check
    The following test code never checked anything because the if statement was
    always false:
    
        if (cs != &chainman_restarted.ActiveChainstate()) {
            BOOST_CHECK_EQUAL(cs->m_chain.Height(), 109);
        }
    
    Also, the height of the background chainstate it was intending to check is 110,
    not 109. Fix both problems by rewriting the check.
    761e5675a9
  67. refactor: Add Chainstate::m_target_blockhash member
    Make Chainstate objects aware of what block they are targeting. This makes
    Chainstate objects more self contained, so it's possible to determine what
    blocks should be downloaded and connected to them without needing to query
    ChainstateManager or look at other Chainstate objects.
    
    The motivation for this change is to make code more readable, so it only
    requires knowing about Chainstate targets, not reasoning about the assumeutxo
    snapshot validation sequence. This change also enables simplifications to the
    ChainstateManager interface in subsequent commits, and could make it easier to
    implement new features like creating new Chainstate objects to generate UTXO
    snapshots or index UTXO data.
    a08d837377
  68. refactor: Add Chainstate m_validity and m_target_utxohash members
    Get rid of m_disabled/IsUsable members. Instead of marking chains disabled for
    different reasons, record chainstate validity and validation progress
    explicitly and use that information directly.
    050a7ff8a9
  69. refactor: Deduplicate Chainstate activation code
    Move duplicate code from ChainstateManager::ActivateSnapshot and
    ChainstateManager::ActivateExistingSnapshot methods to a new
    ChainstateManager::AddChainstate method.
    
    The "AddChainstate" method name doesn't mention snapshots even though it is
    only used to add snapshot chainstates now, because it becomes more generalized
    in a later commit in this PR ("refactor: Add ChainstateManager::m_chainstates
    member")
    bd13bda148
  70. refactor: Pass chainstate parameters to MaybeCompleteSnapshotValidation
    Remove hardcoded references to m_ibd_chainstate and m_snapshot_chainstate so
    MaybeCompleteSnapshotValidation function can be simpler and focus on validating
    the snapshot without dealing with internal ChainstateManager states.
    
    This is a step towards being able to validate the snapshot outside of
    ActivateBestChain loop so cs_main is not locked for minutes when the snapshot
    block is connected.
    214d10aef6
  71. refactor: Add Chainstate::StoragePath() method
    Use to simplify code determining the chainstate leveldb paths.
    4538ab5d39
  72. refactor: Add ChainstateManager::CurrentChainstate() method
    CurrentChainstate() is basically the same as ActiveChainstate() except it
    requires cs_main to be locked when it is called, instead of locking cs_main
    internally.
    
    The name "current" should also be less confusing than "active" because multiple
    chainstates can be active, and CurrentChainstate() returns the chainstate
    targeting the current network tip, regardless of what chainstates are being
    downloaded or how they are used.
    e0307f892e
  73. refactor: Add ChainstateManager::ValidatedChainstate() method
    ValidatedChainstate() accessor replaces GetChainstateForIndexing() with no
    change in behavior.
    2593f882fa
  74. refactor: Convert ChainstateRole enum to struct
    Change ChainstateRole parameter passed to wallets and indexes. Wallets and
    indexes need to know whether chainstate is historical and whether it is fully
    validated. They should not be aware of the assumeutxo snapshot validation
    process.
    6d368aa22e
  75. refactor: Delete ChainstateManager::IsSnapshotActive() method
    IsSnapshotActive() method is only called one place outside of tests and
    asserts, and is confusing because it returns true even after the snapshot is
    fully validated.
    
    The documentation which said this "implies that a background validation
    chainstate is also in use" is also incorrect, because after the snapshot is
    validated, the background chainstate gets disabled and IsUsable() would return
    false.
    64a8df94d5
  76. refactor: Delete ChainstateManager::IsSnapshotValidated() method
    IsSnapshotValidated() is only called one place outside of tests, and is use
    redundantly in some tests, asserting that a snapshot is not validated when a
    snapshot chainstate does not even exist. Simplify by dropping the method and
    checking Chainstate validity field directly.
    8e93d7d242
  77. refactor: Delete ChainstateManager::SnapshotBlockhash() method
    SnapshotBlockhash() is only called two places outside of tests, and is used
    redundantly in some tests, checking the same field as other checks. Simplify by
    dropping the method and using the m_from_snapshot_blockhash field directly.
    7c2295aeb7
  78. refactor: Add ChainstateManager::m_chainstates member
    Use to replace m_active_chainstate, m_ibd_chainstate, and m_snapshot_chainstate
    members. This has several benefits:
    
    - Ensures ChainstateManager treats chainstates instances equally, making
      distinctions based on their attributes, not having special cases and making
      assumptions based on their identities.
    
    - Normalizes ChainstateManager representation so states that should be
      impossible to reach and validation code has no handling for (like
      m_snapshot_chainstate being set and m_ibd_chainstate being unset, or both
      being set but m_active_chainstate pointing to the m_ibd_chainstate) can no
      longer be represented.
    
    - Makes ChainstateManager more extensible so new chainstates can be added for
      different purposes, like indexing or generating and validating assumeutxo
      snapshots without interrupting regular node operations. With the
      m_chainstates member, new chainstates can be added and handled without needing
      to make changes all over validation code or to copy/paste/modify the existing
      code that's been already been written to handle m_ibd_chainstate and
      m_snapshot_chainstate.
    
    - Avoids terms that are confusing and misleading:
    
      - The term "active chainstate" term is confusing because multiple chainstates
        will be active and in use at the same time. Before a snapshot is validated,
        wallet code will use the snapshot chainstate, while indexes will use the IBD
        chainstate, and netorking code will use both chainstates, downloading
        snapshot blocks at higher priority, but also IBD blocks simultaneously.
    
      - The term "snapshot chainstate" is ambiguous because it could refer either
        to the chainstate originally loaded from a snapshot, or to the chainstate
        being used to validate a snapshot that was loaded, or to a chainstate being
        used to produce a snapshot, but it is arbitrary used to refer the first
        thing. The terms "most-work chainstate" or "assumed-valid chainstate" should
        be less ambiguous ways to refer to chainstates loaded from snapshots.
    
      - The term "IBD chainstate" is not just ambiguous because actively confusing
        because technically IBD ends and the node is considered synced when the
        snapshot chainstate finishes syncing, so in practice the IBD chainstate
        will mostly by synced after IBD is complete. The term "fully-validated" is
        a better way of describing the characteristics and purpose of this
        chainstate.
    3f49903cd3
  79. refactor: Add ChainstateManager::ActivateBestChains() method
    Deduplicate code looping over chainstate objects and calling
    ActivateBestChain() and avoid need for code outside ChainstateManager to use
    the GetAll() method.
    ea52cd4bc3
  80. refactor: Delete ChainstateManager::GetAll() method
    Just use m_chainstates array instead.
    cb2467bfd5
  81. refactor: Simplify pruning functions bb037a6a27
  82. doc: Improve ChainstateManager documentation, use consistent terms 137d5980ab
  83. ryanofsky force-pushed on Mar 20, 2025
  84. ryanofsky force-pushed on Mar 20, 2025
  85. ryanofsky commented at 2:18 pm on March 20, 2025: contributor

    Thanks for the reviews

    Rebased 467528960689c2913c101ef75bc833e8f04bd0f3 -> 00c3e4db49ebd2c3bf2f3c7f2a8c9ba5dfb3c8f5 (pr/cstate.10 -> pr/cstate.11, compare) due to conflict with #31519, also adding test comments. Update 00c3e4db49ebd2c3bf2f3c7f2a8c9ba5dfb3c8f5 -> 137d5980abd49a1d2bf783cb57a9386b1941550f (pr/cstate.11 -> pr/cstate.12, compare) just fixing a commit message

  86. DrahtBot removed the label Needs rebase on Mar 20, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-03-31 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me