validation: have LoadBlockIndex account for snapshot use #23174

pull jamesob wants to merge 3 commits into bitcoin:master from jamesob:2021-10-au-loadblockindex changing 3 files +157 −18
  1. jamesob commented at 2:33 pm on October 4, 2021: member

    This is part of the assumeutxo project (parent PR: #15606)


    Currently, BlockManager::LoadBlockIndex adds all blocks that have downloaded transactions to the active chain state’s setBlockIndexCandidates set, ignoring the background chain state.

    This PR changes ChainstateManager::LoadBlockIndex to update setBlockIndexCandidates in the background chain, not just the active chain. In the active chain, the same blocks are added as before. In the background chain, only blocks that have actually been validated, not blocks marked assumed-valid are added so the background chain will continue to download and validate assumed-valid blocks.

  2. DrahtBot added the label Validation on Oct 4, 2021
  3. DrahtBot commented at 11:40 am on October 5, 2021: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #23581 (Move BlockManager to node/blockstorage by MarcoFalke)
    • #23448 (refactor, consensus: remove calls to global Params() in validation layer by lsilva01)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. Sjors commented at 8:42 am on October 7, 2021: member
    (this PR is missing in the Assume UTXO project)
  5. jonatack commented at 7:43 pm on October 7, 2021: contributor
    Concept ACK
  6. in src/validation.cpp:3643 in 51e614fdbc outdated
    3638+    // setBlockIndexCandidates of the background chainstate.
    3639+    bool saw_end_of_assumedvalid{false};
    3640+
    3641+    std::optional<uint256> assumedvalid_end_blockhash;
    3642+    unsigned int assumedvalid_end_nchaintx;
    3643+    std::tie(assumedvalid_end_blockhash, assumedvalid_end_nchaintx) =
    


    jonatack commented at 3:02 pm on October 8, 2021:

    For std::tie

    0+++ b/src/validation.cpp
    1@@ -58,6 +58,7 @@
    2 #include <numeric>
    3 #include <optional>
    4 #include <string>
    5+#include <tuple>
    
  7. in src/validation.cpp:3695 in 51e614fdbc outdated
    3690@@ -3658,7 +3691,18 @@ bool BlockManager::LoadBlockIndex(
    3691         if (pindex->IsAssumedValid() ||
    3692                 (pindex->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3693                  (pindex->HaveTxsDownloaded() || pindex->pprev == nullptr))) {
    3694-            block_index_candidates.insert(pindex);
    3695+            for (CChainState* chainstate : chainman.GetAll()) {
    3696+                // If we're on an block index entry that is passed the end of the
    


    jonatack commented at 3:07 pm on October 8, 2021:
    • s/an block/a block/
    • s/passed the end/past the end/?

    my editor auto-proposes this

    0-                // If we're on an block index entry that is passed the end of the
    1-                // assumed-valid region of the chain, avoid
    2-                // adding this as a candidate tip to the background validation
    3-                // chain, since that would prevent background IBD from happening.
    4+                // If we're on a block index entry that is past the end of the
    5+                // assumed-valid region of the chain, avoid adding this as a
    6+                // candidate tip to the background validation chain, since that
    7+                // would prevent background IBD from happening.
    

    jamesob commented at 1:56 pm on November 1, 2021:
    Fixed
  8. in src/validation.cpp:3659 in 51e614fdbc outdated
    3655+        //
    3656+        // Do not expect assumed-valid blocks to have nTx.
    3657         if (pindex->nTx > 0) {
    3658             if (pindex->pprev) {
    3659-                if (pindex->pprev->HaveTxsDownloaded()) {
    3660+                bool is_assumedvalid_end =
    


    jonatack commented at 3:08 pm on October 8, 2021:
    0                const bool is_assumedvalid_end =
    
  9. in src/validation.h:1020 in 51e614fdbc outdated
    1013@@ -1010,6 +1014,12 @@ class ChainstateManager
    1014     //! ResizeCoinsCaches() as needed.
    1015     void MaybeRebalanceCaches() EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    1016 
    1017+    //! If some region of the block index is assumed to be valid (as in the case of
    1018+    //! UTXO snapshot usage), return the last blockhash to be assumed valid (i.e. the
    1019+    //! base of the snapshot) and the correspondent nChainTx value associated with it.
    1020+    std::pair<std::optional<uint256>, unsigned int> getAssumedValidEnd()
    


    jonatack commented at 3:18 pm on October 8, 2021:

    style nit, PascalCase unless there is some reason I’m not aware of here (same for the other get... functions introduced here)

    0    std::pair<std::optional<uint256>, unsigned int> GetAssumedValidEnd()
    

    ryanofsky commented at 3:49 pm on October 13, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d):

    Would suggest changing GetAssumedValidEnd to GetLastAssumedValid because “end” does not seem right here. In c++ “end” usually refers to the first element after the range, not the last element in the range.

    Also, it seems like whole return value should be optional, not just the first element of the pair (optional<tuple<uint256, int>> not pair<optional<uint256>, int>. Changing this would also avoid having having to return a dummy 0 nchaintx value in the implementation if there is no snapshot.


    ryanofsky commented at 4:52 pm on October 19, 2021:

    style nit, PascalCase unless there is some reason I’m not aware of here (same for the other get... functions introduced here)

    I always liked the lower/upper method/function convention so you know if some random function call is going to implicitly have access to *this as a hidden parameter. Making members or private members lowercase is less verbose than prefixing method calls with this-> and this convention is already followed other parts of code. Nice to see internal method calls made more obvious in new code, IMO.

  10. jonatack commented at 3:28 pm on October 8, 2021: contributor
    After debug building rebased to master and an initial review, the code builds cleanly and makes sense, but I need to look more deeply and haven’t yet checked if the change is covered by tests, so work-in-progress ACK 51e614fdbc6a23ade324e9bd01228bf97df9bc5d. A few minor comments, feel free to ignore.
  11. in src/validation.cpp:3639 in 51e614fdbc outdated
    3630@@ -3631,17 +3631,50 @@ bool BlockManager::LoadBlockIndex(
    3631         vSortedByHeight.push_back(std::make_pair(pindex->nHeight, pindex));
    3632     }
    3633     sort(vSortedByHeight.begin(), vSortedByHeight.end());
    3634+
    3635+    // If we have a historical region of the chain that is assumed-valid, we should
    3636+    // mark the last assumed-valid block that we see, since we then have to populate
    3637+    // the final nChainTx as well as prevent inclusion of blocks after that in the
    3638+    // setBlockIndexCandidates of the background chainstate.
    3639+    bool saw_end_of_assumedvalid{false};
    


    ryanofsky commented at 9:40 am on October 14, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    This comment seems a little out of place since it’s not describing these variables, but describing how they are used in code below. Would maybe suggest breaking this up and moving it to relevant places below.

  12. in src/validation.h:630 in 51e614fdbc outdated
    624@@ -629,6 +625,10 @@ class CChainState
    625      */
    626     const std::optional<uint256> m_from_snapshot_blockhash;
    627 
    628+    //! Return true if this chainstate relies on blocks that are assumed-valid. In
    629+    //! practice this means it was created based on a UTXO snapshot.
    630+    bool ReliesOnAssumeValid() { return m_from_snapshot_blockhash.has_value(); }
    


    ryanofsky commented at 5:16 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    Another naming nitpick. Feel free to ignore this naming suggestion and all the other ones, but might suggest inverting this function and calling it something like “OnlyContainsAssumedValidBlocks”. This seems less handwavy than “ReliesOnAssumeValid” because it’s not clear what “relies” mean or who is relying.

  13. in src/validation.cpp:3641 in 51e614fdbc outdated
    3636+    // mark the last assumed-valid block that we see, since we then have to populate
    3637+    // the final nChainTx as well as prevent inclusion of blocks after that in the
    3638+    // setBlockIndexCandidates of the background chainstate.
    3639+    bool saw_end_of_assumedvalid{false};
    3640+
    3641+    std::optional<uint256> assumedvalid_end_blockhash;
    


    ryanofsky commented at 5:23 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    Also suggested this in another comment, but maybe consider s/assumedvalid_end/last_assumed_valid/ throughout this function, because “end” in c++ usually refers to first element after the range, not the last element in the range

  14. in src/validation.cpp:3644 in 51e614fdbc outdated
    3639+    bool saw_end_of_assumedvalid{false};
    3640+
    3641+    std::optional<uint256> assumedvalid_end_blockhash;
    3642+    unsigned int assumedvalid_end_nchaintx;
    3643+    std::tie(assumedvalid_end_blockhash, assumedvalid_end_nchaintx) =
    3644+        chainman.getAssumedValidEnd();
    


    ryanofsky commented at 5:25 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    Could use structured bindings to save space and avoid potentially dangerous type mismatches:

     0--- a/src/validation.cpp
     1+++ b/src/validation.cpp
     2@@ -3638,10 +3638,7 @@ bool BlockManager::LoadBlockIndex(
     3     // setBlockIndexCandidates of the background chainstate.
     4     bool saw_end_of_assumedvalid{false};
     5 
     6-    std::optional<uint256> assumedvalid_end_blockhash;
     7-    unsigned int assumedvalid_end_nchaintx;
     8-    std::tie(assumedvalid_end_blockhash, assumedvalid_end_nchaintx) =
     9-        chainman.getAssumedValidEnd();
    10+    auto [assumedvalid_end_blockhash, assumedvalid_end_nchaintx] = chainman.getAssumedValidEnd();
    11 
    12     for (const std::pair<int, CBlockIndex*>& item : vSortedByHeight)
    13     {
    
  15. in src/validation.cpp:3698 in 51e614fdbc outdated
    3690@@ -3658,7 +3691,18 @@ bool BlockManager::LoadBlockIndex(
    3691         if (pindex->IsAssumedValid() ||
    3692                 (pindex->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3693                  (pindex->HaveTxsDownloaded() || pindex->pprev == nullptr))) {
    3694-            block_index_candidates.insert(pindex);
    3695+            for (CChainState* chainstate : chainman.GetAll()) {
    3696+                // If we're on an block index entry that is passed the end of the
    3697+                // assumed-valid region of the chain, avoid
    3698+                // adding this as a candidate tip to the background validation
    3699+                // chain, since that would prevent background IBD from happening.
    


    ryanofsky commented at 6:07 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    It’s not so clear why adding blocks to setBlockIndexCandidates would prevent IBD from happening. The negative “avoid doing X so Y doesn’t happen” logic here is a bit twisty, so I would try to describe more positively what this is doing. Something like:

    0// If a chain only supposed to contain assume valid blocks (because it is
    1// validating them in the background) avoid adding later validated blocks,
    2// because these would prevent earlier blocks from being downloaded and
    3// validated.
    4if (!chainstate->OnlyContainsAssumeValidBlocks() || block_assumed_valid) {
    5    chainstate->setBlockIndexCandidates.insert(pindex);
    6}
    
  16. in src/validation.cpp:3658 in 51e614fdbc outdated
    3652+
    3653         // We can link the chain of blocks for which we've received transactions at some point.
    3654         // Pruned nodes may have deleted the block.
    3655+        //
    3656+        // Do not expect assumed-valid blocks to have nTx.
    3657         if (pindex->nTx > 0) {
    


    ryanofsky commented at 6:19 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    Is there a bug here? Comment is saying do not expect assumed valid blocks to have nTx, then the next line immediately skips assume valid logic if nTx is 0


    ariard commented at 11:30 pm on October 25, 2021:
    Agree, I think if the new logic seeks to detect the last block of the assumed-valid range but assumed-valid blocks are expected to not have nTx, detection success cannot be reached if it’s conditional on nTx > 0.

    jamesob commented at 4:51 pm on October 27, 2021:

    Good points - this is a little confusing; my fault for not being clearer in commentary. nTx = 1 is set for all block headers during snapshot activation (https://github.com/jamesob/bitcoin/blob/2021-10-au-loadblockindex/src/validation.cpp#L4966-L4970) so this code will be executed for all assumed-valid blocks.

    Now that you’ve raised the point, I’m wondering if this code is even necessary. The modifications here are redundant with the ones we do during snapshot activation (linked above), and I think I added these here in LoadBlockIndex as a matter of necessity before I thought to set the entire blockindex to dirty in the activation routine (thereby persisting all changes during FlushStateToDisk).

    I was going to say that having this code here is still valuable, if not a little confusing, in case there is some kind of interruption when writing the block index immediately after snapshot load… but changes to the block index are batched, so if we fail to write the block index, none of the chain will be considered assumedvalid!

    So: maybe some of these LoadBlockIndex changes are unnecessary? I’ll work to determine this today.

  17. in src/validation.cpp:3672 in 51e614fdbc outdated
    3668+                    // candidate tips to setBlockIndexCandidates below.
    3669+                    //
    3670+                    // After the snapshot has completed validation, nChainTx
    3671+                    // values should compute normally.
    3672+                    assert(assumedvalid_end_nchaintx > 0);
    3673+                    pindex->nChainTx = assumedvalid_end_nchaintx;
    


    ryanofsky commented at 6:21 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    It would be good to have a unit test that ensures the function is actually setting this.

  18. in src/validation.cpp:3705 in 51e614fdbc outdated
    3700+                //
    3701+                // If there is no assumed-valid region of the chain,
    3702+                // !saw_end_of_assumedvalid will always be true.
    3703+                if (!saw_end_of_assumedvalid || chainstate->ReliesOnAssumeValid()) {
    3704+                    chainstate->setBlockIndexCandidates.insert(pindex);
    3705+                }
    


    ryanofsky commented at 6:23 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    It would be good to have a unit test that ensure the function is adding the right blocks as candidates in both chains.

  19. in src/validation.cpp:5100 in 51e614fdbc outdated
    5095+std::pair<std::optional<uint256>, unsigned int> ChainstateManager::getAssumedValidEnd()
    5096+{
    5097+    std::optional<uint256> snapshotblockhash = SnapshotBlockhash();
    5098+    std::optional<unsigned int> nchaintx = getSnapshotNChainTx();
    5099+
    5100+    if (!snapshotblockhash || !nchaintx) return {std::nullopt, 0};
    


    ryanofsky commented at 6:51 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    Might write to assert when values are inconsistent:

    0if (nchaintx) {
    1  assert(snapshotblockhash);
    2  return {snapshotblockhash, *nchaintx};
    3}
    4return {nullopt, 0};
    
  20. in src/validation.cpp:3673 in 51e614fdbc outdated
    3669+                    //
    3670+                    // After the snapshot has completed validation, nChainTx
    3671+                    // values should compute normally.
    3672+                    assert(assumedvalid_end_nchaintx > 0);
    3673+                    pindex->nChainTx = assumedvalid_end_nchaintx;
    3674+                    saw_end_of_assumedvalid = true;
    


    ryanofsky commented at 7:32 pm on October 19, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (51e614fdbc6a23ade324e9bd01228bf97df9bc5d)

    There can be multiple blocks at the same height in vSortedByHeight, right? If so, this behavior seems nondeterministic, because it’s sorted by (height, pointer address), so blocks at the assume valid height with pointer addresses less than the assume valid height will be added to setBlockIndexCandidates and blocks at the same height with greater pointer addresses will not by added to setBlockIndexCandidates.

    Having this mutable bool variable in the for loop makes this loop more stateful and complicated anyway. I think maybe ideally the GetAssumeValid method should return a (hash, height, nchaintx) tuple instead of a (hash, nchaintx) pair. So the height would be known in advance and the code could look something like:

     0const auto last_assumed_valid_block = chainman.GetLastAssumedValid();
     1for (const auto& [height, pindex] : vSortedByHeight) {
     2    // Update nChainTx the last assumed valid block
     3    if (last_assumed_valid_block) {
     4        const auto& [av_hash, av_height, av_chaintx] = *last_assumed_valid_block;
     5        if (pindex->GetBlockHash() == av_hash) {
     6            pindex->nChainTx = av_chaintx;
     7        }
     8    }
     9    ....
    10    // Add block to chain candidates lists, unless the block is after the assumed valid
    11    // height and the chain is only supposed to contain assumed valid blocks (if it's a
    12    // background chain supposed to download and validate them).
    13    const int av_height = last_assumed_valid_block ? std::get<1>(*last_assumed_valid_block) : std::numeric_limits<int>::max();
    14    if (!chainstate->OnlyContainsAssumedValidBlocks() || pindex->nHeight <= av_height) {
    15         chainstate->setBlockIndexCandidates.insert(pindex);
    16    }
    17}
    

    jamesob commented at 9:41 pm on October 28, 2021:
    Thanks for this comment; this raises a really good point. The LoadBlockIndex() modifications are now no longer stateful, and instead we walk backwards for each pindex to see if any of its ancestors are assumed-valid (which is O(n**2) with chain size, but since LoadBlockIndex() happens once on startup that doesn’t really matter).
  21. ryanofsky approved
  22. ryanofsky commented at 8:05 pm on October 19, 2021: contributor
    Code review ACK 51e614fdbc6a23ade324e9bd01228bf97df9bc5d. I left a lot of suggestions and some questions, but this change seems like a step in right direction and it shouldn’t change behavior in any noticeable way when there’s no snapshot loaded.
  23. ryanofsky commented at 8:20 pm on October 19, 2021: contributor

    Forgot to post this earlier. I wrote up a description of this PR when I started reviewing it that may be useful to others (or helpful to me if I got something wrong):

    Currently, BlockManager::LoadBlockIndex adds all blocks that have downloaded transactions to the active chain state’s setBlockIndexCandidates set, ignoring the background chain state.

    This PR changes ChainstateManager::LoadBlockIndex to:

    • Update setBlockIndexCandidates in the background chain, not just the active chain. In the active chain, all blocks are added as before. In the background chain, only blocks below the snapshot height are added so the background chain will continue to download and validate them.
    • Sets pindex->nChainTx of the snapshot block to the value from snapshot metadata instead of leaving it 0.
  24. in src/validation.cpp:5097 in 51e614fdbc outdated
    5092+    return au_data->nChainTx;
    5093+}
    5094+
    5095+std::pair<std::optional<uint256>, unsigned int> ChainstateManager::getAssumedValidEnd()
    5096+{
    5097+    std::optional<uint256> snapshotblockhash = SnapshotBlockhash();
    


    ariard commented at 10:50 pm on October 25, 2021:
    Can be auto blockhash_op ?
  25. in src/validation.cpp:3702 in 51e614fdbc outdated
    3698+                // adding this as a candidate tip to the background validation
    3699+                // chain, since that would prevent background IBD from happening.
    3700+                //
    3701+                // If there is no assumed-valid region of the chain,
    3702+                // !saw_end_of_assumedvalid will always be true.
    3703+                if (!saw_end_of_assumedvalid || chainstate->ReliesOnAssumeValid()) {
    


    ariard commented at 11:43 pm on October 25, 2021:

    Shouldn’t the logical operator be an && ?

    Otherwise I think if the end of the assumed-valid block region has been reached (saw_end_of_assumevalid=true) logic still enters this control flow. And as such block index entries past the assumed-valid region of the chain are added to setBlockIndexCandidates (AFAIU from the comment just above).

  26. in src/validation.cpp:3666 in 51e614fdbc outdated
    3662+                    pindex->GetBlockHash() == *assumedvalid_end_blockhash;
    3663+
    3664+                if (is_assumedvalid_end) {
    3665+                    // While starting up with an assumeutxo snapshot chain,
    3666+                    // we need to manually populate the nChainTx field of the
    3667+                    // base of the snapshot so that we can add assumed-valid
    


    ariard commented at 11:46 pm on October 25, 2021:
    I don’t see the causality between populating the nChainTx and adequately adding assumed-valid and assumed-valid only blocks to setBlockIndexCandidates for snapshot chainstates ? I think conditional L3691 isn’t pending on nChainTx.

    jamesob commented at 9:24 pm on October 28, 2021:
    Good point! Addressed in new changeset.
  27. ariard commented at 11:51 pm on October 25, 2021: member
    Concept ACK, as highlighted by other reviewers would be good to have unit tests to assert correctness of the changes.
  28. jamesob force-pushed on Oct 28, 2021
  29. jamesob commented at 9:37 pm on October 28, 2021: member

    Hey, review works! Thanks to commentary from @ryanofsky @ariard, I’ve found that a lot of the original patch was unnecessary .I’ve pushed a much-simplified changeset that addresses the quandary I posted yesterday (https://github.com/bitcoin/bitcoin/pull/23174#discussion_r737666011).

    It turns out that the earlier version of this patch contained some unnecessary code that was a vestige from before I was persisting changes to the block index during snapshot activation (added in #21526). The new changeset just ensures that we don’t add tip candidates which rely on assumed-valid blocks to the IBD chainstate. I think the changes are much more intuitive to follow, and a decent amount of code was dropped.

    For anyone who was in doubt: reviewing code pays off!

  30. in src/validation.cpp:3660 in 2d7ecfb6e9 outdated
    3636+        // PopulateAndValidateSnapshot()).
    3637         // Pruned nodes may have deleted the block.
    3638         if (pindex->nTx > 0) {
    3639             if (pindex->pprev) {
    3640-                if (pindex->pprev->HaveTxsDownloaded()) {
    3641+                if (pindex->pprev->nChainTx > 0) {
    


    jamesob commented at 9:43 pm on October 28, 2021:
    Note that this change is tautological, but done for clarity. HaveTxsDownloaded() should probably be removed since its name conflicts with assumed-valid blocks that have had their nChainTx value set during snapshot activation.
  31. jamesob force-pushed on Oct 28, 2021
  32. in src/validation.cpp:4915 in cb1d424480 outdated
    4911@@ -4912,6 +4912,14 @@ bool ChainstateManager::PopulateAndValidateSnapshot(
    4912     for (int i = 0; i <= snapshot_chainstate.m_chain.Height(); ++i) {
    4913         index = snapshot_chainstate.m_chain[i];
    4914 
    4915+        if (index->isGenesis()) {
    


    ryanofsky commented at 9:36 pm on November 2, 2021:

    In commit “validation: don’t modify genesis during snapshot load” (cb1d4244806980d31ceaa2307431dd70e6c39fe9)

    The new comment here seems good, but I think it would be more direct and straightforward to write

    0-for (int i = 0; i <= snapshot_chainstate.m_chain.Height(); ++i) {
    1+for (int i = 1; i <= snapshot_chainstate.m_chain.Height(); ++i) {
    

    above than add this extra control flow and indirect isGenesis() height check.


    jamesob commented at 9:21 pm on November 9, 2021:
    Done.
  33. in src/validation.cpp:4917 in cb1d424480 outdated
    4911@@ -4912,6 +4912,14 @@ bool ChainstateManager::PopulateAndValidateSnapshot(
    4912     for (int i = 0; i <= snapshot_chainstate.m_chain.Height(); ++i) {
    4913         index = snapshot_chainstate.m_chain[i];
    4914 
    4915+        if (index->isGenesis()) {
    4916+            // Don't make any modifications to the genesis block.
    4917+            // This is especially important because we don't want to erroneously
    


    ryanofsky commented at 9:41 pm on November 2, 2021:

    In commit “validation: don’t modify genesis during snapshot load” (cb1d4244806980d31ceaa2307431dd70e6c39fe9)

    Would be good to have a unit test for this, if changing this could break things.


    jamesob commented at 9:21 pm on November 9, 2021:
    Done.
  34. in src/chain.h:317 in 619ada31fd outdated
    311@@ -312,6 +312,10 @@ class CBlockIndex
    312     //!   validated by a background chainstate.
    313     bool IsAssumedValid() const { return nStatus & BLOCK_ASSUMED_VALID; }
    314 
    315+    // Runtime: O(n) with chain length.
    316+    // Returns true if the block specified relies on ancestors which are assumed-valid.
    317+    bool reliesOnAssumedValid() const;
    


    ryanofsky commented at 9:50 pm on November 2, 2021:

    In commit “validation: add CBlockIndex reliesOnAssumeValid” (619ada31fdc90f079cb1a6c44df6b76f6a162c1e)

    Would suggest calling method HasAssumedValidAncestors() instead of reliesOnAssumedValid() to avoid vague “relies”, to be more consistent with other IsAssumedValid() and GetAncestor() method names, and to be less confusing in the place where CBlockIndex::reliesOnAssumedValid and CChainState::reliesOnAssumedValid are both called together.


    jamesob commented at 9:21 pm on November 9, 2021:
    Removed this method altogether, so done.
  35. in src/chain.h:344 in 619ada31fd outdated
    340@@ -337,7 +341,7 @@ class CBlockIndex
    341     //! Build the skiplist pointer for this entry.
    342     void BuildSkip();
    343 
    344-    //! Efficiently find an ancestor of this block.
    345+    //! Efficiently find an ancestor of this block. If nHeight is passed, this is returned.
    


    ryanofsky commented at 9:57 pm on November 2, 2021:

    In commit “validation: add CBlockIndex reliesOnAssumeValid” (619ada31fdc90f079cb1a6c44df6b76f6a162c1e)

    I was confused what nHeight might be referring to, or if it might be a typo for height. Maybe say “If current block height is passed” or “If this->nHeight is passed” instead of “If nHeight is passed”

  36. in src/validation.cpp:3706 in dbca47ece6 outdated
    3663+                    }
    3664+                } else {
    3665+                    // When blockindex doesn't rely on assumedvalid blocks, add it to all chainstates.
    3666+                    chainstate->setBlockIndexCandidates.insert(pindex);
    3667+                }
    3668+            }
    


    ryanofsky commented at 10:37 pm on November 2, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (dbca47ece6b9a3a5e02969270bfab340fce286c0)

    I think I would suggest a few changes here:

    • Consolidate the insert calls, and just focus on one case where blocks shouldn’t be added instead of branching for other cases.
    • Try to be less vague in comments (“fitting chainstates”)
    • Be a little more efficient avoiding the O(n) pindex call, and mention it explicitly in comment to be clear that it’s intentional.
     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 4af1c7f07a0..eec27f63abb 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -3653,15 +3653,15 @@ bool BlockManager::LoadBlockIndex(
     5         if (pindex->IsAssumedValid() ||
     6                 (pindex->IsValid(BLOCK_VALID_TRANSACTIONS) &&
     7                  (pindex->HaveTxsDownloaded() || pindex->pprev == nullptr))) {
     8+            // Following call is O(n), but should not be a significant slowdown.
     9+            const bool has_assumed_valid_ancestors = pindex->reliesOnAssumedValid();
    10+
    11+            // Fill each chainstate's block candidate set. Add all blocks as
    12+            // candidates if the chainstate is allowed to rely on assumed-valid
    13+            // data. Otherwise avoid adding assumed-valid blocks so the
    14+            // chainstate will download and validate all earlier blocks.
    15             for (CChainState* chainstate : chainman.GetAll()) {
    16-                // Only add blocks which are assumed-valid to fitting chainstates, i.e. those
    17-                // populated on the basis of a UTXO snapshot.
    18-                if (pindex->reliesOnAssumedValid()) {
    19-                    if (chainstate->reliesOnAssumedValid()) {
    20-                        chainstate->setBlockIndexCandidates.insert(pindex);
    21-                    }
    22-                } else {
    23-                    // When blockindex doesn't rely on assumedvalid blocks, add it to all chainstates.
    24+                if (chainstate->reliesOnAssumedValid() || !has_assumed_valid_ancestors) {
    25                     chainstate->setBlockIndexCandidates.insert(pindex);
    26                 }
    27             }
    

    Also it would be good to have unit test coverage for this


    jamesob commented at 9:22 pm on November 9, 2021:
    Addressed everything here aside from the unittest, which I will follow up with.
  37. in src/chain.cpp:122 in 619ada31fd outdated
    118@@ -119,6 +119,15 @@ void CBlockIndex::BuildSkip()
    119         pskip = pprev->GetAncestor(GetSkipHeight(nHeight));
    120 }
    121 
    122+bool CBlockIndex::reliesOnAssumedValid() const {
    


    ryanofsky commented at 10:47 pm on November 2, 2021:

    In commit “validation: add CBlockIndex reliesOnAssumeValid” (619ada31fdc90f079cb1a6c44df6b76f6a162c1e)

    Would be good and should be pretty easy add a simple unit test for this. If you want a template, findearliestatleast_edge_test is a simple existing test making a chain of blocks and checking some properties.


    jamesob commented at 9:33 pm on November 8, 2021:

    Yep, will test this.

    Also, this change is not mergeable as-is: it turns out invoking this for each CBlockIndex in the chain takes a really. long. time. making startup on mainnet appear as though it is hung. :cry:


    jamesob commented at 9:22 pm on November 9, 2021:
    Fixed - nothing to test anymore.
  38. ryanofsky approved
  39. ryanofsky commented at 11:13 pm on November 2, 2021: contributor

    Code review ACK dbca47ece6b9a3a5e02969270bfab340fce286c0. Much simpler now and looks all good to me!

    In the PR description #23174#issue-1015261643:

    When reconnecting the block index on startup, we have to account for the fact that assumed-valid blocks (per the snapshot chainstate) do not have nTx values. We can’t exclude blocks from inclusion in setBlockIndexCandidates on this basis or we will trip over an empty set in an assertion later in startup.

    I’m actually not sure what this is referring to, and it seems like it might not be applicable anymore according to #23174 (review) if nTx values are set and the previous code is now reverted

    We also have to ensure that blocks past the snapshot base block (i.e. the end of the assumed-valid region of the chain) are not included in setBlockIndexCandidates for the background validation chainstate.

    The way this is phrased makes it sound like this change avoids adding blocks, when actually it is adding more blocks. Would maybe try to make it clearer this is now adding blocks for the background chainstate. Feel free to steal PR description I posted earlier too #23174 (comment) if that is useful

  40. jamesob force-pushed on Nov 9, 2021
  41. jamesob commented at 9:24 pm on November 9, 2021: member

    Man this changeset keeps getting smaller and smaller. Thanks @ryanofsky!

    I’ve reverted back to doing a stateful approach for assumed-valid detection in LoadBlockIndex since the more holistic approach of scanning ancestors ended up being prohibitively slow when loading the main chain. This has allowed me to drop a few more changes from the original patch.

    The only remaining item left is to write unittests for the contents of setBlockIndexCandidates after calling LoadBlockIndex, which I’ll be working on shortly.

  42. in src/validation.cpp:3638 in 2833bd0277 outdated
    3634         pindex->nChainWork = (pindex->pprev ? pindex->pprev->nChainWork : 0) + GetBlockProof(*pindex);
    3635         pindex->nTimeMax = (pindex->pprev ? std::max(pindex->pprev->nTimeMax, pindex->nTime) : pindex->nTime);
    3636-        // We can link the chain of blocks for which we've received transactions at some point.
    3637+
    3638+        if (!seen_assumed_valid && pindex->IsAssumedValid()) {
    3639+            seen_assumed_valid = true;
    


    ryanofsky commented at 8:36 pm on November 10, 2021:

    In commit “validation: don’t modify genesis during snapshot load” (86d48535198724c723370a447f73e134029331c2)

    I still don’t think it is good that if there are two blocks at the same height and one is assumed valid, then how the other one is treated depends on whether its pointer address is higher or lower than the assumed valid block pointer address. (Previous comment about this was #23174 (review)). I get that this only temporarily impacts validation state of the background chainstate, but I don’t think validation behavior should be nondeterministic and treat blocks differently based on what addresses malloc returns. Would suggest just doing height comparison directly so order of vSortedByHeight entries with the same height won’t matter:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 02b0f76f7b5..119b9141706 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -3623,9 +3623,14 @@ bool BlockManager::LoadBlockIndex(
     5     }
     6     sort(vSortedByHeight.begin(), vSortedByHeight.end());
     7 
     8-    // Have we yet seen a pindex->IsAssumedValid()? If so, account for the fact that all subsequent
     9-    // index entries rely on assumed-valid ancestors.
    10-    bool seen_assumed_valid{false};
    11+    // Find start of assumed-valid region.
    12+    int first_assumed_valid_height = std::numeric_limits<int>::max();
    13+    for (const auto& [height, block] : vSortedByHeight) {
    14+        if (block->IsAssumedValid()) {
    15+            first_assumed_valid_height = height;
    16+            break;
    17+        }
    18+    }
    19 
    20     for (const std::pair<int, CBlockIndex*>& item : vSortedByHeight)
    21     {
    22@@ -3634,10 +3639,6 @@ bool BlockManager::LoadBlockIndex(
    23         pindex->nChainWork = (pindex->pprev ? pindex->pprev->nChainWork : 0) + GetBlockProof(*pindex);
    24         pindex->nTimeMax = (pindex->pprev ? std::max(pindex->pprev->nTimeMax, pindex->nTime) : pindex->nTime);
    25 
    26-        if (!seen_assumed_valid && pindex->IsAssumedValid()) {
    27-            seen_assumed_valid = true;
    28-        }
    29-
    30         // We can link the chain of blocks for which we've received transactions at some point, or
    31         // blocks that are assumed-valid on the basis of snapshot load (see
    32         // PopulateAndValidateSnapshot()).
    33@@ -3674,7 +3675,7 @@ bool BlockManager::LoadBlockIndex(
    34             // happen because assumed-valid blocks are buried deeply in
    35             // the valid chain.
    36             for (CChainState* chainstate : chainman.GetAll()) {
    37-                if (!seen_assumed_valid || chainstate->reliesOnAssumedValid()) {
    38+                if (chainstate->reliesOnAssumedValid() || pindex->nHeight < first_assumed_valid_height) {
    39                     chainstate->setBlockIndexCandidates.insert(pindex);
    40                 }
    41             }
    

    jamesob commented at 9:50 pm on November 10, 2021:
    Great suggestions, all integrated. Thank you!
  43. in src/validation.cpp:3675 in 2833bd0277 outdated
    3671+            // `seen_assumed_valid` works, it is (in general) possible that we
    3672+            // are considering a block which has a height higher than the
    3673+            // assumed-valid region but is not itself reliant on assumed-valid
    3674+            // ancestors (i.e. a very deep fork). In practice this will not
    3675+            // happen because assumed-valid blocks are buried deeply in
    3676+            // the valid chain.
    


    ryanofsky commented at 8:50 pm on November 10, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (2833bd0277ac87cd07e98b0a23b73eed97a2e513)

    I think this note is helpful but a little confusing:

    • It’s vague where it refers to “the way seen_assumed_valid works” and what " considering a block" means
    • It’s unclear where the claim that this has to be a very deep fork comes from. IIUC, the fork just has to happen before the first assumed valid block height, and this could be anywhere in between the genesis block and the snapshot height. It could actually be very close to the snapshot height if the background chainstate is almost finished validating.
    • It isn’t clear about why this behavior is safe. That last sentence “In practice this will not happen” seems to imply that this is safe just because the condition is unlikely to happen. But comment is ambiguous whether it is safe regardless. IIUC, it is safe regardless.

    My attempt to describe this would be “Note: This is considering all blocks whose height is greater or equal to the first assumed-valid block to be assumed-valid blocks, and excluding them from the background chainstate’s setBlockIndexCandidates set. This does mean that some blocks which are not technically assumed-valid (later blocks on a fork beginning before the first assumed-valid block) might not get added to the the background chainstate, but this is ok, because they will still be attached to the active chainstate if they actually contain more work.”

  44. in src/validation.cpp:3677 in 2833bd0277 outdated
    3673+            // assumed-valid region but is not itself reliant on assumed-valid
    3674+            // ancestors (i.e. a very deep fork). In practice this will not
    3675+            // happen because assumed-valid blocks are buried deeply in
    3676+            // the valid chain.
    3677+            for (CChainState* chainstate : chainman.GetAll()) {
    3678+                if (!seen_assumed_valid || chainstate->reliesOnAssumedValid()) {
    


    ryanofsky commented at 8:53 pm on November 10, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (2833bd0277ac87cd07e98b0a23b73eed97a2e513)

    I’d maybe swap these conditions and write this as if (chainstate->reliesOnAssumedValid() || !seen_assumed_valid) to match the description above “Add all blocks as candidates if the chainstate is allowed to rely on assumed-valid blocks. Otherwise avoid adding assumed-valid blocks”

  45. in src/validation.cpp:3704 in 2833bd0277 outdated
    3674+            // ancestors (i.e. a very deep fork). In practice this will not
    3675+            // happen because assumed-valid blocks are buried deeply in
    3676+            // the valid chain.
    3677+            for (CChainState* chainstate : chainman.GetAll()) {
    3678+                if (!seen_assumed_valid || chainstate->reliesOnAssumedValid()) {
    3679+                    chainstate->setBlockIndexCandidates.insert(pindex);
    


    ryanofsky commented at 9:29 pm on November 10, 2021:

    In commit “validation: have LoadBlockIndex account for snapshot use” (2833bd0277ac87cd07e98b0a23b73eed97a2e513)

    I still think it would be really nice to call this function from a unit test and make sure this set is filled correctly. The logic here is fragile, and testing this part of LoadBlockIndex could make it easier to test other parts as well.

    EDIT: Sorry, I see you mentioned working on this in #23174 (comment)


    sdaftuar commented at 1:10 am on May 19, 2023:

    Sorry for chiming in on what is now a very old PR, but I believe this block of code is incorrect. If we have a snapshot in place (and so some block index is marked IsAssumedValid), but we have downloaded blocks that haven’t yet been activated (eg on the background-validation chainstate), this logic prevents us from adding those downloaded-but-not-yet verified blocks to setBlockIndexCandidate for the background chainstate.

    I ran into this while working on a download implementation for assumeutxo, and in some sequence of node restarts I got stuck in a state where my node had downloaded a bunch of blocks but never activated them, because they weren’t in setBlockIndexCandidates. I thought that restarting should fix the issue, but it didn’t due to this block of code not doing the right thing…

    Admittedly there is a lot of complicated interaction here but I think there’s an invariant for setBlockIndexCandidates that is violated by this block of code, which is that if we have the data for a block and all its predecessors and it has more work than our tip, then it should be in setBlockIndexCandidates. By excluding blocks that are marked IsAssumedValid() we violate this.


    jamesob commented at 4:46 pm on May 23, 2023:

    @sdaftuar:

    If we have a snapshot in place (and so some block index is marked IsAssumedValid), but we have downloaded blocks that haven’t yet been activated (eg on the background-validation chainstate), this logic prevents us from adding those downloaded-but-not-yet verified blocks to setBlockIndexCandidate for the background chainstate

    I’m having a hard time following how this could happen. If the bg chainstate has some downloaded-but-not-activated blocks, they are definitionally beneath the first_assumed_valid_height (since we don’t attach blocks above that height to the bg chainstate). So any of those blocks should be added by this code to the bg chainstate’s setBlockIndexCandidates. Have you seen an example where you’re sure that a downloaded block on the bg chain is excluded by this code?

    I don’t doubt that you’re seeing an issue though… I wonder if there might be a problem with how m_best_header is set below? I.e. that state isn’t qualified per chainstate. That’s something I’ll investigate now.

    I’d also be curious if anyone else has seen this issue when testing the assumeutxo tracking branch (https://github.com/bitcoin/bitcoin/pull/27596). Is there a chance some code you’ve added is causing this?

    By excluding blocks that are marked IsAssumedValid() we violate this.

    Note that we’re only excluding blocks marked assumedvalid (by way of the height check) for the bg chainstate, which still seems sensible to me. All blocks, assumedvalid or not, are added to the snapshot chainstate (or “regular” chain if no snapshot is in use).


    sdaftuar commented at 5:50 pm on May 23, 2023:

    they are definitionally beneath the first_assumed_valid_height (since we don’t attach blocks above that height to the bg chainstate)

    I don’t believe this is true; after we download a block and before we validate the block, the CBlockIndex entry for that block will have the AssumedValid bit set to true. This means that if you were to download a bunch of blocks and then shutdown prior to those blocks being validated on the bg chainstate, that when you restart you’ll be in this state.


    jamesob commented at 6:11 pm on May 23, 2023:
    But it’s no problem for them to have the ASSUMED_VALID bit set to true; since they’re beneath the snapshot base height, they’ll be added to the setBlockIndexCandidates.

    sdaftuar commented at 6:22 pm on May 23, 2023:
    The code here isn’t comparing with the snapshot base height; it’s just checking if a block index entry is below the first_assumed_valid_height, and only in that case does it get added to setBlockIndexCandidates. [I don’t believe there’s any other logic elsewhere that will do what you are describing, of adding entries below the snapshot base height to setBlockIndexCandidates?]

    sdaftuar commented at 6:26 pm on May 23, 2023:
    More generally: I think that we don’t need to worry at all about whether the assumed-valid bit is set when adding entries to setBlockIndexCandidates. We always validate a block when it gets connected, so the only invariant we need on entries in setBlockIndexCandidates is whether they have enough work to be worth considering. (We used to have a hard requirement that we always HAVE_DATA for such entries, but after pruning we had to relax that a bit, as there are edge-case scenarios where you might HAVE_DATA for such an entry and then prune it before you try to connect it, so we handle that case by (re-)adding such entries to mapBlocksUnlinked when they occur. But I digress…)

    jamesob commented at 6:35 pm on May 23, 2023:

    The code here isn’t comparing with the snapshot base height; it’s just checking if a block index entry is below the first_assumed_valid_height

    Oops, total misread on my part. You’re right; first_assumed_valid_height will be the block after the tip of the background chain.


    ryanofsky commented at 6:40 pm on May 23, 2023:

    The logic here does seem confusing.

    Code comment says background candidate set needs to exclude BLOCK_ASSUMED_VALID blocks because otherwise ActivateBestChain “would add assumed-valid blocks to the chain”. But that seems like exactly what it should do? (after validating them)

    Commit message says background candidate set needs to exclude blocks after the snapshot height that don’t have BLOCK_ASSUMED_VALID, but depend on blocks that do have BLOCK_ASSUMED_VALID. That idea does make sense, but does not seem to be what is implemented.

    Taking another a step back, it seems like all of this would be simpler if background ActivateBestChain function just skipped calling FindMostWorkChain and tried to connect blocks up to the base of the snapshot.

  46. ryanofsky approved
  47. ryanofsky commented at 9:36 pm on November 10, 2021: contributor

    Code review ACK 2833bd0277ac87cd07e98b0a23b73eed97a2e513. Changes since last review: Making PopulateAndValidateSnapshot loop from 1 instead of calling isGenesis. Also adding unit test. Making LoadBlockIndex use height comparison to determine assumed-valid blocks instead of traversing pprev.

    re: #23174#issue-1015261643, #23174 (comment)

    I edited PR description to drop the part about nChainTx and stop referring to the snapshot height since implementation was changed to use first assumed-valid height, not last assumed-valid height. Previous description was:

    This is part of the assumeutxo project (parent PR: #15606)

    (PR description by ryanofsky)

    Currently, BlockManager::LoadBlockIndex adds all blocks that have downloaded transactions to the active chain state’s setBlockIndexCandidates set, ignoring the background chain state.

    This PR changes ChainstateManager::LoadBlockIndex to:

    Update setBlockIndexCandidates in the background chain, not just the active chain. In the active chain, all blocks are added as before. In the background chain, only blocks below the snapshot height are added so the background chain will continue to download and validate them. Sets pindex->nChainTx of the snapshot block to the value from snapshot metadata instead of leaving it 0.

  48. jamesob force-pushed on Nov 10, 2021
  49. jamesob commented at 9:37 pm on November 10, 2021: member
    @ryanofsky nice timing. I just pushed a unittest for LoadBlockIndex()/setBlockIndexCandidates.
  50. jamesob force-pushed on Nov 10, 2021
  51. jamesob force-pushed on Nov 10, 2021
  52. in src/test/validation_chainstatemanager_tests.cpp:335 in e35d167682 outdated
    330+
    331+    int num_indexes{0};
    332+    int num_assumed_valid{0};
    333+    int expected_assumed_valid{20};
    334+    int last_assumed_valid_idx{40};
    335+    int assumed_valid_start_idx = last_assumed_valid_idx - expected_assumed_valid;
    


    ryanofsky commented at 4:52 pm on November 12, 2021:

    In commit “test: add tests for LoadBlockIndex when using multiple chainstates” (e35d167682bc527fa9d8734f1cc7558f18d95b09)

    Declaring these const could make the test a little easier to grok


    jamesob commented at 10:26 pm on November 18, 2021:
    Fixed, thanks
  53. in src/test/validation_chainstatemanager_tests.cpp:359 in e35d167682 outdated
    354+
    355+    // Mark some region of the chain assumed-valid.
    356+    for (int i = 0; i <= cs1.m_chain.Height(); ++i) {
    357+        auto index = cs1.m_chain[i];
    358+
    359+        if (i < last_assumed_valid_idx && i >= (last_assumed_valid_idx - expected_assumed_valid)) {
    


    ryanofsky commented at 4:53 pm on November 12, 2021:

    In commit “test: add tests for LoadBlockIndex when using multiple chainstates” (e35d167682bc527fa9d8734f1cc7558f18d95b09)

    Could simplify last_assumed_valid_idx - expected_assumed_valid as assumed_valid_start_idx

  54. ryanofsky approved
  55. ryanofsky commented at 5:02 pm on November 12, 2021: contributor

    Code review ACK e35d167682bc527fa9d8734f1cc7558f18d95b09. Main change since last review was just adding test. And some suggestions from the last review were also applied.

    This PR is now a lot simpler than it was at previous points, so it should be pretty easy for previous reviewers to reack, or new reviewers to take a look at.

  56. in src/test/validation_chainstatemanager_tests.cpp:389 in e35d167682 outdated
    384+    BOOST_CHECK_EQUAL(cs1.setBlockIndexCandidates.size(), assumed_valid_start_idx);
    385+
    386+    // The assumed-valid tolerant chain has all blocks as candidates.
    387+    BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.count(validated_tip), 1);
    388+    BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.count(assumed_tip), 1);
    389+    BOOST_CHECK_EQUAL(cs2.setBlockIndexCandidates.size(), num_indexes);
    


    ariard commented at 1:15 am on November 15, 2021:

    IIUC any fully validated block beyond the assumed-valid range of blocks shouldn’t not be tolerated by the “fully-validated” chain, as they’re relying on assumed-valid blocks ?

    If so can you add a BOOST_CHECK_EQUAL(cs1.setBlockIndexCandidates.count(m_chain.Height()), 0) to verify this expected behavior holds ?


    jamesob commented at 7:25 pm on November 16, 2021:

    Did you mean cs1.setBlockIndexCandidates.count(m_chain.Height())? Because the set doesn’t contain heights, it contains CBlockIndex*. Also which m_chain are you referring to? Because there’s one for c1 and c2.

    I think the check you want is already included in BOOST_CHECK_EQUAL(cs1.setBlockIndexCandidates.count(assumed_tip), 0);.

  57. in src/validation.cpp:3642 in e35d167682 outdated
    3621@@ -3622,17 +3622,30 @@ bool BlockManager::LoadBlockIndex(
    3622         vSortedByHeight.push_back(std::make_pair(pindex->nHeight, pindex));
    3623     }
    3624     sort(vSortedByHeight.begin(), vSortedByHeight.end());
    3625+
    3626+    // Find start of assumed-valid region.
    3627+    int first_assumed_valid_height = std::numeric_limits<int>::max();
    3628+    for (const auto& [height, block] : vSortedByHeight) {
    3629+        if (block->IsAssumedValid()) {
    3630+            first_assumed_valid_height = height;
    


    ariard commented at 1:17 am on November 15, 2021:
    nit: Maybe add a assert(IsSnapshotActive() as we shouldn’t have assumed-valid blocks in the absence of a snapshot loaded ?

    jamesob commented at 7:27 pm on November 16, 2021:
    Part of @ryanofsky’s overarching (and good) feedback a few months earlier was to avoid leaking knowledge of the idea of “UTXO snapshots” in places where it doesn’t need to be. Instead we should rely on more fundamental state (like the presence of assumed-valid block index entries). I think referencing snapshots here would be going the other direction.

    ryanofsky commented at 4:03 am on November 17, 2021:

    re: #23174 (review)

    nit: Maybe add a assert(IsSnapshotActive() as we shouldn’t have assumed-valid blocks in the absence of a snapshot loaded ?

    I guess like James said I’d be a little disinclined to add an assert checking for assumeutxo snapshots in validation code that’s dealing more abstractly with chainstates. But I could also see the assert being useful as a sanity check, so no strong opinion either way.

    Another option would be to write the assert a more generically with something like:

    0
    1// assert that the assumed valid blocks come from some assumed-valid chainstate.
    2assert(any_of(chainstates, [](auto& chainstate) { return chainstate->reliesOnAssumedValid(); });
    3
    4// assert that the assumed valid blocks will be validated by some not assumed-valid chainstate
    5assert(any_of(chainstates, [](auto& chainstate) { return !chainstate->reliesOnAssumedValid(); });
    

    jamesob commented at 6:34 pm on November 19, 2021:
    Good idea! Added.
  58. in src/validation.cpp:3669 in e35d167682 outdated
    3661@@ -3649,7 +3662,28 @@ bool BlockManager::LoadBlockIndex(
    3662         if (pindex->IsAssumedValid() ||
    3663                 (pindex->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3664                  (pindex->HaveTxsDownloaded() || pindex->pprev == nullptr))) {
    3665-            block_index_candidates.insert(pindex);
    3666+
    3667+            // Fill each chainstate's block candidate set. Add all blocks as
    3668+            // candidates if the chainstate is allowed to rely on assumed-valid
    3669+            // blocks. Otherwise avoid adding assumed-valid blocks so the
    3670+            // chainstate will download and validate all earlier blocks.
    


    ariard commented at 1:19 am on November 15, 2021:

    I think the comment can be slightly modified to mention the descendants of assumed-valid blocks, like in the commit description.

    “Otherwise avoid adding assumed-valid blocks and their fully-validated descendants therefore inducing the chainstate to download and validate all blocks earlier than the first assumed-valid block”.


    jamesob commented at 8:23 pm on November 16, 2021:
    Not sure I follow your suggested comment, since fully-validated blocks should be added to both chainstates, but I’ve updated the comment to be try and be clearer.
  59. ariard commented at 1:24 am on November 15, 2021: member
    The newly patchset is far simpler, just left few minor comments to confirm my understanding of the changes.
  60. jamesob force-pushed on Nov 16, 2021
  61. ryanofsky approved
  62. ryanofsky commented at 4:05 am on November 17, 2021: contributor
    Code review ACK bc3288b0ba1d949f0004513e250bb21ddfb5b389. Only change since previous review was fixing up a code comment to be more clear about which chainstate it was referring to
  63. jamesob force-pushed on Nov 18, 2021
  64. jamesob force-pushed on Nov 19, 2021
  65. ryanofsky commented at 4:45 am on November 24, 2021: contributor
    Code review ACK 7a576ba9d24308e36dcde2b48b34c0676de98daa. Only change since previous review is adding new asserts and some tweaks in the test
  66. in src/validation.cpp:3635 in e19253f2fd outdated
    3630+        if (block->IsAssumedValid()) {
    3631+            auto chainstates = chainman.GetAll();
    3632+
    3633+            // assert that the assumed valid blocks come from some assumed-valid chainstate.
    3634+            assert(any_of(chainstates.cbegin(), chainstates.cend(),
    3635+                [](auto chainstate) { return chainstate->reliesOnAssumedValid(); }));
    


    fjahr commented at 3:49 pm on November 28, 2021:
    Hm, this feels a bit backwards to me. Couldn’t we skip the whole loop if we first check that one of the chainstates relies on assume-valid?

    jamesob commented at 4:24 pm on November 29, 2021:
    I don’t understand your comment: these two asserts are only performed once if there is an assumed-valid block detected, and then the loop is broken out of (line 3642). We don’t want to do the asserts if there is no assumed-valid block, so I don’t see how or why we’d do them outside of the loop.

    fjahr commented at 11:24 pm on November 30, 2021:
    Unless I have a misunderstanding here, what I meant: We only have blocks that are assumed valid if we have a chainstate that relies on assume valid, those two will always go together. We will probably have very few chain states and we have a lot of blocks. So it seems to me that it would be more efficient to first loop the chainstates and then the blocks.
  67. in src/test/validation_chainstatemanager_tests.cpp:188 in 86d4853519 outdated
    183@@ -184,6 +184,9 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup)
    184     BOOST_CHECK(!chainman.ActiveChainstate().m_from_snapshot_blockhash);
    185     BOOST_CHECK(!chainman.SnapshotBlockhash());
    186 
    187+    // Ensure that the genesis block was not marked assumed-valid.
    188+    BOOST_CHECK(!chainman.ActiveChain().Genesis()->IsAssumedValid());
    


    fjahr commented at 4:47 pm on November 28, 2021:
    Is this in the right place here? When I set AFTER_GENESIS_START to 0 this still doesn’t fail for me.

    jamesob commented at 4:01 pm on November 29, 2021:
    Oh yep, you’re very right - we haven’t actually yet activated the snapshot at this point in the test. Good find, thanks.
  68. fjahr commented at 5:04 pm on November 28, 2021: contributor

    I reviewed the code and it looks very good to me with one potential optimization I suggested below.

    I also did some mutation testing on the changes and discovered that the test in the first commit doesn’t seem to cover the change. Unless I made some error here myself, the test should probably be fixed before merging.

  69. in src/validation.cpp:4968 in 7a576ba9d2 outdated
    4951@@ -4909,7 +4952,14 @@ bool ChainstateManager::PopulateAndValidateSnapshot(
    4952 
    4953     // Fake various pieces of CBlockIndex state:
    4954     CBlockIndex* index = nullptr;
    4955-    for (int i = 0; i <= snapshot_chainstate.m_chain.Height(); ++i) {
    4956+
    4957+    // Don't make any modifications to the genesis block.
    4958+    // This is especially important because we don't want to erroneously
    4959+    // apply BLOCK_ASSUMED_VALID to genesis, which would happen if we didn't skip
    4960+    // it here (since it apparently isn't BLOCK_VALID_SCRIPTS).
    


    maflcko commented at 9:12 am on November 29, 2021:
    I don’t understand this comment. If genesis isn’t BLOCK_VALID_SCRIPTS, then BLOCK_ASSUMED_VALID wouldn’t be applied, so the workaround isn’t needed??

    jamesob commented at 4:02 pm on November 29, 2021:
    No, because genesis isn’t BLOCK_VALID_SCRIPTS we would apply BLOCK_ASSUMED_VALID unless we specifically skip it. The workaround is still desirable and necessary, I just bungled the unittest (see above).

    maflcko commented at 4:13 pm on November 29, 2021:

    Yes, and the code below is:

    0        if (!index->IsValid(BLOCK_VALID_SCRIPTS)) {
    1            // This flag will be removed once the block is fully validated by a
    2            // background chainstate.
    3            index->nStatus |= BLOCK_ASSUMED_VALID;
    4        }
    

    So how could it be executed if BLOCK_VALID_SCRIPTS isn’t set?


    jamesob commented at 4:55 pm on November 29, 2021:
    It is only executed when BLOCK_VALID_SCRIPTS isn’t set - are you by chance missing the ! in the conditional? Edit: changed an “is” to “isn’t” - the double negative got me!

    maflcko commented at 12:24 pm on November 30, 2021:
    Nvm, apparently I don’t understand how negation works.

    jamesob commented at 2:59 pm on November 30, 2021:
    This is an argument for including Python’s not operator in C++21. ;)

    ryanofsky commented at 4:45 pm on November 30, 2021:

    This is an argument for including Python’s not operator in C++21. ;)

    It’s actually already there, no need to wait https://en.cppreference.com/w/cpp/language/operator_alternative


    maflcko commented at 4:53 pm on November 30, 2021:

    Pretty sure that not may fail on some compilers (on windows?).

    In general I think if there is a way to avoid a negation, it might be preferable. In this case saying “(since it apparently is lacking BLOCK_VALID_SCRIPTS)” might work?

  70. jamesob force-pushed on Nov 29, 2021
  71. ryanofsky approved
  72. ryanofsky commented at 6:11 pm on November 29, 2021: contributor
    Code review ACK 709a09c9b6a902d5ed0e5ad6ac421b17c8921e6e. Only change since last review is test change moving genesis not assume-valid check after the successful CreateAndActivateUTXOSnapshot call instead of before, so it is actually meaningful.
  73. in src/validation.cpp:4974 in 9be75f2235 outdated
    4918+    constexpr int AFTER_GENESIS_START{1};
    4919+
    4920+    for (int i = AFTER_GENESIS_START; i <= snapshot_chainstate.m_chain.Height(); ++i) {
    4921         index = snapshot_chainstate.m_chain[i];
    4922 
    4923         // Fake nTx so that LoadBlockIndex() loads assumed-valid CBlockIndex
    


    maflcko commented at 5:03 pm on November 30, 2021:

    in the first commit (9be75f223577108bc573c2b04f2f8c67eefe88ef):

    As the genesis block is the only block which doesn’t have a previous block, it might be good to remove the dead code in this body now.

    index->pprev is always true now and can/should be removed. If you are worried about future logic bugs, you may use Assert(index->pprev)->nChainTx instead of index->pprev->nChainTx.


    jamesob commented at 6:49 pm on December 13, 2021:
    Fixed, thanks.
  74. in src/validation.cpp:3634 in 36e2bc4275 outdated
    3629+    for (const auto& [height, block] : vSortedByHeight) {
    3630+        if (block->IsAssumedValid()) {
    3631+            auto chainstates = chainman.GetAll();
    3632+
    3633+            // assert that the assumed valid blocks come from some assumed-valid chainstate.
    3634+            assert(any_of(chainstates.cbegin(), chainstates.cend(),
    


    maflcko commented at 5:24 pm on November 30, 2021:

    in commit 36e2bc4275fdb0d72390215fcea4d499f8049701:

    Can you explain why this compiles? I always assumed that standard library helpers are only accessible in the std:: namespace.


    jamesob commented at 6:49 pm on December 13, 2021:
    Fixed.
  75. in src/validation.cpp:3639 in 36e2bc4275 outdated
    3634+            assert(any_of(chainstates.cbegin(), chainstates.cend(),
    3635+                [](auto chainstate) { return chainstate->reliesOnAssumedValid(); }));
    3636+            // assert that the assumed valid blocks will be validated by some not assumed-valid
    3637+            // chainstate.
    3638+            assert(any_of(chainstates.cbegin(), chainstates.cend(),
    3639+                [](auto chainstate) { return !chainstate->reliesOnAssumedValid(); }));
    


    maflcko commented at 5:30 pm on November 30, 2021:

    in commit 36e2bc4275fdb0d72390215fcea4d499f8049701:

    Can you explain the asserts a bit better? Currently it looks like they are asserting that two chainstates exist. This condition should be independent of the loop and could be moved outside?


    jamesob commented at 6:06 pm on December 13, 2021:

    This condition should be independent of the loop and could be moved outside?

    I don’t see why you say that because (i) this condition should only be asserted if we encounter an assumed-valid block, which we need to loop through the index to find out, and (ii) the condition is only asserted once since we break immediately afterwards.


    jamesob commented at 6:49 pm on December 13, 2021:
    I’ve refactored this code and updated the commentary; hopefully you find it more intuitive now.

    maflcko commented at 11:05 am on December 14, 2021:

    With “condition” I mean “two chainstates exist”.

    Thus a loop of the form:

    0for b in blocks:
    1  if b.assumed:
    2    assert len(chains())==2
    

    Can be rewritten into:

    0have_two_chains = len(chains())==2
    1for b in blocks:
    2  if b.assumed:
    3    assert have_two_chains
    

    Obviously the performance should be the same here, but evaluating the condition outside the loop might be clearer. It would even allow to add additional checks. (Like asserting that no block is assumed if the number of chains is one?)

  76. in src/validation.cpp:3695 in 36e2bc4275 outdated
    3687+            // setBlockIndexCandidates set. This does mean that some blocks
    3688+            // which are not technically assumed-valid (later blocks on a fork
    3689+            // beginning before the first assumed-valid block) might not get
    3690+            // added to the the background chainstate, but this is ok, because
    3691+            // they will still be attached to the active chainstate if they
    3692+            // actually contain more work.
    


    maflcko commented at 6:02 pm on November 30, 2021:

    in commit 36e2bc4275fdb0d72390215fcea4d499f8049701:

    I don’t understand why it is safe to not attach them. If the assumed valid block is sufficiently old, it may cost little POW to create an alternative fork up to that height, and confusing the background chainstate that it can’t connect to the active chainstate?

    What is the reason for the if-condition in the first place? What would happen if both candidate sets are fully populated?


    jamesob commented at 5:52 pm on December 13, 2021:
    If both candidate sets are fully populated, when ActivateBestChain is called on the background validation chainstate, FindMostWorkChain() will return the assumed-valid block index entry as installed by snapshot activation, and it won’t think it has to enter IBD to download and fully validate the chain up to the snapshot base block. I’ll include this in the commentary above since people (myself included) seem to constantly forget why we need this conditional.

    jamesob commented at 6:48 pm on December 13, 2021:

    If the assumed valid block is sufficiently old, it may cost little POW to create an alternative fork up to that height, and confusing the background chainstate that it can’t connect to the active chainstate?

    I’m confused by this because the background chainstate shouldn’t be considering forks anyway; its purpose is to verify the assumed-valid blocks beneath the snapshot and it does not handle reorgs or forks.


    maflcko commented at 12:08 pm on December 14, 2021:

    Ok, I guess I am missing context on how the download logic will be modified for assumevalid.

    If the assumevalid chain is assumed to be always linear, then maybe a check that all assumed blocks are parents of the assumed tip block makes sense?

  77. maflcko commented at 6:04 pm on November 30, 2021: member

    Concept ACK 709a09c9b6a902d5ed0e5ad6ac421b17c8921e6e, did not look at the last commit (test) 🥇

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3Concept ACK 709a09c9b6a902d5ed0e5ad6ac421b17c8921e6e, did not look at the last commit (test) 🥇
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUggmQv+IHs5tQZR4a2VV+VvpfYa4GEdcIbiiJjRJqReLcICNClFReu5hpKfkKm4
     88VbU64iwXezNAeAzlJYUB7vWMKdGlHMMYX4V9fpJ6zwGkl38eL8Xq7kzHLUDH6iO
     9FfC32v0nywoz2FcnT5sFtAodnTl8kR01lQOCZSUvsvHnrUiYh6TGZt0Xlt1lz8Sh
    102/Orx7mEYlmpR9lDPLrU6DJscUm12r5hE/PQnLC0iJ+nmc/kpKGV6S25fR08E0Id
    11JfPUnpzetTdiJGgjx3Zi9fau0ROmadXYPRJw8LSLm5cWCTxAt32oo4OQnoQeILy2
    12pmp76uQGwEnnD4vytiPg4J2ZBQUIzYz+X3f1DhWJhbDplfAcR1pKHO/f1itCqUhu
    13CUw5IbT1Wu4ocMepisosddMuoIuYSbb3Z3QXizXZHnD6E7WWKa1oNherTzX3IaFM
    14RW+rKuEZ0n1Wa0Sv5NV8OxxPQYLDwPXopHHz1K9Mz4/DJV6eOE9Q5+l1RtkpH9mr
    150nul+aMW
    16=nvxK
    17-----END PGP SIGNATURE-----
    
  78. in src/validation.h:430 in 36e2bc4275 outdated
    426@@ -427,20 +427,16 @@ class BlockManager
    427 
    428     std::unique_ptr<CBlockTreeDB> m_block_tree_db GUARDED_BY(::cs_main);
    429 
    430-    bool LoadBlockIndexDB(std::set<CBlockIndex*, CBlockIndexWorkComparator>& setBlockIndexCandidates) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    431+    bool LoadBlockIndexDB(ChainstateManager& chainman) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    Sjors commented at 12:49 pm on December 6, 2021:
    36e2bc4275fdb0d72390215fcea4d499f8049701: is it possible to spit this out as a refactor commit? IIUC setBlockIndexCandidates here is always just ActiveChainstate().setBlockIndexCandidates.

    jamesob commented at 6:46 pm on December 13, 2021:
    It’s possible, but I’m not going to do it because it creates too much intermediate churn (updating code that is changed anyway in the subsequent commit).
  79. Sjors commented at 12:57 pm on December 6, 2021: member

    utACK for the first commit 9be75f223577108bc573c2b04f2f8c67eefe88ef

    One suggestion to make the second easier to review…

  80. validation: don't modify genesis during snapshot load
    Avoid modifying the genesis block index entry during snapshot load. This
    is because, in a future change that fixes LoadBlockIndex for UTXO
    snapshots, we detect block index entries that are reliant on
    assumed-valid ancestors and treat them specially.
    
    Since the genesis block doesn't have BLOCK_VALID_SCRIPTS, it would be
    erroneously marked BLOCK_ASSUMED_VALID during snapshot load if we didn't
    skip it here. This would cause a "setBlockIndexCandidates() empty"
    assertion to be tripped since all block index entries would be marked
    assume-valid due to genesis, which is never re-validated.
    
    There's probably no good reason to modify the genesis block index entry
    during snapshot load anyway...
    d0c6e61f5d
  81. validation: have LoadBlockIndex account for snapshot use
    Ensure that blocks past the snapshot base block (i.e. the end of the
    assumed-valid region of the chain) are not included in
    setBlockIndexCandidates for the background validation chainstate. These
    blocks, while fully validated and lacking the BLOCK_ASSUMED_VALID flag,
    *rely* on blocks which are assumed-valid, and so shouldn't be added to
    the IBD chainstate.
    
    Co-authored-by: Russ Yanofsky <russ@yanofsky.org>
    0fd599a51a
  82. test: add tests for LoadBlockIndex when using multiple chainstates
    Incorporates feedback from Russ Yanofsky.
    2283b9cd1e
  83. jamesob force-pushed on Dec 13, 2021
  84. in src/validation.cpp:4928 in d0c6e61f5d outdated
    4924@@ -4918,7 +4925,7 @@ bool ChainstateManager::PopulateAndValidateSnapshot(
    4925             index->nTx = 1;
    4926         }
    4927         // Fake nChainTx so that GuessVerificationProgress reports accurately
    4928-        index->nChainTx = index->pprev ? index->pprev->nChainTx + index->nTx : 1;
    4929+        index->nChainTx = index->pprev->nChainTx + index->nTx;
    


    Sjors commented at 5:41 am on December 14, 2021:
    d0c6e61f5dd3b6af818459d9d03b7ba316c5a3f7: I like @MarcoFalke’s suggestion of using Assert(index->pprev) instead.
  85. in src/test/validation_chainstatemanager_tests.cpp:360 in 2283b9cd1e
    355+    // Mark some region of the chain assumed-valid.
    356+    for (int i = 0; i <= cs1.m_chain.Height(); ++i) {
    357+        auto index = cs1.m_chain[i];
    358+
    359+        if (i < last_assumed_valid_idx && i >= assumed_valid_start_idx) {
    360+            index->nStatus = BlockStatus::BLOCK_VALID_TREE | BlockStatus::BLOCK_ASSUMED_VALID;
    


    Sjors commented at 6:28 am on December 14, 2021:

    Maybe add:

    0    BOOST_CHECK(index->IsAssumedValid());
    1} else {
    2    BOOST_CHECK(index->!IsAssumedValid());
    
  86. Sjors approved
  87. Sjors commented at 6:32 am on December 14, 2021: member
    utACK 2283b9cd1ee0fbd1e8ebc61673b1fe7596199a24
  88. maflcko commented at 12:04 pm on December 14, 2021: member

    Concept ACK 2283b9cd1ee0fbd1e8ebc61673b1fe7596199a24 🤽

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3Concept ACK 2283b9cd1ee0fbd1e8ebc61673b1fe7596199a24 🤽
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUjnlQwAmzp6vV9GqCu06YyivJ9zxvJ3Wsv+L8FxehFxMsVm3GTST2sM0bSfPL1M
     86p9eBqXz3JgX6qwPaBsH8U5neWw75LkvpVMPMwmXQairRM3Odr9aGIyZJ5bOtCvv
     9OIMgrPsRz+bVMu1xGKcwCc3MM92LYRluGBAaGVq8qQTXSxZUfQWAoemqC+Ubhaiq
    10XXGJ9MPV02NOWD7ZK2qMTgNFsbPYM1zkVKX6r3kXVi0/jOBexZDrECJIcFvQZ/Ol
    11wqfv338xZnBYq3EnGkZVAiReSVJhpz7O+oMS/yKoHrtYhV07VRY15wxbZBXWaLgU
    12NlLqK5jZO49f2G5uaP2DgQ3CncMNbXEpG1lqJNxZy0bKUuuuio2GcgqJEWnSd7BN
    13f38nMQtQ80T9yUpfMW4/LWjj68XKROGbn9xLswIH654t+7DxHq63FzLq6jMFFto/
    149K18qF1bEBwBpSX2sM9RxoTXT16W13sKxQnAjowcrA8HdAS5WXpVNGknyq01qEUE
    15bPxW9jNk
    16=FgfJ
    17-----END PGP SIGNATURE-----
    
  89. laanwj commented at 9:56 am on December 15, 2021: member
    It looks like this PR has had a good review cycle and there are no critical issues left. I think we should move toward a final round of ACKs and merge it. Smaller suggestions and nits that come up now can be addressed later.
  90. maflcko commented at 10:02 am on December 15, 2021: member
    Ok, I’ll open a follow-up next week or so, unless someone beats me to it.
  91. maflcko merged this on Dec 15, 2021
  92. maflcko closed this on Dec 15, 2021

  93. sidhujag referenced this in commit b32013c08d on Dec 15, 2021
  94. Fabcien referenced this in commit be0f1881ef on Oct 19, 2022
  95. delta1 referenced this in commit 6179a5a04e on Jun 13, 2023
  96. bitcoin locked this on May 22, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me