assumeutxo: Get rid of faked nTx and nChainTx values #29370

pull ryanofsky wants to merge 8 commits into bitcoin:master from ryanofsky:pr/nofake changing 8 files +231 −156
  1. ryanofsky commented at 4:18 pm on February 2, 2024: contributor

    The PopulateAndValidateSnapshot function introduced in f6e2da5fb7c6406c37612c838c998078ea8d2252 from #19806 has been setting fake nTx and nChainTx values that can show up in RPC results (https://github.com/bitcoin/bitcoin/issues/29328) and make CBlockIndex state hard to reason about, because it is difficult to know whether the values are real or fake.

    Revert to previous behavior of setting nTx and nChainTx to 0 when the values are unknown, instead of faking them. Also drop no-longer needed BLOCK_ASSUMED_VALID flag.

    Dropping the faked values also fixes assert failures in the CheckBlockIndex (pindex->nChainTx == pindex->nTx + prev_chain_tx) check that could happen previously if forked or out-of-order blocks before the snapshot got submitted while the snapshot was being validated. The PR includes two commits adding tests for these failures and describing them in detail.

    Compatibility note: This change could cause new -checkblockindex failures if a snapshot was loaded by a previous version of Bitcoin Core and not fully validated, because fake nTx values will have been saved to the block index. It would be pretty easy to avoid these failures by adding some compatibility code to LoadBlockIndex and changing nTx values from 1 to 0 when they are fake (when (pindex->nStatus & BLOCK_VALID_MASK) < BLOCK_VALID_TRANSACTIONS), but a little simpler not to worry about being compatible in this case.

  2. DrahtBot commented at 4:18 pm on February 2, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK Sjors, mzumsande, maflcko, achow101
    Concept ACK fjahr

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29656 (chainparams: Change nChainTx type to uint64_t by fjahr)
    • #29617 (test: Validate UTXO snapshot with coin height > base height & amount > MAX_MONEY supply by jrakibi)
    • #29553 (assumeutxo: Add dumptxoutset height param, remove shell scripts by fjahr)
    • #28339 (validation: improve performance of CheckBlockIndex by mzumsande)
    • #27006 (reduce cs_main scope, guard block index ’nFile’ under a local mutex by furszy)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. maflcko commented at 4:47 pm on February 2, 2024: member

    Backwards compatibility code in LoadBlockIndex to override nTx == 1 values set by previous code when they are fake (when (pindex->nStatus & BLOCK_VALID_MASK) < BLOCK_VALID_TRANSACTIONS)

    Not sure. This can only happen on test-chains, so I think it would be cleaner to just require the developers, if there are any, to finish their background sync, or to wipe their chainstate, or the blocksdir, or the whole test data dir.

  4. DrahtBot added the label CI failed on Feb 2, 2024
  5. DrahtBot commented at 5:16 pm on February 2, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/21160915955

  6. ryanofsky force-pushed on Feb 2, 2024
  7. ryanofsky commented at 6:44 pm on February 2, 2024: contributor

    Not sure. This can only happen on test-chains, so I think it would be cleaner to just require the developers, if there are any, to finish their background sync, or to wipe their chainstate, or the blocksdir, or the whole test data dir.

    It makes sense that having backwards compatibility may not be worth it, but I don’t see why test chains would set nTx to a real value without also setting BLOCK_VALID_TRANSACTIONS. It seems like we should be able to use BLOCK_VALID_TRANSACTIONS to know if nTx is real or fake, as long as we are checking for BLOCK_VALID_TRANSACTIONS, directly, and not using the IsValid() helper which is also influenced by BLOCK_FAILED flags.


    Updated 4c70d993e3ac0e9b7923eded51826fe1faa2d004 -> a3228f02f69e28960f7da76054965c916e781f27 (pr/nofake.1 -> pr/nofake.2, compare) to fix broken validation_block_tests due to a buggy check, also making some minor cleanups

  8. DrahtBot removed the label CI failed on Feb 2, 2024
  9. in src/chain.h:185 in a3228f02f6 outdated
    178     //! @sa ActivateSnapshot
    179     unsigned int nTx{0};
    180 
    181     //! (memory only) Number of transactions in the chain up to and including this block.
    182-    //! This value will be non-zero only if and only if transactions for this block and all its parents are available.
    183-    //! Change to 64-bit type before 2024 (assuming worst case of 60 byte transactions).
    


    maflcko commented at 10:39 am on February 5, 2024:
    Any reason to touch this LOC?

    ryanofsky commented at 4:40 pm on February 5, 2024:

    re: #29370 (review)

    Any reason to touch this LOC?

    No, will restore. I only meant to update the comments around it.


    ryanofsky commented at 9:46 pm on February 5, 2024:

    re: #29370 (review)

    Thanks, restored this line

  10. in src/test/validation_chainstatemanager_tests.cpp:472 in a3228f02f6 outdated
    464@@ -465,6 +465,12 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
    465         // Blocks with heights in range [91, 110] are marked ASSUMED_VALID
    466         if (i < last_assumed_valid_idx && i >= assumed_valid_start_idx) {
    467             index->nStatus = BlockStatus::BLOCK_VALID_TREE | BlockStatus::BLOCK_ASSUMED_VALID;
    468+            index->nTx = 0;
    469+            // Keep nChainTx for the snapshot base since it is part of the
    470+            // snapshot metadata. Reset it for other blocks this is simulating
    471+            // not having been downloaded yet.
    472+            if (i < last_assumed_valid_idx == 1) index->nChainTx = 0;
    


    maflcko commented at 10:51 am on February 5, 2024:
    0            index->nChainTx = 0;
    

    isn’t this always true? Maybe you meant to type - instead of <?

    In any case, the test seems to be passing either way, so this is “dead” code, similar to a code comment.


    ryanofsky commented at 4:47 pm on February 5, 2024:

    re: #29370 (review)

    isn’t this always true? Maybe you meant to type - instead of <?

    In any case, the test seems to be passing either way, so this is “dead” code, similar to a code comment.

    I think it was supposed to be i < last_assumed_valid_idx - 1, to preserve nChainTx in the snapshot block, and only set nChainTx to 0 before the snapshot block, so the test matches what happens in reality. But it does seem like this is not necessary so will simplify.


    ryanofsky commented at 7:12 pm on February 5, 2024:

    re: #29370 (review)

    Thanks, removed this

  11. in src/test/util/chainstate.h:103 in a3228f02f6 outdated
     99-                // see CheckBlockIndex().
    100-                pindex->nStatus |= BLOCK_ASSUMED_VALID;
    101+                pindex->nStatus = BlockStatus::BLOCK_VALID_TREE;
    102+                pindex->nTx = 0;
    103+                pindex->nChainTx = 0;
    104+                pindex->nSequenceId = 0;
    


    maflcko commented at 10:53 am on February 5, 2024:
    Any reason to change the sequence id? I think this isn’t changed in loadtxoutset either?

    ryanofsky commented at 4:43 pm on February 5, 2024:

    re: #29370 (review)

    Any reason to change the sequence id? I think this isn’t changed in loadtxoutset either?

    I’ll double check and add a comment. I think this might be needed to satisfy an assert that ensures nChainTx is set if nSequenceId is set, since these are both set when a block is received and all blocks before it have also been received.


    ryanofsky commented at 7:03 pm on February 5, 2024:

    re: #29370 (review)

    Added a comment. It is necessary to set nSequenceId to avoid an assert failure in CheckBlockIndex. I think it’s also a reasonable thing to do to make the block look like it was never received.

  12. in src/test/validation_chainstatemanager_tests.cpp:473 in a3228f02f6 outdated
    464@@ -465,6 +465,12 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_loadblockindex, TestChain100Setup)
    465         // Blocks with heights in range [91, 110] are marked ASSUMED_VALID
    466         if (i < last_assumed_valid_idx && i >= assumed_valid_start_idx) {
    467             index->nStatus = BlockStatus::BLOCK_VALID_TREE | BlockStatus::BLOCK_ASSUMED_VALID;
    468+            index->nTx = 0;
    469+            // Keep nChainTx for the snapshot base since it is part of the
    470+            // snapshot metadata. Reset it for other blocks this is simulating
    471+            // not having been downloaded yet.
    472+            if (i < last_assumed_valid_idx == 1) index->nChainTx = 0;
    473+            index->nSequenceId = 0;
    


    maflcko commented at 10:53 am on February 5, 2024:
    Same (sequence id)

    ryanofsky commented at 7:12 pm on February 5, 2024:

    re: #29370 (review)

    Same (sequence id)

    Thanks, removed this

  13. in src/validation.cpp:4754 in a3228f02f6 outdated
    4581-            // block will be a candidate for the tip, but it may not be
    4582-            // VALID_TRANSACTIONS (eg if we haven't yet downloaded the block),
    4583-            // so we special-case the snapshot block as a potential candidate
    4584-            // here.
    4585-            if (pindex == GetSnapshotBaseBlock() ||
    4586-                    (pindex->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    


    maflcko commented at 11:00 am on February 5, 2024:
    Is removing the validity check intended?

    ryanofsky commented at 4:59 pm on February 5, 2024:

    re: #29370 (review)

    Is removing the validity check intended?

    It was intended, but on second thought it is probably better to add back. The reason for removing it is that the block reaching VALID_TRANSACTIONS level is not really significant, because what is actually needed is for the block and all its ancestors back to the snapshot block or genesis block to reach VALID_TRANACTIONS level, and HaveNumChainTxs() is the direct and efficient way to check for that.

    But it should still be useful to call the IsValid() function because it additionally checks for BLOCK_FAILED flags. So it would be good to add back to avoid changing behavior, and be more efficient if there are invalid blocks.


    ryanofsky commented at 7:23 pm on February 5, 2024:

    re: #29370 (review)

    Thanks, reverted this

  14. in src/validation.cpp:4907 in a3228f02f6 outdated
    4924+        assert(((pindex->nStatus & BLOCK_VALID_MASK) >= BLOCK_VALID_TRANSACTIONS) == (pindex->nTx > 0)); // This is pruning-independent.
    4925         // All parents having had data (at some point) is equivalent to all parents being VALID_TRANSACTIONS, which is equivalent to HaveNumChainTxs().
    4926-        assert((pindexFirstNeverProcessed == nullptr) == pindex->HaveNumChainTxs());
    4927-        assert((pindexFirstNotTransactionsValid == nullptr) == pindex->HaveNumChainTxs());
    4928+        // HaveNumChainTxs will also be set in the assumeutxo snapshot block from snapshot metadata.
    4929+        const bool snap_block = pindex == GetSnapshotBaseBlock();
    


    maflcko commented at 11:11 am on February 5, 2024:
    Can this be further limited to only mark the snap_block if it has ASSUMED_VALID set? Otherwise, nodes which never used loadtxoutset will assume that some block is a “snapshot block”, even though they never used that feature.

    ryanofsky commented at 5:17 pm on February 5, 2024:

    re: #29370 (review)

    Can this be further limited to only mark the snap_block if it has ASSUMED_VALID set? Otherwise, nodes which never used loadtxoutset will assume that some block is a “snapshot block”, even though they never used that feature.

    I’m pretty sure I can do that, but it should not change anything. If loadtxoutset is not used, GetSnapshotBaseBlock will return null and snap_block should already be false. The only case where the suggestion would change the checks performed (when GetSnapshotBaseBlock() == pindex && !pindex->IsAssumedValid()) is the case where background chain has been downloaded and fully validated, and the snapshot block loses it’s assumed_valid flag after it become validated. In that case, none of the snap_block conditions will ever be reached. Whether a snapshot block is still considered a snapshot block after the snapshot is validated is just a semantic question.

  15. maflcko approved
  16. maflcko commented at 11:12 am on February 5, 2024: member
    lgtm
  17. in src/validation.cpp:5097 in a3228f02f6 outdated
    4911         if (pindex->nStatus & BLOCK_HAVE_UNDO) assert(pindex->nStatus & BLOCK_HAVE_DATA);
    4912-        if (pindex->IsAssumedValid()) {
    4913-            // Assumed-valid blocks should have some nTx value.
    4914-            assert(pindex->nTx > 0);
    4915-            // Assumed-valid blocks should connect to the main chain.
    4916-            assert((pindex->nStatus & BLOCK_VALID_MASK) >= BLOCK_VALID_TREE);
    


    maflcko commented at 11:14 am on February 5, 2024:
    Any reason to remove this check?

    ryanofsky commented at 5:08 pm on February 5, 2024:

    re: #29370 (review)

    Any reason to remove this check?

    No, good catch, I’ll add it back. I was thinking it wasn’t really related to the other checks here, and trying to eliminate unnecessary uses of BLOCK_ASSUMED_VALID / IsAssumedValid in general. But as long as we have the assumed valid flag, might as well check this.


    ryanofsky commented at 9:51 pm on February 5, 2024:

    re: #29370 (review)

    Restored this check

  18. maflcko commented at 3:00 pm on February 5, 2024: member

    Also, it would be good to include the test to ensure the crash no longer happens.

    #29261 (comment)

  19. ryanofsky commented at 5:24 pm on February 5, 2024: contributor
    Thanks for the close review! Will work on your suggestions
  20. ryanofsky force-pushed on Feb 5, 2024
  21. ryanofsky commented at 10:29 pm on February 5, 2024: contributor
    Updated a3228f02f69e28960f7da76054965c916e781f27 -> 3bdbc3fc7f116856acc7d955f591110b96f5ebd2 (pr/nofake.2 -> pr/nofake.3, compare) implementing all suggestions, adding test coverage for the RPCs, and adding a new commit that drops the no longer needed BLOCK_ASSUMED_VALID flag. Updated 3bdbc3fc7f116856acc7d955f591110b96f5ebd2 -> 603a92c9881ed78e29f15c48121ce8bf35d30837 (pr/nofake.3 -> pr/nofake.4, compare) removing more BLOCK_ASSUMED_VALID references
  22. ryanofsky force-pushed on Feb 5, 2024
  23. DrahtBot added the label CI failed on Feb 5, 2024
  24. in src/validation.cpp:5709 in 59a8ca9333 outdated
    5535@@ -5531,14 +5536,6 @@ bool ChainstateManager::PopulateAndValidateSnapshot(
    5536     for (int i = AFTER_GENESIS_START; i <= snapshot_chainstate.m_chain.Height(); ++i) {
    5537         index = snapshot_chainstate.m_chain[i];
    5538 
    5539-        // Fake nTx so that LoadBlockIndex() loads assumed-valid CBlockIndex
    5540-        // entries (among other things)
    5541-        if (!index->nTx) {
    5542-            index->nTx = 1;
    5543-        }
    5544-        // Fake nChainTx so that GuessVerificationProgress reports accurately
    


    mzumsande commented at 3:07 am on February 6, 2024:
    Would GuessVerificationProgress lose accuracy with the removal or was the comment wrong?

    maflcko commented at 9:01 am on February 6, 2024:

    I think the comment was wrong, or do you have a code path where nChainTx is read when it was wrong (previously) or is unset (now)?

    Generally, GuessVerificationProgress is only called to estimate the IBD progress at the tip, in which case nChainTx is set to the correct value. It is also used to guess rescan progress, which is not possible for blocks outside the active chain or blocks that have data missing, so nChainTx should be correct there as well.

    unrelated to this pull: Maybe an Assume(nchaintx) can be added to GuessVerificationProgress?


    mzumsande commented at 7:27 pm on February 6, 2024:

    do you have a code path where nChainTx is read when it was wrong (previously) or is unset (now)

    no, I never understood how the fake nChainTx values would help with guessing progress, but also wasn’t 100% sure.


    ryanofsky commented at 4:35 pm on February 8, 2024:

    re: #29370 (review)

    unrelated to this pull: Maybe an Assume(nchaintx) can be added to GuessVerificationProgress?

    I went ahead and added this in a separate commit

  25. in test/functional/feature_assumeutxo.py:245 in 603a92c988 outdated
    223+
    224+        # nChainTx of all blocks after START_HEIGHT and before
    225+        # SNAPSHOT_BASE_HEIGHT should be 0. Confirm expected difference at
    226+        # beginning and end of this window.
    227+        stats = n1.getchaintxstats(nblocks=SNAPSHOT_BASE_HEIGHT-START_HEIGHT-2, blockhash=snapshot_prev_hash)
    228+        assert_equal(stats["window_tx_count"], 0)
    


    maflcko commented at 9:02 am on February 6, 2024:
    unrelated to this pull: May be better to throw here, instead of returning wrong data, but this can be done in a follow-up.

    maflcko commented at 2:17 pm on February 19, 2024:

    Would need to add a suppression for now?

     0 node1 stderr rpc/blockchain.cpp:1658:42: runtime error: unsigned integer overflow: 0 - 200 cannot be represented in type 'unsigned int'
     1    [#0](/bitcoin-bitcoin/0/) 0x5622afcfe34c in getchaintxstats()::$_0::operator()(RPCHelpMan const&, JSONRPCRequest const&) const src/rpc/blockchain.cpp:1658:42
     2    [#1](/bitcoin-bitcoin/1/) 0x5622afcfe34c in UniValue std::__invoke_impl<UniValue, getchaintxstats()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>(std::__invoke_other, getchaintxstats()::$_0&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/invoke.h:61:14
     3    [#2](/bitcoin-bitcoin/2/) 0x5622afcfe34c in std::enable_if<is_invocable_r_v<UniValue, getchaintxstats()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>, UniValue>::type std::__invoke_r<UniValue, getchaintxstats()::$_0&, RPCHelpMan const&, JSONRPCRequest const&>(getchaintxstats()::$_0&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/invoke.h:114:9
     4    [#3](/bitcoin-bitcoin/3/) 0x5622afcfe34c in std::_Function_handler<UniValue (RPCHelpMan const&, JSONRPCRequest const&), getchaintxstats()::$_0>::_M_invoke(std::_Any_data const&, RPCHelpMan const&, JSONRPCRequest const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:290:9
     5    [#4](/bitcoin-bitcoin/4/) 0x5622b0be25a6 in std::function<UniValue (RPCHelpMan const&, JSONRPCRequest const&)>::operator()(RPCHelpMan const&, JSONRPCRequest const&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
     6    [#5](/bitcoin-bitcoin/5/) 0x5622b0be25a6 in RPCHelpMan::HandleRequest(JSONRPCRequest const&) const src/rpc/util.cpp:620:20
     7    [#6](/bitcoin-bitcoin/6/) 0x5622afd5574a in CRPCCommand::CRPCCommand(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>, RPCHelpMan (*)())::'lambda'(JSONRPCRequest const&, UniValue&, bool)::operator()(JSONRPCRequest const&, UniValue&, bool) const src/./rpc/server.h:107:91
     8    [#7](/bitcoin-bitcoin/7/) 0x5622affb821e in std::function<bool (JSONRPCRequest const&, UniValue&, bool)>::operator()(JSONRPCRequest const&, UniValue&, bool) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
     9    [#8](/bitcoin-bitcoin/8/) 0x5622affb821e in ExecuteCommand(CRPCCommand const&, JSONRPCRequest const&, UniValue&, bool) src/rpc/server.cpp:537:20
    10    [#9](/bitcoin-bitcoin/9/) 0x5622affb821e in ExecuteCommands(std::vector<CRPCCommand const*, std::allocator<CRPCCommand const*>> const&, JSONRPCRequest const&, UniValue&) src/rpc/server.cpp:504:13
    11    [#10](/bitcoin-bitcoin/10/) 0x5622affb4fa1 in CRPCTable::execute(JSONRPCRequest const&) const src/rpc/server.cpp:524:13
    12    [#11](/bitcoin-bitcoin/11/) 0x5622b02e0d82 in HTTPReq_JSONRPC(std::any const&, HTTPRequest*) src/httprpc.cpp:203:40
    13    [#12](/bitcoin-bitcoin/12/) 0x5622b030a5ad in std::function<bool (HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&)>::operator()(HTTPRequest*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>> const&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
    14    [#13](/bitcoin-bitcoin/13/) 0x5622b030a5ad in HTTPWorkItem::operator()() src/httpserver.cpp:60:9
    15    [#14](/bitcoin-bitcoin/14/) 0x5622b03102b6 in WorkQueue<HTTPClosure>::Run() src/httpserver.cpp:115:13
    16    [#15](/bitcoin-bitcoin/15/) 0x5622b02f939c in HTTPWorkQueueRun(WorkQueue<HTTPClosure>*, int) src/httpserver.cpp:404:12
    17    [#16](/bitcoin-bitcoin/16/) 0x7f71dd506b83  (/lib/x86_64-linux-gnu/libstdc++.so.6+0xeab83) (BuildId: befd6a93d352d7d294804769783f60c9cc0cb5f5)
    18    [#17](/bitcoin-bitcoin/17/) 0x5622af76219e in asan_thread_start(void*) asan_interceptors.cpp.o
    19    [#18](/bitcoin-bitcoin/18/) 0x7f71dd17aac9  (/lib/x86_64-linux-gnu/libc.so.6+0x97ac9) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    20    [#19](/bitcoin-bitcoin/19/) 0x7f71dd20b47b  (/lib/x86_64-linux-gnu/libc.so.6+0x12847b) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    21SUMMARY: UndefinedBehaviorSanitizer: unsigned-integer-overflow rpc/blockchain.cpp:1658:42 in 
    

    https://cirrus-ci.com/task/5549867297144832?logs=ci#L2930

  26. maflcko commented at 9:04 am on February 6, 2024: member
    Needs rebase
  27. ryanofsky force-pushed on Feb 7, 2024
  28. ryanofsky commented at 1:07 am on February 7, 2024: contributor

    Needs rebase

    Yes, thanks for recognizing the CI failures.

    Rebased 603a92c9881ed78e29f15c48121ce8bf35d30837 -> 3f0b8607f549b27e9bb0cc8b189252a5cd954a36 (pr/nofake.4 -> pr/nofake.5, compare) with test fixes needed after silent conflict with #29354

  29. DrahtBot removed the label CI failed on Feb 7, 2024
  30. maflcko commented at 8:25 am on February 7, 2024: member
    Could include the test #29370 (comment) , or did you want to leave that for later?
  31. ryanofsky force-pushed on Feb 7, 2024
  32. ryanofsky commented at 7:19 pm on February 7, 2024: contributor

    re: #29370 (comment)

    Could include the test #29370 (comment) , or did you want to leave that for later?

    I had just forgotten about this. Added the test now, and fixed another bug uncovered by the test (forgetting to reset pindexFirstNeverProcessed after moving upwards from the snapshot block).

    Updated 3f0b8607f549b27e9bb0cc8b189252a5cd954a36 -> 2f4e7e8dc82adb60a339211cdc15408b5a6919ce (pr/nofake.5 -> pr/nofake.6, compare) adding maflcko’s test

  33. ryanofsky marked this as ready for review on Feb 7, 2024
  34. ryanofsky referenced this in commit 89d95bf13b on Feb 8, 2024
  35. ryanofsky force-pushed on Feb 8, 2024
  36. ryanofsky commented at 4:48 pm on February 8, 2024: contributor
    Updated 2f4e7e8dc82adb60a339211cdc15408b5a6919ce -> c9ca4ca4d970f9d42d0254775c04d114bd2152c3 (pr/nofake.6 -> pr/nofake.7, compare) added suggested unset nChainTx check in GuessVerificationProgress
  37. in test/functional/feature_assumeutxo.py:223 in d2bbe427d3 outdated
    218+        assert_equal(n1.getblockheader(start_hash)["nTx"], 1)
    219+        assert_equal(n1.getblockheader(start_next_hash)["nTx"], 1)
    220+        assert_equal(n1.getblockheader(snapshot_prev_hash)["nTx"], 1)
    221+        assert_equal(n1.getblockheader(snapshot_hash)["nTx"], 1)
    222+
    223+        # After loading the snapshot, check nChainTx values indirectly, by using the
    


    mzumsande commented at 9:42 pm on February 8, 2024:
    Why check these values indirectly instead of directly (field txcount)?

    ryanofsky commented at 7:51 pm on February 9, 2024:

    Why check these values indirectly instead of directly (field txcount)?

    Wow, I did not notice that field 🤦. That will make the test a lot better.


    ryanofsky commented at 3:46 pm on February 15, 2024:

    re: #29370 (review)

    Why check these values indirectly instead of directly (field txcount)?

    Thanks, checking these values directly now

  38. in src/chain.h:183 in 207a171769 outdated
    176@@ -173,21 +177,19 @@ class CBlockIndex
    177     //! (memory only) Total amount of work (expected number of hashes) in the chain up to and including this block
    178     arith_uint256 nChainWork{};
    179 
    180-    //! Number of transactions in this block.
    181+    //! Number of transactions in this block. This will be nonzero if the block
    182+    //! reached the VALID_TRANSACTIONS level, and zero otherwise.
    183     //! Note: in a potential headers-first mode, this number cannot be relied upon
    184     //! Note: this value is faked during UTXO snapshot load to ensure that
    


    mzumsande commented at 10:40 pm on February 8, 2024:
    comment can be removed, it’s no longer faked.

    ryanofsky commented at 3:47 pm on February 15, 2024:

    re: #29370 (review)

    comment can be removed, it’s no longer faked.

    Good catch, dropped this comment.

  39. in src/chain.h:131 in c9ca4ca4d9 outdated
    141-     *
    142-     * This flag is only used to implement checks in CheckBlockIndex() and
    143-     * should not be used elsewhere.
    144-     */
    145-    BLOCK_ASSUMED_VALID      =   256,
    146+    BLOCK_STATUS_RESERVED    =   256, //!< Unused flag that was previously set on assumeutxo snapshot blocks and their
    


    mzumsande commented at 7:07 pm on February 9, 2024:
    Is it necessary to keep a placeholder / would just removing it break something? I’m asking beacuse as far as I can see, the enum BlockStatus is only used in IsValid / RaiseValidity, but not for nStatus, which is a uint32_t.

    ryanofsky commented at 3:46 pm on February 15, 2024:

    re: #29370 (review)

    Is it necessary to keep a placeholder / would just removing it break something? I’m asking beacuse as far as I can see, the enum BlockStatus is only used in IsValid / RaiseValidity, but not for nStatus, which is a uint32_t.

    This is mostly just for documentation. I think it’s useful to keep a note that the field was previously used and may be saved to disk. Probably though it would not break anything if the field were reused in the future, so it should be possible to reclaim this if we were running out of flags.


    ryanofsky commented at 4:28 pm on February 15, 2024:

    re: #29370 (review)

    Is it necessary to keep a placeholder / would just removing it break something? I’m asking beacuse as far as I can see, the enum BlockStatus is only used in IsValid / RaiseValidity, but not for nStatus, which is a uint32_t.

    If this asking whether dropping the placeholder would change the underlying enum type, I think it wouldn’t because it it’s declared 32-bit unsigned explicitly (enum BlockStatus : uint32_t)


    Sjors commented at 4:06 pm on March 11, 2024:
    If we keep the comment, perhaps add that this has never been set for any (released or master branch) mainnet node. So it can be repurposed, especially if at some point in the future both signet and testnet have been reset.

    ryanofsky commented at 5:53 pm on March 11, 2024:

    re: #29370 (review)

    If we keep the comment, perhaps add that this has never been set for any (released or master branch) mainnet node. So it can be repurposed, especially if at some point in the future both signet and testnet have been reset.

    Would prefer not to do this, since part of the point of the comment is to discourage reusing the flag. I think just the fact that flag isn’t used, should be enough clue that it could be used if necessary.

  40. in doc/design/assumeutxo.md:54 in c9ca4ca4d9 outdated
    50@@ -51,12 +51,6 @@ The utility script
    51 
    52 ## Design notes
    53 
    54-- A new block index `nStatus` flag is introduced, `BLOCK_ASSUMED_VALID`, to mark block
    


    mzumsande commented at 7:12 pm on February 9, 2024:
    The next paragraph refers to this deleted comment (“using the nStatus flag mentioned above”) and should be updated too.

    ryanofsky commented at 3:46 pm on February 15, 2024:

    re: #29370 (review)

    The next paragraph refers to this deleted comment (“using the nStatus flag mentioned above”) and should be updated too.

    Thanks, fixed

  41. mzumsande commented at 7:13 pm on February 9, 2024: contributor
    Concept ACK This looks like a nice simplification.
  42. ryanofsky referenced this in commit 67dd1b146d on Feb 15, 2024
  43. ryanofsky force-pushed on Feb 15, 2024
  44. ryanofsky commented at 4:15 pm on February 15, 2024: contributor

    Thanks for the review!

    Updated c9ca4ca4d970f9d42d0254775c04d114bd2152c3 -> 8ab4e07a1a54c9f4818a557fd1182bbb59e1cf9b (pr/nofake.7 -> pr/nofake.8, compare) with suggestions

  45. DrahtBot added the label CI failed on Feb 15, 2024
  46. in test/functional/feature_assumeutxo.py:238 in 50273702d7 outdated
    242-        # between these should be fake values set by snapshot loading code.
    243+        # between these should be unset.
    244         assert_equal(n1.getchaintxstats(nblocks=1, blockhash=start_hash)["txcount"], START_HEIGHT + 1)
    245-        assert_equal(n1.getchaintxstats(nblocks=1, blockhash=start_next_hash)["txcount"], START_HEIGHT + 2)
    246-        assert_equal(n1.getchaintxstats(nblocks=1, blockhash=snapshot_prev_hash)["txcount"], SNAPSHOT_BASE_HEIGHT)
    247+        assert_equal(n1.getchaintxstats(nblocks=1, blockhash=start_next_hash)["txcount"], 0)
    


    fjahr commented at 3:07 pm on February 20, 2024:
  47. fjahr commented at 3:16 pm on February 20, 2024: contributor
    Concept ACK, looks good at first sight, will do a deeper review when it’s clear if the CI failure is an issue. I couldn’t reproduce it locally so far…
  48. maflcko commented at 3:20 pm on February 20, 2024: member
    To reproduce locally, you’ll have to compile with the integer sanitizer (not to be confused by the undefined sanitizer)
  49. ryanofsky commented at 3:24 pm on February 20, 2024: contributor

    CI failure is just a preexisting bug in a line of code that doesn’t affect anything:

    https://github.com/bitcoin/bitcoin/blob/45b2a91897ca8f4a3e0c1adcfb30cf570971da4e/src/rpc/blockchain.cpp#L1658

    Two nChainTx values in the getchaintxstats are subtracted and this triggers the undefined behavior sanitizer because they are unsigned and the result is negative. getchaintxstats has a number of edge case problems like this that should be cleaned up but don’t really affect anything. I’ll probably just add a ubsan suppression in this PR.

  50. ryanofsky commented at 3:24 pm on February 20, 2024: contributor

    To reproduce locally, you’ll have to compile with the integer sanitizer (not to be confused by the undefined sanitizer)

    Thanks! I was just trying to reproduce this locally using the undefined sanitizer.

  51. ryanofsky referenced this in commit cecf618b68 on Feb 20, 2024
  52. ryanofsky force-pushed on Feb 20, 2024
  53. ryanofsky commented at 7:50 pm on February 20, 2024: contributor
    Updated 8ab4e07a1a54c9f4818a557fd1182bbb59e1cf9b -> 8afcd99435b693f69b8c3a918b0573ef12559b22 (pr/nofake.8 -> pr/nofake.9, compare) adding suppression to avoid getchaintxstats ubsan integer overflow errors exposed by new test coverage
  54. DrahtBot removed the label CI failed on Feb 20, 2024
  55. in test/functional/feature_assumeutxo.py:220 in 32481c30a5 outdated
    215+        start_next_hash = n1.getblockhash(height=START_HEIGHT+1)
    216+        snapshot_prev_hash = n1.getblockhash(height=SNAPSHOT_BASE_HEIGHT-1)
    217+        snapshot_hash = n1.getblockhash(height=SNAPSHOT_BASE_HEIGHT)
    218+
    219+        # nTx of the starting block should be 1 because it actually has one
    220+        # transaction. nTx of later blocks should be fake 1 values set by
    


    maflcko commented at 10:33 am on February 22, 2024:

    nit in the first commit: Why claim they are fake when one can test which ones are fake?

     0diff --git a/test/functional/feature_assumeutxo.py b/test/functional/feature_assumeutxo.py
     1index 49cd2d8ec1..1f0d4bb309 100755
     2--- a/test/functional/feature_assumeutxo.py
     3+++ b/test/functional/feature_assumeutxo.py
     4@@ -309,6 +309,17 @@ class AssumeutxoTest(BitcoinTestFramework):
     5         }
     6         self.wait_until(lambda: n1.getindexinfo() == completed_idx_state)
     7 
     8+        self.log.info("Re-check nTx and nChainTx values")
     9+        assert_equal(n1.getblockheader(start_hash)["nTx"], 1)
    10+        assert_equal(n1.getblockheader(start_next_hash)["nTx"], 2)
    11+        assert_equal(n1.getblockheader(snapshot_prev_hash)["nTx"], 1)
    12+        assert_equal(n1.getblockheader(snapshot_hash)["nTx"], 2)
    13+
    14+        assert_equal(n1.getchaintxstats(nblocks=1, blockhash=start_hash)["txcount"], START_HEIGHT + 1)
    15+        assert_equal(n1.getchaintxstats(nblocks=1, blockhash=start_next_hash)["txcount"], START_HEIGHT + 3)
    16+        assert_equal(n1.getchaintxstats(nblocks=1, blockhash=snapshot_prev_hash)["txcount"], snapshot_nchaintx - 2)
    17+        assert_equal(n1.getchaintxstats(nblocks=1, blockhash=snapshot_hash)["txcount"], snapshot_nchaintx)
    18+
    19 
    20         for i in (0, 1):
    21             n = self.nodes[i]
    

    ryanofsky commented at 1:29 pm on February 26, 2024:

    re: #29370 (review)

    nit in the first commit: Why claim they are fake when one can test which ones are fake?


    ryanofsky commented at 1:56 pm on February 26, 2024:

    re: #29370 (review)

    nit in the first commit: Why claim they are fake when one can test which ones are fake?

    This PR is removing the fake values, so the parts of this test that refer to fake values only last for two commits then are removed along with the fake values.

    I think it would make this test too fragile and difficult to maintain if it hardcoded expressions like START_HEIGHT + 3, and snapshot_nchaintx - 2 that depend on the i % 3 == 0 condition ~150 lines earlier in the test setup. It could be useful to check final nTx and nChainTx values though, not just initial ones after loading the snapshot I expanded the test to check for these values by storing them instead of hardcoding them.

  56. in test/sanitizer_suppressions/ubsan:65 in 32481c30a5 outdated
    61@@ -61,6 +62,7 @@ implicit-integer-sign-change:CBlockPolicyEstimator::processBlockTx
    62 implicit-integer-sign-change:SetStdinEcho
    63 implicit-integer-sign-change:compressor.h
    64 implicit-integer-sign-change:crypto/
    65+implicit-integer-sign-change:getchaintxstats*
    


    maflcko commented at 10:38 am on February 22, 2024:

    nit: In the first commit. I tested this locally, and the suppressions were not needed. The cirrus link you provide says:

    0 node1 stderr rpc/blockchain.cpp:1658:42: runtime error: unsigned integer overflow: 0 - 200 cannot be represented in type 'unsigned int'
    

    Looking at the 0, it seems more likely that this was introduced in a later commit.

    Also, Cirrus will clear the log in 90 days, so it may be better to not link to it.


    ryanofsky commented at 1:59 pm on February 26, 2024:

    re: #29370 (review)

    nit: In the first commit. I tested this locally, and the suppressions were not needed.

    Thanks, I updated the commit description, dropped the link, and moved the ubsan suppressions into a new commit

  57. in src/validation.cpp:5119 in cecf618b68 outdated
    5114@@ -5115,6 +5115,11 @@ double GuessVerificationProgress(const ChainTxData& data, const CBlockIndex *pin
    5115     if (pindex == nullptr)
    5116         return 0.0;
    5117 
    5118+    if (!Assume(pindex->nChainTx > 0)) {
    5119+        LogPrintf("Internal bug detected: block %d has unset nChainTx (%s %s). Please report this issue here: %s\n",
    


    maflcko commented at 10:43 am on February 22, 2024:
    second commit: LogPrintf is deprecated. Also, could use STR_INTERNAL_BUG to reduce boilerplate.

    ryanofsky commented at 2:00 pm on February 26, 2024:

    re: #29370 (review)

    second commit: LogPrintf is deprecated. Also, could use STR_INTERNAL_BUG to reduce boilerplate.

    Changed LogPrintf to LogWarning. I’m still not using STR_INTERNAL_BUG for now though. I actually tried to use STR_INTERNAL_BUG initially here but it seemed like an awkward fit because it prints a multiline string to the log which seems like it would complicate log parsing, and it also duplicates file/line/function name in the log. It would be nice to have a version of STR_INTERNAL_BUG, maybe integrated with the Assume macro that works better for logging, but right now it appears like STR_INTERNAL_BUG is better suited and mostly used for throwing exceptions, not for logging.

  58. in src/validation.cpp:4932 in 594336ae8a outdated
    4943-        } else {
    4944-            // Otherwise there should only be an nTx value if we have
    4945-            // actually seen a block's transactions.
    4946-            assert(((pindex->nStatus & BLOCK_VALID_MASK) >= BLOCK_VALID_TRANSACTIONS) == (pindex->nTx > 0)); // This is pruning-independent.
    4947         }
    4948+        // There there should only be an nTx value if we have
    


    maflcko commented at 1:08 pm on February 22, 2024:
    594336ae8aad026dacb5d536df63bd374e3a89c7: there there

    ryanofsky commented at 2:02 pm on February 26, 2024:
  59. in src/validation.cpp:4964 in 594336ae8a outdated
    4967+            assert((pindex->nChainTx != 0) == (pindex == snap_base));
    4968+        }
    4969+
    4970         // Chainstate-specific checks on setBlockIndexCandidates
    4971         for (auto c : GetAll()) {
    4972+            const bool is_active = c == &ActiveChainstate();
    


    maflcko commented at 1:19 pm on February 22, 2024:

    nit in https://github.com/bitcoin/bitcoin/commit/594336ae8aad026dacb5d536df63bd374e3a89c7: Would be good to avoid the use of = in C++, when possible, as it narrowing and possibly confusing when used in contexts with ==.

    0const bool is_active{c == &ActiveChainstate()};
    

    ryanofsky commented at 2:01 pm on February 26, 2024:

    re: https://github.com/bitcoin/bitcoin/pull/29370/files#r1499230459

    nit in 594336a: Would be good to avoid the use of = in C++, when possible, as it narrowing and possibly confusing when used in contexts with ==.

    Makes sense. In the newest update I was able to eliminate some uses of is_active and drop this.

  60. in src/validation.cpp:4966 in 594336ae8a outdated
    4970         // Chainstate-specific checks on setBlockIndexCandidates
    4971         for (auto c : GetAll()) {
    4972+            const bool is_active = c == &ActiveChainstate();
    4973             if (c->m_chain.Tip() == nullptr) continue;
    4974-            if (!CBlockIndexWorkComparator()(pindex, c->m_chain.Tip()) && pindexFirstNeverProcessed == nullptr) {
    4975+            if (!CBlockIndexWorkComparator()(pindex, c->m_chain.Tip()) && (pindexFirstNeverProcessed == nullptr || (is_active && pindex == snap_base))) {
    


    maflcko commented at 1:45 pm on February 22, 2024:

    Just to clarify, the is_active check is just a belt-and-suspender, since pindex == snap_base implies is_active?

    An alternative may be to split the assumption out into a separate assert.


    ryanofsky commented at 5:24 pm on February 22, 2024:

    re: #29370 (review)

    Just to clarify, the is_active check is just a belt-and-suspender, since pindex == snap_base implies is_active?

    It’s meant to be a real check. I should probably add a comment about it. The idea is that on the snapshot chain, the snapshot block should be in setBlockIndexCandidates even though earlier transactions may not be downloaded.

    I don’t think pindex == snap_base implies anything about is_active, because the for ... GetAll loop above should be iterating over all chains and only one of them should be active.

    However, I think it might be true right now that the snapshot block is added to setBlockIndexCandidates for all chains when it is missing transaction data, even though it only makes sense to add it on the snapshot chain. So the checks below may happen to succeed even if is_active is false. But if that’s the case, it’s just a side-effect of the way some sloppy code around TryAddBlockIndexCandidate is written, not something we should rely on here:

    https://github.com/bitcoin/bitcoin/blob/1ac627c485a43e50a9a49baddce186ee3ad4daad/src/validation.cpp#L4602-L4613

    https://github.com/bitcoin/bitcoin/blob/1ac627c485a43e50a9a49baddce186ee3ad4daad/src/test/validation_chainstatemanager_tests.cpp#L519-L522

    I think we will probably want a separate PR to clean up code around TryAddBlockIndexCandidate and only add blocks to setBlockIndexCandidates that really belong there.


    ryanofsky commented at 2:01 pm on February 26, 2024:

    re: https://github.com/bitcoin/bitcoin/pull/29370/files#r1499266054

    Just to clarify, the is_active check is just a belt-and-suspender, since pindex == snap_base implies is_active?

    Reviewing this more, my comments above about pindex == snap_base being orthogonal to is_active, and about TryAddBlockIndexCandidate adding the snapshot block as a candidate to the background validation chainstate unnecessarily are still accurate, but my conclusion that it makes sense to keep is_active instead of dropping it is wrong.

    It makes more sense to drop is_active && condition to avoid hitting the c->setBlockIndexCandidates.count(pindex) == 0 assert below, and not trigger an error if the snapshot block is added unnecessarily to the background chain candidate set, which can happen in LoadBlockIndex if the node is restarted during background validation as described above.

    So the new push drops is_active && here and should be safer and clearer. I also wrote a lot of new comments to help clarify this code.

  61. in src/validation.cpp:4977 in 594336ae8a outdated
    4984+                    if (pindexFirstMissing == nullptr || pindex == c->m_chain.Tip() || (is_active && pindex == snap_base)) {
    4985                         // The active chainstate should always have this block
    4986                         // as a candidate, but a background chainstate should
    4987                         // only have it if it is an ancestor of the snapshot base.
    4988-                        if (is_active || GetSnapshotBaseBlock()->GetAncestor(pindex->nHeight) == pindex) {
    4989+                        if (is_active || snap_base->GetAncestor(pindex->nHeight) == pindex) {
    


    maflcko commented at 4:20 pm on February 22, 2024:
    Why would the background chainstate have the snapshot base? The comment seems wrong and the second branch of the if condition is never taken, no?

    ryanofsky commented at 5:48 pm on February 22, 2024:

    re: #29370 (review)

    Why would the background chainstate have the snapshot base? The comment seems wrong and the second branch of the if condition is never taken, no?

    I think the comment is just saying that if pindex is an ancestor of the snapshot base, it should be in setBlockIndexCandidates. It’s possible the snapshot base could be in there too, but the code is only checking for it to be there if the chain is active, or if all transactions in and before the snapshot base were downloaded.


    ryanofsky commented at 2:00 pm on February 26, 2024:

    re: https://github.com/bitcoin/bitcoin/pull/29370/files#r1499519079

    Why would the background chainstate have the snapshot base? The comment seems wrong and the second branch of the if condition is never taken, no?

    Answered this in #29370 (review), but now added better comments to the code as well.


    maflcko commented at 10:07 am on March 20, 2024:
    I incorrectly assumed that GetSnapshotBaseBlock was a function on the chainstate, which then would return nullptr for the background chainstate. However, it is a function on the chainstatemanager and it should not return a nullptr if a second chainstate (background chainstate) is loaded. Otherwise, it would be a nullptr deref anyway.
  62. maflcko commented at 4:31 pm on February 22, 2024: member

    ACK 8afcd99435b693f69b8c3a918b0573ef12559b22 🤗

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: ACK 8afcd99435b693f69b8c3a918b0573ef12559b22 🤗
    3pDg2BPKGH9xuFZAcu+QXuXdu2fcqOaeS2y/Crj2fW2R6q67lpuPvQS+Iq51IomT4ZBoP+oCfeUSQY6QiyCRLCg==
    
  63. DrahtBot requested review from mzumsande on Feb 22, 2024
  64. DrahtBot requested review from fjahr on Feb 22, 2024
  65. ryanofsky commented at 5:52 pm on February 22, 2024: contributor
    Thanks for the review. Tried to answer some questions below and I will work on the suggestions
  66. ryanofsky assigned ryanofsky on Feb 26, 2024
  67. ryanofsky referenced this in commit b64dc6d303 on Feb 26, 2024
  68. ryanofsky referenced this in commit 85dfb7160c on Feb 26, 2024
  69. ryanofsky force-pushed on Feb 26, 2024
  70. ryanofsky commented at 5:32 pm on February 26, 2024: contributor
    Updated 8afcd99435b693f69b8c3a918b0573ef12559b22 -> 7081caca03b8f0d1c96415e0fc839bf4836257be (pr/nofake.9 -> pr/nofake.10, compare) adding check for final nTx and nChainTx values to the python test, moving ubsan suppressions to another commit, and improving setBlockIndexCandidates checks and adding better comments. Updated 7081caca03b8f0d1c96415e0fc839bf4836257be -> 2f3a692cdad60cb771e390bb79bf4da2b40146c9 (pr/nofake.10 -> pr/nofake.11, compare) making minor edits to comments
  71. ryanofsky referenced this in commit c015050cb2 on Feb 26, 2024
  72. ryanofsky referenced this in commit 49426f84b9 on Feb 26, 2024
  73. ryanofsky force-pushed on Feb 26, 2024
  74. in src/validation.cpp:5125 in d351ea2f88 outdated
    4950+        assert(((pindex->nStatus & BLOCK_VALID_MASK) >= BLOCK_VALID_TRANSACTIONS) == (pindex->nTx > 0)); // This is pruning-independent.
    4951         // All parents having had data (at some point) is equivalent to all parents being VALID_TRANSACTIONS, which is equivalent to HaveNumChainTxs().
    4952-        assert((pindexFirstNeverProcessed == nullptr) == pindex->HaveNumChainTxs());
    4953-        assert((pindexFirstNotTransactionsValid == nullptr) == pindex->HaveNumChainTxs());
    4954+        // HaveNumChainTxs will also be set in the assumeutxo snapshot block from snapshot metadata.
    4955+        assert((pindexFirstNeverProcessed == nullptr || pindex == snap_base) == pindex->HaveNumChainTxs());
    


    mzumsande commented at 10:56 pm on February 27, 2024:
    I think that this criterion can fail during the point in time where the background chainstate catches up with the snapshot block. We will request the snapshot block from a peer, and if it doesn’t arrive in order, we can’t connect it yet. Therefore, in in ReceivedBlockTransactions we’ll set nTx to some value but since there are predecessors that haven’t arrived yet we’ll set nChainTx (which was previously set to the chainparams value) back to 0. As a result, we have pindex == snap_base and pindex->HaveNumChainTxs()==false and the assert will be hit.

    ryanofsky commented at 11:36 pm on February 27, 2024:

    re: #29370 (review)

    In commit “assumeutxo: Get rid of faked nTx and nChainTx values” (d351ea2f887dc81993799010ac2fb6a220f07561)

    we’ll set nChainTx (which was previously set to the chainparams value) back to 0

    Wow, great catch. It never occurred to me that we would set the nChainTx back to 0 after setting it to another value. But that’s exactly what we do at the beginning of ReceivedBlockTransactions():

    https://github.com/bitcoin/bitcoin/blob/ba907f96ad37c09c49c0e1532fad118fcb8dd4a8/src/validation.cpp#L3616

    It looks like this bug was actually present in the previous assert too, and not new in this commit?

    In light of that, I guess the check we would need here is:

    0if (pindex->HaveNumChainTxs()) {
    1    assert(pindexFirstNeverProcessed == nullptr || pindex == snap_base);
    2} else {
    3    assert(pindexFirstNeverProcessed != nullptr);   
    4}
    

    But I think a better change would be to make ReceivedBlockTransactions() not reset nChainTx to 0 at all. Setting the snapshot block nChainTx to a real value, then back to 0, then back to a real value again seems like confusing behavior that could break other code aside from this assert.


    mzumsande commented at 4:15 pm on February 28, 2024:

    It looks like this bug was actually present in the previous assert too, and not new in this commit?

    Yes, I encountered this issue testing this branch on signet (and hadn’t tried that with master), but it should also fail on master. So I think it would also be ok to fix this not here but in a separate PR.

    Slightly related: I wonder if we should do something differently in the situation where the chainparams-provided nChainTx turns out to be wrong. Of course that’s trusted data and shouldn’t be incorrect normally - but if it is anyway I think that we’d currently continue with the inconsistent nChainTx data and wouldn’t even log an error - unless -checkblockindex is enabled, in which case we’d terminate with an assert.


    ryanofsky commented at 5:42 pm on February 28, 2024:

    It looks like this bug was actually present in the previous assert too, and not new in this commit?

    Yes, I encountered this issue testing this branch on signet (and hadn’t tried that with master), but it should also fail on master. So I think it would also be ok to fix this not here but in a separate PR.

    Agree it should probably be a separate PR. It might be easier to implement that PR after this one, so the fake values are gone and and ReceivedBlockTransactions could assert that nTx and nChainTx are 0 for new, non-snapshot blocks.

    Slightly related: I wonder if we should do something differently in the situation where the chainparams-provided nChainTx turns out to be wrong. Of course that’s trusted data and shouldn’t be incorrect normally - but if it is anyway I think that we’d currently continue with the inconsistent nChainTx data and wouldn’t even log an error - unless -checkblockindex is enabled, in which case we’d terminate with an assert.

    Agree. We have errors and specific code to handle cases when other trusted data is wrong, so it probably makes sense to handle this case as well. It might be easier to check for this case after the fake values are gone since nonzero nChainTx values should not be changing after that.


    mzumsande commented at 6:10 pm on February 28, 2024:

    Agree it should probably be a separate PR.

    ok - I can work on a follow-up that fixes the bug (preferably in ReceivedBlockTransactions instead of just adjusting the checks) , and add a test case that would trigger the assert on master.


    ryanofsky commented at 7:20 pm on February 28, 2024:

    ok - I can work on a follow-up that fixes the bug (preferably in ReceivedBlockTransactions instead of just adjusting the checks) , and add a test case that would trigger the assert on master.

    That would be great, and I’d be happy to review. Let me know if you think I should update the CheckBlockIndex assert in this PR as described above #29370 (review) too here.

    I did experiment a little with adding asserts to ReceivedBlockTransactions:

     0--- a/src/validation.cpp
     1+++ b/src/validation.cpp
     2@@ -3589,6 +3589,8 @@ void Chainstate::TryAddBlockIndexCandidate(CBlockIndex* pindex)
     3 void ChainstateManager::ReceivedBlockTransactions(const CBlock& block, CBlockIndex* pindexNew, const FlatFilePos& pos)
     4 {
     5     AssertLockHeld(cs_main);
     6+    assert(pindexNew->nTx == 0);
     7+    assert(pindexNew->nChainTx == 0);
     8     pindexNew->nTx = block.vtx.size();
     9     pindexNew->nChainTx = 0;
    10     pindexNew->nFile = pos.nFile;
    

    This makes the two assumeutxo tests fail as expected, but it also makes the rpc_getblockfrompeer.py test fail the nTx == 0 check for some reason. Hopefully a straightforward fix ReceivedBlockTransactions should be possible, though.


    mzumsande commented at 7:14 pm on February 29, 2024:

    Let me know if you think I should update the CheckBlockIndex assert in this PR as described above #29370 (review) too here.

    Up to you, but I think it’d be good to have this fixed and would be happy to re-ACK.


    ryanofsky commented at 9:26 pm on February 29, 2024:

    re: #29370 (review)

    Up to you, but I think it’d be good to have this fixed and would be happy to re-ACK.

    Ok, I updated the assertion in a new commit 438ac1ffc23ec5134f2ac74cee75ff145078df09, which could also be a standalone PR. (I’m assuming by “fixed” you mean changing the assertion, not changing ReceivedBlockTransactions, which I think would be harder to do correctly.)


    ryanofsky commented at 8:36 pm on March 8, 2024:

    re: #29370 (review)

    Ok, I updated the assertion in a new commit 438ac1f, which could also be a standalone PR. (I’m assuming by “fixed” you mean changing the assertion, not changing ReceivedBlockTransactions, which I think would be harder to do correctly.)

    New asserts had some problems reported #29370#pullrequestreview-1912039200 with a test case triggering failures posted #29370 (comment). It was possible to fix the asserts in CheckBlockIndex to cope with ReceivedBlockTransactions setting 0 nChainTx values, but the checks started getting more complicated and it turned out to be simpler to just update ReceivedBlockTransactions not to mess up nChainTx instead.

    Also debugging the test case revealed that the new crash it was triggering with this PR was not the same as the crash it was triggering before this PR. So commit 438ac1ffc23ec5134f2ac74cee75ff145078df09 would not actually fix anything. The test case works by submitting a snapshot block after loading the snapshot, out of order before previous blocks were downloaded. Before this PR, this would cause ReceivedBlockTransactions to set a fake nChainTx value on the snapshot block and the pindex->nChainTx == pindex->nTx + prev_chain_tx check on the next block to fail. After this PR it would cause ReceivedBlockTransactions to set a 0 nChainTx value and the assert((pindexFirstNeverProcessed == nullptr || pindex == snap_base) == pindex->HaveNumChainTxs()); check to fail.

  75. mzumsande commented at 6:18 pm on February 28, 2024: contributor

    ACK 2f3a692cdad60cb771e390bb79bf4da2b40146c9 - I reviewed the code, and did some testing running with -checkblockindex on signet.

    Would be great if @jamesob you could have a quick look. Not necessarily an in-depth review - I’d just want too make sure we’re not missing some historical context from the time when the faked values and the BLOCK_ASSUMED_VALID flag were introduced originally.

  76. DrahtBot requested review from maflcko on Feb 28, 2024
  77. ryanofsky referenced this in commit 438ac1ffc2 on Feb 29, 2024
  78. ryanofsky force-pushed on Feb 29, 2024
  79. ryanofsky commented at 9:39 pm on February 29, 2024: contributor
    Updated 2f3a692cdad60cb771e390bb79bf4da2b40146c9 -> 7ce1a0e375a2b27614be0ff148aea989d1100ff8 (pr/nofake.11 -> pr/nofake.12, compare) adding new commit 438ac1ffc23ec5134f2ac74cee75ff145078df09 to fix a pre-existing bug #29370 (review) in CheckBlockIndex
  80. in src/validation.cpp:4948 in 438ac1ffc2 outdated
    4944-        assert((pindexFirstNotTransactionsValid == nullptr) == pindex->HaveNumChainTxs());
    4945+        if (pindex->HaveNumChainTxs()) {
    4946+            assert(pindexFirstNotTransactionsValid == nullptr);
    4947+        } else {
    4948+            const bool unprocessed_snapshot_block{pindex == snap_base && pindex->nTx == 0};
    4949+            assert(pindexFirstNotTransactionsValid != nullptr || unprocessed_snapshot_block);
    


    mzumsande commented at 9:21 pm on March 1, 2024:
    I think it must be pindex->nTx != 0 in the line above - since the situation is that we have received the snapshot block transactions (so nTx!=0), but can’t connect the block yet (so nChainTx==0). Also, maybe unconnected_snapshot_block is a better name?

    ryanofsky commented at 11:11 pm on March 1, 2024:

    re: #29370 (review)

    I think it must be pindex->nTx != 0 in the line above - since the situation is that we have received the snapshot block transactions (so nTx!=0), but can’t connect the block yet (so nChainTx==0). Also, maybe unconnected_snapshot_block is a better name?

    Oh, I forgot that nTx would still be set in that case. I was trying to make the check stricter so snap_base would have to have nChainTx set after it was fully connected, and it wouldn’t be able to escape this check after that happened. But I guess there is no way to determine that and the best that can be done is to check that it nChainTx is nonzero after it is validated. Can make that update


    ryanofsky commented at 8:25 pm on March 8, 2024:

    re: #29370 (review)

    Could reproduce this failure with the test change suggested in #29370 (comment). Changing this to const bool unprocessed_snapshot_block{pindex == snap_base && (pindex->nStatus & BLOCK_VALID_MASK) < BLOCK_VALID_SCRIPTS} did fix this but then there was just another assert failure in the assert((pindex->nChainTx != 0) == (pindex == snap_base)); line below, so opted to revert this and make a fix in ReceivedBlockTransactions instead

  81. in src/validation.cpp:5148 in 6c154da50e outdated
    4979+        } else if (pindex->pprev->nChainTx > 0 && pindex->nTx > 0) {
    4980+            // If previous nChainTx is set and number of transactions in block is known, sum must be set.
    4981+            assert(pindex->nChainTx == pindex->nTx + pindex->pprev->nChainTx);
    4982+        } else {
    4983+            // Otherwise nChainTx should only be set if this is a snapshot block.
    4984+            assert((pindex->nChainTx != 0) == (pindex == snap_base));
    


    mzumsande commented at 9:57 pm on March 1, 2024:
    This is also violated in the situation of a snapshot block that is sent out of order (pindex->nChainTx == 0, pindex == snap_base )

    ryanofsky commented at 11:19 pm on March 1, 2024:

    re: #29370 (review)

    This is also violated in the situation of a snapshot block that is sent out of order (pindex->nChainTx == 0, pindex == snap_base )

    Ok, I guess it needs to be changed to:

    0assert(pindex->nChainTx == 0 || pindex == snap_base);
    

    ryanofsky commented at 8:30 pm on March 8, 2024:

    re: #29370 (review)

    Could reproduce this failure with the test change suggested in #29370 (comment). It actually needed a slightly different fix assert(pindex->nChainTx == 0 || pindex == snap_base || pindex->pprev == snap_base) because there if the snapshot block temporarily had nChainTx set to 0, and the block after the snapshot block was previously downloaded, the assert could trigger again without the pindex->pprev == snap_base condition. Explaining this in another code comment seemed too complicated though, so opted to revert this and make a fix in ReceivedBlockTransactions instead

  82. mzumsande commented at 10:02 pm on March 1, 2024: contributor
    The situation with a snapshot block that is sent out of order is still asserting (in two places). As mentioned before, I’d understand if you wanted to delay the fix to a different PR.
  83. ryanofsky commented at 11:25 pm on March 1, 2024: contributor
    Thanks for the review. It seems like my workaround for the (preexisting) bug reported in the #29370 (review) thread doesn’t work, and I’m undecided whether it’s better to make tweaks to fix it or just leave it alone to be solved separately. @mzumsande I’m really curious how you’re testing it, because it is a little hard to reason about abstractly without a test.
  84. mzumsande commented at 11:53 pm on March 1, 2024: contributor

    @mzumsande I’m really curious how you’re testing it, because it is a little hard to reason about abstractly without a test.

    I’ve been testing this in two ways:

    1. On signet, syncing from a snapshot I created until I am close to the point where the background chainstate is finished, and then running with -checkblockindex for the last couple of blocks until it crashes. After several tries I’ve now saved a copy of a datadir where the background sync is finished up to snapshot_height - 150 or so, and then I run only the last few blocks with the slow -checkblockindex. The signet testing is also how I encountered #29519.

    2. Adjusting feature_assumeutxo.py, so far I’ve just added the code

    0        # Send snapshot block to n1 out of order, to test checkblockindex behavior
    1        snapshot_hash = n0.getblockhash(SNAPSHOT_BASE_HEIGHT)
    2        snapshot_block = n0.getblock(snapshot_hash, 0)
    3        n1.submitblock(snapshot_block)
    

    before self.connect_nodes(0, 1) in line ~308. This triggers the bug at the snapshot height, but is not ideal otherwise, because it means that the order of blocks is really strange, e.g. we receive the block snapshot+1 after snapshot which would not usually happening when syncing. I’ve run into additional asserts failures this way (not mentioned above) that will hopefully be no longer an issue after changing the logic in ReceivedBlockTransactions. But I’m thinking of generalising this approach - maybe even randomize the order in which all blocks are sent - but I haven’t done that yet.

  85. ryanofsky commented at 0:11 am on March 2, 2024: contributor
    Thank you that is very helpful! I should have thought of submitting the snapshot block to test this. For some reason I was thinking the bug would be harder to trigger.
  86. mzumsande commented at 7:25 pm on March 8, 2024: contributor
    Fixing the issue in ReceivedBlockTransactions turned out to be really simple, see 4cba9f4f0693ecae23f8c02afe354c5893c2684c in 202403_assumeutxo_nchaintx that builds on this branch, and also adds a test to submit blocks in a randomized order. So I’m not sure if it’s really worth the effort to get the checks right in the current world where nChainTx for the snapshot block can be set back to 0.
  87. ryanofsky commented at 7:53 pm on March 8, 2024: contributor

    Fixing the issue in ReceivedBlockTransactions turned out to be really simple, see 4cba9f4 in 202403_assumeutxo_nchaintx that builds on this branch, and also adds a test to submit blocks in a randomized order. So I’m not sure if it’s really worth the effort to get the checks right in the current world where nChainTx for the snapshot block can be set back to 0.

    Yeah after putting in more effort, I gave up on the checks and came to a similar conclusion. My fix is more strict, though and looks like:

     0--- a/src/validation.cpp
     1+++ b/src/validation.cpp
     2@@ -3590,7 +3590,14 @@ void ChainstateManager::ReceivedBlockTransactions(const CBlock& block, CBlockInd
     3 {
     4     AssertLockHeld(cs_main);
     5     pindexNew->nTx = block.vtx.size();
     6-    pindexNew->nChainTx = 0;
     7+    // Typically nChainTx will be 0 at this point, but it can be nonzero if this
     8+    // is a pruned block which is being downloaded again, or if this is an
     9+    // assumeutxo snapshot block which has a hardcoded nChainTx value from the
    10+    // snapshot metadata. If the pindex is not the snapshot block and the nChainTx
    11+    // value is not zero, assert that value is actually correct.
    12+    assert(pindexNew->nChainTx == 0 ||
    13+           pindexNew->nChainTx == (pindexNew->pprev ? pindexNew->pprev->nChainTx : 0) + pindexNew->nTx ||
    14+           pindexNew == GetSnapshotBaseBlock());
    15     pindexNew->nFile = pos.nFile;
    16     pindexNew->nDataPos = pos.nPos;
    17     pindexNew->nUndoPos = 0;
    18@@ -3610,7 +3617,13 @@ void ChainstateManager::ReceivedBlockTransactions(const CBlock& block, CBlockInd
    19         while (!queue.empty()) {
    20             CBlockIndex *pindex = queue.front();
    21             queue.pop_front();
    22-            pindex->nChainTx = (pindex->pprev ? pindex->pprev->nChainTx : 0) + pindex->nTx;
    23+            auto chain_tx{(pindex->pprev ? pindex->pprev->nChainTx : 0) + pindex->nTx};
    24+            // Before setting nChainTx assert that it is 0 or already set to
    25+            // the correct value. This assert will fail after receiving the
    26+            // assumeutxo snapshot block if assumeutxo snapshot metadata has an
    27+            // incorrect AssumeutxoData::nChainTx value.
    28+            assert(pindex->nChainTx == 0 || pindex->nChainTx == chain_tx);
    29+            pindex->nChainTx = chain_tx;
    30             pindex->nSequenceId = nBlockSequenceId++;
    31             for (Chainstate *c : GetAll()) {
    32                 c->TryAddBlockIndexCandidate(pindex);
    
  88. mzumsande commented at 8:48 pm on March 8, 2024: contributor

    My fix is more strict, though and looks like: (…)

    Seems fine to me, though I’m not in favor of adding nontrivial asserts like this outside of CheckBlockIndex - I’d prefer them to be Assume in production code, since the consequences of them breaking at some point could be very serious, whereas the consequence of a wrong (non-zero) nChainTx value would be mostly just that progress estimation logging would be broken.

  89. ryanofsky referenced this in commit fdabd745d1 on Mar 8, 2024
  90. ryanofsky force-pushed on Mar 8, 2024
  91. ryanofsky commented at 8:56 pm on March 8, 2024: contributor
    Updated 7ce1a0e375a2b27614be0ff148aea989d1100ff8 -> 35b0964954afb88a4ac0e64ebe777f194946bf38 (pr/nofake.12 -> pr/nofake.13, compare) incorporating a different fix for the bug reported #29370 (review) triggered by test code in #29370 (comment). Instead of updating CheckBlockIndex asserts to cope with ReceivedBlockTransactions temporarily setting the snapshot block nChainTx to 0 if it is downloaded out of order, it updates ReceivedBlockTransactions not to do that. This avoids problems with the previous fix reported #29370#pullrequestreview-1912039200 which made CheckBlockIndex checks more complicated
  92. ryanofsky referenced this in commit 95474d9f3b on Mar 8, 2024
  93. ryanofsky force-pushed on Mar 8, 2024
  94. ryanofsky commented at 9:12 pm on March 8, 2024: contributor

    re: #29370 (comment)

    Seems fine to me, though I’m not in favor of adding nontrivial asserts like this outside of CheckBlockIndex - I’d prefer them to be Assume in production code, since the consequences of them breaking at some point could be very serious, whereas the consequence of a wrong (non-zero) nChainTx value would be mostly just that progress estimation logging would be broken.

    Updated 35b0964954afb88a4ac0e64ebe777f194946bf38 -> 1e227bc2139bffe6fd3e3f25d0594c9443836420 (pr/nofake.13 -> pr/nofake.14, compare) replacing assert with Assume Updated 1e227bc2139bffe6fd3e3f25d0594c9443836420 -> 065952806234e5a0a6b87df5a5067d0f4062d25e (pr/nofake.14 -> pr/nofake.15, compare) fixing lint error https://cirrus-ci.com/task/6070075329871872?logs=lint#L743

  95. ryanofsky referenced this in commit 85233b2032 on Mar 8, 2024
  96. ryanofsky force-pushed on Mar 8, 2024
  97. DrahtBot added the label CI failed on Mar 8, 2024
  98. DrahtBot commented at 9:20 pm on March 8, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/22453844886

  99. DrahtBot removed the label CI failed on Mar 8, 2024
  100. ryanofsky unassigned ryanofsky on Mar 11, 2024
  101. in src/validation.cpp:5698 in 0659528062 outdated
    5575@@ -5579,21 +5576,11 @@ bool ChainstateManager::PopulateAndValidateSnapshot(
    5576     CBlockIndex* index = nullptr;
    5577 
    5578     // Don't make any modifications to the genesis block.
    5579-    // This is especially important because we don't want to erroneously
    5580-    // apply BLOCK_ASSUMED_VALID to genesis, which would happen if we didn't skip
    5581-    // it here (since it apparently isn't BLOCK_VALID_SCRIPTS).
    


    Sjors commented at 4:22 pm on March 11, 2024:

    One useful bit of this comment might be worth preserving:

    0// Note that the genesis block isn't BLOCK_VALID_SCRIPTS.
    

    ryanofsky commented at 6:08 pm on March 11, 2024:

    re: #29370 (review)

    One useful bit of this comment might be worth preserving:

    Thanks, added this back

  102. in test/functional/feature_assumeutxo.py:281 in 85233b2032 outdated
    253@@ -247,6 +254,15 @@ def check_tx_counts(final: bool) -> None:
    254 
    255         assert_equal(n1.getblockchaininfo()["blocks"], SNAPSHOT_BASE_HEIGHT)
    256 
    257+        self.log.info("Submit a stale block that forked off the chain before the snapshot")
    


    Sjors commented at 4:25 pm on March 11, 2024:

    85233b203231f39c4436017860248ff9ad08ad42: This is the test mentioned in the PR description? Maybe introduce it in a separate commit so it’s a bit easier to test the bug fix.

    When I undo the changes in chain.h and validation.cpp the test fails earlier (line 243), so it tricky to test this part now.


    ryanofsky commented at 6:09 pm on March 11, 2024:

    re: #29370 (review)

    85233b2: This is the test mentioned in the PR description? Maybe introduce it in a separate commit so it’s a bit easier to test the bug fix.

    Good idea, I moved the tests for the two crashes into separate commits.


    Sjors commented at 7:04 pm on March 11, 2024:
    Nice, I was able to produce the crashes - by reverting the main commit and then keeping one of the two crash test commits. Didn’t look at the crash log itself.
  103. Sjors commented at 4:39 pm on March 11, 2024: member
    I’m still reviewing, but I like how the last commit 065952806234e5a0a6b87df5a5067d0f4062d25e makes it clear that quite a bit of complexity goes away.
  104. in src/validation.cpp:5347 in c015050cb2 outdated
    5114@@ -5115,6 +5115,11 @@ double GuessVerificationProgress(const ChainTxData& data, const CBlockIndex *pin
    5115     if (pindex == nullptr)
    5116         return 0.0;
    5117 
    5118+    if (!Assume(pindex->nChainTx > 0)) {
    5119+        LogWarning("Internal bug detected: block %d has unset nChainTx (%s %s). Please report this issue here: %s\n",
    5120+                   pindex->nHeight, PACKAGE_NAME, FormatFullVersion(), PACKAGE_BUGREPORT);
    


    Sjors commented at 6:05 pm on March 11, 2024:
    c015050cb2cf35303c05c7eb86d3546403e330eb: maybe also return 0.0;?

    ryanofsky commented at 2:28 pm on March 12, 2024:

    re: #29370 (review)

    c015050: maybe also return 0.0;?

    Good idea, added

  105. ryanofsky referenced this in commit 7811676331 on Mar 11, 2024
  106. ryanofsky force-pushed on Mar 11, 2024
  107. ryanofsky commented at 6:10 pm on March 11, 2024: contributor
    Updated 065952806234e5a0a6b87df5a5067d0f4062d25e -> 69f9f5df4df73d02c4cd184addc36b9fc8f3f507 (pr/nofake.15 -> pr/nofake.16, compare) adding back a comment and splitting commits as suggested.
  108. in src/validation.cpp:4976 in 49426f84b9 outdated
    4976-                    // is valid and we have all data for its parents, it must be in
    4977-                    // setBlockIndexCandidates.  m_chain.Tip() must also be there
    4978-                    // even if some data has been pruned.
    4979+                    // If pindex and all its parents downloaded transactions, and
    4980+                    // the transactions were not pruned (pindexFirstMissing is
    4981+                    // null), it is a potential candidate.
    


    Sjors commented at 6:21 pm on March 11, 2024:
    49426f84b9a15aaae1b3a3c9c5d2ac4ee5855b11: the description matches the code, but I’m a bit confused why an index with a pruned ancestor can’t be a candidate. Or is it the case that it may be a candidate, but we just don’t have a check for that?

    ryanofsky commented at 7:33 pm on March 11, 2024:

    re: #29370 (review)

    49426f8: the description matches the code, but I’m a bit confused why an index with a pruned ancestor can’t be a candidate. Or is it the case that it may be a candidate, but we just don’t have a check for that?

    I think it’s the latter. It may be a candidate but we can’t have a check ensuring that it is a candidate because if blocks are missing between pindex and the current chain tip, FindMostWorkChain could remove it from the candidate set:

    https://github.com/bitcoin/bitcoin/blob/a945f09fa6e0f94cc424da9e06516f9cfa192545/src/validation.cpp#L3073

    Probably ideally the check below would be pindexFirstMissingSinceChainTip == nullptr instead of pindexFirstMissing == nullptr, but it would take more work to be able to write that.


    Sjors commented at 9:26 am on March 12, 2024:
    That makes sense. So instead of checking if there is a block with missing data between the tip and the new candidate, the code here checks if there’s any missing data between genesis and the new candidate. Which is typically the case on a pruned node, which means this check is useless on a pruned node. But that’s outside the scope of this PR.

    ryanofsky commented at 2:32 pm on March 12, 2024:

    re: #29370 (review)

    That makes sense. So instead of checking if there is a block with missing data between the tip and the new candidate, the code here checks if there’s any missing data between genesis and the new candidate.

    Yes, it’s pretty confusing so I added some more comments to help clarify.

  109. in src/validation.cpp:4874 in 6096542e4c outdated
    4870@@ -4852,13 +4871,30 @@ void ChainstateManager::CheckBlockIndex()
    4871     size_t nNodes = 0;
    4872     int nHeight = 0;
    4873     CBlockIndex* pindexFirstInvalid = nullptr; // Oldest ancestor of pindex which is invalid.
    4874-    CBlockIndex* pindexFirstMissing = nullptr; // Oldest ancestor of pindex which does not have BLOCK_HAVE_DATA.
    4875-    CBlockIndex* pindexFirstNeverProcessed = nullptr; // Oldest ancestor of pindex for which nTx == 0.
    4876+    CBlockIndex* pindexFirstMissing = nullptr; // Oldest ancestor of pindex which does not have BLOCK_HAVE_DATA, since assumeutxo snapshot.
    


    Sjors commented at 7:37 pm on March 11, 2024:
    Do you mean “since assumeutxo snapshot if used”?

    ryanofsky commented at 8:21 pm on March 11, 2024:

    re: #29370 (review)

    Do you mean “since assumeutxo snapshot if used”?

    Yes, added “if used” to clarify

  110. Sjors commented at 7:53 pm on March 11, 2024: member

    Mostly happy with 69f9f5df4df73d02c4cd184addc36b9fc8f3f507

    snap_update_firsts makes me a bit dizzy, mostly because I don’t understand what the original code is trying to achieve. I’ll try to wrap my ahead around that another time.

    Update 2024-03-12: ok, I wrapped my ahead around it. The call sites for snap_update_firsts are where we are navigating to the next CBlockIndex to check.

    There are two sets of pindexFirst... variables. The first set starts from genesis + 1 and does not involve assumeutxo. The second set starts from snapshot. Blocks are checked depth-first starting from genesis + 1. At some point we may encounter a snapshot block, at which point snap_update_firsts swaps the pindexFirst... variables. Since we may have a few stale blocks below the snapshot, or even entire fork that replaces the snapshot, snap_update_firsts is also called on the way down / sideways.

  111. Sjors approved
  112. Sjors commented at 12:15 pm on March 12, 2024: member

    ACK 69f9f5df4df73d02c4cd184addc36b9fc8f3f507

    I tested loading the signet snapshot with -checkblockindex (didn’t wait for tip or background sync to finish though).

    Note for anyone else testing this: the headers pre-sync phase is agonisingly slow, so you may want to do that bit without -checkblockindex. I also ran into #29519 where all my peers were stuck downloading only the background thread (fixed by toggling setnetworkactive).

  113. DrahtBot requested review from mzumsande on Mar 12, 2024
  114. ryanofsky referenced this in commit f37038758d on Mar 12, 2024
  115. ryanofsky referenced this in commit 187e96b42a on Mar 12, 2024
  116. ryanofsky referenced this in commit 5b334a821c on Mar 12, 2024
  117. ryanofsky force-pushed on Mar 12, 2024
  118. ryanofsky commented at 2:34 pm on March 12, 2024: contributor

    Thanks for the review!

    Updated 69f9f5df4df73d02c4cd184addc36b9fc8f3f507 -> ce9388084cb130825ab9faf2ee9e5a6b548a1a07 (pr/nofake.16 -> pr/nofake.17, compare) adding more comments to clarify things and adding early return in GuessVerificationProgress if Assume check fails Updated ce9388084cb130825ab9faf2ee9e5a6b548a1a07 -> ae0d1c266898dc1e3c2f72e0d7a7d4d57b164f45 (pr/nofake.17 -> pr/nofake.18, compare) fixing new comments.

  119. ryanofsky referenced this in commit b2f2c0a210 on Mar 12, 2024
  120. ryanofsky referenced this in commit babe5d0c20 on Mar 12, 2024
  121. ryanofsky force-pushed on Mar 12, 2024
  122. Sjors commented at 3:01 pm on March 12, 2024: member
    re-ACK ae0d1c266898dc1e3c2f72e0d7a7d4d57b164f45
  123. ryanofsky referenced this in commit fe0633e6df on Mar 13, 2024
  124. ryanofsky referenced this in commit da18416b7f on Mar 13, 2024
  125. ryanofsky force-pushed on Mar 13, 2024
  126. ryanofsky referenced this in commit 187b4df521 on Mar 13, 2024
  127. ryanofsky force-pushed on Mar 13, 2024
  128. ryanofsky commented at 1:35 pm on March 13, 2024: contributor
    Updated ae0d1c266898dc1e3c2f72e0d7a7d4d57b164f45 -> 37e54d798a29b4798402db5c4163537046caf536 (pr/nofake.18 -> pr/nofake.19, compare) trying to improve a few comments. Updated 37e54d798a29b4798402db5c4163537046caf536 -> 345ac98bb893da7c1f0b573d4da73711b2139391 (pr/nofake.19 -> pr/nofake.20, compare) adding back nChainTx = 0 fallback in ReceivedBlockTransactions
  129. DrahtBot added the label Needs rebase on Mar 13, 2024
  130. in src/test/util/chainstate.h:94 in b3d695f913 outdated
     96-                // We have to set the ASSUMED_VALID flag, because otherwise it
     97-                // would not be possible to have a block index entry without HAVE_DATA
     98-                // and with nTx > 0 (since we aren't setting the pruned flag);
     99-                // see CheckBlockIndex().
    100-                pindex->nStatus |= BLOCK_ASSUMED_VALID;
    101+                // Remove all data and validity flags by just setting
    


    fjahr commented at 6:30 pm on March 15, 2024:
    The comment above (Reset the HAVE_DATA flags...) could be removed or at least changed to match the changed code here.

    ryanofsky commented at 6:10 pm on March 18, 2024:

    re: #29370 (review)

    The comment above (Reset the HAVE_DATA flags...) could be removed or at least changed to match the changed code here.

    The comment above is still accurate and still describes the main thing this is trying to do. Now just other flags are reset as well. Happy to change any of the comments if you have a specific suggestion though.


    fjahr commented at 10:03 pm on March 18, 2024:

    Before the comment could be clearly matched to what was happening below because the HAVE_DATA flags were unset explicitly. Now they are unset implicitly and the new comment states what is going on again more clearly. So this would be confusing as we would have two split comments making the same statement in different words and I would think I might be missing something special about the HAVE_DATA here.

    To keep it simple, I would change this:

    0            // Reset the HAVE_DATA flags below the snapshot height, simulating
    1            // never-having-downloaded them in the first place.
    

    to this:

    0            // Simulate never-having-downloaded the blocks below the snapshot height.
    

    fjahr commented at 10:23 pm on March 18, 2024:
    It’s a nit, so feel free to keep as is unless you have to retouch.
  131. fjahr commented at 8:32 pm on March 15, 2024: contributor
    Code looks good to me after a latest pass, I will quickly re-review once this has been rebased.
  132. assumeutxo test: Add RPC test for fake nTx and nChainTx values
    The fake values will be removed in an upcoming commit, so it is useful to have
    test coverage confirming the change in behavior.
    f252e687ec
  133. ci: add getchaintxstats ubsan suppressions
    Add ubsan suppressions for integer overflows in the getchaintxstats RPC.
    
    getchainstatstx line "int nTxDiff = pindex->nChainTx - past_block.nChainTx" can
    trigger ubsan integer overflows when assumeutxo snapshots are loaded, from
    subtracting unsigned values and assigning the result to a signed int.
    
    The overflow behavior probably exists in current code but is hard to trigger
    because it would require calling getchainstatstx at the right time with
    specific parameters as background blocks are being downloaded. But the overflow
    behavior becomes easier to trigger in the upcoming commit removing fake
    nChainTx values, so a suppression needs to be added before then for CI to pass.
    
    getchainstatstx should probably be improved separately in another PR to not
    need this suppression, and handle edge cases and missing nChainTx values more
    carefully.
    63e8fc912c
  134. validation: Check GuessVerificationProgress is not called with disconnected block
    Use Assume macro as suggested https://github.com/bitcoin/bitcoin/pull/29370#discussion_r1479427801
    0fd915ee6b
  135. doc: Improve comments describing setBlockIndexCandidates checks
    The checks are changing slightly in the next commit, so try to explains the
    ones that exist to avoid confusion
    (https://github.com/bitcoin/bitcoin/pull/29370#discussion_r1499519079)
    9b97d5bbf9
  136. assumeutxo: Get rid of faked nTx and nChainTx values
    The `PopulateAndValidateSnapshot` function introduced in
    f6e2da5fb7c6406c37612c838c998078ea8d2252 from #19806 has been setting fake
    `nTx` and `nChainTx` values that can show up in RPC results (see #29328) and
    make `CBlockIndex` state hard to reason about, because it is difficult to know
    whether the values are real or fake.
    
    Revert to previous behavior of setting `nTx` and `nChainTx` to 0 when the
    values are unknown, instead of faking them.
    
    This commit fixes at least two assert failures in the (pindex->nChainTx ==
    pindex->nTx + prev_chain_tx) check that would happen previously. Tests for
    these failures are added separately in the next two commits.
    
    Compatibility note: This change could result in -checkblockindex failures if a
    snapshot was loaded by a previous version of Bitcoin Core and not fully
    validated, because fake nTx values will have been saved to the block index. It
    would be pretty easy to avoid these failures by adding some compatibility code
    to `LoadBlockIndex` and changing `nTx` values from 1 to 0 when they are fake
    (when `(pindex->nStatus & BLOCK_VALID_MASK) < BLOCK_VALID_TRANSACTIONS`), but a
    little simpler not to worry about being compatible in this case.
    ef29c8b662
  137. test: assumeutxo stale block CheckBlockIndex crash test
    Add a test for a CheckBlockIndex crash that would happen before previous
    "assumeutxo: Get rid of faked nTx and nChainTx values" commit.
    
    The crash was an assert failure in the (pindex->nChainTx == pindex->nTx +
    prev_chain_tx) check that would previously happen if a snapshot was loaded, and
    a block was submitted which forked from the chain before the snapshot block and
    after the last downloaded background chain block. This block would not be
    marked assumed-valid because it would not be an ancestor of the snapshot, and
    it would have nTx set, nChainTx unset, and prev->nChainTx set with a fake
    value, so the assert would fail. After the fix, prev->nChainTx is unset instead
    of being set to a fake value, so the assert succeeds. This test was originally
    posted by maflcko in
    https://github.com/bitcoin/bitcoin/issues/29261#issuecomment-1918947945
    
    Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
    0391458d76
  138. test: assumeutxo snapshot block CheckBlockIndex crash test
    Add a test for a CheckBlockIndex crash that would happen before previous
    "assumeutxo: Get rid of faked nTx and nChainTx values" commit.
    
    The crash was an assert failure in the (pindex->nChainTx == pindex->nTx +
    prev_chain_tx) check that would previously happen if the snapshot block was
    submitted after loading the snapshot and downloading a few blocks after the
    snapshot. In that case ReceivedBlockTransactions() previously would overwrite
    the nChainTx value of the submitted snapshot block with a fake value based on
    the previous block, so the (pindex->nChainTx == pindex->nTx + prev_chain_tx)
    check would later fail on the first block after the snapshot. This test was
    originally posted by Martin Zumsande <mzumsande@gmail.com> in
    https://github.com/bitcoin/bitcoin/pull/29370#issuecomment-1974096225
    
    Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
    ef174e9ed2
  139. assumeutxo: Remove BLOCK_ASSUMED_VALID flag
    Flag adds complexity and is not currently used for anything.
    9d9a7458a2
  140. ryanofsky force-pushed on Mar 18, 2024
  141. ryanofsky referenced this in commit 302b878ba2 on Mar 18, 2024
  142. ryanofsky referenced this in commit d1edd8ee36 on Mar 18, 2024
  143. ryanofsky referenced this in commit 6e634078a1 on Mar 18, 2024
  144. ryanofsky commented at 6:12 pm on March 18, 2024: contributor
    Rebased 345ac98bb893da7c1f0b573d4da73711b2139391 -> 9d9a7458a2570f7db56ab626b22010591089c312 (pr/nofake.20 -> pr/nofake.21, compare) due to conflict with #29478
  145. DrahtBot removed the label Needs rebase on Mar 18, 2024
  146. Sjors commented at 10:42 am on March 19, 2024: member

    re-ACK 9d9a7458a2570f7db56ab626b22010591089c312

    I also tried generating and loading a signet snapshot with this PR combined with #29612.

  147. DrahtBot requested review from fjahr on Mar 19, 2024
  148. mzumsande commented at 9:45 pm on March 19, 2024: contributor

    Tested ACK 9d9a7458a2570f7db56ab626b22010591089c312

    I reviewed the code and tested this running with checkblockindex on signet.

  149. in src/validation.cpp:5149 in 9b97d5bbf9 outdated
    5149-                    // even if some data has been pruned.
    5150+                    // If pindex and all its parents downloaded transactions,
    5151+                    // and the transactions were not pruned (pindexFirstMissing
    5152+                    // is null), it is a potential candidate. The check
    5153+                    // excludes pruned blocks, because if any blocks were
    5154+                    // pruned between pindex the current chain tip, pindex will
    


    maflcko commented at 9:56 am on March 20, 2024:
    nit: 9b97d5bbf980d657a277c85d113c2ae3e870e0ec: pindex *and* the current
  150. DrahtBot requested review from maflcko on Mar 20, 2024
  151. in test/functional/feature_assumeutxo.py:335 in ef174e9ed2 outdated
    330+        # blocks downloaded, but its useful to test because it triggers more
    331+        # corner cases in ReceivedBlockTransactions() and CheckBlockIndex()
    332+        # setting and testing nChainTx values, and it exposed previous bugs.
    333+        snapshot_hash = n0.getblockhash(SNAPSHOT_BASE_HEIGHT)
    334+        snapshot_block = n0.getblock(snapshot_hash, 0)
    335+        n1.submitblock(snapshot_block)
    


    maflcko commented at 11:35 am on March 20, 2024:

    nit in ef174e9ed21c08f38e5d4b537b6decfd1f646db9:

    I wonder if it would be better to randomly either submit or not submit the block, because the test should be passing in both cases.

    If a crash were to happen, it would be intermittent. However, it would still be deterministic, because the randomness seed is printed/logged.

  152. maflcko approved
  153. maflcko commented at 11:47 am on March 20, 2024: member

    ACK 9d9a7458a2570f7db56ab626b22010591089c312 🎯

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: ACK 9d9a7458a2570f7db56ab626b22010591089c312 🎯
    3m+XWEuj+TMxTAm2vqwQ2KlIlqiGQf/ADm9CzC8dphRCYKKQo4VacrJ9sy4MIZjiOABr94CzwXsoVSmSPlrElBw==
    
  154. achow101 assigned achow101 on Mar 20, 2024
  155. achow101 commented at 4:49 pm on March 20, 2024: member
    ACK 9d9a7458a2570f7db56ab626b22010591089c312
  156. achow101 merged this on Mar 20, 2024
  157. achow101 closed this on Mar 20, 2024

  158. fjahr commented at 7:11 pm on March 20, 2024: contributor
    Post merge utACK 9d9a7458a2570f7db56ab626b22010591089c312

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me