validation: ensure assumevalid is always used during reindex #31615

pull Eunovo wants to merge 2 commits into bitcoin:master from Eunovo:31494-reindex-assumevalid changing 5 files +150 −17
  1. Eunovo commented at 11:52 am on January 7, 2025: contributor

    assumevalid may be ignored during reindex for the following reasons:

    • the assumevalid block is not in the block index: this happens when the assumevalid block could not be downloaded before the previous IBD run was interrupted
    • the chainwork of the best header is less than the minimumchainwork: happens if the previous IBD was interrupted before it could connect blocks up to minimumchainwork. See Issue #31494

    This PR fixes this by skipping script verification during reindex if assumevalid is enabled.

    EDIT: While the minimumchainwork is typically set to match the assumevalid block height, users typically adjust the assumevalid block without also adjusting the minimumchainwork; Scenario 2 from the list can occur if the assumevalid block is set to an older block without also reducing the minimumchainwork.

  2. DrahtBot commented at 11:52 am on January 7, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31615.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept NACK stickies-v
    Concept ACK mzumsande, TheCharlatan
    Stale ACK stratospher, l0rinc

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. in src/validation.cpp:2502 in 951812b0ac outdated
    2498@@ -2499,7 +2499,7 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2499         if (it != m_blockman.m_block_index.end()) {
    2500             if (it->second.GetAncestor(pindex->nHeight) == pindex &&
    2501                 m_chainman.m_best_header->GetAncestor(pindex->nHeight) == pindex &&
    2502-                m_chainman.m_best_header->nChainWork >= m_chainman.MinimumChainWork()) {
    2503+                (m_chainman.m_best_header->nChainWork >= m_chainman.MinimumChainWork() || m_chainman.m_blockman.LoadingBlocks())) {
    


    mzumsande commented at 3:39 pm on January 7, 2025:
    I think that this won’t achieve the desired result: If the assumevalid block is not in the block index (which will be the case if IBD crashed midway before that block was downloaded, and a reindex is done) we’ll abort above in line 2499 and never get to this spot.

    Eunovo commented at 6:29 pm on January 7, 2025:
    That’s true. I thought #31494 was focused on the best header chainwork not being up to minimumchainwork. I could change this to always skip script checks during a reindex, but IIUC having a block on disk doesn’t mean that it was connected during the previous IBD. Might be worth it to sync headers first then connect blocks even during reindex.

    mzumsande commented at 9:23 pm on January 7, 2025:

    I really wasn’t sure what the best solution is when I opened the issue and still am not, but these are my thoughts:

    Yes, there are two reasons why we might not apply assumevalid during a reindex currently: 1.) The best header has not enough chainwork 2.) The assumevalid block is not in our block index

    I could change this to always skip script checks during a reindex, but IIUC having a block on disk doesn’t mean that it was connected during the previous IBD.

    I think the question is rather how well we can avoid to connect blocks skipping the script checks during the reindex that we would have fully validated in the original run if we hadn’t crashed. Assuming that the user provides the same -assumevalid arg to the reindex run that they did in the original run I can see the following cases: 1.) The user provides an existing block (or nothing, so that the default hash from chainparams is used) that we just haven’t downloaded yet because it’s later in the best chain - if we don’t have it in our index we’d be justified to skip script checks - if we do have it in our index, no special rules are necessary for reindexing. I think that’s the most common case for users. 2.) The user provides -assumevalid=0 - we should respect that also in the reindex run and fully validate everything. 3.) The user provides a block hash with -assumevalid that is non-existing / not in the best chain. I think that’s the critical case because here we would have fully validated everything in the original run, but won’t validate anything during the reindex. But arguably that’s not a common case / the user’s fault for providing this bad block, so maybe it can be tolerated?

    Might be worth it to sync headers first then connect blocks even during reindex.

    Yes, that’s the second option I thought of. I think we can’t just stop and wait for the headers to be downloaded - reindex should be able to finish without having a p2p connection, even if that means no assumevalid. But I was thinking of maybe just disabling the relevant LoadingBlocks() guards in net_processing, so that we’d download headers in parallel to ActivateBestChain(), and would change to use assumevalid after a few thousand blocks or so. That seem a bit messy (we’d still connect some blocks without assumevalid) and requires testing but might work well in practice?!


    Eunovo commented at 1:59 pm on January 10, 2025:

    I created another test to check if blocks rejected in the normal IBD run get loaded during a -reindex. The test reveals that it does load the invalid block and it will connect it if script checks are skipped, see https://github.com/Eunovo/bitcoin/commit/1ff4e5ee78418a30ba1ea9c384b8c7ee43435ce6

    IIUC During a reindex, the node does not know what blocks were connected in the previous run so it can’t safely skip checks. It looks to me like we have to:

    1. Sync headers before connecting blocks during reindex OR
    2. Persist data to show that we previously connected this block (IIUC this exact data exists but gets deleted during reindex, maybe we can preserve the relevant data we need?)

    Eunovo commented at 2:04 pm on January 10, 2025:

    The test reveals that it does load the invalid block and it will connect it if script checks are skipped, see Eunovo@1ff4e5e

    I’m still not sure if this causes any damage because I expect the node to eventually pick the most-work chain when it syncs headers

  4. Eunovo commented at 11:56 am on January 18, 2025: contributor

    I had an offline conversation with @mzumsande. During the reindex, we can apply assumevalid for following cases:

    • If the assumevalid block is the index, then we can skip script checks for all ancestors of the assumevalid block
    • If the assumevalid block is not in the index, then the initial IBD run had not downloaded the assumevalid block yet then the blocks on disk are ancestors of the assumevalid block and we can still skip script verification

    Then we download headers from peers in parallel to ActivateBestChain()

    I’ll put this in draft while I implement and test this over the next few days

  5. Eunovo marked this as a draft on Jan 18, 2025
  6. Eunovo force-pushed on Jan 29, 2025
  7. DrahtBot added the label CI failed on Jan 29, 2025
  8. Eunovo force-pushed on Jan 29, 2025
  9. DrahtBot removed the label CI failed on Jan 29, 2025
  10. l0rinc commented at 3:17 pm on January 30, 2025: contributor

    I’m not sure whether LoadingBlocks is the correct way to solve it or not (I would need someone else to opine on that, I can’t tell if this can err on the side of not checking the scripts at all for some edge case).

    I can confirm however that this does seem to solve the issue I had, reported in #31494 (i.e. reindex(-chainstate) after a partial IBD always enabling script verification).


    To test it, I have added a LogWarning("fScriptChecks is %s", fScriptChecks); after the assignments and ran it with master vs the branch:

    0   cmake -B build -DCMAKE_BUILD_TYPE=Debug >/dev/null 2>&1 && cmake --build build -j$(nproc) >/dev/null 2>&1\
    1&& mkdir -p demo && build/bin/bitcoind -datadir=demo -stopatheight=1000 >/dev/null && build/bin/bitcoind -datadir=demo -reindex -stopatheight=100 >/dev/null \
    2&& build/bin/bitcoind -datadir=demo -reindex-chainstate -stopatheight=10 | grep "script verification"
    

    master

    0[warning] fScriptChecks is true
    

    Eunovo:31494-reindex-assumevalid

    0[warning] fScriptChecks is false
    

    I can also confirm that adding -assumevalid=0 will enable script checking from the beginning and that -assumevalid=000000009b7262315dbf071787ad3656097b892abffd1f95a1a022f896f533fc has 5 fScriptChecks=false values after which script checking is (correctly) enabled again.


    I also checked that the new assume_valid test passes here and fails correctly on master (though it’s very slow, we have to check if there’s a way to speed it up somewhat)

     02025-01-30T15:08:22.897000Z TestFramework (INFO): Success
     12025-01-30T15:08:31.343000Z TestFramework (INFO): Restarting node while reindexing..
     22025-01-30T15:08:32.551000Z TestFramework (INFO): Testing reindex with -assumevalid
     32025-01-30T15:09:23.963000Z TestFramework (INFO): Testing reindex with assumevalid block in block index and minchainwork > best_header chainwork
     42025-01-30T15:09:52.592000Z TestFramework (ERROR): Assertion failed
     5Traceback (most recent call last):
     6  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
     7    self.run_test()
     8    ~~~~~~~~~~~~~^^
     9  File "/Users/lorinc/IdeaProjects/bitcoin/build/test/functional/feature_reindex.py", line 203, in run_test
    10    self.assume_valid()
    11    ~~~~~~~~~~~~~~~~~^^
    

    So from me it’s an approach ACK, but someone else needs to validate the implementation details and the corner cases that can result from this.

  11. Eunovo commented at 3:51 pm on January 30, 2025: contributor

    I’m not sure whether LoadingBlocks is the correct way to solve it or not (I would need someone else to opine on that, I can’t tell if this can err on the side of not checking the scripts at all for some edge case).

    I agree too. LoadingBlocks() is true when blocks are being reindexed or when new blocks are being loaded with -loadblocks . I intend to move away from LoadingBlocks() to something that’s only true when blockfiles are being reindexed.

    I also checked that the new assume_valid test passes here and fails correctly on master (though it’s very slow, we have to check if there’s a way to speed it up somewhat)

    I’ll look into this!

  12. Eunovo force-pushed on Feb 13, 2025
  13. mzumsande commented at 9:49 pm on March 7, 2025: contributor

    Concept ACK

    I think the current approach would work. Even though it means that if the user provided a bad / non-existing assumeutxo hash they could skip script validation during a reindex when they wouldn’t have in the original run, I’d say that’s not a major problem because a reindex should only be necessary in case of a a db corruption, which should not be remotely triggerable.

    So it will hopefully make the reindex a bit faster in case of a database corruption, while not changing the assumevalid security assumption very much, and it does respect -assumevalid=0.

    Though it would be good to have some experienced contributors chime in that were around when assumevalid was introduced (#9484).

  14. Eunovo marked this as ready for review on Mar 8, 2025
  15. Eunovo commented at 6:33 am on March 8, 2025: contributor
    Thanks for the reviews @l0rinc and @mzumsande . Due to your feedback, I have opened the PR for general review.
  16. DrahtBot added the label CI failed on Mar 21, 2025
  17. DrahtBot commented at 7:47 am on March 21, 2025: contributor
    Needs rebase, if still relevant
  18. Eunovo force-pushed on Mar 24, 2025
  19. Eunovo renamed this:
    Ensure assumevalid is always used during reindex
    validation: ensure assumevalid is always used during reindex
    on Mar 24, 2025
  20. Eunovo commented at 11:39 am on March 24, 2025: contributor

    Rebased on master https://github.com/bitcoin/bitcoin/commit/770d39a37652d40885533fecce37e9f71cc0d051

  21. DrahtBot removed the label CI failed on Mar 24, 2025
  22. DrahtBot added the label Validation on Mar 24, 2025
  23. in src/node/blockstorage.cpp:1232 in e1b42458cc outdated
    1227@@ -1213,6 +1228,11 @@ void ImportBlocks(ChainstateManager& chainman, std::span<const fs::path> import_
    1228             }
    1229             nFile++;
    1230         }
    1231+
    1232+        // Call ActivateBestChain here so we can skip script verification
    


    mzumsande commented at 8:41 pm on May 15, 2025:
    nit: I’d replace “here” with “before setting m_blockfiles_indexed” so the reason becomes more clear.

    Eunovo commented at 11:59 am on June 18, 2025:
    Done
  24. in src/node/blockstorage.cpp:1193 in e1b42458cc outdated
    1189@@ -1190,6 +1190,21 @@ void ImportBlocks(ChainstateManager& chainman, std::span<const fs::path> import_
    1190 {
    1191     ImportingNow imp{chainman.m_blockman.m_importing};
    1192 
    1193+    auto update_all_chainstate = [&]() {
    


    mzumsande commented at 8:42 pm on May 15, 2025:
    I’d prefer to call it activate_all_chainstates to not introduce a less precise term (“update”).

    Eunovo commented at 11:59 am on June 18, 2025:
    Done
  25. in src/node/blockstorage.cpp:1258 in e1b42458cc outdated
    1265-        if (!chainstate->ActivateBestChain(state, nullptr)) {
    1266-            chainman.GetNotifications().fatalError(strprintf(_("Failed to connect best block (%s)."), state.ToString()));
    1267-            return;
    1268-        }
    1269-    }
    1270+    update_all_chainstate();
    


    mzumsande commented at 9:27 pm on May 15, 2025:
    This means that in case of a reindex, ActivateBestChain(), will be called twice now from ImportBlocks, but that shouldn’t be a problem because the second call just won’t do any work.
  26. in test/functional/feature_reindex.py:206 in cb29bc3c30 outdated
    188+
    189+        # node0 should still sync to tip of the chain, ignoring the invalid signature
    190+        assert_equal(node.getblock(node.getbestblockhash())['height'], expected_height)
    191+        assert_equal(int(node.getbestblockhash(), 16), blocks[2100].sha256)
    192+
    193+        self.log.info("Success")
    


    mzumsande commented at 9:54 pm on May 15, 2025:
    would be good to add a subtest that a run without -assumevalid will not connect the invalid block.

    Eunovo commented at 8:12 am on May 16, 2025:
    Will do

    Eunovo commented at 11:59 am on June 18, 2025:
    Done
  27. in test/functional/feature_reindex.py:158 in cb29bc3c30 outdated
    153+            block = create_block(tip, create_coinbase(height), block_time)
    154+            block.solve()
    155+            blocks.append(block)
    156+            tip = block.sha256
    157+
    158+        # Start node0 with -assumevalid pointing to the block with invalid signature
    


    mzumsande commented at 9:57 pm on May 15, 2025:
    Comment needs updating, -assumevalid points to the last block, not the invalid one.

    Eunovo commented at 12:00 pm on June 18, 2025:
    Done
  28. in test/functional/feature_reindex.py:151 in cb29bc3c30 outdated
    145+        invalid_block.solve()
    146+        tip = invalid_block.sha256
    147+        blocks.append(invalid_block)
    148+
    149+        # Bury that block with roughly two weeks' worth of blocks (2100 blocks) and an extra 100 blocks
    150+        for _ in range(2200):
    


    mzumsande commented at 10:02 pm on May 15, 2025:
    is there a reason we need to mine that many blocks, and not just a few more to make the test faster?

    Eunovo commented at 8:31 am on May 16, 2025:
    Are you referring to the 100 blocks mined after the first 2100 blocks?

    mzumsande commented at 3:31 pm on May 16, 2025:
    I meant the 2100 (two weeks) of blocks. I didn’t test this, but is there a reason this couldn’t just be a few blocks (plus a few more that aren’t sent to the node)?

    Eunovo commented at 4:42 pm on May 16, 2025:

    Script checks are only skipped when the current block is roughly 2 weeks away from the tip. See https://github.com/bitcoin/bitcoin/blob/3f83c744ac28b700090e15b5dda2260724a56f49/src/validation.cpp#L2476-L2494

    We have to bury the test block 2 weeks deep for script verification to the skipped


    mzumsande commented at 5:30 pm on May 16, 2025:
    oh I see, forgot about that check - can be resolved.

    stratospher commented at 8:15 am on July 4, 2025:
    the tests now take 2m34.878s on my system (was around 18 seconds before). since the tests takes much more time now, maybe remove it from the “Tests less than 30s” section in test_runner.py?

    Eunovo commented at 1:53 pm on July 4, 2025:
    @stratospher the test completes much faster on my system, but by checking the timing of other tests, I put it among “Tests less than 2m” section because it completes faster than mining_getblocktemplate_longpoll.py than slower than mempool_persist.py
  29. DrahtBot added the label CI failed on Jun 15, 2025
  30. in test/functional/feature_reindex.py:143 in cb29bc3c30 outdated
    138+
    139+        # Create a block with an invalid signature
    140+        tx = CTransaction()
    141+        tx.vin.append(CTxIn(COutPoint(int(coinbase_to_spend, 16), 0), scriptSig=b""))
    142+        tx.vout.append(CTxOut(49 * 10000, CScript([OP_TRUE])))
    143+        tx.calc_sha256()
    


    DrahtBot commented at 8:49 am on June 16, 2025:
                                   AttributeError: 'CTransaction' object has no attribute 'calc_sha256'
    
  31. Eunovo force-pushed on Jun 18, 2025
  32. Eunovo commented at 12:02 pm on June 18, 2025: contributor
  33. DrahtBot removed the label CI failed on Jun 18, 2025
  34. in test/functional/feature_reindex.py:178 in 11a9d55e84 outdated
    173+        expected_height = height - 100
    174+        # node0 should sync to tip of the chain, ignoring the invalid signature
    175+        assert_equal(node.getblock(node.getbestblockhash())['height'], expected_height)
    176+
    177+        self.log.info("Testing reindex with assumevalid block in block index and minchainwork > best_header chainwork")
    178+        # Restart node0 but raise minimumchainwork to that of a chain 32256 blocks long
    


    stratospher commented at 7:04 am on July 4, 2025:
    any particular reason for having this high minimum chain work? (ex: for 32256 blocks instead of 3256 blocks). also curious how you computed it?

    Eunovo commented at 1:46 pm on July 4, 2025:

    For this test, I need the minimumchainwork to be higher than the best_header chainwork. The height of the best_header here is 3714, so the minimumchainwork needs to be for a chain considerably higher than 3714 blocks. Antoine had provided the exact chain work for 32256 blocks in this stack exchange answer, so I used that.

    Note that the minimumchainwork used doesn’t affect the duration of the test, it just checks that the minimumchainwork parameter is ignored during the reindex.


    stratospher commented at 3:09 pm on July 4, 2025:
    ah I see, but that’s on mainnet and not on regtest, so comment might need an edit. agree that the value doesn’t affect the test!

    Eunovo commented at 3:51 am on July 7, 2025:
    I modified the test to calculate chainwork to be approximately 100 blocks greater than the current best_header, and I checked that the test still verifies the expected behaviour. See https://github.com/bitcoin/bitcoin/pull/31615/commits/a53d43bc0f010513b0922ad48d611f402ec0e511
  35. stratospher commented at 8:31 am on July 4, 2025: contributor

    ACK 11a9d55e.

    Agree that applying reindex-assumevalid combo can be useful in the special cases mentioned above to speedup reindexing when we don’t have all blocks.

    A behaviour change during assumevalid+reindex now would be:

    • we’d skip script checks even when the assumevalid blockindex doesn’t exist (ex:we put assumevalid=somegarbage(or)blockhash-with-typo)
    • this was not the case on master where we skip script checks only when know that the assumevalid blockindex exists.

    Don’t think it’s a problem since assumevalid already puts responsibility on the user and we’d need to have tampered block files on disk from some external source (in which case user shouldn’t be using assumevalid in the first place). just wanted to mention it + curious about what others think about this/any other possible behaviour change.

  36. DrahtBot requested review from mzumsande on Jul 4, 2025
  37. DrahtBot requested review from l0rinc on Jul 4, 2025
  38. Eunovo force-pushed on Jul 4, 2025
  39. Eunovo commented at 1:49 pm on July 4, 2025: contributor

    Rebased https://github.com/bitcoin/bitcoin/pull/31615/commits/11a9d55e84af2d6ff138b99310720667cd59dcc3 to https://github.com/bitcoin/bitcoin/pull/31615/commits/e260e23d88cd6b2aaa03d9d45dae8b174f2cfe1f

    • moved feature_reindex.py test to “Tests less than 2m” section in test_runner.py, because it is now much slower and has similar runtimes to other tests in this section
  40. Eunovo force-pushed on Jul 7, 2025
  41. stratospher commented at 8:25 am on July 7, 2025: contributor
    reACK a53d43b.
  42. DrahtBot added the label CI failed on Jul 18, 2025
  43. in test/functional/feature_reindex.py:147 in a53d43bc0f outdated
    142+        tx.vin.append(CTxIn(COutPoint(int(coinbase_to_spend, 16), 0), scriptSig=b""))
    143+        tx.vout.append(CTxOut(49 * 10000, CScript([OP_TRUE])))
    144+
    145+        invalid_block = create_block(tip, create_coinbase(height), block_time, txlist=[tx])
    146+        invalid_block.solve()
    147+        tip = invalid_block.sha256
    


    DrahtBot commented at 6:06 am on July 21, 2025:
                                   AttributeError: 'CBlock' object has no attribute 'sha256'
    
  44. Eunovo force-pushed on Jul 21, 2025
  45. DrahtBot removed the label CI failed on Jul 21, 2025
  46. Eunovo commented at 9:23 am on July 21, 2025: contributor
  47. in src/validation.cpp:2488 in 01e372b258 outdated
    2484@@ -2485,9 +2485,15 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2485                 //  that are hardly doing any signature verification at all in testing without having to
    2486                 //  artificially set the default assumed verified block further back.
    2487                 // The test against the minimum chain work prevents the skipping when denied access to any chain at
    2488-                //  least as good as the expected chain.
    2489+                // least as good as the expected chain.
    


    l0rinc commented at 6:27 pm on July 21, 2025:
    nit: the leading space was deliberate, it’s used in other sentences as well to signal that it’s a continuation. We can either keep this style or remove it from this block comment, but only removing it on this line only is inconsistent. For simplicity and diff minimization I’d just revert this line.
  48. in src/node/blockstorage.cpp:1221 in 01e372b258 outdated
    1217@@ -1218,6 +1218,21 @@ void ImportBlocks(ChainstateManager& chainman, std::span<const fs::path> import_
    1218 {
    1219     ImportingNow imp{chainman.m_blockman.m_importing};
    1220 
    1221+    auto activate_all_chainstate = [&]() {
    


    l0rinc commented at 6:30 pm on July 21, 2025:

    nit: we don’t need to drag all context into the lambda, consider (+ () is redundant):

    0    auto activate_all_chainstate = [&chainman] {
    
  49. in src/node/blockstorage.cpp:1262 in 01e372b258 outdated
    1255@@ -1241,6 +1256,11 @@ void ImportBlocks(ChainstateManager& chainman, std::span<const fs::path> import_
    1256             }
    1257             nFile++;
    1258         }
    1259+
    1260+        // Call ActivateBestChain before setting m_blockfiles_indexed so we can skip script verification
    1261+        // during reindex if assumevalid is enabled
    1262+        activate_all_chainstate();
    


    l0rinc commented at 6:42 pm on July 21, 2025:
    could we do this change in a separate commit, explaining in the commit messages separately why this needed? And I might be wrong here but I don’t understand why we need to run this twice - which is the case for reindex as far as I can tell.

    Eunovo commented at 8:54 am on July 22, 2025:

    ImportBlocks first tries to load blocks from existing block files during the reindex process. After the reindex process, it tries to load any external blocks provided by the user. We only want to force the use of assumevalid(if set) during the reindex process, not when loading external blocks.

    We call activate_all_chainstate the first time, during the reindex process. This allows us to skip script checks using modified assumevalid rules for reindex. We call activate_all_chainstate the second time, after external blocks have been loaded, so we can search for better chains among the new blocks. If no external blocks were loaded activate_all_chainstate will have no effect.

    I have added a comment explaining that the second call will have no effect if no external blocks were imported.

    I am not convinced that this should be in a separate commit. This change only makes sense in the context of this commit.


    TheCharlatan commented at 11:21 am on October 13, 2025:
    I’m a bit confused by this change too. Previously we wrote the reindexing flag before starting to connect blocks (through the call to ABC), but now it seems like we are connecting blocks first, and only then writing the reindexing flag again?

    mzumsande commented at 8:35 pm on October 14, 2025:
    #31615 (review) is relevant to this, too. Basically, we want to connect blocks with the reindexing flag set during a reindex, but need a ABC call in the case of no -reindex.
  50. in src/node/blockstorage.cpp:1292 in 01e372b258 outdated
    1294-            chainman.GetNotifications().fatalError(strprintf(_("Failed to connect best block (%s)."), state.ToString()));
    1295-            return;
    1296-        }
    1297-    }
    1298+    activate_all_chainstate();
    1299     // End scope of ImportingNow
    


    l0rinc commented at 6:44 pm on July 21, 2025:
    does this comment still make sense?

    Eunovo commented at 7:51 am on July 22, 2025:
    Yes, it does. The ImportingNow object is destroyed at the end of the function.
  51. in test/functional/feature_reindex.py:124 in 01e372b258 outdated
    120@@ -97,6 +121,90 @@ def continue_reindex_after_shutdown(self):
    121             node.wait_for_rpc_connection(wait_for_import=False)
    122         node.stop_node()
    123 
    124+    # Check that reindex uses -assumevalid
    


    l0rinc commented at 6:58 pm on July 21, 2025:
    pretty good test, it covers every new added line!
  52. in test/functional/test_runner.py:107 in 01e372b258 outdated
    102@@ -103,6 +103,7 @@
    103     'p2p_segwit.py',
    104     'feature_maxuploadtarget.py',
    105     'feature_assumeutxo.py',
    106+    'feature_reindex.py',
    


    l0rinc commented at 7:00 pm on July 21, 2025:
    What’s the reason for moving this? If it’s needed, could you add it to the commit message?

    Eunovo commented at 8:28 am on July 22, 2025:
    Done.
  53. l0rinc approved
  54. l0rinc commented at 7:01 pm on July 21, 2025: contributor

    ACK 01e372b25855a21d205c6ded83f6849691111d42

    Thanks for fixing it - someone with more knowledge in this area should also validate it.

  55. DrahtBot requested review from stratospher on Jul 21, 2025
  56. Eunovo force-pushed on Jul 22, 2025
  57. Eunovo force-pushed on Jul 22, 2025
  58. in src/node/blockstorage.cpp:1287 in 89b5b607e3 outdated
    1294-            chainman.GetNotifications().fatalError(strprintf(_("Failed to connect best block (%s)."), state.ToString()));
    1295-            return;
    1296-        }
    1297-    }
    1298+    // Call activate_all_chainstate() again, after loading blocks from files,
    1299+    // If no blocks were imported, ActivateBestChain will nothing to do
    


    l0rinc commented at 5:39 pm on July 22, 2025:
    0    // If no blocks were imported, ActivateBestChain will have nothing to do
    

    Eunovo commented at 6:47 am on July 23, 2025:
    Done
  59. in src/validation.cpp:2495 in 89b5b607e3 outdated
    2490+                //  before it could connect enough blocks to reach the minimumchainwork.
    2491                 fScriptChecks = (GetBlockProofEquivalentTime(*m_chainman.m_best_header, *pindex, *m_chainman.m_best_header, params.GetConsensus()) <= 60 * 60 * 24 * 7 * 2);
    2492             }
    2493+        } else if (m_chainman.m_blockman.IsReindexing()) {
    2494+            // During a reindex, the assumed valid block may not be in the index
    2495+            // if the previous IBD run was interrupted before it downloaded the assume valid block.
    


    l0rinc commented at 5:39 pm on July 22, 2025:

    To be consistent with previous line:

    0            // if the previous IBD run was interrupted before it downloaded the assumed valid block.
    

    Eunovo commented at 6:48 am on July 23, 2025:
    Done
  60. in src/node/blockstorage.cpp:1 in 25b875a95c outdated


    l0rinc commented at 5:43 pm on July 22, 2025:

    Commit message is inconsistent with comments: assumedvalid block is written in your code as assumed valid block. I would be fine with assumevalid block as well, as long as they’re consistent.

    nit: sentence ends in full block yet - we could add a fullstop.


    Eunovo commented at 6:48 am on July 23, 2025:
    Done
  61. l0rinc approved
  62. l0rinc commented at 5:46 pm on July 22, 2025: contributor

    ACK 89b5b607e3380ff2cf03d8199c70e655e8c265cb

    Consider unifying the way your refer to the first block that fulfills the assumevalid condition

  63. Eunovo force-pushed on Jul 23, 2025
  64. l0rinc commented at 4:27 am on September 10, 2025: contributor

    Just tried this on my rpi4b server which keeps enabling signature validation after a reindex. Restarting it with only the args ./build/bin/bitcoind -dbcache=5000 -datadir=$DATA_DIR -assumevalid=0000000000000000000087564caa77e7b3f29d0464256c04d5539e43663f8698 still shows heavy signature validation (based on perf top).

    Adding some logging reveals that the reason for reenabling it is that the assumedcalid block is not actually found in the block index for some reason: https://github.com/bitcoin/bitcoin/blob/master/src/validation.cpp#L2433-L2434

  65. Eunovo commented at 1:13 pm on September 10, 2025: contributor

    Just tried this on my rpi4b server which keeps enabling signature validation after a reindex. Restarting it with only the args ./build/bin/bitcoind -dbcache=5000 -datadir=$DATA_DIR -assumevalid=0000000000000000000087564caa77e7b3f29d0464256c04d5539e43663f8698 still shows heavy signature validation @l0rinc Can you share steps to reproduce? Did the previous reindex use signature validation? Was your node pruned?

  66. l0rinc commented at 6:13 pm on September 10, 2025: contributor
    Not pruned, but did many reindex-chainstates (it’s a rpi4b benchmarking server), maybe a few reindexes as well. I don’t know how to reliably reproduce it, but it always happen on that server with the given state. I will try to copy the data over when the other servers free up.
  67. l0rinc commented at 4:11 pm on September 11, 2025: contributor

    While just restarting the node with this commit didn’t help for some reason, doing a reindex with it does seem to disable signature verification successfully. But it doesn’t actually seem to fix the underlying problem, since ./build/bin/bitcoin-cli -datadir=$DATA_DIR getblockchaininfo | jq .headers still returns 841150 for me.

    So while it does seem to work around the issue described in #33336 (comment), it doesn’t seem to solve the underlying problem of reindex ignoring the later headers.

  68. in src/validation.cpp:2490 in 6923417e09 outdated
    2485@@ -2486,8 +2486,14 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2486                 //  artificially set the default assumed verified block further back.
    2487                 // The test against the minimum chain work prevents the skipping when denied access to any chain at
    2488                 //  least as good as the expected chain.
    2489+                // During a reindex, skip the minimumchainwork check because the previous IBD run may have been interrupted
    2490+                //  before it could connect enough blocks to reach the minimumchainwork.
    


    ajtowns commented at 0:49 am on September 12, 2025:

    The minimumchainwork should match the assumevalid block, so if minimumchainwork isn’t reached the assumevalid block won’t be present and this code block won’t be executed.

    So this is only relevant for reindexing when you’re manually choosing an older assumevalid block than the default, and aren’t also adjusting the minchainwork. (If you choose a more recent assumevalid block, then the default minchainwork will be satisfied anyway)


    Eunovo commented at 9:00 am on September 15, 2025:
    I think it still makes sense to account for this scenario since minimumchainwork and assumevalid block are set separately. I’ll update the comment to explain that minimumchainwork only needs to be skipped if you set an older assumevalid block without also reducing the minimumchainwork.
  69. DrahtBot added the label Needs rebase on Sep 12, 2025
  70. Eunovo force-pushed on Sep 15, 2025
  71. Eunovo requested review from DrahtBot on Sep 15, 2025
  72. Eunovo requested review from ajtowns on Sep 15, 2025
  73. Eunovo requested review from l0rinc on Sep 15, 2025
  74. Eunovo commented at 10:33 am on September 15, 2025: contributor
    Rebased on master and added a comment explaining why we bother to skip minimumchainwork, even though we expect minimumchainwork to match assumevalid block height most times.
  75. DrahtBot removed the label Needs rebase on Sep 15, 2025
  76. in src/validation.cpp:2460 in d7e4854990 outdated
    2455+                // Therefore, during reindex operations, we bypass the minimumchainwork check since a prior IBD
    2456+                //  may have been interrupted before reaching the required chainwork threshold.
    2457                 fScriptChecks = (GetBlockProofEquivalentTime(*m_chainman.m_best_header, *pindex, *m_chainman.m_best_header, params.GetConsensus()) <= 60 * 60 * 24 * 7 * 2);
    2458             }
    2459+        } else if (m_chainman.m_blockman.IsReindexing()) {
    2460+            // During a reindex, the assumed valid block may not be in the index
    


    l0rinc commented at 6:59 pm on September 25, 2025:
    While the change does seem to fix my personal experience of unexpected script validations, I don’t yet understand why reindex does this in the first place - can we fix that instead of adding extra conditions to the script validation checks? Seems a bit hacky and only solves part of the problem, doesn’t it?

    Eunovo commented at 7:49 am on September 26, 2025:
    Looks like you are suggesting we do something more invasive. Can you clarify what you mean? I do not believe something more invasive is required since this simple fix works, but I’m happy to hear what others think.

    l0rinc commented at 6:06 pm on September 26, 2025:
    Not necessarily, but it bothers me that we (or at least I) don’t understand what’s causing the problem in the first place, the current solution feels like a work-around to me - which is still better than what we had, but it’s also arguably more complicated and it doesn’t seem to fix the underlying problem of why reindex is missing headers in the first place. And are we sure it’s a reindex-only bug or can it happen for reindex-chainstate or IBD as well? I haven’t dug too deep into this, but I want it to be fixed and was wondering if the apparent lack of interest in the reviews is because others also felt it’s a workaround. Thanks for working on this.

    mzumsande commented at 6:29 pm on September 26, 2025:

    Not necessarily, but it bothers me that we (or at least I) don’t understand what’s causing the problem in the first place

    I tried to describe this in the linked issue, I think the problem is understood. To rephrase it:

    Normal IBD has the order 1) Download entire header chain until we have crossed minchainwork 2) Start downloading blocks => assumevalid will be applied because we have a header with minchainwork and the assumevalid block in our index

    During reindex, we 1) Delete all existing headers 2) Read blocks and headers of just the blocks we have saved on disk 3) Start connecting blocks - without ever downloading the full header chain from peers.

    So if we didn’t have the assumevalid block / a minchainwork full block on disk before starting the reindex (because we didn’t get that far in our previous IBD) assumevalid will not be applied.

  77. TheCharlatan commented at 11:23 am on October 13, 2025: contributor
    Concept ACK
  78. in src/node/blockstorage.cpp:1287 in 45126d1ed9 outdated
    1294-            chainman.GetNotifications().fatalError(strprintf(_("Failed to connect best block (%s)."), state.ToString()));
    1295-            return;
    1296-        }
    1297-    }
    1298+    // Call activate_all_chainstate() again, after loading blocks from files,
    1299+    // If no blocks were imported, ActivateBestChain will have nothing to do
    


    mzumsande commented at 8:17 pm on October 14, 2025:

    I think that this comment is misleading, the call is not only here for -loadblock.

    This is (unless -reindex is used) the first and only time ActivateBestChain() is invoked as part of init. Even without -reindex and without -loadblocks it does important work: It connects the genesis block on first startup (without which init cannot finish), and on later starts it will connect blocks that were eligible to be connected in the previous run, but weren’t because of the shutdown interruption (you’ll often observe this when restarting nodes during IBD). So the main reason the call needs to be there is so that ActivateBestChain() is invoked even if no -reindex is specified.


    Eunovo commented at 9:40 am on October 15, 2025:
    Thanks for pointing this out. I will update the comment.

    Eunovo commented at 1:46 pm on October 20, 2025:
    Done.
  79. Eunovo force-pushed on Oct 20, 2025
  80. in test/functional/feature_reindex.py:206 in 06ab084436 outdated
    201+        self.start_node(0, extra_args=["-reindex"])
    202+        best_blockhash = node.getbestblockhash()
    203+        assert_equal(node.getblock(best_blockhash)['height'], block_info['height'])
    204+        assert_equal(best_blockhash, block_info['hash'])
    205+
    206+        self.log.info("Success")
    


    hodlinator commented at 8:57 am on October 22, 2025:

    nit: Few tests report success, the test framework always outputs “Tests successful” if the test(s) it runs succeed already, so seems unnecessary.

    0₿ git grep 'self.log.info("Success")'
    1test/functional/feature_pruning.py:        self.log.info("Success")
    2test/functional/feature_pruning.py:        self.log.info("Success")
    3test/functional/feature_pruning.py:        self.log.info("Success")
    4test/functional/feature_pruning.py:        self.log.info("Success")
    5test/functional/feature_pruning.py:        self.log.info("Success")
    6test/functional/feature_reindex.py:        self.log.info("Success")
    7test/functional/feature_reindex.py:        self.log.info("Success")
    
  81. DrahtBot added the label Needs rebase on Oct 24, 2025
  82. Eunovo force-pushed on Oct 27, 2025
  83. validation.cpp: use assumevalid during reindex
    -reindex deletes the block tree db and attempts to rebuild it based on local block files,
    without attempting to re-request headers for which we don't have the full block.
    This might cause:
    - the best known chain to have a chainwork less than the minimumchainwork
    - the assumedvalid block to not be in the block index if the previous IBD was interrupted
    before downloading it
    32eceee45a
  84. test: Check that reindex always uses -assumevalid
    Also move feature_reindex.py test to "Tests less than 2m" section in test_runner.py,
    because it is now much slower and has similar runtimes to other tests in this section
    55daf0e060
  85. Eunovo force-pushed on Oct 27, 2025
  86. Eunovo commented at 3:41 pm on October 27, 2025: contributor
    Rebased on master
  87. DrahtBot removed the label Needs rebase on Oct 27, 2025
  88. stickies-v commented at 9:58 am on October 31, 2025: contributor

    Concept ACK: reindex validation behaviour being different to IBD validation behaviour is undesirable and should be avoided.

    Slight approach NACK: I think this is probably the smallest diff we can get to improve reindex behaviour, but I think there are drawbacks that outweigh the benefits:

    1. it still doesn’t make IBD and reindex validation behaviour the same: with the current approach blocks that are NOT part of the assumevalid chain will still be validated as assumevalid. While it doesn’t seem likely that this can currently be abused by an attacker, I’m not certain and I’d prefer avoiding having to think about it now and in the future.
    2. this makes validation slightly more complex by adding another edge case to solve a problem which should only (?) occur in case of corruption, most likely caused by faulty hardware. While I’m not against improving reindex behaviour, I’m not convinced it’s important enough to complicate overall validation logic. If the user has fixed their hardware faults, just running a fresh IBD doesn’t seem like a big deal for these rare occurences?

    Generally speaking, I think we should move towards IBD and reindex validation behaviour converging rather than diverging. Three alternative approaches, sorted by decreasing feasibility, include:

    1. Log an error that assumevalid is enabled but reindex could not find the specified assumevalid header, print the highest available header in blockindex and inform the user that they can restart with -assumevalid=0 or -assumevalid=<best_available_header>or a fresh IBD, and then shutdown the node.
    2. rearchitect our blockindex storage so that we don’t need reindex, or so that it doesn’t wipe the assumevalid state. Relevant work is being done in #32427, and @cfields shared some relevant thoughts offline too.
    3. generalize validation logic so that there is no difference between IBD and reindex, making it possible to do p2p during reindex

    My preference for this PR would be to implement approach 1) to not change logic but just increase visibility for the user, and then actually improve things by improving our architecture (approaches 2) and/or 3)) rather than patching things up.

  89. Eunovo commented at 11:11 am on October 31, 2025: contributor
    1. Log an error that assumevalid is enabled but reindex could not find the specified assumevalid header, print the highest available header in blockindex and inform the user that they can restart with -assumevalid=0 or -assumevalid=<best_available_header>or a fresh IBD, and then shutdown the node.

    I agree with this approach, but we would still have to update the minimumchainwork logic for reindex, or we ask the user to also update the minimumchainwork too.

    1. generalize validation logic so that there is no difference between IBD and reindex, making it possible to do p2p during reindex

    I’m not sure how important it is for reindex to be able to work without a network connection.

  90. stickies-v commented at 12:38 pm on October 31, 2025: contributor

    I agree with this approach, but we would still have to update the minimumchainwork logic for reindex, or we ask the user to also update the minimumchainwork too.

    That seems sensible. I’m currently unsure what the best approach is here, but informing in logging seems like the minimum we can do.

    I’m not sure how important it is for reindex to be able to work without a network connection.

    I don’t think it is. Reindex being fully offline to me seems like an artifact of our current startup and locking logic, rather than a stated goal. Note also that my suggestions don’t mean reindex requires network connection, just that it doesn’t preclude it so headers can be synced if the user allows network connections.

  91. l0rinc commented at 4:19 pm on October 31, 2025: contributor

    For my benchmarking needs reindex-chainstate suffices, especially now that I know about this corner-case.

    Log an error that assumevalid is enabled but reindex could not find the specified assumevalid header

    Now that #33336 was merge we’re getting part of that message already which should already help a lot in this situation.

    Thanks for taking care of this issue, agree with the concerns of @stickies-v, but I personally would either keep reindex offline (I don’t like surprises, I often start benchmarks with -connect=0) or add an optional header update between reindex and reindex-chainstate steps which could time out to degrade gracefully.

  92. in src/validation.cpp:2444 in 55daf0e060
    2441+            // Therefore, bypass this check during a reindex.
    2442             script_check_reason = "assumevalid hash not in headers";
    2443-        } else if (it->second.GetAncestor(pindex->nHeight) != pindex) {
    2444+        } else if (is_present_in_index && it->second.GetAncestor(pindex->nHeight) != pindex) {
    2445             script_check_reason = (pindex->nHeight > it->second.nHeight) ? "block height above assumevalid height" : "block not in assumevalid chain";
    2446         } else if (m_chainman.m_best_header->GetAncestor(pindex->nHeight) != pindex) {
    


    hodlinator commented at 11:13 am on November 7, 2025:
    (Prior to this PR, all if’s from here on down expected the current block to be an ancestor of the assumed valid block).
  93. in src/validation.cpp:2437 in 55daf0e060
    2432@@ -2433,13 +2433,22 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2433         // This setting doesn't force the selection of any particular chain but makes validating some faster by
    2434         //  effectively caching the result of part of the verification.
    2435         BlockMap::const_iterator it{m_blockman.m_block_index.find(m_chainman.AssumedValidBlock())};
    2436-        if (it == m_blockman.m_block_index.end()) {
    2437+        bool is_present_in_index{it != m_blockman.m_block_index.end()};
    2438+        bool is_reindexing{m_chainman.m_blockman.IsReindexing()};
    


    hodlinator commented at 11:54 am on November 7, 2025:
    nit: if we keep these, const bool might be nicer, although the scope is small.
  94. in src/validation.cpp:2442 in 55daf0e060
    2443-        } else if (it->second.GetAncestor(pindex->nHeight) != pindex) {
    2444+        } else if (is_present_in_index && it->second.GetAncestor(pindex->nHeight) != pindex) {
    2445             script_check_reason = (pindex->nHeight > it->second.nHeight) ? "block height above assumevalid height" : "block not in assumevalid chain";
    2446         } else if (m_chainman.m_best_header->GetAncestor(pindex->nHeight) != pindex) {
    2447             script_check_reason = "block not in best header chain";
    2448-        } else if (m_chainman.m_best_header->nChainWork < m_chainman.MinimumChainWork()) {
    


    hodlinator commented at 1:21 pm on November 7, 2025:

    remark: Didn’t get this at first - if our block is an ancestor of AssumedValidBlock(), why would either of: a) it not being part of the best header chain b) the best header chain being below minimum chainwork …make us turn on script checking for this assumevalid ancestor?

    It’s been this way since assumevalid was introduced #9484 / e440ac7ef3b6f3ad1cd8fc7027cece40413202d9.

    I guess a) could be to cover the case where a node runner is instructed by a malicious party to run with an -assumevalid hash from a bogus chain fork which has less PoW. If you’re off the main fork, we need to make sure you’re not on a fork that contains invalid blocks, so we perform full validation.

    b) Could be to handle network eclipse attacks in combination with -assumevalid where the malicious party is too lazy to also make users lower the -minimumchainwork setting.

    Maybe we could add some clarifying comments to that affect?


    Eunovo commented at 11:19 am on November 10, 2025:
    From what I can infer, we use MinimumChainWork to set an “expected chain length”. This protects the node from any chain that is not at least as good as the set expected chain length, especially when it is eclipsed. Since the assumevalid block is typically set at the MinimumChainWork height, it would seem that we can remove the MinimumChainWork check from here by checking that the assumevalid block is an ancestor of (or is) the best_header, but in the case that we are using a malicious assumevalid block hash, we can still choose the expected chain by setting our MinimumChainWork appropriately. I see this as the last defence against such an attack; the attacker would also have to convince you to also reduce your MinimumChainWork.

    Eunovo commented at 11:21 am on November 10, 2025:
    I wouldn’t mind adding a comment stating this. Reviewers can confirm if it correctly reflects the intent of the code.
  95. hodlinator commented at 3:52 pm on November 7, 2025: contributor

    Agree this issue would be good to fix!

    However also agree with #31615#pullrequestreview-3403430467 that logging a warning/error and doing some kind of higher level change instead of patching things up here would be preferable.

    Higher level proposal

    Add an earlier phase for -reindex where it rebuilds the headers chain/BlockIndex from blocks on disk, and if we’re still below minimumchainwork, perform headerssync to populate the index prior to this phase of processing blocks.

    A condition for exiting this new phase would be that we fulfill minimumchainwork, that way we shouldn’t (ever?) have an issue where best header is below minimum chainwork when we reach ConnectBlock().

    There may be a concern that some of the headers taken from blocks on disk somehow got corrupted, leading to us filling the block index with garbage. But if we verify headers connect by prev-hash and obey thresholds and difficulty adjustments before feeding them to the index we should be okay (same as we do for headerssync).

    Alternative

    An alternative to the above would be to have a completely different fast path for -reindex where if the block exists on disk, we only do minimal hashing to verify that the entire block data is not corrupt. This assumes we do more complete validation before storing blocks to disk (and doesn’t account for switching between binaries with different consensus rules).

    This would however enable a malicious party to give invalid blocks data to a victim while offline, “here, I optimized IBD for you”. But it would still require an eclipse attack / network split to maintain.

    (There was an issue with the prior 661a32305c13fdffc7dc4420aa44eb503739fc12 where if we halt IBD before reaching the assumevalid block, and restart with -reindex, it looks like we would have disabled script validation even though the block was younger than 2 weeks.

    https://github.com/bitcoin/bitcoin/commit/661a32305c13fdffc7dc4420aa44eb503739fc12#diff-97c3a52bc5fad452d82670a7fd291800bae20c7bc35bb82686c2c0a4ea7b5b98R2462

    This has been fixed in newer pushes AFAICS).

  96. Eunovo commented at 2:23 pm on November 11, 2025: contributor
    Thanks for the reviews and suggestions @l0rinc @stickies-v @hodlinator @mzumsande @TheCharlatan @stratospher . I have implemented an approach that uses headers-sync in #33854 and will be closing this PR.
  97. Eunovo closed this on Nov 11, 2025


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-11-29 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me