validation: ensure assumevalid is always used during reindex #31615

pull Eunovo wants to merge 2 commits into bitcoin:master from Eunovo:31494-reindex-assumevalid changing 4 files +127 −15
  1. Eunovo commented at 11:52 am on January 7, 2025: contributor

    assumevalid may be ignored during reindex for the following reasons:

    • the assumevalid block is not in the block index: this happens when the assumevalid block could not be downloaded before the previous IBD run was interrupted
    • the chainwork of the best header is less than the minimumchainwork: happens if the previous IBD was interrupted before it could connect blocks up to minimumchainwork. See Issue #31494

    This PR fixes this by skipping script verification during reindex if assumevalid is enabled.

  2. DrahtBot commented at 11:52 am on January 7, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31615.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK l0rinc, mzumsande

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30214 (refactor: Improve assumeutxo state representation by ryanofsky)
    • #29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in src/validation.cpp:2502 in 951812b0ac outdated
    2498@@ -2499,7 +2499,7 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2499         if (it != m_blockman.m_block_index.end()) {
    2500             if (it->second.GetAncestor(pindex->nHeight) == pindex &&
    2501                 m_chainman.m_best_header->GetAncestor(pindex->nHeight) == pindex &&
    2502-                m_chainman.m_best_header->nChainWork >= m_chainman.MinimumChainWork()) {
    2503+                (m_chainman.m_best_header->nChainWork >= m_chainman.MinimumChainWork() || m_chainman.m_blockman.LoadingBlocks())) {
    


    mzumsande commented at 3:39 pm on January 7, 2025:
    I think that this won’t achieve the desired result: If the assumevalid block is not in the block index (which will be the case if IBD crashed midway before that block was downloaded, and a reindex is done) we’ll abort above in line 2499 and never get to this spot.

    Eunovo commented at 6:29 pm on January 7, 2025:
    That’s true. I thought #31494 was focused on the best header chainwork not being up to minimumchainwork. I could change this to always skip script checks during a reindex, but IIUC having a block on disk doesn’t mean that it was connected during the previous IBD. Might be worth it to sync headers first then connect blocks even during reindex.

    mzumsande commented at 9:23 pm on January 7, 2025:

    I really wasn’t sure what the best solution is when I opened the issue and still am not, but these are my thoughts:

    Yes, there are two reasons why we might not apply assumevalid during a reindex currently: 1.) The best header has not enough chainwork 2.) The assumevalid block is not in our block index

    I could change this to always skip script checks during a reindex, but IIUC having a block on disk doesn’t mean that it was connected during the previous IBD.

    I think the question is rather how well we can avoid to connect blocks skipping the script checks during the reindex that we would have fully validated in the original run if we hadn’t crashed. Assuming that the user provides the same -assumevalid arg to the reindex run that they did in the original run I can see the following cases: 1.) The user provides an existing block (or nothing, so that the default hash from chainparams is used) that we just haven’t downloaded yet because it’s later in the best chain - if we don’t have it in our index we’d be justified to skip script checks - if we do have it in our index, no special rules are necessary for reindexing. I think that’s the most common case for users. 2.) The user provides -assumevalid=0 - we should respect that also in the reindex run and fully validate everything. 3.) The user provides a block hash with -assumevalid that is non-existing / not in the best chain. I think that’s the critical case because here we would have fully validated everything in the original run, but won’t validate anything during the reindex. But arguably that’s not a common case / the user’s fault for providing this bad block, so maybe it can be tolerated?

    Might be worth it to sync headers first then connect blocks even during reindex.

    Yes, that’s the second option I thought of. I think we can’t just stop and wait for the headers to be downloaded - reindex should be able to finish without having a p2p connection, even if that means no assumevalid. But I was thinking of maybe just disabling the relevant LoadingBlocks() guards in net_processing, so that we’d download headers in parallel to ActivateBestChain(), and would change to use assumevalid after a few thousand blocks or so. That seem a bit messy (we’d still connect some blocks without assumevalid) and requires testing but might work well in practice?!


    Eunovo commented at 1:59 pm on January 10, 2025:

    I created another test to check if blocks rejected in the normal IBD run get loaded during a -reindex. The test reveals that it does load the invalid block and it will connect it if script checks are skipped, see https://github.com/Eunovo/bitcoin/commit/1ff4e5ee78418a30ba1ea9c384b8c7ee43435ce6

    IIUC During a reindex, the node does not know what blocks were connected in the previous run so it can’t safely skip checks. It looks to me like we have to:

    1. Sync headers before connecting blocks during reindex OR
    2. Persist data to show that we previously connected this block (IIUC this exact data exists but gets deleted during reindex, maybe we can preserve the relevant data we need?)

    Eunovo commented at 2:04 pm on January 10, 2025:

    The test reveals that it does load the invalid block and it will connect it if script checks are skipped, see Eunovo@1ff4e5e

    I’m still not sure if this causes any damage because I expect the node to eventually pick the most-work chain when it syncs headers

  4. Eunovo commented at 11:56 am on January 18, 2025: contributor

    I had an offline conversation with @mzumsande. During the reindex, we can apply assumevalid for following cases:

    • If the assumevalid block is the index, then we can skip script checks for all ancestors of the assumevalid block
    • If the assumevalid block is not in the index, then the initial IBD run had not downloaded the assumevalid block yet then the blocks on disk are ancestors of the assumevalid block and we can still skip script verification

    Then we download headers from peers in parallel to ActivateBestChain()

    I’ll put this in draft while I implement and test this over the next few days

  5. Eunovo marked this as a draft on Jan 18, 2025
  6. Eunovo force-pushed on Jan 29, 2025
  7. DrahtBot added the label CI failed on Jan 29, 2025
  8. Eunovo force-pushed on Jan 29, 2025
  9. DrahtBot removed the label CI failed on Jan 29, 2025
  10. l0rinc commented at 3:17 pm on January 30, 2025: contributor

    I’m not sure whether LoadingBlocks is the correct way to solve it or not (I would need someone else to opine on that, I can’t tell if this can err on the side of not checking the scripts at all for some edge case).

    I can confirm however that this does seem to solve the issue I had, reported in #31494 (i.e. reindex(-chainstate) after a partial IBD always enabling script verification).


    To test it, I have added a LogWarning("fScriptChecks is %s", fScriptChecks); after the assignments and ran it with master vs the branch:

    0   cmake -B build -DCMAKE_BUILD_TYPE=Debug && cmake --build build -j$(nproc) \
    1&& mkdir -p demo && build/src/bitcoind -datadir=demo -stopatheight=100 \
    2&& build/src/bitcoind -datadir=demo -reindex-chainstate -stopatheight=10
    

    master

    0[warning] fScriptChecks is true
    

    Eunovo:31494-reindex-assumevalid

    0[warning] fScriptChecks is false
    

    I can also confirm that adding -assumevalid=0 will enable script checking from the beginning and that -assumevalid=000000009b7262315dbf071787ad3656097b892abffd1f95a1a022f896f533fc has 5 fScriptChecks=false values after which script checking is (correctly) enabled again.


    I also checked that the new assume_valid test passes here and fails correctly on master (though it’s very slow, we have to check if there’s a way to speed it up somewhat)

     02025-01-30T15:08:22.897000Z TestFramework (INFO): Success
     12025-01-30T15:08:31.343000Z TestFramework (INFO): Restarting node while reindexing..
     22025-01-30T15:08:32.551000Z TestFramework (INFO): Testing reindex with -assumevalid
     32025-01-30T15:09:23.963000Z TestFramework (INFO): Testing reindex with assumevalid block in block index and minchainwork > best_header chainwork
     42025-01-30T15:09:52.592000Z TestFramework (ERROR): Assertion failed
     5Traceback (most recent call last):
     6  File "/Users/lorinc/IdeaProjects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
     7    self.run_test()
     8    ~~~~~~~~~~~~~^^
     9  File "/Users/lorinc/IdeaProjects/bitcoin/build/test/functional/feature_reindex.py", line 203, in run_test
    10    self.assume_valid()
    11    ~~~~~~~~~~~~~~~~~^^
    

    So from me it’s an approach ACK, but someone else needs to validate the implementation details and the corner cases that can result from this.

  11. Eunovo commented at 3:51 pm on January 30, 2025: contributor

    I’m not sure whether LoadingBlocks is the correct way to solve it or not (I would need someone else to opine on that, I can’t tell if this can err on the side of not checking the scripts at all for some edge case).

    I agree too. LoadingBlocks() is true when blocks are being reindexed or when new blocks are being loaded with -loadblocks . I intend to move away from LoadingBlocks() to something that’s only true when blockfiles are being reindexed.

    I also checked that the new assume_valid test passes here and fails correctly on master (though it’s very slow, we have to check if there’s a way to speed it up somewhat)

    I’ll look into this!

  12. Eunovo force-pushed on Feb 13, 2025
  13. mzumsande commented at 9:49 pm on March 7, 2025: contributor

    Concept ACK

    I think the current approach would work. Even though it means that if the user provided a bad / non-existing assumeutxo hash they could skip script validation during a reindex when they wouldn’t have in the original run, I’d say that’s not a major problem because a reindex should only be necessary in case of a a db corruption, which should not be remotely triggerable.

    So it will hopefully make the reindex a bit faster in case of a database corruption, while not changing the assumevalid security assumption very much, and it does respect -assumevalid=0.

    Though it would be good to have some experienced contributors chime in that were around when assumevalid was introduced (#9484).

  14. Eunovo marked this as ready for review on Mar 8, 2025
  15. Eunovo commented at 6:33 am on March 8, 2025: contributor
    Thanks for the reviews @l0rinc and @mzumsande . Due to your feedback, I have opened the PR for general review.
  16. DrahtBot added the label CI failed on Mar 21, 2025
  17. DrahtBot commented at 7:47 am on March 21, 2025: contributor
    Needs rebase, if still relevant
  18. validation.cpp: use assumevalid during reindex
    -reindex deletes the block tree db and attempts to rebuild it based on local block files,
    without attempting to re-request headers for which we don't have the full block yet
    This might cause:
    - the best known chain to have a chainwork less than the minimumchainwork
    - the assumedvalid block to not be in the block index if the previous IBD was interrupted
    before downloading it
    e1b42458cc
  19. test: Check that reindex always uses -assumevalid cb29bc3c30
  20. Eunovo force-pushed on Mar 24, 2025
  21. Eunovo renamed this:
    Ensure assumevalid is always used during reindex
    validation: ensure assumevalid is always used during reindex
    on Mar 24, 2025
  22. Eunovo commented at 11:39 am on March 24, 2025: contributor

    Rebased on master https://github.com/bitcoin/bitcoin/commit/770d39a37652d40885533fecce37e9f71cc0d051

  23. DrahtBot removed the label CI failed on Mar 24, 2025
  24. DrahtBot added the label Validation on Mar 24, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-03-31 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me