Ensure assumevalid is always used during reindex #31615

pull Eunovo wants to merge 2 commits into bitcoin:master from Eunovo:31494-reindex-assumevalid changing 2 files +85 −3
  1. Eunovo commented at 11:52 am on January 7, 2025: none

    assumevalid is not always used during reindex because the chainwork of the best header might be less than the minimumchainwork if the previous IBD was interrupted before it could connect blocks up to minimumchainwork. See Issue #31494

    This does not occur during normal IBD because the node does not connect blocks before minchainwork headers.

  2. validation.cpp: Bypass minimumchainwork test during reindex
    -reindex deletes the block tree db and attempts to rebuild it based on local block files, without attempting to re-request headers for which we don't have the full block yet (because we didn't get to that point in the previous IBD). This might cause the best known chain to have a chainwork less than the minimumchainwork
    
    Bypassing the minimuchainwork test while `LoadingBlocks() == true` allows reindex to use assumevalid
    951812b0ac
  3. test: Check that reindex always uses -assumevalid 57bd31eac4
  4. DrahtBot commented at 11:52 am on January 7, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31615.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

  5. in src/validation.cpp:2502 in 951812b0ac outdated
    2498@@ -2499,7 +2499,7 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
    2499         if (it != m_blockman.m_block_index.end()) {
    2500             if (it->second.GetAncestor(pindex->nHeight) == pindex &&
    2501                 m_chainman.m_best_header->GetAncestor(pindex->nHeight) == pindex &&
    2502-                m_chainman.m_best_header->nChainWork >= m_chainman.MinimumChainWork()) {
    2503+                (m_chainman.m_best_header->nChainWork >= m_chainman.MinimumChainWork() || m_chainman.m_blockman.LoadingBlocks())) {
    


    mzumsande commented at 3:39 pm on January 7, 2025:
    I think that this won’t achieve the desired result: If the assumevalid block is not in the block index (which will be the case if IBD crashed midway before that block was downloaded, and a reindex is done) we’ll abort above in line 2499 and never get to this spot.

    Eunovo commented at 6:29 pm on January 7, 2025:
    That’s true. I thought #31494 was focused on the best header chainwork not being up to minimumchainwork. I could change this to always skip script checks during a reindex, but IIUC having a block on disk doesn’t mean that it was connected during the previous IBD. Might be worth it to sync headers first then connect blocks even during reindex.

    mzumsande commented at 9:23 pm on January 7, 2025:

    I really wasn’t sure what the best solution is when I opened the issue and still am not, but these are my thoughts:

    Yes, there are two reasons why we might not apply assumevalid during a reindex currently: 1.) The best header has not enough chainwork 2.) The assumevalid block is not in our block index

    I could change this to always skip script checks during a reindex, but IIUC having a block on disk doesn’t mean that it was connected during the previous IBD.

    I think the question is rather how well we can avoid to connect blocks skipping the script checks during the reindex that we would have fully validated in the original run if we hadn’t crashed. Assuming that the user provides the same -assumevalid arg to the reindex run that they did in the original run I can see the following cases: 1.) The user provides an existing block (or nothing, so that the default hash from chainparams is used) that we just haven’t downloaded yet because it’s later in the best chain - if we don’t have it in our index we’d be justified to skip script checks - if we do have it in our index, no special rules are necessary for reindexing. I think that’s the most common case for users. 2.) The user provides -assumevalid=0 - we should respect that also in the reindex run and fully validate everything. 3.) The user provides a block hash with -assumevalid that is non-existing / not in the best chain. I think that’s the critical case because here we would have fully validated everything in the original run, but won’t validate anything during the reindex. But arguably that’s not a common case / the user’s fault for providing this bad block, so maybe it can be tolerated?

    Might be worth it to sync headers first then connect blocks even during reindex.

    Yes, that’s the second option I thought of. I think we can’t just stop and wait for the headers to be downloaded - reindex should be able to finish without having a p2p connection, even if that means no assumevalid. But I was thinking of maybe just disabling the relevant LoadingBlocks() guards in net_processing, so that we’d download headers in parallel to ActivateBestChain(), and would change to use assumevalid after a few thousand blocks or so. That seem a bit messy (we’d still connect some blocks without assumevalid) and requires testing but might work well in practice?!


    Eunovo commented at 1:59 pm on January 10, 2025:

    I created another test to check if blocks rejected in the normal IBD run get loaded during a -reindex. The test reveals that it does load the invalid block and it will connect it if script checks are skipped, see https://github.com/Eunovo/bitcoin/commit/1ff4e5ee78418a30ba1ea9c384b8c7ee43435ce6

    IIUC During a reindex, the node does not know what blocks were connected in the previous run so it can’t safely skip checks. It looks to me like we have to:

    1. Sync headers before connecting blocks during reindex OR
    2. Persist data to show that we previously connected this block (IIUC this exact data exists but gets deleted during reindex, maybe we can preserve the relevant data we need?)

    Eunovo commented at 2:04 pm on January 10, 2025:

    The test reveals that it does load the invalid block and it will connect it if script checks are skipped, see Eunovo@1ff4e5e

    I’m still not sure if this causes any damage because I expect the node to eventually pick the most-work chain when it syncs headers

  6. Eunovo commented at 11:56 am on January 18, 2025: none

    I had an offline conversation with @mzumsande. During the reindex, we can apply assumevalid for following cases:

    • If the assumevalid block is the index, then we can skip script checks for all ancestors of the assumevalid block
    • If the assumevalid block is not in the index, then the initial IBD run had not downloaded the assumevalid block yet then the blocks on disk are ancestors of the assumevalid block and we can still skip script verification

    Then we download headers from peers in parallel to ActivateBestChain()

    I’ll put this in draft while I implement and test this over the next few days

  7. Eunovo marked this as a draft on Jan 18, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me