validation: stricter internal handling of invalid blocks #31405

pull mzumsande wants to merge 7 commits into bitcoin:master from mzumsande:202411_stricter_invalidblock_handling changing 2 files +52 −83
  1. mzumsande commented at 7:29 pm on December 2, 2024: contributor

    Some fields in validation are set opportunistically by “best effort”:

    • The BLOCK_FAILED_CHILD status (which means that the block index has an invalid predecessor)
    • m_best_header (the most-work header not known to be invalid).

    This means that there are known situations in which these fields are not set when they should be, or set to wrong values. This is tolerated because the fields are not used for anything consensus-critical and triggering these situations involved creating invalid blocks with valid PoW header, so would have a cost attached. Also, having stricter guarantees for these fields requires iterating over the entire block index, which has some DoS potential, especially with any header above the checkpoint being accepted int he past (see e.g. #11531).

    However, there are reasons to change this now:

    • RPCs use these fields and can report wrong results
    • There is the constant possibility that someone could add code that expects these fields to be correct, especially because it is not well documented that these fields cannot always be relied upon.
    • DoS concerns have become less of an issue after #25717 - now an attacker would need to invest much more work because they can’t fork off the last checkpoint anymore

    This PR continues the work from #30666 to ensure that BLOCK_FAILED_CHILD status and m_best_header are always correct:

    • it adds a call to InvalidChainFound() in AcceptBlock().
    • it adds checks for BLOCK_FAILED_CHILD and m_best_header to CheckBlockIndex(). In order to be able to do this, the existing cache in the RPC-only InvalidateBlock() is adjusted to handle these as well. These are performance optimizations with the goal of avoiding having a call of InvalidChainFound() / looping over the block index after each disconnected block. I also wrote a fuzz test to find possible edge cases violating CheckBlockIndex, which I will PR separately soon.
    • it removes the m_failed_blocks set, which was a heuristic necessary when we couldn’t be sure if a given block index had an invalid predecessor or not. Now that we have that guarantee, the set is no longer needed.
  2. DrahtBot commented at 7:29 pm on December 2, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31405.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK ryanofsky, stickies-v, TheCharlatan, achow101
    Concept ACK stringintech
    Stale ACK sipa, stratospher

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30479 (validation: Add eligible ancestors of reconsidered block to setBlockIndexCandidates by mzumsande)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Validation on Dec 2, 2024
  4. mzumsande force-pushed on Dec 2, 2024
  5. DrahtBot added the label CI failed on Dec 2, 2024
  6. DrahtBot commented at 8:47 pm on December 2, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/33808912013

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. DrahtBot removed the label CI failed on Dec 2, 2024
  8. mzumsande marked this as ready for review on Dec 3, 2024
  9. stratospher commented at 2:14 pm on January 13, 2025: contributor

    ACK b2136da9

    • Went through the logs in rpc_invalidateblock.py and checked how the children were marked invalid/m_best_header was set on master vs in this PR
    • Went through logs of some functional tests like feature_block.py and verified that BLOCK_FAILED_CHILD wasn’t being set in m_failed_blocks loop
    • Ran invalidateblock RPC on a full node.
  10. in src/validation.cpp:5379 in b2136da98d outdated
    5358@@ -5367,6 +5359,8 @@ void ChainstateManager::CheckBlockIndex()
    5359         if (pindexFirstInvalid == nullptr) {
    5360             // Checks for not-invalid blocks.
    5361             assert((pindex->nStatus & BLOCK_FAILED_MASK) == 0); // The failed mask cannot be set for blocks without invalid parents.
    5362+        } else {
    


    stratospher commented at 6:41 am on January 14, 2025:

    528b8cc: this would be more accurate for descendants.

    0        } else if (pindexFirstInvalid != pindex) {
    

    mzumsande commented at 6:07 pm on January 28, 2025:
    I changed the comment to “Invalid blocks and their descendants must be marked as invalid” - doesn’t really make sense to me to make an exception for the block itself.
  11. in src/validation.cpp:3745 in b2136da98d outdated
    3738@@ -3729,6 +3739,28 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3739             m_blockman.m_dirty_blockindex.insert(to_mark_failed);
    3740         }
    3741 
    3742+        // Mark descendants of the invalidated block as invalid
    3743+        auto range = cand_invalid_descendants.equal_range(invalid_walk_tip);
    3744+        for (auto it = range.first; it != range.second; ++it) {
    3745+            it->second->nStatus |= BLOCK_FAILED_CHILD;
    


    stratospher commented at 7:33 am on January 14, 2025:

    7f88af8b: when invalidateblock is called multiple times, we would need to clear the BLOCK_FAILED_VALID from previous iteration so that both BLOCK_FAILED_VALID and BLOCK_FAILED_CHILD don’t happen together on the same block. ex: this situation

    0if (it->second->nStatus & BLOCK_FAILED_VALID) {
    1    it->second->nStatus = (it->second->nStatus ^ BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
    2} else {
    3    it->second->nStatus |= BLOCK_FAILED_CHILD;
    4}
    

    mzumsande commented at 6:08 pm on January 28, 2025:
    Done, made it a bit shorter: it->second->nStatus = (it->second->nStatus & ~BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
  12. mzumsande force-pushed on Jan 28, 2025
  13. mzumsande force-pushed on Jan 28, 2025
  14. mzumsande commented at 6:10 pm on January 28, 2025: contributor
    b2136da to 4ba2e48: Addressed review comments by @stratospher (Thanks!) and rebased.
  15. in src/validation.cpp:5382 in eca8d9182e outdated
    5434@@ -5434,6 +5435,8 @@ void ChainstateManager::CheckBlockIndex()
    5435         if (pindexFirstInvalid == nullptr) {
    5436             // Checks for not-invalid blocks.
    5437             assert((pindex->nStatus & BLOCK_FAILED_MASK) == 0); // The failed mask cannot be set for blocks without invalid parents.
    5438+        } else {
    5439+            assert((pindex->nStatus & BLOCK_FAILED_MASK)); // Invalid blocks and their descendants must be marked as invalid
    


    stratospher commented at 8:40 am on February 6, 2025:
    eca8d91: micro-style-nit only if you have to retouch - could make double (()) to ().

    mzumsande commented at 4:17 pm on April 1, 2025:
    I missed this before, but it’s fixed now!
  16. stratospher commented at 3:17 pm on February 10, 2025: contributor
    reACK 4ba2e48.
  17. stickies-v commented at 4:48 pm on March 3, 2025: contributor
    PSA: Bitcoin Core PR Review Club will cover this PR in its next meeting at 2025-03-05 at 17:00 UTC. See https://bitcoincore.reviews/31405 for notes, questions, and instructions on how to join.
  18. in src/validation.cpp:3773 in 4ba2e480ff outdated
    3763@@ -3754,6 +3764,29 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3764             m_blockman.m_dirty_blockindex.insert(to_mark_failed);
    3765         }
    3766 
    3767+        // Mark descendants of the invalidated block as invalid
    3768+        // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3769+        auto range = cand_invalid_descendants.equal_range(invalid_walk_tip);
    3770+        for (auto it = range.first; it != range.second; ++it) {
    3771+            it->second->nStatus = (it->second->nStatus & ~BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
    3772+        }
    


    stickies-v commented at 9:32 pm on March 3, 2025:
    Is there a reason why we forcefully unset BLOCK_FAILED_VALID? Since both flags store orthogonal information in separate bits, intuitively I’d expect we could have both co-exist? If this breaks other assumptions, perhaps good to add to the documentation here?

    mzumsande commented at 6:34 pm on March 4, 2025:
    There is some more discussion about this in #31835, where I wrote my opinion. Since BLOCK_FAILED_CHILD has never been used for anything, it’s hard to say what the intentions really were, but my impression is that the first invalid block should be marked as BLOCK_FAILED_VALID and all of its descendants as BLOCK_FAILED_CHILD - so having both BLOCK_FAILED_VALID and BLOCK_FAILED_CHILD at the same time doesn’t make sense to me.
  19. in src/validation.cpp:3717 in 4ba2e480ff outdated
    3710@@ -3707,6 +3711,12 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3711                     candidate->HaveNumChainTxs()) {
    3712                 candidate_blocks_by_work.insert(std::make_pair(candidate->nChainWork, candidate));
    3713             }
    3714+            // Similarly, populate cache for blocks not in main chain to invalidate
    3715+            if (!m_chain.Contains(candidate) &&
    3716+                !CBlockIndexWorkComparator()(candidate, pindex->pprev)) {
    3717+                cand_invalid_descendants.insert(std::make_pair(m_chain.FindFork(candidate), candidate));
    


    stickies-v commented at 9:48 pm on March 3, 2025:
    It’s not clear to me why we would only want to mark blocks with large-enough PoW as invalid descendants. If this is not a bug, I think behaviour is inconsistent with the commit message and should probably be document in the code too?

    mzumsande commented at 6:52 pm on March 4, 2025:
    With pindex->pprev being the predecessor of the block passed to the invalidateblock rpc call, the logic in InvalidateBlock() is to go from the tip back to pindex->pprev - and we may need to update other forked descendants of the blocks encountered along that way (which must have more accumulated work than their parents), which is why we put them in the cache. But all blocks with lower work can never be encountered by that process and therefore don’t need to be in the caches. I’ll add more documentation with my next push.

    stickies-v commented at 12:07 pm on March 5, 2025:
    Oh sorry, I get it now, thanks. For some reason (I think I had just been reviewing the next section dealing with disconnecting the tip) I was confusing pindex for the chain tip here. I don’t really have any doc suggestions, so can be marked as resolved.
  20. in src/validation.cpp:3697 in 4ba2e480ff outdated
    3689@@ -3691,6 +3690,11 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3690     // build a map once so that we can look up candidate blocks by chain
    3691     // work as we go.
    3692     std::multimap<const arith_uint256, CBlockIndex *> candidate_blocks_by_work;
    3693+    // Map to cache candidates not in the main chain that might need invalidating.
    3694+    // Maps fork block in chain to the candidates for invalidation.
    3695+    std::multimap<const CBlockIndex*, CBlockIndex*> cand_invalid_descendants;
    3696+    // Map to cache candidates for m_best_header
    3697+    std::multimap<const arith_uint256, CBlockIndex*> best_header_blocks_by_work;
    


    stickies-v commented at 0:49 am on March 4, 2025:

    Since this map is a superset of candidate_blocks_by_work , I think we could just have one, and instead check for IsValid(BLOCK_VALID_TRANSACTIONS) && HaveNumChainTxs() further down when we update setBlockIndexCandidates? I suspect the memory impllications should be negligible either way, but I think it would clean up the code a bit?

    E.g. something like:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index f9c900ef27..ead0454f7b 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -3689,12 +3689,10 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
     5     // To avoid walking the block index repeatedly in search of candidates,
     6     // build a map once so that we can look up candidate blocks by chain
     7     // work as we go.
     8-    std::multimap<const arith_uint256, CBlockIndex *> candidate_blocks_by_work;
     9+    std::multimap<const arith_uint256, CBlockIndex *> highpow_outofchain_headers;
    10     // Map to cache candidates not in the main chain that might need invalidating.
    11     // Maps fork block in chain to the candidates for invalidation.
    12     std::multimap<const CBlockIndex*, CBlockIndex*> cand_invalid_descendants;
    13-    // Map to cache candidates for m_best_header
    14-    std::multimap<const arith_uint256, CBlockIndex*> best_header_blocks_by_work;
    15 
    16     {
    17         LOCK(cs_main);
    18@@ -3706,16 +3704,9 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    19             // Instead, consider only non-active-chain blocks that have at
    20             // least as much work as where we expect the new tip to end up.
    21             if (!m_chain.Contains(candidate) &&
    22-                    !CBlockIndexWorkComparator()(candidate, pindex->pprev) &&
    23-                    candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    24-                    candidate->HaveNumChainTxs()) {
    25-                candidate_blocks_by_work.insert(std::make_pair(candidate->nChainWork, candidate));
    26-            }
    27-            // Similarly, populate cache for blocks not in main chain to invalidate
    28-            if (!m_chain.Contains(candidate) &&
    29-                !CBlockIndexWorkComparator()(candidate, pindex->pprev)) {
    30+                    !CBlockIndexWorkComparator()(candidate, pindex->pprev)) {
    31+                highpow_outofchain_headers.insert(std::make_pair(candidate->nChainWork, candidate));
    32                 cand_invalid_descendants.insert(std::make_pair(m_chain.FindFork(candidate), candidate));
    33-                best_header_blocks_by_work.insert(std::make_pair(candidate->nChainWork, candidate));
    34             }
    35         }
    36     }
    37@@ -3774,11 +3765,9 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    38         // Determine new best header
    39         if (m_chainman.m_best_header->nStatus & BLOCK_FAILED_MASK) {
    40             m_chainman.m_best_header = invalid_walk_tip->pprev;
    41-            auto best_header_it = best_header_blocks_by_work.lower_bound(m_chainman.m_best_header->nChainWork);
    42-            while (best_header_it != best_header_blocks_by_work.end()) {
    43-                if (best_header_it->second->nStatus & BLOCK_FAILED_MASK) {
    44-                    best_header_it = best_header_blocks_by_work.erase(best_header_it);
    45-                } else {
    46+            auto best_header_it = highpow_outofchain_headers.lower_bound(m_chainman.m_best_header->nChainWork);
    47+            while (best_header_it != highpow_outofchain_headers.end()) {
    48+                if (!(best_header_it->second->nStatus & BLOCK_FAILED_MASK)) {
    49                     if (!CBlockIndexWorkComparator()(best_header_it->second, m_chainman.m_best_header)) {
    50                         m_chainman.m_best_header = best_header_it->second;
    51                     }
    52@@ -3788,11 +3777,12 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    53         }
    54 
    55         // Add any equal or more work headers to setBlockIndexCandidates
    56-        auto candidate_it = candidate_blocks_by_work.lower_bound(invalid_walk_tip->pprev->nChainWork);
    57-        while (candidate_it != candidate_blocks_by_work.end()) {
    58-            if (!CBlockIndexWorkComparator()(candidate_it->second, invalid_walk_tip->pprev)) {
    59-                setBlockIndexCandidates.insert(candidate_it->second);
    60-                candidate_it = candidate_blocks_by_work.erase(candidate_it);
    61+        auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    62+        while (candidate_it != highpow_outofchain_headers.end()) {
    63+            if (!CBlockIndexWorkComparator()(candidate_it->second, invalid_walk_tip->pprev) &&
    64+                candidate_it->second->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    65+                candidate_it->second->HaveNumChainTxs()) {
    66+                    setBlockIndexCandidates.insert(candidate_it->second);
    67             } else {
    68                 ++candidate_it;
    69             }
    

    mzumsande commented at 10:26 pm on March 18, 2025:
    That’s a good idea! I think we can even go one step further and use highpow_outofchain_headers for everything, also for cand_invalid_descendants (at the cost of doing a few GetAncestor() calls, which should be cheap enough). I pushed an updated version that uses this approach.
  21. stickies-v commented at 0:58 am on March 4, 2025: contributor
    Concept ACK for making m_best_header and nStatus more reliably correct, and the trade-offs seem reasonable. Even though the diff is small, I find reasoning about these changes fairly complex and require a lot of context, so I’ll need to think through the implications more.
  22. stringintech commented at 6:37 pm on March 8, 2025: contributor

    Concept ACK.

    I wrote two test cases to understand the general behavior of InvalidateBlock, not specifically to test this PR’s changes:

    0A->B->C->D->E->F (main chain)
    1A->B->C->G (fork 1)
    2A->B->H->I->J (fork 2)
    3A->B->H->K (fork 3)
    

    In the first test, I invalidate block C (on active chain) which should invalidate C, D, E, F, and G. The second test invalidates H (on a fork) which should invalidate H, I, J, and K.

    Both tests behave the same on the master and PR branches. The first test fails on both because D, E, and F are not marked as BLOCK_FAILED_CHILD, which I did not expect.

    I have two questions:

    1 ) I found what appears to be a bug in InvalidateBlock: https://github.com/bitcoin/bitcoin/blob/4637cb1eec48d1af8d23eeae1bb4c6f8de55eed9/src/validation.cpp#L3750

    The first part of the condition seems incorrect and should probably be:

    0if (to_mark_failed->pprev == invalid_walk_tip ...) {
    

    The assertions that fail in the first test are for blocks D, E, and F which were all part of the best chain before invalidation, which led me to this part of the code. I’m not sure if fixing this is in scope for this PR.

    Update: Just saw the PR #31835 which fixes this issue. :)

    2 ) To validate my understanding: Since both branches behave similarly (even on master the m_best_header and BLOCK_FAILED_CHILD statuses are set correctly because of the final call to InvalidChainFound - except for the ones not set as BLOCK_FAILED_CHILD because of the potential bug), is it correct that the main goal of changes applied in InvalidateBlock is to maintain proper block statuses and best header between tip disconnections by marking children as invalid right away and recalculating the best header immediately rather than doing it later?

  23. mzumsande commented at 7:23 pm on March 17, 2025: contributor

    Update: Just saw the PR #31835 which fixes this issue. :)

    Heh, looks like you are at least the third person who found that bug independently, first one was in #16856!

    Since both branches behave similarly (even on master the m_best_header and BLOCK_FAILED_CHILD statuses are set correctly because of the final call to InvalidChainFound - except for the ones not set as BLOCK_FAILED_CHILD because of the potential bug), is it correct that the main goal of changes applied in InvalidateBlock is to maintain proper block statuses and best header between tip disconnections by marking children as invalid right away and recalculating the best header immediately rather than doing it later?

    Yes, that’s correct. The main goal of this PR to add these assumptions to CheckBlockIndex() - this means they should always be true when cs_main is not held. Since InvalidateBlock will release cs_main and call CheckBlockIndex() after each step, some tests would fail without the changes to InvalidateBlock() - even if the final InvalidChainFound() in that function will eventually correct it!

  24. mzumsande marked this as a draft on Mar 17, 2025
  25. mzumsande force-pushed on Mar 18, 2025
  26. mzumsande force-pushed on Mar 18, 2025
  27. mzumsande commented at 10:37 pm on March 18, 2025: contributor

    4ba2e48 to fe83133: I changed the approach to InvalidateBlock() after the suggestion by @stickies-v: Instead of having 3 slightly different caches, now only one cache is used for everything (highpow_outofchain_headers). This should make the changes more simple and have a negligible overhead of a few additional GetAncestor() calls.

    I also added a call to CheckBlockIndex after each iteration of InvalidateBlock() (which releases cs_main, so any other thread could call CBI, which means that the block index cannot be in an inconsistent state, even temporarily). This should make it clear why the changes to InvalidateBlock() are necessary, even if (as @stringintech noted) the additional checks to CheckBlockIndex might not have led to an failure otherwise.

  28. mzumsande marked this as ready for review on Mar 19, 2025
  29. in src/validation.cpp:3791 in fe83133998 outdated
    3787+                if (!CBlockIndexWorkComparator()(best_header_it->second, m_chainman.m_best_header)) {
    3788+                    m_chainman.m_best_header = best_header_it->second;
    3789+                }
    3790+                ++best_header_it;
    3791             }
    3792         }
    


    stringintech commented at 10:03 pm on March 19, 2025:

    Can the m_best_header update logic be merged into the while loop that handles the nStatus updates? Not sure if it is necessarily better; but I was thinking of something like:

    Adding this just before the while loop:

    0const bool best_header_needs_update = m_chainman.m_best_header->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip;
    1if (best_header_needs_update) {
    2  m_chainman.m_best_header = invalid_walk_tip->pprev;
    3}
    

    And then here:

    0            if (best_header_needs_update &&
    1                !CBlockIndexWorkComparator()(candidate_it->second, m_chainman.m_best_header)) {
    2                m_chainman.m_best_header = candidate_it->second;
    3            }
    4            ++candidate_it;
    5        }
    

    mzumsande commented at 7:54 pm on March 21, 2025:
    Yes, I like that solution, it’s less code and since we would continue when encountering an invalid header before we’d never set an invalid block as best header, so it’s correct. I added you as co-author of that commit with the email from your github profile - let me know if you want me to change anything about that!

    stringintech commented at 10:30 am on March 22, 2025:
    Thanks a lot! Yes the email is okay.
  30. in src/validation.cpp:3769 in fe83133998 outdated
    3761+        // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3762+        // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3763+        auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    3764+        while (candidate_it != highpow_outofchain_headers.end()) {
    3765+            if (candidate_it->second->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip) {
    3766+                candidate_it->second->nStatus = (candidate_it->second->nStatus & ~BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
    


    stringintech commented at 10:04 pm on March 19, 2025:
    Shouldn’t we mark the candidate dirty after this?

    mzumsande commented at 7:49 pm on March 21, 2025:
    yes added - nothing bad would happen if we didn’t (because BLOCK_FAILED_CHILD would be recalculated at next startup) but I think it’s definitely better to do it.
  31. in src/validation.cpp:3763 in fe83133998 outdated
    3762+        // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3763+        auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    3764+        while (candidate_it != highpow_outofchain_headers.end()) {
    3765+            if (candidate_it->second->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip) {
    3766+                candidate_it->second->nStatus = (candidate_it->second->nStatus & ~BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
    3767+                // If invalidated, the block is irrelevant for setBlockIndexCandidates and m_best_header and can be removed fro mthe cache
    


    stringintech commented at 10:06 pm on March 19, 2025:
    Small typo at the end of the comment

    mzumsande commented at 7:48 pm on March 21, 2025:
    fixed
  32. mzumsande force-pushed on Mar 21, 2025
  33. mzumsande force-pushed on Mar 21, 2025
  34. mzumsande commented at 7:55 pm on March 21, 2025: contributor
    fe83133 to 7b6fe5a - addressed feedback by @stringintech, thanks!
  35. sipa commented at 3:48 pm on March 28, 2025: member

    utACK 7b6fe5a2196d53a3024dfe7f11b316a50acc3cb7

    The code changes make sense, I think, given the fact that the block index failed flags are generally kept consistent already, apart from some edge cases around invalidateblock. I think the existing design stemmed from a desire not to have full block index iterations wherever possible, but if we’ve already bit the bullet on that, there is no need to for the complexity laziness brings here.

  36. DrahtBot requested review from stratospher on Mar 28, 2025
  37. DrahtBot requested review from stringintech on Mar 28, 2025
  38. DrahtBot requested review from stickies-v on Mar 28, 2025
  39. in src/validation.cpp:3762 in fbe9e2a391 outdated
    3762+        while (candidate_it != highpow_outofchain_headers.end()) {
    3763+            if (!CBlockIndexWorkComparator()(candidate_it->second, invalid_walk_tip->pprev) &&
    3764+                candidate_it->second->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3765+                candidate_it->second->HaveNumChainTxs()) {
    3766                 setBlockIndexCandidates.insert(candidate_it->second);
    3767-                candidate_it = candidate_blocks_by_work.erase(candidate_it);
    


    TheCharlatan commented at 2:47 pm on March 29, 2025:

    In commit fbe9e2a391d5f6e14998776b03d4ffbbf43f33c8:

    Nit: Should this change really be in this commit? It does make sense in the next commit. I don’t think it would be a bug per se to never erase, but it might be less efficient.


    mzumsande commented at 4:16 pm on April 1, 2025:
    yes - I’ve moved it into the next commit and added an explanation to the commit message.
  40. TheCharlatan approved
  41. TheCharlatan commented at 9:11 pm on March 29, 2025: contributor

    crACK 7b6fe5a2196d53a3024dfe7f11b316a50acc3cb7

    This reduces complexity overall and I am happy that the slightly verbose block in AcceptBlockHeader and the m_dirty_blockindex are gone. It is not really straight forward to review though and I hope I did not miss any edge cases or real performance drawbacks. Might be good to get some more eyes on this. Having a better abstraction on the BlockMap / block tree data structure that just handles all of this internally could make things easier in the future.

  42. mzumsande force-pushed on Apr 1, 2025
  43. mzumsande commented at 4:19 pm on April 1, 2025: contributor
    7b6fe5a to 96e6371: Addressed #31405 (review) and removed unnecessary braces.
  44. TheCharlatan approved
  45. TheCharlatan commented at 7:31 pm on April 1, 2025: contributor
    Re-ACK 96e6371328c7715ef814da5831d08b5d30285dd3
  46. DrahtBot requested review from sipa on Apr 1, 2025
  47. stratospher approved
  48. stratospher commented at 8:07 am on April 3, 2025: contributor

    reACK 96e6371.

    I liked the new approach of using 1 cache for everything and the smaller diff. Also went through the logs in rpc_invalidate.py to make sure that everything works as intended.

  49. in src/validation.cpp:4522 in 163b01eae8 outdated
    4544@@ -4545,6 +4545,7 @@ bool ChainstateManager::AcceptBlock(const std::shared_ptr<const CBlock>& pblock,
    4545         if (state.IsInvalid() && state.GetResult() != BlockValidationResult::BLOCK_MUTATED) {
    4546             pindex->nStatus |= BLOCK_FAILED_VALID;
    4547             m_blockman.m_dirty_blockindex.insert(pindex);
    4548+            ActiveChainstate().InvalidChainFound(pindex);
    


    stickies-v commented at 4:46 pm on April 23, 2025:

    My understanding is that state.GetResult() != BlockValidationResult::BLOCK_MUTATED is equivalent to state.GetResult() == BlockValidationResult::BLOCK_CONSENSUS here, because none of the other enum values are returned outside of block header validation, and at this point we’ve already passed AcceptBlockHeader().

    I tested this by adding an Assume and running the unit and functional test suite with it:

    0...
    1            Assume(state.GetResult() == BlockValidationResult::BLOCK_CONSENSUS);
    2            ActiveChainstate().InvalidChainFound(pindex);
    3...
    

    Understanding the DoS safety (as documented in the 163b01eae82a9dd94dcda94cf13001e46b4a8a56 commit message) is fairly straightforward from your documentation, but I spent most time figuring out whether the InvalidChainFound() call would be safe e.g. in the case of a BLOCK_TIME_FUTURE too. If my understanding that it must be a BLOCK_CONSENSUS failure at this point, I think adding the Assume and/or updating the commit message to reflect that might be helpful?


    mzumsande commented at 10:11 pm on May 5, 2025:

    Hmm, this PR doesn’t really change the essence of this: The critical thing here is to set nStatus to BLOCK_FAILED_VALID, and this is already was the case on master. This is irrevocable (outside of the reconsiderblock RPC which is a hack that doesn’t really count) and it means that this block - and therefore all of its descendants - will never be connected. If there was a bug (e.g. if we could fail with BLOCK_TIME_FUTURE after accepting a header and then receiving the full block) this would be a bug on master.

    The added call to InvalidChainFound only fixes the accounting for the descendants.


    stickies-v commented at 10:49 am on May 6, 2025:

    Oh right, I didn’t consider that pindex is already set to BLOCK_FAILED_INVALID a few lines up. I agree that this change does not introduce any additional risks from unexpected BLOCK_TIME_FUTURE etc BlockValidationResult. Thanks for the insight, can be marked as resolved.

    (I still think adding the Assume could be helpful to catch future bugs, but it’s orthogonal to the PR)

  50. in src/validation.cpp:3717 in e2b69d48a0 outdated
    3715@@ -3716,6 +3716,7 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3716 
    3717         // Make sure the queue of validation callbacks doesn't grow unboundedly.
    3718         if (m_chainman.m_options.signals) LimitValidationInterfaceQueue(*m_chainman.m_options.signals);
    3719+        m_chainman.CheckBlockIndex();
    


    stickies-v commented at 4:55 pm on April 23, 2025:

    I was surprised to see that CheckBlockIndex() is not a const method, which I think it should be. To help validate that this call is not having unexpected side-effects, I implemented the const-ness change and found no issues because of it: https://github.com/bitcoin/bitcoin/compare/master...stickies-v:bitcoin:2b68b3f8343fe6582cc2ecbc474f489a50bdeb11

    I think this change would be good to have on master, but it doesn’t have to be in this PR either, so if you’d rather no include it in this PR (it is a non-trivial change), I’m happy to open it as a separate PR? edit: opened #32364


    stickies-v commented at 11:20 am on April 24, 2025:
    I don’t really understand why we need this extra check in every loop iteration. The commit message states that we need to check it again because cs_main was released, but it’s not really obvious to me why that’s a problem?

    mzumsande commented at 9:10 pm on April 29, 2025:

    We don’t “need” the check - but what we want to guarantee is that whenever cs_main is not held, the block index must be in a consistent state (i.e. CheckBlockIndex() would pass) - because at this point, other threads can change the block index / call CBI, leading to issues such as #16444. This spot is pretty critical, because even though InvalidateBlock will clean up in the end, it releases cs_main for stability reasons multiple times (so that the node doesn’t become unresponsive for too long) - so having the BlockIndex in a consistent state at this spot is the reason for the changes to InvalidateBlock in this PR, so I think it’s helpful to also check for this on regtest.

    (I’ll address the rest of the feedback soon, thanks!)


    stickies-v commented at 2:38 pm on May 1, 2025:

    Thank you, I think I understand now. The commit message and the placement of the new CheckBlockIndex() call in 79bef1bc52afab860782d6d9a6016b93f94c4503 being before (rather than after) the block index mutations led me to believe that we were worried about other threads leaving the block index in an inconsistent state, and made this particular addition feel rather arbitrary. Instead, it seems you want to ensure that InvalidateBlock() does not leave the block index in an inconsistent state whenever cs_main is released.

    If that’s correct, then I think:

    • CheckBlockIndex() should be called after, rather than before, logic that mutates the block index. Of course, in a loop (and with the extra call after the loop finishes), this is largely (but not entirely) equivalent, so it’s mostly about readability and signaling intent.
      • One concern here is that the loop could (and sometimes does*) return early, in which case we might never call CheckBlockIndex(), which should probably be addressed with an RAII struct that does the check whenever it goes out of scope? Or alternatively, just add a CheckBlockIndex() call whenever we return early.
    • CheckBlockIndex() should be called right before releasing, but while it still has the cs_main lock, to avoid synchronization issues (effectively making CheckBlockIndex() an EXCLUSIVE_LOCKS_REQUIRED(::cs_main) method)

    * Note: the current implementation in 79bef1bc52afab860782d6d9a6016b93f94c4503 will not call CheckBlockIndex() if DisconnectTip() fails. At first sight, it doesn’t look like DisconnectTip() can bring m_block_index in an inconsistent state, so probably fine?


    mzumsande commented at 10:56 pm on May 5, 2025:

    If that’s correct, then I think:

    Yes, that’s correct.

    CheckBlockIndex() should be called right before releasing, but while it still has the cs_main lock, to avoid synchronization issues (effectively making CheckBlockIndex() an EXCLUSIVE_LOCKS_REQUIRED(::cs_main) method)

    CheckBlockIndex() is called from multiple places in validation, some of which hold cs_main, some of which don’t. My understanding is that it must always pass if cs_main isn’t held (because at that time, other threads could change the block index / invoke CBI themselves), but not necessarily while cs_main  is held (not everything is atomic).

    In general, I think that CBI calls should be strategically placed to spots where (in case of bugs) those are most likely to occur - no need to cover every return path etc.

    On second thought, I think the added call (even if it helped find issues during development) might be overkill, considering that we already have a call here. So I have remove the commit that adds it for now.


    stickies-v commented at 9:56 am on May 6, 2025:

    So I have remove the commit that adds it for now.

    That seems good to me.

    My understanding is that it must always pass if cs_main isn’t held (because at that time, other threads could change the block index / invoke CBI themselves), but not necessarily while cs_main is held (not everything is atomic).

    This still feels awkward to me, like we’re sprinkling tests throughout the code in case they catch something without a clear aim. If we’re trying to ensure that InvalidateBlock() does not leave the block index (BI) in an inconsistent state, then releasing cs_main so another thread can change the BI seems like the test isn’t just testing InvalidateBlock(), but also (especially) those other threads? If we want to test InvalidateBlock(), then imo we must call CBI before releasing cs_main. And if other threads might be problematic, then we should add CBI call(s) in those locations?

    Anyway, commit removed, so this can be resolved now. Thanks!


    stickies-v commented at 10:05 am on May 6, 2025:

    So I have remove the commit that adds it for now.

    Actually, it looks like 79bef1bc52afab860782d6d9a6016b93f94c4503 is still here (with HEAD 9d001dc1b1627f7a973e21227955bba1f473d6aa). I think you accidentally removed 4153139425 instead of 79bef1bc52afab860782d6d9a6016b93f94c4503?


    mzumsande commented at 5:33 pm on May 6, 2025:

    Actually, it looks like https://github.com/bitcoin/bitcoin/commit/79bef1bc52afab860782d6d9a6016b93f94c4503 is still here

    oops, fixed

    releasing cs_main so another thread can change the BI

    I wasn’t suggesting this, the reason why we are releasing cs_main frequently during invalidateblock in the first place is so that the node/GUI doesn’t become unresponsive in case many blocks in a row are disconnected.

    In the past, there was a problem with the intermediate state in InvalidateBlock() (fixed in #16849) after cs_main was released that was only discovered because in a functional test it happened that headers were received/CBI was invoked in a different thread (https://github.com/bitcoin/bitcoin/issues/16444.).


    stickies-v commented at 10:25 am on May 7, 2025:
    Can be marked as resolved, thanks! I think we were talking past each other a bit but resolved offline.
  51. DrahtBot added the label Needs rebase on Apr 23, 2025
  52. in src/validation.cpp:3756 in 96e6371328 outdated
    3752@@ -3754,15 +3753,35 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3753             m_blockman.m_dirty_blockindex.insert(to_mark_failed);
    3754         }
    3755 
    3756-        // Add any equal or more work headers to setBlockIndexCandidates
    3757-        auto candidate_it = candidate_blocks_by_work.lower_bound(invalid_walk_tip->pprev->nChainWork);
    3758-        while (candidate_it != candidate_blocks_by_work.end()) {
    3759-            if (!CBlockIndexWorkComparator()(candidate_it->second, invalid_walk_tip->pprev)) {
    3760+        // Mark descendants of the invalidated block as invalid
    


    stickies-v commented at 2:15 pm on April 24, 2025:

    nit: Would label as out-of-chain (or non-active-chain for consistency)

    0        // Mark out-of-chain descendants of the invalidated block as invalid
    

    mzumsande commented at 10:06 pm on May 5, 2025:
    done
  53. in src/validation.cpp:3759 in 5b92a30a20 outdated
    3753@@ -3754,17 +3754,24 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3754             m_blockman.m_dirty_blockindex.insert(to_mark_failed);
    3755         }
    3756 
    3757-        // Add any equal or more work headers to setBlockIndexCandidates
    3758+        // Mark descendants of the invalidated block as invalid
    3759+        // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3760+        // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3761         auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    


    stickies-v commented at 2:39 pm on April 24, 2025:

    nit: I think it would make the code more readable to introduce a CBlockIndex* candidate here, to avoid all the candidate_it->second in an already verbose code block:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index fed9186553..6ceeaa6ab5 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -3759,17 +3759,18 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
     5         // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
     6         auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
     7         while (candidate_it != highpow_outofchain_headers.end()) {
     8-            if (candidate_it->second->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip) {
     9-                candidate_it->second->nStatus = (candidate_it->second->nStatus & ~BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
    10-                m_blockman.m_dirty_blockindex.insert(candidate_it->second);
    11+            CBlockIndex* candidate{candidate_it->second};
    12+            if (candidate->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip) {
    13+                candidate->nStatus = (candidate->nStatus & ~BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
    14+                m_blockman.m_dirty_blockindex.insert(candidate);
    15                 // If invalidated, the block is irrelevant for setBlockIndexCandidates and can be removed from the cache
    16                 candidate_it = highpow_outofchain_headers.erase(candidate_it);
    17                 continue;
    18             }
    19-            if (!CBlockIndexWorkComparator()(candidate_it->second, invalid_walk_tip->pprev) &&
    20-                candidate_it->second->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    21-                candidate_it->second->HaveNumChainTxs()) {
    22-                setBlockIndexCandidates.insert(candidate_it->second);
    23+            if (!CBlockIndexWorkComparator()(candidate, invalid_walk_tip->pprev) &&
    24+                candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    25+                candidate->HaveNumChainTxs()) {
    26+                setBlockIndexCandidates.insert(candidate);
    27             }
    28             ++candidate_it;
    29         }
    

    mzumsande commented at 10:06 pm on May 5, 2025:
    done, good idea!
  54. in src/validation.cpp:3769 in 5b92a30a20 outdated
    3759+        // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3760+        // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3761         auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    3762         while (candidate_it != highpow_outofchain_headers.end()) {
    3763+            if (candidate_it->second->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip) {
    3764+                candidate_it->second->nStatus = (candidate_it->second->nStatus & ~BLOCK_FAILED_VALID) | BLOCK_FAILED_CHILD;
    


    stickies-v commented at 2:54 pm on April 24, 2025:

    nit: I find splitting this up in two statements to be much more readable:

    0                // BLOCK_FAILED_VALID should never be set if BLOCK_FAILED_CHILD is set because <...>
    1                candidate->nStatus &= ~BLOCK_FAILED_VALID;
    2                candidate->nStatus |= BLOCK_FAILED_CHILD;
    

    Performance implications seems irrelevant here, and most likely optimized by the compiler anyway.


    mzumsande commented at 10:06 pm on May 5, 2025:
    done. The suggested comment is a bit too strong for my taste: In reality the distinction doesn’t matter, even if both were set nothing bad would happen, see #31835. This is more for being consistent with how it’s handled elsewhere.

    stickies-v commented at 10:00 am on May 6, 2025:
    Thanks, the new comment “Children of failed blocks should be marked as BLOCK_FAILED_CHILD instead.” works for me, can be marked as resolved.
  55. in src/validation.cpp:3759 in f762a6d8db outdated
    3756@@ -3757,12 +3757,19 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3757         // Mark descendants of the invalidated block as invalid
    3758         // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3759         // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3760+        // Recalculate the best header if it became invalid.
    


    stickies-v commented at 2:59 pm on April 24, 2025:
    nit: for grepping purposes: s/the best header/m_best_header

    mzumsande commented at 10:04 pm on May 5, 2025:
    done
  56. in src/validation.cpp:3764 in f762a6d8db outdated
    3756@@ -3757,12 +3757,19 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3757         // Mark descendants of the invalidated block as invalid
    3758         // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3759         // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3760+        // Recalculate the best header if it became invalid.
    3761         auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    3762+
    3763+        const bool best_header_needs_update{m_chainman.m_best_header->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip};
    3764+        if (best_header_needs_update) {
    3765+            m_chainman.m_best_header = invalid_walk_tip->pprev; // this block is definitely still valid at this point, but there may be better ones
    


    stickies-v commented at 3:01 pm on April 24, 2025:
    nit: s/this block/pprev/ for reduced ambiguity

    mzumsande commented at 10:03 pm on May 5, 2025:
    done
  57. mzumsande force-pushed on Apr 24, 2025
  58. mzumsande commented at 6:46 pm on April 24, 2025: contributor
  59. in src/validation.cpp:3782 in d854a34525 outdated
    3777@@ -3771,6 +3778,10 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3778                 candidate_it->second->HaveNumChainTxs()) {
    3779                 setBlockIndexCandidates.insert(candidate_it->second);
    3780             }
    3781+            if (best_header_needs_update &&
    3782+                !CBlockIndexWorkComparator()(candidate_it->second, m_chainman.m_best_header)) {
    


    stickies-v commented at 7:52 pm on April 24, 2025:

    nit: it’s consistent with how CBlockIndexWorkComparator is used most of the time in our codebase, but I don’t understand why we don’t write this without the negation (swapping the arguments instead)?

    0                CBlockIndexWorkComparator()(m_chainman.m_best_header, candidate_it->second))
    

    mzumsande commented at 10:03 pm on May 5, 2025:

    CBlockIndexWorkComparator()(a, b)) means that a is strictly smaller than b. !CBlockIndexWorkComparator()(b, a)) means that a is smaller or or equal to b (in this case, equal means the same block index because of the pointer address criterion).

    I think that’s why sometimes one or the other is used.

    In this case, it doesn’t really matter because the two elements are chosen such that they cannot be the same, so I’ll change it to the simpler version.


    stickies-v commented at 11:32 am on May 6, 2025:

    I think that’s why sometimes one or the other is used.

    Right, that makes sense.

    I think in the two other cases where CBlockIndexWorkComparator is used in this PR, comparing directly on nChainwork would be more readable and more correct?

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index b7fe228bbf..b4f6bb40f4 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -3701,7 +3701,7 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
     5             // Instead, consider only non-active-chain blocks that have at
     6             // least as much work as where we expect the new tip to end up.
     7             if (!m_chain.Contains(candidate) &&
     8-                !CBlockIndexWorkComparator()(candidate, pindex->pprev) &&
     9+                (candidate->nChainWork >= pindex->pprev->nChainWork) &&
    10                 !(candidate->nStatus & BLOCK_FAILED_MASK)) {
    11                 highpow_outofchain_headers.insert(std::make_pair(candidate->nChainWork, candidate));
    12             }
    13@@ -3775,7 +3775,7 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    14                 candidate_it = highpow_outofchain_headers.erase(candidate_it);
    15                 continue;
    16             }
    17-            if (!CBlockIndexWorkComparator()(candidate, invalid_walk_tip->pprev) &&
    18+            if ((candidate->nChainWork >= invalid_walk_tip->pprev->nChainWork) &&
    19                 candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    20                 candidate->HaveNumChainTxs()) {
    21                 setBlockIndexCandidates.insert(candidate);
    

    Where it is used to update m_best_header, which is just a single block index, it seems sticking to CBlockIndexWorkComparator seems appropriate for determinism.


    mzumsande commented at 5:18 pm on May 6, 2025:

    I think that this wouldn’t be correct: We use CBlockIndexWorkComparator during normal adding to setBlockIndexCandidates (here) and also use it while asserting that blocks that sort worse are not in the set here. If we’d insert blocks in the set that have the same nChainWork but sort worse (due to nSequence), I would expect the assert to be hit.

    That being said, some of the comments about this (e.g. also in TryAddBlockIndexCandidate) are misleading in my opinion. I have reworded the spot in InvalidateBlock, any other spots should be cleaned up separately.

  60. DrahtBot removed the label Needs rebase on Apr 24, 2025
  61. stickies-v commented at 8:10 pm on April 24, 2025: contributor

    The implementation simplifications since my last review are great (and require a small update to the PR description which still mentions the addition of two caches).

    I’m still not 100% comfortable that the changes don’t introduce any unwanted side-effects, so I want to give it another full review first. Approach ACK.

  62. DrahtBot added the label CI failed on Apr 24, 2025
  63. DrahtBot removed the label CI failed on Apr 29, 2025
  64. mzumsande force-pushed on May 5, 2025
  65. mzumsande force-pushed on May 5, 2025
  66. mzumsande force-pushed on May 6, 2025
  67. in src/validation.cpp:4543 in 65de18836e outdated
    4539@@ -4540,6 +4540,7 @@ bool ChainstateManager::AcceptBlock(const std::shared_ptr<const CBlock>& pblock,
    4540         if (state.IsInvalid() && state.GetResult() != BlockValidationResult::BLOCK_MUTATED) {
    4541             pindex->nStatus |= BLOCK_FAILED_VALID;
    4542             m_blockman.m_dirty_blockindex.insert(pindex);
    4543+            ActiveChainstate().InvalidChainFound(pindex);
    


    stickies-v commented at 9:57 am on May 15, 2025:

    This code block now duplicates InvalidBlockFound(), minus the setBlockIndex erasure - but I think that should be a (fast) no-op, perhaps it’s worth deduplicating this?

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 31a0f54c11..3312849024 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -4516,11 +4516,8 @@ bool ChainstateManager::AcceptBlock(const std::shared_ptr<const CBlock>& pblock,
     5 
     6     if (!CheckBlock(block, state, params.GetConsensus()) ||
     7         !ContextualCheckBlock(block, state, *this, pindex->pprev)) {
     8-        if (state.IsInvalid() && state.GetResult() != BlockValidationResult::BLOCK_MUTATED) {
     9-            pindex->nStatus |= BLOCK_FAILED_VALID;
    10-            m_blockman.m_dirty_blockindex.insert(pindex);
    11-            ActiveChainstate().InvalidChainFound(pindex);
    12-        }
    13+        Assume(state.IsInvalid());
    14+        ActiveChainstate().InvalidBlockFound(pindex, state);
    15         LogError("%s: %s\n", __func__, state.ToString());
    16         return false;
    17     }
    

    From my (limited) understanding, think state.IsInvalid() should always evaluate to true and thus seems like more of an assertion, but it’s orthogonal to my suggestion.


    ryanofsky commented at 4:40 pm on June 2, 2025:

    re: #31405 (review)

    In commit “validation: call InvalidChainfound also from AcceptBlock” (65de18836e5ddbdb8388e3bbdbcebee84c3b4c59)

    The suggestion to call InvalidBlockFound instead of InvalidChainFound to avoid duplicating logic seems like a good idea, and would be interested to see a response, even if the change is not worth making here.


    ryanofsky commented at 4:42 pm on June 2, 2025:

    In commit “validation: call InvalidChainfound also from AcceptBlock” (65de18836e5ddbdb8388e3bbdbcebee84c3b4c59):

    Even after reading PR overview I found this commit description really confusing, because it only explained why it was ok to set flags here, not why it was actually good, or how the code wasn’t broken before this commit. It would be helpful if the commit said these things explicitly:

    • This commit marks descendants of failed blocks with BLOCK_FAILED_CHILD and updates m_best_header when these things weren’t done reliably before.
    • This is not a bugfix, because the flag was only being set on a best-effort basis before.
    • Setting these reliably has a slight performance cost but will make the code less complicated and allow removing m_failed_blocks in a later commit.

    ryanofsky commented at 4:57 pm on June 2, 2025:

    In commit “validation: call InvalidChainfound also from AcceptBlock” (65de18836e5ddbdb8388e3bbdbcebee84c3b4c59):

    Commit description mentions “at the cost of looping over the entire block index” which is referring to the loop in Chainstate::SetBlockFailureFlags. Probably not important in this context, but that function does seem like it could trigger unnecessary work because it is adding blocks to m_dirty_blockindex regardless of whether flags have actually changed, which is different than most of the other code updating m_dirty_blockindex. Maybe it doesn’t matter because the function has no callers that will mark already invalid blocks invalid, but not checking for changes does seem like a footgun that could lead to unnecessary database writes if the function is called other places.

    Just noting in case this could be a potential followup, or in case I’m missing something.

    EDIT: It seems like this behavior was changed recently in aac5488909f72f8c5a91ca7f12398069b7cd9ce4 from #31835. Before that commit there was an incorrect check to see if m_dirty_blockindex should be updated, which the commit removed instead of replacing.


    mzumsande commented at 6:26 pm on June 5, 2025:

    done - I agree. The removal from setBlockIndexCandidates will be (unnecessary) because at this stage the index cannot be in that set - ReceivedBlockTransactiosns is only called later. But this is not on a performance-critical path (since behind PoW) so the added removal shouldn’t matter.

    From my (limited) understanding, think state.IsInvalid() should always evaluate to true and thus seems like more of an assertion, but it’s orthogonal to my suggestion.

    I don’t think calling InvalidBlockFound() if the Assume failed for an unexpected reason in a non-debug build would be great, so I added the Assume but left the if condition in.


    mzumsande commented at 6:41 pm on June 5, 2025:
    I changed the commit message based on your suggestion. I dropped the “make the code less complicated” part - it’s a bit subjective and anyway not my main motivation for this PR: I think that in an area as central as validation we should avoid having structures that are populated on a “best-effort” basis (and therefore cannot be subjected to rigorous checks as in CheckBlockIndex) and trust that no one will ever rely on them for anything too serious in the future.

    mzumsande commented at 6:43 pm on June 5, 2025:

    which is referring to the loop in Chainstate::SetBlockFailureFlags

    It’s referring to SetBlockFailureFlags and RecalculateBestHeader which both loop over the block index.

    unnecessary database writes

    I agree that this is a potential follow-up (to #31835, pinging @stratospher). It’s probably not performance-critical, because it’s also guarded by PoW (creating multiple PoW-valid headers that build on a failed block is very costly).

  68. in src/validation.cpp:3706 in ed172d3002 outdated
    3707-                    candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3708-                    candidate->HaveNumChainTxs()) {
    3709-                candidate_blocks_by_work.insert(std::make_pair(candidate->nChainWork, candidate));
    3710+                !CBlockIndexWorkComparator()(candidate, pindex->pprev) &&
    3711+                !(candidate->nStatus & BLOCK_FAILED_MASK)) {
    3712+                highpow_outofchain_headers.insert(std::make_pair(candidate->nChainWork, candidate));
    


    stickies-v commented at 10:05 am on May 15, 2025:

    nit

    0                highpow_outofchain_headers.insert({candidate->nChainWork, candidate});
    

    mzumsande commented at 6:26 pm on June 5, 2025:
    done
  69. in src/validation.cpp:3764 in 9304257d67 outdated
    3764+            CBlockIndex* candidate{candidate_it->second};
    3765+            if (!CBlockIndexWorkComparator()(candidate, invalid_walk_tip->pprev) &&
    3766+                candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3767+                candidate->HaveNumChainTxs()) {
    3768+                setBlockIndexCandidates.insert(candidate);
    3769+                candidate_it = highpow_outofchain_headers.erase(candidate_it);
    


    stickies-v commented at 10:16 am on May 15, 2025:

    In 9304257d67e6f280cc1b45021118c6bb37489d89:

    Since we don’t remove blocks after inserting into setBlockIndexCandidates anymore, this means that blocks could be inserted multiple times now into the set - this won’t have any effect, so behavior isn’t changed.

    I suspect with “remove blocks” you mean erasing from highpow_outofchain_headers? In which case, I don’t think this part of the commit message is correct, at least for this commit?


    ryanofsky commented at 5:09 pm on June 2, 2025:

    re: #31405 (review)

    In commit “validation: cache all headers with enough PoW in invalidateblock” (9304257d67e6f280cc1b45021118c6bb37489d89)

    In which case, I don’t think this part of the commit message is correct, at least for this commit?

    Yes, I was going to ask about this same thing. I don’t understand what “Since we don’t remove blocks after inserting into setBlockIndexCandidates anymore” is referring to.

    EDIT: I guess it refers to the erase() call dropped in the subsequent commit eb973022dc2a324af4d61d9b55134577788bc83e, so maybe this comment should be moved there?


    mzumsande commented at 6:26 pm on June 5, 2025:
    good point - yes, that comment belonged to the following commit. I moved it there in a slightly shorter form.
  70. in src/validation.cpp:3773 in ed172d3002 outdated
    3776+            if (candidate->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip) {
    3777+                // Children of failed blocks should be marked as BLOCK_FAILED_CHILD instead.
    3778+                candidate->nStatus &= ~BLOCK_FAILED_VALID;
    3779+                candidate->nStatus |= BLOCK_FAILED_CHILD;
    3780+                m_blockman.m_dirty_blockindex.insert(candidate);
    3781+                // If invalidated, the block is irrelevant for setBlockIndexCandidates and m_best_header and can be removed from the cache
    


    stickies-v commented at 10:51 am on May 15, 2025:

    grammar nit: parallel structures (and line-length)

    0                // If invalidated, the block is irrelevant for setBlockIndexCandidates and for
    1                // m_best_header so it can be removed from the cache
    

    mzumsande commented at 6:26 pm on June 5, 2025:
    done
  71. in src/validation.cpp:3783 in ed172d3002 outdated
    3786+                candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3787+                candidate->HaveNumChainTxs()) {
    3788+                setBlockIndexCandidates.insert(candidate);
    3789+            }
    3790+            if (best_header_needs_update &&
    3791+                CBlockIndexWorkComparator()(m_chainman.m_best_header, candidate)) {
    


    stickies-v commented at 11:31 am on May 15, 2025:

    note: this logic is different to RecalculateBestHeader: https://github.com/bitcoin/bitcoin/blob/c779ee3a4044df3a263394bbb8b104aeeaa7c727/src/validation.cpp#L6404

    From our earlier conversation it seems like CBlockIndexWorkComparator is the correct choice here, so probably this can be fixed (if necessary) in a separate pull, but it seems worth raising here? Relatedly, I tried modifying CheckBlockIndex to use CBlockIndexWorkComparator here, and that leads to assertions being raised, so it seems like m_best_header is currently guaranteed to be highest-work, but not necessarily highest-score (even when patching RecalculateBestHeader to use CBlockIndexWorkComparator)


    ryanofsky commented at 5:51 pm on June 2, 2025:

    re: #31405 (review)

    In commit “validation: in invalidateblock, calculate m_best_header right away” (a335d003eba7bf8204db457a2a6dcca0b2f53a43)

    It does seem surprising that if both RecalculateBestHeader and the CheckBlockIndex pindex->nChainWork <= m_best_header->nChainWork were changed to use CBlockIndexWorkComparator the check would fail. Would be good to know what causes this and it would be good to have a comment in CheckBlockIndex or on the m_best_header variable about which comparison is actually supposed to be used.


    mzumsande commented at 6:37 pm on June 5, 2025:

    I think that this behavior is because the existing behavior during normal addition (here), is to set it to the header received first, and it is not changed in ReceivedBlockTransactions based on which full blocks are first received in full/connected.

    So in this scenario 1.) Receive Header 1 2.) Receive Header 2 of same height 3.) Receive/Connect Block 2 4.) Receive Block 1 the tip will be at Block 2, but m_best_header will be at header 1.

    This is why changing the CheckBlockIndex assert to use CBlockIndexWorkComparator should lead to assert failures. This could be changed in a follow-up, but we should check more closely if this would be beneficial to any of the uses of m_best_header (maybe not?).

    In this PR, I changed the InvalidateBlock code to use nChainWork to be similar to the way it’s done in RecalculateBlockHeader (to not raise the expectation that it needs to fulfill CBlockIndexWorkComparator ordering) - although it’s arbitrary which block will be picked in case of multiple candidates. I also added a comment to m_best_header that it does not follow CBlockIndexWorkComparator tiebreak scores.

  72. in src/validation.cpp:3765 in ed172d3002 outdated
    3767+        auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    3768+
    3769+        const bool best_header_needs_update{m_chainman.m_best_header->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip};
    3770+        if (best_header_needs_update) {
    3771+            m_chainman.m_best_header = invalid_walk_tip->pprev; // pprev is definitely still valid at this point, but there may be better ones
    3772+        }
    


    stickies-v commented at 11:40 am on May 15, 2025:

    nit: developer notes suggest line length < 100

    0        const bool best_header_needs_update{
    1            m_chainman.m_best_header->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip
    2        };
    3        if (best_header_needs_update) {
    4            // pprev is definitely still valid at this point, but there may be better ones
    5            m_chainman.m_best_header = invalid_walk_tip->pprev;
    6        }
    

    mzumsande commented at 6:37 pm on June 5, 2025:
    fixed
  73. stickies-v approved
  74. stickies-v commented at 12:10 pm on May 15, 2025: contributor

    ACK ed172d3002da68a3ddbd5d13e7d3f0618c1378d4

    I’ve thoroughly gone through this PR many times now, and I don’t think I can assure myself anymore than I am now that it is correct and safe. Still, the code touched in this PR is fairly complicated and coupled, and I am not as familiar with this validation logic as others, so my main concern is that there are unexpected side-effects I’ve not considered.

    Nits can be ignored, comments (if correct) can be addressed in other pulls.

  75. DrahtBot requested review from TheCharlatan on May 15, 2025
  76. DrahtBot requested review from stratospher on May 15, 2025
  77. TheCharlatan approved
  78. TheCharlatan commented at 1:27 pm on May 17, 2025: contributor
    Re-ACK ed172d3002da68a3ddbd5d13e7d3f0618c1378d4
  79. achow101 referenced this in commit aee7cec0db on May 27, 2025
  80. achow101 commented at 11:06 pm on May 27, 2025: member

    ACK ed172d3002da68a3ddbd5d13e7d3f0618c1378d4

    Would prefer if others more familiar with consensus could review this before merging.

  81. in src/validation.cpp:3764 in eb973022dc outdated
    3771+            }
    3772             if (!CBlockIndexWorkComparator()(candidate, invalid_walk_tip->pprev) &&
    3773                 candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3774                 candidate->HaveNumChainTxs()) {
    3775                 setBlockIndexCandidates.insert(candidate);
    3776-                candidate_it = highpow_outofchain_headers.erase(candidate_it);
    


    ryanofsky commented at 6:34 pm on June 2, 2025:

    In commit “validation: in invalidateblock, mark children as invalid right away” (eb973022dc2a324af4d61d9b55134577788bc83e)

    Comment in commit message about the candidate getting marked failed later seems like useful information to have in the actual code.

    Maybe add comment here like “// Do not remove candidate from the highpow cache, because it might be a descendant of the block being invalidated which needs to be marked failed later.”


    ryanofsky commented at 1:45 pm on June 6, 2025:

    re: #31405 (review)

    In commit “validation: in invalidateblock, mark children as invalid right away” (328345d571d9b86af54e915b30183b91d28cab2f)

    Thanks for the new comment. Now that there’s an explanation directly in the code, I think it would be good to replace the text “Entries from…” paragraph with a simple statement that the commit doesn’t change behavior, since the paragraph only repeats the code comment out of context.

  82. in src/validation.cpp:4385 in ed172d3002 outdated
    4398-        if (!pindexPrev->IsValid(BLOCK_VALID_SCRIPTS)) {
    4399-            // The above does not mean "invalid": it checks if the previous block
    4400-            // hasn't been validated up to BLOCK_VALID_SCRIPTS. This is a performance
    4401-            // optimization, in the common case of adding a new block to the tip,
    4402-            // we don't need to iterate over the failed blocks list.
    4403-            for (const CBlockIndex* failedit : m_failed_blocks) {
    


    ryanofsky commented at 6:45 pm on June 2, 2025:

    In commit “validation: remove m_failed_blocks” (ed172d3002da68a3ddbd5d13e7d3f0618c1378d4)

    Found this commit message confusing because it is talking about changes that actually happened in previous commits, and only referring to the code that sets flags, not the code that checks them.

    Would suggest a more specific description like “After previous changes, the pindexPrev->nStatus check in AcceptBlockHeader is now sufficient to detect invalid blocks and checking m_failed_blocks there is no longer necessary.


    mzumsande commented at 6:44 pm on June 5, 2025:
    I rewrote the commit message - hope it’s clearer now.
  83. ryanofsky approved
  84. ryanofsky commented at 7:08 pm on June 2, 2025: contributor

    Code review ACK ed172d3002da68a3ddbd5d13e7d3f0618c1378d4. Seems a lot nicer to track more reliable information instead of relying on best-effort heuristics if this is now possible.

    This change looks ready to merge with all the reviews and discussion so far, but there are some suggestions from 3 weeks ago #31405#pullrequestreview-2843060953 that could be responded to. Could merge this or hold off depending on your preference @mzumsande

  85. mzumsande commented at 7:23 pm on June 2, 2025: contributor

    This change looks ready to merge with all the reviews and discussion so far, but there are some suggestions from 3 weeks ago #31405 (review) that could be responded to. Could merge this or hold off depending on your preference @mzumsande

    Please hold off merging, I’ll address @stickies-v and your comments later this week!

  86. mzumsande force-pushed on Jun 5, 2025
  87. mzumsande commented at 6:51 pm on June 5, 2025: contributor
    ed172d3 to fdcdc2c: Addressed feedback by @stickies-v and @ryanofsky - thanks!
  88. mzumsande force-pushed on Jun 5, 2025
  89. in src/validation.cpp:3783 in fdcdc2cdea outdated
    3786+            }
    3787+            if (!CBlockIndexWorkComparator()(candidate, invalid_walk_tip->pprev) &&
    3788+                candidate->IsValid(BLOCK_VALID_TRANSACTIONS) &&
    3789+                candidate->HaveNumChainTxs()) {
    3790+                setBlockIndexCandidates.insert(candidate);
    3791+                // Do not remove candidate from the highpow cache, because it might be a descendant of the block being invalidated
    


    stickies-v commented at 12:25 pm on June 6, 2025:

    nit for greppability (and line lenght)

    0                // Do not remove candidate from highpow_outofchain_headers, because it might be a
    1                // descendant of the block being invalidated which needs to be marked failed later.
    

    mzumsande commented at 3:20 pm on June 6, 2025:
    done
  90. in src/validation.cpp:4524 in fdcdc2cdea outdated
    4519@@ -4537,9 +4520,9 @@ bool ChainstateManager::AcceptBlock(const std::shared_ptr<const CBlock>& pblock,
    4520 
    4521     if (!CheckBlock(block, state, params.GetConsensus()) ||
    4522         !ContextualCheckBlock(block, state, *this, pindex->pprev)) {
    4523-        if (state.IsInvalid() && state.GetResult() != BlockValidationResult::BLOCK_MUTATED) {
    4524-            pindex->nStatus |= BLOCK_FAILED_VALID;
    4525-            m_blockman.m_dirty_blockindex.insert(pindex);
    4526+        Assume(state.IsInvalid());
    4527+        if (state.IsInvalid()) {
    


    stickies-v commented at 12:36 pm on June 6, 2025:

    inline nit

    0        if (Assume(state.IsInvalid())) {
    

    mzumsande commented at 3:20 pm on June 6, 2025:
    done
  91. in src/validation.h:1043 in fdcdc2cdea outdated
    1060-    std::set<CBlockIndex*> m_failed_blocks;
    1061-
    1062-    /** Best header we've seen so far (used for getheaders queries' starting points). */
    1063+    /** Best header we've seen so far for which the block is not known to be invalid
    1064+        (used, among others, for getheaders queries' starting points).
    1065+        In case of multiple best headers with the same work, it could point to any:
    


    stickies-v commented at 12:38 pm on June 6, 2025:

    nit

    0        In case of multiple best headers with the same work, it could point to any.
    

    mzumsande commented at 3:21 pm on June 6, 2025:
    changed to “because” - it was meant to imply a causality between the sentences.
  92. stickies-v approved
  93. stickies-v commented at 12:43 pm on June 6, 2025: contributor

    re-ACK fdcdc2cdeadb5d36076f1b94b54261e22e031354

    Mostly documentation changes addressing review feedback. Non-trivial code changes are:

  94. DrahtBot requested review from TheCharlatan on Jun 6, 2025
  95. DrahtBot requested review from achow101 on Jun 6, 2025
  96. DrahtBot requested review from ryanofsky on Jun 6, 2025
  97. in src/validation.cpp:3755 in 328345d571 outdated
    3752@@ -3753,18 +3753,30 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3753             m_blockman.m_dirty_blockindex.insert(to_mark_failed);
    3754         }
    3755 
    3756-        // Add any equal or more work headers to setBlockIndexCandidates
    3757+        // Mark out-of-chain descendants of the invalidated block as invalid
    


    ryanofsky commented at 2:14 pm on June 6, 2025:

    In commit “validation: in invalidateblock, mark children as invalid right away” (328345d571d9b86af54e915b30183b91d28cab2f)

    Would seem more accurate to say this code is marking out-of-chain descendants of the “disconnected block” rather than the “invalidated block” since the invalidated block sounds like the block passed to invalidateblock

  98. in src/validation.cpp:3763 in 809413fde6 outdated
    3755@@ -3756,7 +3756,15 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3756         // Mark out-of-chain descendants of the invalidated block as invalid
    3757         // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3758         // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3759+        // Recalculate m_best_header if it became invalid.
    3760         auto candidate_it = highpow_outofchain_headers.lower_bound(invalid_walk_tip->pprev->nChainWork);
    3761+
    3762+        const bool best_header_needs_update{m_chainman.m_best_header->GetAncestor(invalid_walk_tip->nHeight) == invalid_walk_tip};
    3763+        if (best_header_needs_update) {
    3764+            // pprev is definitely still valid at this point, but there may be better ones
    


    ryanofsky commented at 2:26 pm on June 6, 2025:

    In commit “validation: in invalidateblock, calculate m_best_header right away” (809413fde6893e60078e02b401a9d98a3e341fc1)

    Maybe replace “but there may be better ones” with “but there could be a better header, which will be found in the loop below”, because current comment seems to imply that the code will not actually find the best header.

  99. in src/validation.cpp:3787 in 809413fde6 outdated
    3783@@ -3776,6 +3784,10 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3784                 // Do not remove candidate from the highpow cache, because it might be a descendant of the block being invalidated
    3785                 // which needs to be marked failed later.
    3786             }
    3787+            if (best_header_needs_update &&
    3788+                m_chainman.m_best_header->nChainWork < candidate->nChainWork) {
    


    ryanofsky commented at 2:44 pm on June 6, 2025:

    In commit “validation: in invalidateblock, calculate m_best_header right away” (809413fde6893e60078e02b401a9d98a3e341fc1)

    It seems like it could be better to use CBlockIndexWorkComparator here instead of relying highpow_outofchain_headers multimap ordering when choosing m_best_header and two headers have the same mount of work because CBlockIndexWorkComparator should be more deterministic and consistent with surrounding code, while map order will depend on iteration order of the m_block_index unordered set.

    EDIT: Never mind probably. It looks like RecalculateBestHeader is also relying on m_block_index iteration order so this is consistent with that. Could be helpful to just add a comment about intent though.


    TheCharlatan commented at 8:17 am on June 10, 2025:
    I’d like a comment here too.

    mzumsande commented at 4:53 pm on June 10, 2025:
    Will add one in a follow-up if I don’t need to push.
  100. in src/validation.cpp:3758 in 809413fde6 outdated
    3755@@ -3756,7 +3756,15 @@ bool Chainstate::InvalidateBlock(BlockValidationState& state, CBlockIndex* pinde
    3756         // Mark out-of-chain descendants of the invalidated block as invalid
    3757         // (possibly replacing a pre-existing BLOCK_FAILED_VALID with BLOCK_FAILED_CHILD)
    3758         // Add any equal or more work headers that are not invalidated to setBlockIndexCandidates
    3759+        // Recalculate m_best_header if it became invalid.
    


    ryanofsky commented at 2:56 pm on June 6, 2025:

    In commit “validation: in invalidateblock, calculate m_best_header right away” (809413fde6893e60078e02b401a9d98a3e341fc1)

    Commit message says “Before, m_best_header would be calculated only after disconnecting multiple blocks” but I’m actually not seeing where this was happening, though I think it must have been happening?

    Could help if commit message was clarified (or not if I’m just missing something obvious).


    stickies-v commented at 10:58 am on June 9, 2025:

    but I’m actually not seeing where this was happening

    There’s a InvalidChainFound(to_mark_failed) call near the end of InvalidateBlock(), which in turn calls RecalculateBestHeader().


    ryanofsky commented at 1:24 pm on June 9, 2025:

    re: #31405 (review)

    In commit “validation: in invalidateblock, calculate m_best_header right away” (9a70883002e1fee76c24810808af4fb43f2c8cf5)

    Thanks! It looks like the later RecalculateBestHeader() call is now unnecessary and could repeat the work this new code is doing, which doesn’t seem great from perspective of duplicating code. InvalidChainFound also seems to be updating m_best_invalid and m_best_header together while the new code is only updating the latter, so that seems a bit fragile.

    I wonder if maybe instead of adding the new code in this commit it might make more sense to just move the InvalidateBlock call above the CheckBlockIndex call?


    mzumsande commented at 2:58 pm on June 9, 2025:

    I wonder if maybe instead of adding the new code in this commit it might make more sense to just move the InvalidateBlock call above the CheckBlockIndex call?

    You probably meant to suggest moving the InvalidChainFound call.

    I think that wouldn’t be ideal: We could put it into the loop (before cs_main is released) and remove all other changed to InvalidateBlock of this PR, but then we’d loop over the entire block index with each iteration, which I’d like to to avoid for performance reasons (e.g. artificial rewinds for ten-thousands of block for AssumeUtxo parameter determination).

    If we put it directly before the CheckBlockIndex call, the block index would be in an inconsistent state after each release of cs_main in the main loop, so that other threads can call CheckBlockIndex() (this happens, for example, if a new header is received via p2p, see ProcessNewBlockHeaders).

    Maybe one possibility would be to add a parameter to InvalidChainFound to skip RecalculateBestHeader and SetBlockFailureFlags (because I think we want to keep setting of m_best_invalid,the logging, and maybe also CheckForkWarningConditions).


    ryanofsky commented at 1:19 pm on June 10, 2025:

    re: #31405 (review)

    In commit “validation: in invalidateblock, calculate m_best_header right away” (9a70883002e1fee76c24810808af4fb43f2c8cf5)

    Thanks. I was just thinking of moving it right before the CheckBlockIndex call, but didn’t think about the fact that CheckBlockIndex could be called by other threads during this time.

    Maybe one possibility would be to add a parameter to InvalidChainFound to skip RecalculateBestHeader

    Yes, this would be helpful especially if it could allow moving the InvalidChainFound call inside the loop instead of after it. Current code only updating m_best_header inside the loop before cs_main is released, and updating both m_best_header and m_best_invalid afterward with no comments about the redundancy or rationale seems fragile.


    mzumsande commented at 4:52 pm on June 10, 2025:
    I’ll look into this for a follow-up. InvalidChainFound logs inconditionally, so just moving it into the loop could be too verbose when disconnecting thousands of blocks.
  101. in src/validation.h:1044 in fdcdc2cdea outdated
    1037@@ -1038,7 +1038,10 @@ class ChainstateManager
    1038     }
    1039 
    1040 
    1041-    /** Best header we've seen so far (used for getheaders queries' starting points). */
    1042+    /** Best header we've seen so far for which the block is not known to be invalid
    1043+        (used, among others, for getheaders queries' starting points).
    1044+        In case of multiple best headers with the same work, it could point to any:
    1045+        CBlockIndexWorkComparator tiebreaker rules don't apply here. */
    


    ryanofsky commented at 3:07 pm on June 6, 2025:

    In commit “doc: Improve m_best_header documentation” (fdcdc2cdeadb5d36076f1b94b54261e22e031354)

    It seems like it would be nice to use CBlockIndexWorkComparator consistently for comparisons, but since it’s probably beyond scope of this PR, it’s good to have this documentation comment mentioning the lack of consistency.

    Just in case its useful for future reference, the post describing the current situation in the most detail is #31405 (review)

  102. mzumsande force-pushed on Jun 6, 2025
  103. mzumsande commented at 3:21 pm on June 6, 2025: contributor
    fdcdc2c tof6b782f: addressed nits
  104. DrahtBot added the label CI failed on Jun 6, 2025
  105. DrahtBot commented at 3:23 pm on June 6, 2025: contributor

    🚧 At least one of the CI tasks failed. Task TSan, depends, gui: https://github.com/bitcoin/bitcoin/runs/43626034007 LLM reason (✨ experimental): The CI failed due to a syntax error in src/validation.cpp at line 4493, caused by an unbalanced parenthesis in the line if (Assume(state.IsInvalid()) {.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  106. validation: call InvalidBlockFound also from AcceptBlock
    When a block it found invalid during acceptance (but before connection)
    we now mark its descendants with BLOCK_FAILED_CHILD and update
    m_best_header when these things weren't done reliably before.
    
    This does not fix a serious bug because the flags and m_best_header were being set on a best-effort basis before
    and not used for anything critical.
    Setting these reliably has a slight performance cost (iterating over the
    entire block index) but leads to more consistency in validation and allows removing m_failed_blocks in a later commit.
    This can only be triggered by providing a block with sufficient PoW
    that is otherwise invalid, so it is not a DoS vector.
    15fa5b5a90
  107. validation: cache all headers with enough PoW in invalidateblock
    We now include blocks without HaveNumChainTxs() / lower validation status
    than BLOCK_VALID_TRANSACTIONS. These checks are still performed at the
    spot where we use the cache to insert into setBlockIndexCandidates.
    
    This is in preparation for using the cache for more things than
    just setBlockIndexCandidates candidates in the following commits.
    
    Co-authored-by: stickies-v <stickies-v@protonmail.com>
    4c29326183
  108. validation: in invalidateblock, mark children as invalid right away
    Before, they would be marked as invalid only after disconnecting
    multiple blocks, letting go of cs_main in the meantime.
    This is in preparation for adding a check to
    CheckBlockIndex() requiring that descendants of invalid block indexes
    are always marked as invalid.
    Entries from highpow_outofchain_headers are now only removed if made invalid,
    no longer after inserting into setBlockIndexCandidates, because they
    might still become invalid later in the second case.
    This means that blocks could be inserted multiple times now into
    setBlockIndexCandidates - this won't have any effect, so
    behavior isn't changed.
    8e39f2d20d
  109. validation: in invalidateblock, calculate m_best_header right away
    Before, m_best_header would be calculated only after disconnecting
    multiple blocks, letting go of cs_main in the meantime.
    This is in preparation for adding checks to CheckBlockIndex()
    requiring that m_best_header is the most-work header not known to be invalid.
    
    Co-authored-by: stringintech <stringintech@gmail.com>
    9a70883002
  110. validation: Add more checks to CheckBlockIndex()
    This adds checks that
    1) Descendants of invalid block indexes are also marked invalid
    2) m_best_header cannot be invalid, and there can be no valid
    block with more work than it.
    ed764ea2b4
  111. validation: remove m_failed_blocks
    After changes in previous commits, we now mark all blocks that descend from an invalid block
    immediately as the block is found invalid. This happens both in the AcceptBlock
    and ConnectBlock stages of block validation.
    As a result, the pindexPrev->nStatus check in AcceptBlockHeader is now sufficient to detect
    invalid blocks and checking m_failed_blocks there is no longer necessary.
    ee673b9aa0
  112. doc: Improve m_best_header documentation f6b782f3aa
  113. mzumsande force-pushed on Jun 6, 2025
  114. ryanofsky approved
  115. ryanofsky commented at 3:33 pm on June 6, 2025: contributor
    Code review ACK fdcdc2cdeadb5d36076f1b94b54261e22e031354. Looks good, thanks for many new clarifications! Only changes since last review were switching from a CBlockIndexWorkComparator to a nChainWork comparison, switching from InvalidChainFound to InvalidBlockFound without changing behavior, and improving many comment. I left more suggestions below which are not important and all only about comments.
  116. ryanofsky approved
  117. ryanofsky commented at 3:36 pm on June 6, 2025: contributor
    Code review ACK f6b782f3aad4a6bcf823a9a0fabb4418bca1eea1 with only minor code & comment updates
  118. DrahtBot requested review from stickies-v on Jun 6, 2025
  119. DrahtBot removed the label CI failed on Jun 6, 2025
  120. stickies-v commented at 11:06 am on June 9, 2025: contributor

    re-ACK f6b782f3aad4a6bcf823a9a0fabb4418bca1eea1

    No changes except minor ones addressing review feedback wrt documentation updates and inlining.

  121. TheCharlatan approved
  122. TheCharlatan commented at 8:17 am on June 10, 2025: contributor
    Re-ACK f6b782f3aad4a6bcf823a9a0fabb4418bca1eea1
  123. ryanofsky approved
  124. ryanofsky commented at 1:51 pm on June 10, 2025: contributor
    This seems ready to merge, but it’s also a subtle change so it could be helpful if more reviewers want to take a look and ack. I’m planning to merge this in a day or two if nothing new comes up.
  125. achow101 commented at 5:54 pm on June 10, 2025: member
    reACK f6b782f3aad4a6bcf823a9a0fabb4418bca1eea1
  126. ryanofsky assigned ryanofsky on Jun 11, 2025
  127. ryanofsky merged this on Jun 11, 2025
  128. ryanofsky closed this on Jun 11, 2025

  129. mzumsande deleted the branch on Jun 13, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-06-18 00:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me