validation: prevent FindMostWorkChain from causing UB #35070

pull stratospher wants to merge 3 commits into bitcoin:master from stratospher:2026_04_m_blocks_unlinked_ub changing 3 files +18 −16
  1. stratospher commented at 1:03 PM on April 14, 2026: contributor

    This is joint work with @ mzumsande.

    note: this requires a pruned node with deep reorgs to trigger. still it breaks assumptions in the codebase and is good to fix. A similar UB was fixed in #34521.

    This PR prevents duplicate insertions into m_blocks_unlinked in FindMostWorkChain. There are 3 ways to insert into m_blocks_unlinked:

    1. LoadBlockIndex - not problematic, as each block index is processed only once.
    2. ReceivedBlockTransactions - not problematic, as this is usually only called once per block when it is first accepted in AcceptBlock. in the rare case it’s triggered again after pruning, the block would have been removed from m_blocks_unlinked when it was initially pruned, so duplicates still can’t arise.
    3. FindMostWorkChain - problematic when multiple candidate tips share common chains of ancestors, traversals from each tip to the fork point may insert duplicate (pprev, pindex) entries for blocks whose parents have been pruned.

    When the missing parent is later received and ReceivedBlockTransactions processes m_blocks_unlinked, the same entry may be processed multiple times. This can result in the block being re-added to setBlockIndexCandidates with a modified nSequenceId, violating its ordering invariants and leading to undefined behavior. So avoid duplicate insertions into m_blocks_unlinked in FindMostWorkChain.

    how to test:

    use the updated feature_pruning.py which adds coverage for this scenario.

    • on master: the test (with the below diff) fails since nSequenceId is being modified for an entry in setBlockIndexCandidates
    • on this branch: the test (with the below diff) passes
    --- a/src/validation.cpp
    +++ b/src/validation.cpp
    @@ -3814,6 +3814,12 @@ void ChainstateManager::ReceivedBlockTransactions(const CBlock& block, CBlockInd
                        pindex->nHeight, pindex->m_chain_tx_count, prev_tx_sum(*pindex), CLIENT_NAME, FormatFullVersion(), CLIENT_BUGREPORT);
                 }
                 pindex->m_chain_tx_count = prev_tx_sum(*pindex);
    +            for (const auto& c : m_chainstates) {
    +                if (c->setBlockIndexCandidates.contains(pindex)) {
    +                    LogInfo("### pindex UB = %s", pindex);
    +                    assert(false);
    +                }
    +            }
                 pindex->nSequenceId = nBlockSequenceId++;
                 for (const auto& c : m_chainstates) {
                     c->TryAddBlockIndexCandidate(pindex);
    
  2. DrahtBot added the label Validation on Apr 14, 2026
  3. DrahtBot commented at 1:04 PM on April 14, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK stringintech, sedited
    Approach ACK marcofleon

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  4. stratospher commented at 1:05 PM on April 14, 2026: contributor

    We also considered changing the type of m_blocks_unlinked from std::multimap<CBlockIndex*, CBlockIndex*> to std::map<CBlockIndex*, std::set<CBlockIndex*>> to prevent duplicate entries at the type level and to better express the parent -> set of unique children relationship.

    However, this broke test_chain_split_in_memory in feature_chain_tiebreaks.py. The root cause was:

    • std::multimap preserves insertion order for entries with equal keys. When a parent's data finally arrives and ReceivedBlockTransactions processes m_blocks_unlinked, children of that parent are dequeued in insertion order (i.e., in the order their block data was received) and each assigned nSequenceId. CBlockIndexWorkComparator uses nSequenceId as the tiebreaker among equal-work chains (lower wins). So insertion order in m_blocks_unlinked directly determines which sibling becomes the chain tip.
    • Switching to std::set<CBlockIndex*> would instead order children by pointer address, not by order in which they were received, and this changed which sibling got the lower nSequenceId and broke the chain tip selection which the test is checking.

    We didn't want to cause any behaviour change in this PR, so went with std::multimap and added an explicit duplicate check at FindMostWorkChain callsite instead. Open to suggestions if someone has a better idea for a more suitable data structure!

  5. in src/validation.cpp:3163 in c80fc367d1 outdated
    3163 | +                            }
    3164 | +                        }
    3165 | +
    3166 | +                        if (!found) {
    3167 | +                            m_blockman.m_blocks_unlinked.emplace(pindexFailed->pprev, pindexFailed);
    3168 | +                        }
    


    stringintech commented at 7:57 PM on April 17, 2026:

    Seems this can become a bit less verbose:

    auto range = m_blockman.m_blocks_unlinked.equal_range(pindexFailed->pprev);
    if (!std::any_of(range.first, range.second, [&](const auto& p) { return p.second == pindexFailed; })) {
        m_blockman.m_blocks_unlinked.emplace(pindexFailed->pprev, pindexFailed);
    }
    

    stratospher commented at 6:05 PM on April 20, 2026:

    nice! taken.

  6. stringintech commented at 8:25 PM on April 17, 2026: contributor

    Concept ACK.

    Reverted the fix and added the suggested diff in the PR description and made sure the assertion is hit.

    While reviewing, noticed another source of duplicate insertion and potential UB though:

    LoadBlockIndex inserts into m_blocks_unlinked for blocks with nTx > 0 without checking BLOCK_HAVE_DATA. When a block is pruned, PruneOneBlockFile explicitly removes its (pprev, pindex) pair from m_blocks_unlinked. But on the next restart, LoadBlockIndex puts it back — violating the invariant that only blocks with BLOCK_HAVE_DATA should be in m_blocks_unlinked.

    When such a pruned block is re-downloaded (perhaps because it is now part of a potential best chain), ReceivedBlockTransactions inserts the same (pprev, pindex) pair again, causing the duplicate. This can happen when a block gets pruned and then re-downloaded before its parent was ever downloaded, which seems like a rare scenario and trickier to reproduce.

    This can be fixed by guarding the insertion in LoadBlockIndex with BLOCK_HAVE_DATA:

    } else {
        pindex->m_chain_tx_count = 0;
        if (pindex->nStatus & BLOCK_HAVE_DATA) {
            m_blocks_unlinked.insert(std::make_pair(pindex->pprev, pindex));
        }
    }
    

    Note that for the duplicate insertion in ReceivedBlockTransactions to be reached in this scenario, CheckBlockIndex must not run in AcceptBlock — since it already asserts !(pindex->nStatus & BLOCK_HAVE_DATA) → !foundInUnlinked and catches the inconsistency.

  7. mzumsande commented at 2:03 AM on April 18, 2026: contributor

    While reviewing, noticed another source of duplicate insertion and potential UB though:

    This can be fixed by guarding the insertion in LoadBlockIndex with BLOCK_HAVE_DATA:

    I agree. The LoadBlockIndex issue is also described in issue #35050, which suggests the same fix.

  8. validation: avoid duplicates in m_blocks_unlinked
    In a pruned node undergoing a deep reorg, FindMostWorkChain can
    insert duplicate entries into m_blocks_unlinked.
    
    This can happen when:
    - Traversing from one candidate tip to the fork point adds blocks
    whose parents have been pruned.
    - Traversing from another candidate tip over the same fork inserts
    the same pairs again, since the blocks are shared across
    both branches.
    
    When we finally download the missing parent from our peer and call
    ReceivedBlockTransactions to process m_blocks_unlinked, the same
    entry may be processed multiple times.
    
    This can lead to re-insertion into setBlockIndexCandidates with
    a modified nSequenceId, violating its ordering invariants and
    causing undefined behavior. So avoid duplicate insertions into
    m_blocks_unlinked here.
    
    Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
    c97a7602a7
  9. test: add coverage for UB caused by FindMostWorkChain
    when pruned node2 finally receives the blocks from
    node0, node2 will process duplicate entires in
    ReceivedBlockTransactions twice but it will only
    insert into setBlockIndexCandidates if the block has
    more work that the current chain tip.
    
    the duplicate entries in m_blocks_unlinked in this
    test are from height 1171 to 1294.
    
    before this commit
    - we invalidated height 1320 and chain tip became
    1319. so we won't add duplicate entries (all have
    <1319) into setBlockIndexCandidates and won't
    have coverage for this UB scenario.
    
    with this commit
    - we invalidate height 1295 and chain tip became
    1294. so we will process the duplicate entry
    1294 in m_blocks_unlinked and have coverage
    for this UB.
    
    Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
    efcf74e07d
  10. test/doc: remove misleading comment and improve tests
    remove misleading comment for m_blocks_unlinked since for
    pruned nodes:
    - usually A is the missing data (just like in non-pruned nodes)
    - in PruneOneBlockFile, we remove entries once data for B is missing.
    
    Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
    b02191feac
  11. stratospher force-pushed on Apr 20, 2026
  12. marcofleon commented at 4:44 PM on April 27, 2026: contributor

    Approach ACK

    The fix looks good to me. I have a suggestion in place of (or in addition to) the test change in efcf74e07d8ab45d89ea0191393307224a5f64bd. The functional test change seemed a bit magic numbery to me.

    In CheckBlockIndex, I replaced this: https://github.com/bitcoin/bitcoin/blob/ad3f73862bdbee0aac106fa9e08c4181ce78ba47/src/validation.cpp#L5339-L5346

    with an assertion that there are no duplicates in m_blocks_unlinked. Something like this:

    size_t numFoundInUnlinked = 0;
    for (auto it = rangeUnlinked.first; it != rangeUnlinked.second; ++it) {
        assert(it->first == pindex->pprev);
        if (it->second == pindex) {
            ++numFoundInUnlinked;
        }
    }
    assert(numFoundInUnlinked <= 1);
    bool foundInUnlinked = (numFoundInUnlinked > 0);
    

    Running feature_pruning.py with this patch on master will fail:

    node2 stderr Assertion failed: (numFoundInUnlinked <= 1), function CheckBlockIndex, file validation.cpp, line 5359.
    

    Then applying the first commit of this PR, the test passes without the second commit's change to which block we invalidate. Maybe there's something I'm missing about efcf74e07d8ab45d89ea0191393307224a5f64bd (covers something the assertion doesn't), but asserting in CheckBlockIndex feels like a more general way of preventing these duplicate insertions from (re)occurring.

  13. sedited commented at 9:02 AM on April 29, 2026: contributor

    Concept ACK

    Given the number of small bugs we've been finding around it, maybe a better way to fix this would be wrapping m_blocks_unlinked in a data structure whose member functions enforce the required invariants (blocks need to have data, no duplicates)?


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-16 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me