prune, rpc: Check undo data when finding pruneheight #29668

pull fjahr wants to merge 3 commits into bitcoin:master from fjahr:2024-03-first-stored-undo changing 7 files +120 −14
  1. fjahr commented at 4:42 PM on March 17, 2024: contributor

    The function GetFirstStoredBlock() helps us find the first block for which we have data. So far this function only looked for a block with BLOCK_HAVE_DATA. However, this doesn't mean that we also have the undo data of that block, and undo data might be required for what a user would like to do with those blocks. One example of how this might happen is if some blocks were fetched using the getblockfrompeer RPC. Blocks fetched from a peer will have data but no undo data.

    The first commit here allows GetFirstStoredBlock() to check for undo data as well by passing a parameter. This alone is useful for #29553 and I would use it there.

    In the second commit I am applying the undo check to the RPCs that report pruneheight to the user. I find this much more intuitive because I think the user expects to be able to do all operations on blocks up until the pruneheight but that is not the case if undo data is missing. I personally ran into this once before and now again when testing for assumeutxo when I had used getblockfrompeer. The following commit adds test coverage for this change of behavior.

    The last commit adds a note in the docs of getblockfrompeer that undo data will not be available.

  2. DrahtBot commented at 4:42 PM on March 17, 2024: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK furszy, stickies-v, achow101
    Stale ACK TheCharlatan

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29770 (index: Check all necessary block data is available before starting to sync by fjahr)
    • #29656 (chainparams: Change nChainTx type to uint64_t by fjahr)
    • #19463 (Prune locks by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. fjahr renamed this:
    prune, rpc: Check undo data when finding for pruneheight
    prune, rpc: Check undo data when finding pruneheight
    on Mar 17, 2024
  4. DrahtBot added the label CI failed on Mar 17, 2024
  5. fjahr force-pushed on Mar 17, 2024
  6. DrahtBot removed the label CI failed on Mar 17, 2024
  7. TheCharlatan approved
  8. TheCharlatan commented at 2:47 PM on March 20, 2024: contributor

    ACK 296243c3153e996d1626b8eabc8f8bce845ce3ad

  9. in src/node/blockstorage.cpp:598 in 296243c315 outdated
     594 | @@ -595,12 +595,16 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
     595 |      return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     596 |  }
     597 |  
     598 | -const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block)
     599 | +const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool check_undo)
    


    stickies-v commented at 5:14 PM on March 25, 2024:

    nit: I find has_undo more to the point

    const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool has_undo)
    

    fjahr commented at 5:03 PM on March 27, 2024:

    Not needed due to approach change

  10. in src/node/blockstorage.cpp:601 in 296243c315 outdated
     594 | @@ -595,12 +595,16 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
     595 |      return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     596 |  }
     597 |  
     598 | -const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block)
     599 | +const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool check_undo)
     600 |  {
     601 |      AssertLockHeld(::cs_main);
     602 | +    uint32_t check_status{BLOCK_HAVE_DATA};
    


    stickies-v commented at 5:24 PM on March 25, 2024:

    nit: I think highlighting that this is a mask improves readability

        uint32_t status_mask{BLOCK_HAVE_DATA};
    

    stickies-v commented at 5:29 PM on March 25, 2024:

    nit: also, perhaps good to add a type static_assert here to ensure this doesn't silently introduce bugs when nStatus changes type?

    static_assert(std::is_same<decltype(check_status), decltype(last_block->nStatus)>::value);
    
    

    fjahr commented at 5:03 PM on March 27, 2024:

    Not needed due to approach change

  11. in src/node/blockstorage.h:341 in 296243c315 outdated
     336 | @@ -337,8 +337,8 @@ class BlockManager
     337 |  
     338 |      //! Find the first stored ancestor of start_block immediately after the last
     339 |      //! pruned ancestor. Return value will never be null. Caller is responsible
     340 | -    //! for ensuring that start_block has data is not pruned.
     341 | -    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
     342 | +    //! for ensuring that start_block has data.
     343 | +    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, const bool check_undo=false) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    stickies-v commented at 5:35 PM on March 25, 2024:

    nit: unnecessary const

        const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, bool check_undo=false) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    

    stickies-v commented at 5:42 PM on March 25, 2024:

    nit: also, would be nice to update the docs for the check_undo param, and include the exception made for genesis block so new callers are made aware to look out for that?


    fjahr commented at 5:03 PM on March 27, 2024:

    done

  12. in src/rpc/blockchain.cpp:839 in 296243c315 outdated
     835 | @@ -835,7 +836,7 @@ static RPCHelpMan pruneblockchain()
     836 |  
     837 |      PruneBlockFilesManual(active_chainstate, height);
     838 |      const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
     839 | -    return block.nStatus & BLOCK_HAVE_DATA ? active_chainstate.m_blockman.GetFirstStoredBlock(block)->nHeight - 1 : block.nHeight;
     840 | +    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, nullptr, true)->nHeight - 1 : block.nHeight;
    


    stickies-v commented at 5:45 PM on March 25, 2024:

    nit (+ a few lines down)

        return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*last_block=*/nullptr, /*check_undo=*/true)->nHeight - 1 : block.nHeight;
    

    fjahr commented at 5:03 PM on March 27, 2024:

    done

  13. stickies-v commented at 8:12 PM on March 25, 2024: contributor

    Concept ACK

    I think this approach works well, but I think I'd prefer passing a mask instead of a bool, which prevents combinatorial explosion of params if we want to add more filtering options in the future. I personally find that a bit easier to quickly understand too. What do you think about something like this (didn't add static asserts and doc updates etc yet):

    <details> <summary>git diff on 296243c315</summary>

    diff --git a/src/node/blockstorage.cpp b/src/node/blockstorage.cpp
    index 74a233f43b..eaefa602d3 100644
    --- a/src/node/blockstorage.cpp
    +++ b/src/node/blockstorage.cpp
    @@ -595,16 +595,12 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
         return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     }
     
    -const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool check_undo)
    +const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, uint32_t status_mask)
     {
         AssertLockHeld(::cs_main);
    -    uint32_t check_status{BLOCK_HAVE_DATA};
    -    if (check_undo) {
    -        check_status |= BLOCK_HAVE_UNDO;
    -    }
         const CBlockIndex* last_block = &upper_block;
    -    assert((last_block->nStatus & check_status) == check_status); // 'upper_block' must have data
    -    while (last_block->pprev && ((last_block->pprev->nStatus & check_status) == check_status)) {
    +    assert((last_block->nStatus & status_mask) == status_mask); // 'upper_block' must have data
    +    while (last_block->pprev && ((last_block->pprev->nStatus & status_mask) == status_mask)) {
             if (lower_block) {
                 // Return if we reached the lower_block
                 if (last_block == lower_block) return lower_block;
    @@ -617,7 +613,7 @@ const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_bl
         assert(last_block != nullptr);
         // In the special case that all blocks up to the Genesis block are not
         // pruned, we return the Genesis block instead of block 1, even though
    -    // it can not have any undo data, since it is properly handled as an
    +    // it may be missing (e.g. undo) data, since it is properly handled as an
         // exception everywhere.
         if (last_block->nHeight == 1) {
             return last_block->pprev;
    diff --git a/src/node/blockstorage.h b/src/node/blockstorage.h
    index 33d78259e6..766a170946 100644
    --- a/src/node/blockstorage.h
    +++ b/src/node/blockstorage.h
    @@ -338,7 +338,7 @@ public:
         //! Find the first stored ancestor of start_block immediately after the last
         //! pruned ancestor. Return value will never be null. Caller is responsible
         //! for ensuring that start_block has data.
    -    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, const bool check_undo=false) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    +    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
     
         /** True if any block files have ever been pruned. */
         bool m_have_pruned = false;
    diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
    index 3cd9421343..88ed7ed964 100644
    --- a/src/rpc/blockchain.cpp
    +++ b/src/rpc/blockchain.cpp
    @@ -836,7 +836,8 @@ static RPCHelpMan pruneblockchain()
     
         PruneBlockFilesManual(active_chainstate, height);
         const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
    -    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, nullptr, true)->nHeight - 1 : block.nHeight;
    +    const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    +    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, nullptr, has_undo)->nHeight - 1 : block.nHeight;
     },
         };
     }
    @@ -1290,7 +1291,8 @@ RPCHelpMan getblockchaininfo()
         obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
         if (chainman.m_blockman.IsPruneMode()) {
             bool has_tip_data = tip.nStatus & BLOCK_HAVE_MASK;
    -        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip, nullptr, true)->nHeight : tip.nHeight + 1);
    +        const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    +        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip, nullptr, has_undo)->nHeight : tip.nHeight + 1);
     
             const bool automatic_pruning{chainman.m_blockman.GetPruneTarget() != BlockManager::PRUNE_TARGET_MANUAL};
             obj.pushKV("automatic_pruning",  automatic_pruning);
    
    

    </details>

    Code LGTM 296243c3153e996d1626b8eabc8f8bce845ce3ad, and I verified that the (updated) feature_pruning.py test fails on master as expected. I'm happy to proceed with your approach too if you don't like mine.

  14. fjahr force-pushed on Mar 27, 2024
  15. fjahr commented at 5:03 PM on March 27, 2024: contributor

    @stickies-v thanks for the feedback! I don't have strong feelings between the approaches but I agree yours is more flexible and maybe a bit more expressive, so I applied that. I also addressed the rest of the feedback and added some more detail to the docs.

  16. DrahtBot added the label CI failed on Mar 28, 2024
  17. DrahtBot commented at 1:49 AM on March 28, 2024: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    <sub>Debug: https://github.com/bitcoin/bitcoin/runs/23163769501</sub>

  18. fjahr force-pushed on Mar 28, 2024
  19. furszy commented at 3:08 PM on March 28, 2024: member

    q: doesn't this mean that if we generate a chain, prune half of it, then fetch the pruned blocks through getblockfrompeer, any fresh indexes sync will throw a different init error after this changes? (StartIndexBackgroundSync() should fail at CheckBlockDataAvailability() now).

    If this is correct, it would be nice to introduce a test presenting the behavior change.

  20. DrahtBot removed the label CI failed on Mar 28, 2024
  21. fjahr commented at 10:57 PM on March 30, 2024: contributor

    q: doesn't this mean that if we generate a chain, prune half of it, then fetch the pruned blocks through getblockfrompeer, any fresh indexes sync will throw a different init error after this changes? (StartIndexBackgroundSync() should fail at CheckBlockDataAvailability() now).

    If this is correct, it would be nice to introduce a test presenting the behavior change.

    Good question but no, CheckBlockDataAvailability() does not change the default mask (HAVE_DATA), so it shouldn't change behavior there for now. But your question made me realize that we should check for undo data for some indices because they need it. I have implemented that in a follow-up here: #29770 and I also implemented the test there that shows the behavior change you had in mind I think.

  22. in src/rpc/blockchain.cpp:840 in 23ac56177d outdated
     835 | @@ -835,7 +836,8 @@ static RPCHelpMan pruneblockchain()
     836 |  
     837 |      PruneBlockFilesManual(active_chainstate, height);
     838 |      const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
     839 | -    return block.nStatus & BLOCK_HAVE_DATA ? active_chainstate.m_blockman.GetFirstStoredBlock(block)->nHeight - 1 : block.nHeight;
     840 | +    const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
     841 | +    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight - 1 : block.nHeight;
    


    stickies-v commented at 1:35 PM on March 31, 2024:
        return block.nStatus & has_undo ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight - 1 : block.nHeight;
    

    fjahr commented at 7:28 PM on April 28, 2024:

    done

  23. in src/rpc/blockchain.cpp:1293 in 23ac56177d outdated
    1290 | @@ -1289,7 +1291,8 @@ RPCHelpMan getblockchaininfo()
    1291 |      obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    1292 |      if (chainman.m_blockman.IsPruneMode()) {
    1293 |          bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    


    stickies-v commented at 1:36 PM on March 31, 2024:

    This should check for has_undo too


    stickies-v commented at 2:13 PM on March 31, 2024:

    Also, it seems like we can just use BLOCK_HAVE_MASK = BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO,


    fjahr commented at 7:28 PM on April 28, 2024:

    done, I liked that BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO is more expressive because we don't use BLOCK_HAVE_MASK much I feel like people would need to look it up more, but I have just added a comment instead now.

  24. in src/node/blockstorage.h:344 in 23ac56177d outdated
     342 | +    //! ancestor that does not match the status mask. Return value will never be
     343 | +    //! null. Caller is responsible for ensuring that start_block has the status
     344 | +    //! mask data. If the whole chain matched the status mask the genesis block
     345 | +    //! will be returned regardless of it's status match because, for example,
     346 | +    //! it can not have undo data by nature.
     347 | +    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    stickies-v commented at 3:35 PM on March 31, 2024:

    nit: it's called upper_block in the implementation, which I think is a better name too (cfr lower_block)

        const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& upper_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    

    fjahr commented at 7:28 PM on April 28, 2024:

    done

  25. stickies-v commented at 10:14 PM on March 31, 2024: contributor

    Thanks for incorporating the mask suggestion. Had a deeper look at the code, I think this logic maybe warrants a bit more cleaning up, because with the status_mask, using GetFirstStoredBlock is not very intuitive anymore and potentially a bit footgunny.

    • GetFirstStoredBlock: rename to GetFirstBlock since the Stored bit depends on the status_mask now. Consequently, make status_mask required to avoid footguns.
    • let callsites handle genesis block exceptions, because it depends on the status_mask. As a bonus, everything is a bit more explicit?
    • some other tidying up, e.g. make parameter naming consistent between header and implementation, improve documentation

    Wdyt about this further iteration? I understand it's quite a bit of overhaul compared to your first proposal for what should be a relatively simple change. I'm just hesitant about introducing new boolean parameters when we can avoid it, and I think it's worth improving overall code robustness?

    <details> <summary>git diff on 23ac56177d</summary>

    diff --git a/src/init.cpp b/src/init.cpp
    index 1a4fce4678..a8a5ff0dfd 100644
    --- a/src/init.cpp
    +++ b/src/init.cpp
    @@ -2032,7 +2032,7 @@ bool StartIndexBackgroundSync(NodeContext& node)
         if (indexes_start_block) {
             LOCK(::cs_main);
             const CBlockIndex* start_block = *indexes_start_block;
    -        if (!start_block) start_block = chainman.ActiveChain().Genesis();
    +        if (!start_block) start_block = chainman.ActiveChain()[1];
             if (!chainman.m_blockman.CheckBlockDataAvailability(*index_chain.Tip(), *Assert(start_block))) {
                 return InitError(strprintf(Untranslated("%s best block of the index goes beyond pruned data. Please disable the index or reindex (which will download the whole blockchain again)"), older_index_name));
             }
    diff --git a/src/node/blockstorage.cpp b/src/node/blockstorage.cpp
    index eaefa602d3..8556504453 100644
    --- a/src/node/blockstorage.cpp
    +++ b/src/node/blockstorage.cpp
    @@ -595,11 +595,12 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
         return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     }
     
    -const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, uint32_t status_mask)
    +const CBlockIndex* BlockManager::GetFirstBlock(const CBlockIndex& upper_block, uint32_t status_mask,
    +                                               const CBlockIndex* lower_block)
     {
         AssertLockHeld(::cs_main);
         const CBlockIndex* last_block = &upper_block;
    -    assert((last_block->nStatus & status_mask) == status_mask); // 'upper_block' must have data
    +    assert((last_block->nStatus & status_mask) == status_mask); // 'upper_block' must satisfy the status mask
         while (last_block->pprev && ((last_block->pprev->nStatus & status_mask) == status_mask)) {
             if (lower_block) {
                 // Return if we reached the lower_block
    @@ -610,21 +611,13 @@ const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_bl
             }
             last_block = last_block->pprev;
         }
    -    assert(last_block != nullptr);
    -    // In the special case that all blocks up to the Genesis block are not
    -    // pruned, we return the Genesis block instead of block 1, even though
    -    // it may be missing (e.g. undo) data, since it is properly handled as an
    -    // exception everywhere.
    -    if (last_block->nHeight == 1) {
    -        return last_block->pprev;
    -    }
    -    return last_block;
    +    return Assert(last_block);
     }
     
     bool BlockManager::CheckBlockDataAvailability(const CBlockIndex& upper_block, const CBlockIndex& lower_block)
     {
         if (!(upper_block.nStatus & BLOCK_HAVE_DATA)) return false;
    -    return GetFirstStoredBlock(upper_block, &lower_block) == &lower_block;
    +    return GetFirstBlock(upper_block, BLOCK_HAVE_DATA, &lower_block) == &lower_block;
     }
     
     // If we're using -prune with -reindex, then delete block files that will be ignored by the
    diff --git a/src/node/blockstorage.h b/src/node/blockstorage.h
    index 8f83b68558..8f309c31b6 100644
    --- a/src/node/blockstorage.h
    +++ b/src/node/blockstorage.h
    @@ -335,13 +335,29 @@ public:
         //! (part of the same chain).
         bool CheckBlockDataAvailability(const CBlockIndex& upper_block LIFETIMEBOUND, const CBlockIndex& lower_block LIFETIMEBOUND) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
     
    -    //! Find the first stored ancestor of start_block immediately after the last
    -    //! ancestor that does not match the status mask. Return value will never be
    -    //! null. Caller is responsible for ensuring that start_block has the status
    -    //! mask data. If the whole chain matched the status mask the genesis block
    -    //! will be returned regardless of it's status match because, for example,
    -    //! it can not have undo data by nature.
    -    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    +    /**
    +     * [@brief](/bitcoin-bitcoin/contributor/brief/) Finds the furthest away ancestor of `upper_block` of which the nStatus matches the `status_mask`
    +     * 
    +     * Starts from `upper_block` and iterates backwards through its ancestors for as long as the block's
    +     * nStatus keeps matching the `status_mask` mask, or until it encounters `lower_block`.
    +     *
    +     * [@pre](/bitcoin-bitcoin/contributor/pre/) `upper_block` must match the `status_mask`.
    +     * 
    +     * [@param](/bitcoin-bitcoin/contributor/param/) upper_block The block from which to start the search, must meet the `status_mask` criteria.
    +     * [@param](/bitcoin-bitcoin/contributor/param/) status_mask A bitmask representing the conditions to check against each block's nStatus.
    +     * [@param](/bitcoin-bitcoin/contributor/param/) lower_block An optional pointer to the `CBlockIndex` that serves as the lower boundary of
    +     *                    the search (inclusive). If `nullptr`, the search includes up to the genesis
    +     *                    block.
    +     * [@return](/bitcoin-bitcoin/contributor/return/) A (non-null) pointer to the `CBlockIndex` of the furthest away ancestor (including
    +     *         `upper_block` itself) that matches the `status_mask`, up to and including
    +     *         `lower_block` if it is part of the ancestry.
    +     */
    +    const CBlockIndex* GetFirstBlock(
    +        const CBlockIndex& upper_block LIFETIMEBOUND, 
    +        uint32_t status_mask,
    +        const CBlockIndex* lower_block = nullptr
    +    ) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    +
     
         /** True if any block files have ever been pruned. */
         bool m_have_pruned = false;
    diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
    index fa83d2e397..15d6f9e9b9 100644
    --- a/src/rpc/blockchain.cpp
    +++ b/src/rpc/blockchain.cpp
    @@ -94,6 +94,18 @@ double GetDifficulty(const CBlockIndex& blockindex)
         return dDiff;
     }
     
    +//! Return height of highest block that has been pruned, or -1 if no blocks have been pruned
    +static int GetPruneHeight(const Chainstate& active_chainstate) EXCLUSIVE_LOCKS_REQUIRED(!cs_main) {
    +    AssertLockHeld(cs_main);
    +
    +    const CBlockIndex& chain_tip{*CHECK_NONFATAL(active_chainstate.m_chain.Tip())};
    +    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
    +    const auto first_pruned{active_chainstate.m_blockman.GetFirstBlock(chain_tip, BLOCK_HAVE_MASK)->nHeight - 1};
    +
    +    // special case: genesis block is never pruned
    +    return first_pruned == 0 ? -1 : first_pruned;
    +}
    +
     static int ComputeNextBlockAndDepth(const CBlockIndex& tip, const CBlockIndex& blockindex, const CBlockIndex*& next)
     {
         next = tip.GetAncestor(blockindex.nHeight + 1);
    @@ -835,9 +847,7 @@ static RPCHelpMan pruneblockchain()
         }
     
         PruneBlockFilesManual(active_chainstate, height);
    -    const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
    -    const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    -    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight - 1 : block.nHeight;
    +    return GetPruneHeight(active_chainstate);
     },
         };
     }
    @@ -1290,10 +1300,7 @@ RPCHelpMan getblockchaininfo()
         obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
         obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
         if (chainman.m_blockman.IsPruneMode()) {
    -        bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    -        const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    -        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight : tip.nHeight + 1);
    -
    +        obj.pushKV("pruneheight", GetPruneHeight(active_chainstate) + 1);
             const bool automatic_pruning{chainman.m_blockman.GetPruneTarget() != BlockManager::PRUNE_TARGET_MANUAL};
             obj.pushKV("automatic_pruning",  automatic_pruning);
             if (automatic_pruning) {
    diff --git a/src/test/blockmanager_tests.cpp b/src/test/blockmanager_tests.cpp
    index d7ac0bf823..12f3257700 100644
    --- a/src/test/blockmanager_tests.cpp
    +++ b/src/test/blockmanager_tests.cpp
    @@ -2,6 +2,7 @@
     // Distributed under the MIT software license, see the accompanying
     // file COPYING or http://www.opensource.org/licenses/mit-license.php.
     
    +#include <chain.h>
     #include <chainparams.h>
     #include <clientversion.h>
     #include <node/blockstorage.h>
    @@ -113,7 +114,7 @@ BOOST_FIXTURE_TEST_CASE(blockmanager_block_data_availability, TestChain100Setup)
         };
     
         // 1) Return genesis block when all blocks are available
    -    BOOST_CHECK_EQUAL(blockman.GetFirstStoredBlock(tip), chainman->ActiveChain()[0]);
    +    BOOST_CHECK_EQUAL(blockman.GetFirstBlock(tip, BLOCK_HAVE_DATA), chainman->ActiveChain()[0]);
         BOOST_CHECK(blockman.CheckBlockDataAvailability(tip, *chainman->ActiveChain()[0]));
     
         // 2) Check lower_block when all blocks are available
    @@ -127,7 +128,7 @@ BOOST_FIXTURE_TEST_CASE(blockmanager_block_data_availability, TestChain100Setup)
         func_prune_blocks(last_pruned_block);
     
         // 3) The last block not pruned is in-between upper-block and the genesis block
    -    BOOST_CHECK_EQUAL(blockman.GetFirstStoredBlock(tip), first_available_block);
    +    BOOST_CHECK_EQUAL(blockman.GetFirstBlock(tip, BLOCK_HAVE_DATA), first_available_block);
         BOOST_CHECK(blockman.CheckBlockDataAvailability(tip, *first_available_block));
         BOOST_CHECK(!blockman.CheckBlockDataAvailability(tip, *last_pruned_block));
     }
    
    

    </details>

  26. fjahr force-pushed on Apr 28, 2024
  27. fjahr commented at 7:34 PM on April 28, 2024: contributor

    @stickies-v Thank you, I mostly took your suggestions with some minor tweaks and made you co-author of the refactoring commit. I was a bit torn about letting the caller handle the Genesis block issue but since this was actually the previous behavior I guess it's better to not change it if you like it the way it is. I have adapted GetPruneHeight to return std::optional which I found a good fit and avoids the -1 return value. And I think the change in init.cpp isn't needed here (yet). But I think we will need it in #29770 later.

    EDIT: I had forgotten that pruneblockchain returns -1 when nothing is pruned, but I still think handling it internally with std::optional is nicer.

  28. DrahtBot added the label CI failed on Apr 28, 2024
  29. fjahr force-pushed on Apr 28, 2024
  30. fjahr force-pushed on Apr 28, 2024
  31. fjahr force-pushed on Apr 28, 2024
  32. DrahtBot removed the label CI failed on Apr 28, 2024
  33. in src/rpc/blockchain.cpp:798 in edbb6a3eb1 outdated
     792 | +    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
     793 | +    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, /*status_mask=*/BLOCK_HAVE_MASK)};
     794 | +
     795 | +    // Result can never be 0 because genesis block is never pruned, so the
     796 | +    // result 1 means that no block has been pruned.
     797 | +    if (first_block->nHeight == 1) {
    


    stickies-v commented at 11:51 AM on April 29, 2024:

    Could add a CHECK_NONFATAL(first_block->nHeight > 0); here to put that assumption in code, that would've helped with the segfault in current HEAD


    fjahr commented at 4:25 PM on April 29, 2024:

    done

  34. in src/rpc/blockchain.cpp:795 in edbb6a3eb1 outdated
     790 | +    const CBlockIndex& chain_tip{*CHECK_NONFATAL(active_chainstate.m_chain.Tip())};
     791 | +    // Check for both data and undo data
     792 | +    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
     793 | +    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, /*status_mask=*/BLOCK_HAVE_MASK)};
     794 | +
     795 | +    // Result can never be 0 because genesis block is never pruned, so the
    


    stickies-v commented at 11:52 AM on April 29, 2024:

    This block needs to be in the next commit - we're not passing BLOCK_HAVE_MASK yet in dd7696b97f20d756df1afa523a8f9e7bf3924c7d. This should fix the failing CI


    fjahr commented at 4:25 PM on April 29, 2024:

    done

  35. in src/rpc/blockchain.cpp:801 in edbb6a3eb1 outdated
     796 | +    // result 1 means that no block has been pruned.
     797 | +    if (first_block->nHeight == 1) {
     798 | +        return std::nullopt;
     799 | +    }
     800 | +    // The block before the block with all data is the last that was pruned
     801 | +    return first_block->pprev->nHeight;
    


    stickies-v commented at 11:59 AM on April 29, 2024:

    GetFirstBlock() does not provide any guarantees that pprev is not nullptr (nor should it), so I think we need to assert that here


    fjahr commented at 4:25 PM on April 29, 2024:

    done


    stickies-v commented at 4:10 PM on June 3, 2024:

    I don't think this is fully addressed in 0501ec102f3740049bd2e891cd83527c31b29222 - would update to return Assert(first_unpruned.pprev)->nHeight;?


    fjahr commented at 4:19 PM on June 22, 2024:

    Not sure why this is still needed when we return above if !first_unpruned.pprev but I've added it since it won't hurt.

  36. in src/rpc/blockchain.cpp:786 in edbb6a3eb1 outdated
     782 | @@ -782,6 +783,24 @@ static RPCHelpMan getblock()
     783 |      };
     784 |  }
     785 |  
     786 | +//! Return height of highest block that has been pruned
    


    stickies-v commented at 12:02 PM on April 29, 2024:
    //! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    

    fjahr commented at 4:25 PM on April 29, 2024:

    added

  37. in src/rpc/blockchain.cpp:857 in edbb6a3eb1 outdated
     852 | @@ -834,8 +853,8 @@ static RPCHelpMan pruneblockchain()
     853 |      }
     854 |  
     855 |      PruneBlockFilesManual(active_chainstate, height);
     856 | -    const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
     857 | -    return block.nStatus & BLOCK_HAVE_DATA ? active_chainstate.m_blockman.GetFirstStoredBlock(block)->nHeight - 1 : block.nHeight;
     858 | +    const auto prune_height{GetPruneHeight(active_chainstate)};
     859 | +    return prune_height ? prune_height.value() : -1;
    


    stickies-v commented at 12:07 PM on April 29, 2024:

    nit

        return GetPruneHeight(active_chainstate).value_or(-1);
    

    fjahr commented at 4:26 PM on April 29, 2024:

    taken

  38. in src/rpc/blockchain.cpp:1318 in edbb6a3eb1 outdated
    1306 | @@ -1288,8 +1307,8 @@ RPCHelpMan getblockchaininfo()
    1307 |      obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
    1308 |      obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    1309 |      if (chainman.m_blockman.IsPruneMode()) {
    1310 | -        bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    1311 | -        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip)->nHeight : tip.nHeight + 1);
    1312 | +        const auto prune_height{GetPruneHeight(active_chainstate)};
    1313 | +        obj.pushKV("pruneheight", prune_height ? prune_height.value() + 1 : 0);
    


    stickies-v commented at 12:14 PM on April 29, 2024:

    nit: more consistent with the (suggested) approach in pruneblockchain, but then "plus one (only present if pruning is enabled)" as per the docs in this RPC

            const auto prune_height{GetPruneHeight(active_chainstate).value_or(-1)};
            obj.pushKV("pruneheight", prune_height + 1);
    

    fjahr commented at 4:26 PM on April 29, 2024:

    I prefer to keep this one as is because I find it more intuitive.

  39. stickies-v commented at 12:18 PM on April 29, 2024: contributor

    nit: I think the last 3 commits should be squashed

  40. in src/node/blockstorage.h:359 in dd7696b97f outdated
     353 | +     *                    the search (inclusive). If `nullptr`, the search includes up to the genesis
     354 | +     *                    block.
     355 | +     * @return A (non-null) pointer to the `CBlockIndex` of the furthest away ancestor (including
     356 | +     *         `upper_block` itself) that matches the `status_mask`, up to and including
     357 | +     *         `lower_block` if it is part of the ancestry.
     358 | +     */
    


    ryanofsky commented at 4:00 PM on April 29, 2024:

    In commit "refactor, blockstorage: Generalize GetFirstStoredBlock" (dd7696b97f20d756df1afa523a8f9e7bf3924c7d)

    I think there are a few issues with this comment:

    • The @<!-- -->brief and @<!-- -->return sections make it sound like it is returning the earliest block matching the flags, when actually it is returning the earliest block matching the flags after the latest block not matching them. This is different because it means not only does the returned block have the flags, but all blocks between the returned block and upper block have them.
    • One precondition is specified but one is missing (this also asserts false if the lower block is not an ancestor of the upper block).
    • I don't like how some pretty important details are written in parentheses. Better to use parentheses for more optional explanatory information.

    Suggestion to fix these things:

        /**
         * [@brief](/bitcoin-bitcoin/contributor/brief/) Returns the earliest block with specified `status_mask` flags set after
         * the latest block _not_ having those flags.
         *
         * This function starts from `upper_block`, which must have all
         * `status_mask` flags set, and iterates backwards through its ancestors. It
         * continues as long as each block has all `status_mask` flags set, until
         * reaching the oldest ancestor or `lower_block`.
         * 
         * [@pre](/bitcoin-bitcoin/contributor/pre/) `upper_block` must have all `status_mask` flags set.
         * [@pre](/bitcoin-bitcoin/contributor/pre/) `lower_block` must be null or an ancestor of `upper_block`
         *
         * [@param](/bitcoin-bitcoin/contributor/param/) upper_block The starting block for the search, which must have all
         *                    `status_mask` flags set.
         * [@param](/bitcoin-bitcoin/contributor/param/) status_mask Bitmask specifying required status flags.
         * [@param](/bitcoin-bitcoin/contributor/param/) lower_block The earliest possible block to return. If null, the
         *                    search can extend to the genesis block.
         *
         * [@return](/bitcoin-bitcoin/contributor/return/) A non-null pointer to the earliest block between `upper_block`
         *         and `lower_block`, inclusive, such that every block between the
         *         returned block and `upper_block` has `status_mask` flags set.
         */
    

    fjahr commented at 6:57 PM on May 30, 2024:

    Taken, thanks!

  41. fjahr force-pushed on Apr 29, 2024
  42. fjahr commented at 4:27 PM on April 29, 2024: contributor

    Adressed feedback from @stickies-v , thanks again!

    nit: I think the last 3 commits should be squashed

    The doc change is only tangentially related so I will keep it separate. But I squashed the 2nd and 3rd.

  43. in src/rpc/blockchain.cpp:797 in 4e62129d8a outdated
     792 | +
     793 | +    if (first_block->nHeight == 0) {
     794 | +        return std::nullopt;
     795 | +    }
     796 | +    // The block before the block with all data is the last that was pruned
     797 | +    return Assert(first_block->pprev)->nHeight;
    


    ryanofsky commented at 4:44 PM on April 29, 2024:

    In commit "refactor, blockstorage: Generalize GetFirstStoredBlock" (4e62129d8aed839c83fa4a595aab06ed5ca7eeb6)

    IMO, this logic is a little confusing because it using both nHeight and pprev instead of just using pprev consistently. Also I think the cases here could have more explanation.

    // Get first block with data, after the last block without data.
    // This is the start of the unpruned range of blocks.
    const auto first_unpruned{active_chainstate.m_blockman.GetFirstBlock(chain_tip, BLOCK_HAVE_DATA)};
    if (!first_unpruned->prev) {
       // No block before the first unpruned block means nothing is pruned.
       return nullopt;
    }
    // Block before the first unpruned block is the last pruned block.
    return first_unpruned->prev->nHeight;
    

    fjahr commented at 6:57 PM on May 30, 2024:

    Taken, thanks!

  44. in src/rpc/blockchain.cpp:791 in 61f352d6bc outdated
     786 | @@ -787,10 +787,14 @@ static std::optional<int> GetPruneHeight(const Chainstate& active_chainstate) EX
     787 |      AssertLockHeld(::cs_main);
     788 |  
     789 |      const CBlockIndex& chain_tip{*CHECK_NONFATAL(active_chainstate.m_chain.Tip())};
     790 | -    if (!(chain_tip.nStatus & BLOCK_HAVE_DATA)) return chain_tip.nHeight;
     791 | -    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, BLOCK_HAVE_DATA)};
     792 | -
     793 | -    if (first_block->nHeight == 0) {
     794 | +    // Check for both data and undo data
     795 | +    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
    


    ryanofsky commented at 5:30 PM on April 29, 2024:

    In commit "rpc: Make pruneheight also reflect undo data presence" (61f352d6bcf485862815fdd68064cc07ea8ab6ce)

    This doesn't seem right. If chain consists of just genesis block, and genesis block doesn't have undo data this will return height 0 as if genesis block were pruned.


    fjahr commented at 6:59 PM on May 30, 2024:

    Addressed in the other comment on this commit

  45. in src/rpc/blockchain.cpp:786 in 4e62129d8a outdated
     781 | @@ -782,6 +782,21 @@ static RPCHelpMan getblock()
     782 |      };
     783 |  }
     784 |  
     785 | +//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
     786 | +static std::optional<int> GetPruneHeight(const Chainstate& active_chainstate) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    


    ryanofsky commented at 5:41 PM on April 29, 2024:

    In commit "refactor, blockstorage: Generalize GetFirstStoredBlock" (4e62129d8aed839c83fa4a595aab06ed5ca7eeb6)

    I don't think it makes sense to call this parameter active_chainstate. It is true the active chainstate is always passed, but this function should care whether the chainstate is active or not. Probably also it would be better for this function not to access the Chainstate class at all and instead be passed separate BlockMan and CChain parameters.


    fjahr commented at 6:57 PM on May 30, 2024:

    Ok, passing CChain and BlockMan as parameters now.

  46. in src/rpc/blockchain.cpp:794 in 61f352d6bc outdated
     793 | -    if (first_block->nHeight == 0) {
     794 | +    // Check for both data and undo data
     795 | +    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
     796 | +    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, /*status_mask=*/BLOCK_HAVE_MASK)};
     797 | +
     798 | +    // Result can never be 0 because genesis block is never pruned, so the
    


    ryanofsky commented at 6:15 PM on April 29, 2024:

    In commit "rpc: Make pruneheight also reflect undo data presence" (61f352d6bcf485862815fdd68064cc07ea8ab6ce)

    This comment "Result can never be 0 because genesis block is never pruned" is misleading and basically describes the opposite of what is happening. If the genesis block was not pruned you would expect GetFirstBlock to return the genesis block, because it has data, and for the result to still be 0 here like it was in the previous commit. The reason GetFirstBlock is no longer returning the genesis block is because the genesis block does not have undo data, so it is now being considered pruned.

    So code works for a different reason than is being described. I think it would probably clearer to ignore genesis block flags rather than making this code depend on that that genesis undo flag won't be set.

    Would suggest something more like the following (untested):

    //! Return height of highest block on the chain that has been pruned, or std::nullopt if no blocks have been pruned
    static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
        AssertLockHeld(::cs_main);
    
        // Search for the last block missing block data or undo data. Don't let the
        // search consider the genesis block, because the genesis block does not
        // have undo data, but should not be considered pruned.
        const CBlockIndex* first_block{chain[1]};
        const CBlockIndex* chain_tip{chain.Tip()};
    
        // If there are no blocks after the genesis block, or no blocks at all, nothing is pruned.
        if (!first_block || !chain_tip) return std::nullopt;
    
        // If the chain tip is pruned, everything is pruned.
        if (!(chain_tip->nStatus & BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    
        const auto& first_unpruned{*Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_MASK, first_block))};
        if (&first_unpruned == first_block) {
            // All blocks between first_block and chain_tip have data, so nothing is pruned.
            return std::nullopt;
        }
    
        // Block before the first unpruned block is the last pruned block.
        return Assert(first_unpruned.pprev)->nHeight;
    }
    

    fjahr commented at 6:59 PM on May 30, 2024:

    I have adopted these changes with minor edits and that addresses your previous comment about the genesis behavior as well.

  47. ryanofsky commented at 6:21 PM on April 29, 2024: contributor

    Code review 697d0a23f5d0a75829ffe1ac874eaf15fff3bfbe. Approach seems good and first commit seems ok in its current form but I think version of GetPruneHeight in the second commit is not handling the genesis block correctly.

  48. fjahr force-pushed on May 30, 2024
  49. fjahr commented at 7:07 PM on May 30, 2024: contributor

    Addressed comments from @ryanofsky , thank you for the thoughtful review!

  50. in src/rpc/blockchain.cpp:786 in dca1ca1f56 outdated
     781 | @@ -782,6 +782,24 @@ static RPCHelpMan getblock()
     782 |      };
     783 |  }
     784 |  
     785 | +//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
     786 | +static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    


    stickies-v commented at 1:52 PM on June 3, 2024:

    In dca1ca1f56d8e63929eb7aea64f8fb3e01752357: I think BlockManager::GetFirstBlock() can be made const, which allows us to pass a const BlockManager& blockman here:

    <details> <summary>git diff on dca1ca1f56</summary>

    diff --git a/src/node/blockstorage.cpp b/src/node/blockstorage.cpp
    index 18eb2dc1aa..6dbc111eb6 100644
    --- a/src/node/blockstorage.cpp
    +++ b/src/node/blockstorage.cpp
    @@ -595,7 +595,7 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
         return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     }
     
    -const CBlockIndex* BlockManager::GetFirstBlock(const CBlockIndex& upper_block, uint32_t status_mask, const CBlockIndex* lower_block)
    +const CBlockIndex* BlockManager::GetFirstBlock(const CBlockIndex& upper_block, uint32_t status_mask, const CBlockIndex* lower_block) const
     {
         AssertLockHeld(::cs_main);
         const CBlockIndex* last_block = &upper_block;
    diff --git a/src/node/blockstorage.h b/src/node/blockstorage.h
    index 917d75232d..d1e46fde2b 100644
    --- a/src/node/blockstorage.h
    +++ b/src/node/blockstorage.h
    @@ -361,7 +361,7 @@ public:
             const CBlockIndex& upper_block LIFETIMEBOUND,
             uint32_t status_mask,
             const CBlockIndex* lower_block = nullptr
    -    ) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    +    ) const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
     
         /** True if any block files have ever been pruned. */
         bool m_have_pruned = false;
    diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
    index 90cee9901d..b99abe545d 100644
    --- a/src/rpc/blockchain.cpp
    +++ b/src/rpc/blockchain.cpp
    @@ -783,7 +783,7 @@ static RPCHelpMan getblock()
     }
     
     //! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    -static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    +static std::optional<int> GetPruneHeight(const BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
         AssertLockHeld(::cs_main);
     
         const CBlockIndex* chain_tip{chain.Tip()};
    
    

    </details>


    fjahr commented at 4:21 PM on June 22, 2024:

    done

  51. in src/rpc/blockchain.cpp:800 in 0501ec102f outdated
     795 | +
     796 | +    // If there are no blocks after the genesis block, or no blocks at all, nothing is pruned.
     797 | +    if (!first_block || !chain_tip) return std::nullopt;
     798 | +
     799 | +    // If the chain tip is pruned, everything is pruned.
     800 | +    if (!(chain_tip->nStatus & BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    


    stickies-v commented at 3:42 PM on June 3, 2024:

    I think this is buggy, chain tip should be considered pruned if any of the BLOCK_HAVE_MASK flags are missing:

        if (!((chain_tip->nStatus & BLOCK_HAVE_MASK) == BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    

    Can be verified by adding unit test to the first commit, which passes on dca1ca1f56 and fails on current HEAD for CheckGetPruneHeight(blockman, chain, 100):

    $ ./src/test/test_bitcoin --run_test=blockchain_tests/get_prune_height
    ...
    Running 1 test case...
    Assertion failed: ((last_block->nStatus & status_mask) == status_mask), function GetFirstBlock, file blockstorage.cpp, line 602.
    unknown location:0: fatal error: in "blockchain_tests/get_prune_height": signal: SIGABRT (application abort requested)
    test/blockchain_tests.cpp:93: last checkpoint
    

    <details> <summary>git diff on dca1ca1f56</summary>

    diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
    index 90cee9901d..b7c1bc9673 100644
    --- a/src/rpc/blockchain.cpp
    +++ b/src/rpc/blockchain.cpp
    @@ -783,7 +783,7 @@ static RPCHelpMan getblock()
     }
     
     //! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    -static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    +std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) {
         AssertLockHeld(::cs_main);
     
         const CBlockIndex* chain_tip{chain.Tip()};
    diff --git a/src/rpc/blockchain.h b/src/rpc/blockchain.h
    index c2021c3608..d9bf627df5 100644
    --- a/src/rpc/blockchain.h
    +++ b/src/rpc/blockchain.h
    @@ -21,6 +21,7 @@ class CBlockIndex;
     class Chainstate;
     class UniValue;
     namespace node {
    +class BlockManager;
     struct NodeContext;
     } // namespace node
     
    @@ -57,4 +58,7 @@ UniValue CreateUTXOSnapshot(
         const fs::path& path,
         const fs::path& tmppath);
     
    +//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    +std::optional<int> GetPruneHeight(node::BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    +
     #endif // BITCOIN_RPC_BLOCKCHAIN_H
    diff --git a/src/test/blockchain_tests.cpp b/src/test/blockchain_tests.cpp
    index be515a9eac..2a63d09795 100644
    --- a/src/test/blockchain_tests.cpp
    +++ b/src/test/blockchain_tests.cpp
    @@ -5,7 +5,9 @@
     #include <boost/test/unit_test.hpp>
     
     #include <chain.h>
    +#include <node/blockstorage.h>
     #include <rpc/blockchain.h>
    +#include <sync.h>
     #include <test/util/setup_common.h>
     #include <util/string.h>
     
    @@ -74,4 +76,36 @@ BOOST_AUTO_TEST_CASE(get_difficulty_for_very_high_target)
         TestDifficulty(0x12345678, 5913134931067755359633408.0);
     }
     
    +//! Prune chain from height down to genesis block and check that
    +//! GetPruneHeight returns the correct value
    +static void CheckGetPruneHeight(node::BlockManager& blockman, CChain& chain, int height) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    +{
    +    AssertLockHeld(::cs_main);
    +
    +    // Emulate pruning all blocks from `height` down to the genesis block
    +    // by unsetting the `BLOCK_HAVE_DATA` flag from `nStatus`
    +    for (CBlockIndex* it{chain[height]}; it != nullptr && it->nHeight > 0; it = it->pprev) {
    +        it->nStatus &= ~BLOCK_HAVE_DATA;
    +    }
    +
    +    const auto prune_height{GetPruneHeight(blockman, chain)};
    +    BOOST_REQUIRE(prune_height.has_value());
    +    BOOST_CHECK_EQUAL(*prune_height, height);
    +}
    +
    +BOOST_FIXTURE_TEST_CASE(get_prune_height, TestChain100Setup)
    +{
    +    LOCK(::cs_main);
    +    auto& chain = m_node.chainman->ActiveChain();
    +    auto& blockman = m_node.chainman->m_blockman;
    +
    +    // Fresh chain of 100 blocks without any pruned blocks, so std::nullopt should be returned
    +    BOOST_CHECK(!GetPruneHeight(blockman, chain).has_value());
    +
    +    // Start pruning
    +    CheckGetPruneHeight(blockman, chain, 1);
    +    CheckGetPruneHeight(blockman, chain, 99);
    +    CheckGetPruneHeight(blockman, chain, 100);
    +}
    +
     BOOST_AUTO_TEST_SUITE_END()
    
    

    </details>


    ryanofsky commented at 4:48 PM on June 3, 2024:

    re: #29668 (review)

    I think this is buggy, chain tip should be considered pruned if any of the BLOCK_HAVE_MASK flags are missing:

    Nice catch! Would be great to add the test to the first commit.


    fjahr commented at 4:20 PM on June 22, 2024:

    Cool, thanks! Added the fix and the test.

  52. in test/functional/feature_pruning.py:510 in 0501ec102f outdated
     505 | +        fetch_block = node.getblockhash(pruneheight - 1)
     506 | +
     507 | +        self.connect_nodes(1, 2)
     508 | +        peers = node.getpeerinfo()
     509 | +        peer_id = peers[0]["id"]
     510 | +        node.getblockfrompeer(fetch_block, peer_id)
    


    stickies-v commented at 4:20 PM on June 3, 2024:

    nit: peer_id makes things less readable imo

            node.getblockfrompeer(fetch_block, peers[0]["id"])
    

    fjahr commented at 4:19 PM on June 22, 2024:

    done

  53. in src/rpc/blockchain.cpp:786 in dca1ca1f56 outdated
     781 | @@ -782,6 +782,24 @@ static RPCHelpMan getblock()
     782 |      };
     783 |  }
     784 |  
     785 | +//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    


    ryanofsky commented at 4:42 PM on June 3, 2024:

    In commit "refactor, blockstorage: Generalize GetFirstStoredBlock" (dca1ca1f56d8e63929eb7aea64f8fb3e01752357)

    I think the code would be simpler (and have fewer cases to think about) if this function just returned -1 when no blocks were pruned, instead of nullopt:

    --- a/src/rpc/blockchain.cpp
    +++ b/src/rpc/blockchain.cpp
    @@ -782,8 +782,8 @@ static RPCHelpMan getblock()
         };
     }
     
    -//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    -static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    +//! Return height of highest block that has been pruned, or -1 if no blocks have been pruned
    +int GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
         AssertLockHeld(::cs_main);
     
         const CBlockIndex* chain_tip{chain.Tip()};
    @@ -794,7 +794,7 @@ static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& c
         const auto& first_unpruned{*Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_DATA))};
         if (!first_unpruned.pprev) {
            // No block before the first unpruned block means nothing is pruned.
    -       return std::nullopt;
    +       return -1;
         }
         // Block before the first unpruned block is the last pruned block.
         return first_unpruned.pprev->nHeight;
    @@ -852,7 +852,7 @@ static RPCHelpMan pruneblockchain()
         }
     
         PruneBlockFilesManual(active_chainstate, height);
    -    return GetPruneHeight(chainman.m_blockman, active_chain).value_or(-1);
    +    return GetPruneHeight(chainman.m_blockman, active_chain);
     },
         };
     }
    @@ -1305,8 +1305,7 @@ RPCHelpMan getblockchaininfo()
         obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
         obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
         if (chainman.m_blockman.IsPruneMode()) {
    -        const auto prune_height{GetPruneHeight(chainman.m_blockman, active_chainstate.m_chain)};
    -        obj.pushKV("pruneheight", prune_height ? prune_height.value() + 1 : 0);
    +        obj.pushKV("pruneheight", GetPruneHeight(chainman.m_blockman, active_chainstate.m_chain) + 1);
     
             const bool automatic_pruning{chainman.m_blockman.GetPruneTarget() != BlockManager::PRUNE_TARGET_MANUAL};
             obj.pushKV("automatic_pruning",  automatic_pruning);
    

    fjahr commented at 4:20 PM on June 22, 2024:

    I disagree here. I think this is the correct use of an optional when we may or may not have a value to return. And I think it's easier to understand because the code is more expressive. Handing a magical -1 around is just not intuitive IMO.

  54. ryanofsky approved
  55. ryanofsky commented at 4:54 PM on June 3, 2024: contributor

    Code review 0501ec102f3740049bd2e891cd83527c31b29222. Code looks very good, but the nStatus & BLOCK_HAVE_MASK bug found by stickies-v #29668 (review) needs to be fixed, and the other suggested changes could be considered too

  56. refactor, blockstorage: Generalize GetFirstStoredBlock
    GetFirstStoredBlock is generalized to check for any data status with a
    status mask that needs to be passed as a parameter. To reflect this the
    function is also renamed to GetFirstBlock.
    
    Co-authored-by: stickies-v <stickies-v@protonmail.com>
    96b4facc91
  57. fjahr force-pushed on Jun 22, 2024
  58. fjahr commented at 4:23 PM on June 22, 2024: contributor

    Addressed feedback from @stickies-v and @ryanofsky, thank you!

  59. DrahtBot commented at 5:37 PM on June 22, 2024: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    <sub>Debug: https://github.com/bitcoin/bitcoin/runs/26553131801</sub>

  60. DrahtBot added the label CI failed on Jun 22, 2024
  61. rpc: Make pruneheight also reflect undo data presence 4a1975008b
  62. doc: Add note to getblockfrompeer on missing undo data 8789dc8f31
  63. fjahr force-pushed on Jun 22, 2024
  64. DrahtBot removed the label CI failed on Jun 22, 2024
  65. in src/rpc/blockchain.cpp:432 in 8789dc8f31
     428 | @@ -429,6 +429,7 @@ static RPCHelpMan getblockfrompeer()
     429 |          "getblockfrompeer",
     430 |          "Attempt to fetch block from a given peer.\n\n"
     431 |          "We must have the header for this block, e.g. using submitheader.\n"
     432 | +        "The block will not have any undo data which can limit the usage of the block data in a context where the undo data is needed.\n"
    


    furszy commented at 4:48 PM on June 23, 2024:

    q: is the "undo data" term well known outside of core? It seems to be an implementation detail to me. Maybe, could write this differently: "The block data will be stored alone; there will be no available information associated with the outputs this block spent. This limits the usage.. etc".


    fjahr commented at 10:19 PM on June 23, 2024:

    Hm, not sure how much it is known but it is referenced in other places already, like the getblock RPC help. And I'm not sure if a more generic text helps more people because it's a quite technical reason and I don't know how to explain it in a way that works for everyone and is still concise. With the term undo data people at least have something they can search on Bitcoin SE etc.


    furszy commented at 1:31 PM on June 24, 2024:

    I don't know how to explain it in a way that works for everyone and is still concise

    No problem on leaving it as is if there is other RPC command using this term. What I wrote seems generally useful to me; it means that the node might not be able to access the spent outputs script and value. Maybe adding few scenarios where the undo data is needed could clarify it usage.

  66. in src/rpc/blockchain.cpp:1318 in 8789dc8f31
    1313 | @@ -1288,8 +1314,8 @@ RPCHelpMan getblockchaininfo()
    1314 |      obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
    1315 |      obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    1316 |      if (chainman.m_blockman.IsPruneMode()) {
    1317 | -        bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    1318 | -        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip)->nHeight : tip.nHeight + 1);
    1319 | +        const auto prune_height{GetPruneHeight(chainman.m_blockman, active_chainstate.m_chain)};
    1320 | +        obj.pushKV("pruneheight", prune_height ? prune_height.value() + 1 : 0);
    


    furszy commented at 1:55 PM on June 24, 2024:

    little topic: maybe changing the RPC docs to say that "pruneheight" returns "the first block unpruned, all previous blocks were pruned" is more useful than saying "height of the last block pruned, plus one" as is now.


    fjahr commented at 4:02 PM on July 12, 2024:

    sounds good, taken as you suggested in #29770

  67. in src/rpc/blockchain.h:62 in 8789dc8f31
      57 | @@ -57,4 +58,7 @@ UniValue CreateUTXOSnapshot(
      58 |      const fs::path& path,
      59 |      const fs::path& tmppath);
      60 |  
      61 | +//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
      62 | +std::optional<int> GetPruneHeight(const node::BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    stickies-v commented at 2:27 PM on June 24, 2024:

    nit: missing some includes / fwd decls for the newly added line:

    <details> <summary>git diff on 8789dc8f31</summary>

    diff --git a/src/rpc/blockchain.h b/src/rpc/blockchain.h
    index f6a7fe236c..23909c334e 100644
    --- a/src/rpc/blockchain.h
    +++ b/src/rpc/blockchain.h
    @@ -9,15 +9,18 @@
     #include <core_io.h>
     #include <streams.h>
     #include <sync.h>
    +#include <threadsafety.h>
     #include <util/fs.h>
     #include <validation.h>
     
     #include <any>
    +#include <optional>
     #include <stdint.h>
     #include <vector>
     
     class CBlock;
     class CBlockIndex;
    +class CChain;
     class Chainstate;
     class UniValue;
     namespace node {
    
    

    </details>


    fjahr commented at 4:02 PM on July 12, 2024:

    Done in #29770

  68. in src/test/blockchain_tests.cpp:81 in 8789dc8f31
      75 | @@ -74,4 +76,36 @@ BOOST_AUTO_TEST_CASE(get_difficulty_for_very_high_target)
      76 |      TestDifficulty(0x12345678, 5913134931067755359633408.0);
      77 |  }
      78 |  
      79 | +//! Prune chain from height down to genesis block and check that
      80 | +//! GetPruneHeight returns the correct value
      81 | +static void CheckGetPruneHeight(node::BlockManager& blockman, CChain& chain, int height) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    


    stickies-v commented at 2:37 PM on June 24, 2024:

    nit

    static void CheckGetPruneHeight(const node::BlockManager& blockman, const CChain& chain, int height) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    

    and related:

    <details> <summary>git diff on 8789dc8f31</summary>

    diff --git a/src/test/blockchain_tests.cpp b/src/test/blockchain_tests.cpp
    index 2a63d09795..7a5d175c8c 100644
    --- a/src/test/blockchain_tests.cpp
    +++ b/src/test/blockchain_tests.cpp
    @@ -96,8 +96,8 @@ static void CheckGetPruneHeight(node::BlockManager& blockman, CChain& chain, int
     BOOST_FIXTURE_TEST_CASE(get_prune_height, TestChain100Setup)
     {
         LOCK(::cs_main);
    -    auto& chain = m_node.chainman->ActiveChain();
    -    auto& blockman = m_node.chainman->m_blockman;
    +    const auto& chain = m_node.chainman->ActiveChain();
    +    const auto& blockman = m_node.chainman->m_blockman;
     
         // Fresh chain of 100 blocks without any pruned blocks, so std::nullopt should be returned
         BOOST_CHECK(!GetPruneHeight(blockman, chain).has_value());
    
    

    </details>


    fjahr commented at 4:02 PM on July 12, 2024:

    Done in #29770

  69. in src/rpc/blockchain.cpp:802 in 4a1975008b outdated
     804 | +
     805 | +    // If the chain tip is pruned, everything is pruned.
     806 | +    if (!((chain_tip->nStatus & BLOCK_HAVE_MASK) == BLOCK_HAVE_MASK)) return chain_tip->nHeight;
     807 | +
     808 | +    const auto& first_unpruned{*Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_MASK, first_block))};
     809 | +    if (&first_unpruned == first_block) {
    


    furszy commented at 2:38 PM on June 24, 2024:

    Can remove the back and forth.

        const CBlockIndex* first_unpruned{Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_MASK, first_block))};
        if (first_unpruned == first_block) {
    

    fjahr commented at 4:02 PM on July 12, 2024:

    done in #29770

  70. in src/rpc/blockchain.cpp:799 in 4a1975008b outdated
     801 | -       return std::nullopt;
     802 | +    // If there are no blocks after the genesis block, or no blocks at all, nothing is pruned.
     803 | +    if (!first_block || !chain_tip) return std::nullopt;
     804 | +
     805 | +    // If the chain tip is pruned, everything is pruned.
     806 | +    if (!((chain_tip->nStatus & BLOCK_HAVE_MASK) == BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    


    furszy commented at 2:41 PM on June 24, 2024:

    could be simpler:

        if ((chain_tip->nStatus & BLOCK_HAVE_MASK) != BLOCK_HAVE_MASK) return chain_tip->nHeight;
    

    fjahr commented at 4:02 PM on July 12, 2024:

    done in #29770

  71. furszy commented at 2:43 PM on June 24, 2024: member

    Code review ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7. Left few nits.

  72. DrahtBot requested review from stickies-v on Jun 24, 2024
  73. DrahtBot requested review from TheCharlatan on Jun 24, 2024
  74. stickies-v approved
  75. stickies-v commented at 2:45 PM on June 24, 2024: contributor

    ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7

  76. achow101 commented at 7:18 PM on July 10, 2024: member

    ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7

  77. achow101 merged this on Jul 10, 2024
  78. achow101 closed this on Jul 10, 2024

  79. fjahr commented at 11:49 PM on July 10, 2024: contributor

    With this merged, I will take #29770 out of draft mode shortly and also address the left-over comments here. Thanks @stickies-v and @furszy !

  80. fjahr referenced this in commit 34fdeface4 on Jul 12, 2024
  81. fjahr referenced this in commit 245c09bc85 on Jul 12, 2024
  82. fjahr referenced this in commit 0354df3608 on Jul 15, 2024
  83. fjahr referenced this in commit d855c594c3 on Jul 17, 2024
  84. fjahr referenced this in commit 3d72c9fd05 on Mar 29, 2025
  85. fjahr referenced this in commit 401e59df45 on Mar 30, 2025
  86. fjahr referenced this in commit 5ba0d1d0dc on Apr 3, 2025
  87. Fabcien referenced this in commit cd986cbab6 on Apr 14, 2025
  88. Fabcien referenced this in commit ab94fbe742 on Apr 14, 2025
  89. fjahr referenced this in commit 60786db022 on May 31, 2025
  90. fjahr referenced this in commit c354abeb28 on Jun 4, 2025
  91. fjahr referenced this in commit 342cbec6b2 on Jun 15, 2025
  92. fjahr referenced this in commit 4914105197 on Jun 17, 2025
  93. fjahr referenced this in commit 5ce8336aec on Jun 18, 2025
  94. bitcoin locked this on Jul 12, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-01 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me