prune, rpc: Check undo data when finding pruneheight #29668

pull fjahr wants to merge 3 commits into bitcoin:master from fjahr:2024-03-first-stored-undo changing 7 files +120 −14
  1. fjahr commented at 4:42 pm on March 17, 2024: contributor

    The function GetFirstStoredBlock() helps us find the first block for which we have data. So far this function only looked for a block with BLOCK_HAVE_DATA. However, this doesn’t mean that we also have the undo data of that block, and undo data might be required for what a user would like to do with those blocks. One example of how this might happen is if some blocks were fetched using the getblockfrompeer RPC. Blocks fetched from a peer will have data but no undo data.

    The first commit here allows GetFirstStoredBlock() to check for undo data as well by passing a parameter. This alone is useful for #29553 and I would use it there.

    In the second commit I am applying the undo check to the RPCs that report pruneheight to the user. I find this much more intuitive because I think the user expects to be able to do all operations on blocks up until the pruneheight but that is not the case if undo data is missing. I personally ran into this once before and now again when testing for assumeutxo when I had used getblockfrompeer. The following commit adds test coverage for this change of behavior.

    The last commit adds a note in the docs of getblockfrompeer that undo data will not be available.

  2. DrahtBot commented at 4:42 pm on March 17, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK furszy, stickies-v, achow101
    Stale ACK TheCharlatan

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29770 (index: Check all necessary block data is available before starting to sync by fjahr)
    • #29656 (chainparams: Change nChainTx type to uint64_t by fjahr)
    • #19463 (Prune locks by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. fjahr renamed this:
    prune, rpc: Check undo data when finding for pruneheight
    prune, rpc: Check undo data when finding pruneheight
    on Mar 17, 2024
  4. DrahtBot added the label CI failed on Mar 17, 2024
  5. fjahr force-pushed on Mar 17, 2024
  6. DrahtBot removed the label CI failed on Mar 17, 2024
  7. TheCharlatan approved
  8. TheCharlatan commented at 2:47 pm on March 20, 2024: contributor
    ACK 296243c3153e996d1626b8eabc8f8bce845ce3ad
  9. in src/node/blockstorage.cpp:598 in 296243c315 outdated
    594@@ -595,12 +595,16 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
    595     return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
    596 }
    597 
    598-const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block)
    599+const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool check_undo)
    


    stickies-v commented at 5:14 pm on March 25, 2024:

    nit: I find has_undo more to the point

    0const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool has_undo)
    

    fjahr commented at 5:03 pm on March 27, 2024:
    Not needed due to approach change
  10. in src/node/blockstorage.cpp:601 in 296243c315 outdated
    594@@ -595,12 +595,16 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
    595     return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
    596 }
    597 
    598-const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block)
    599+const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool check_undo)
    600 {
    601     AssertLockHeld(::cs_main);
    602+    uint32_t check_status{BLOCK_HAVE_DATA};
    


    stickies-v commented at 5:24 pm on March 25, 2024:

    nit: I think highlighting that this is a mask improves readability

    0    uint32_t status_mask{BLOCK_HAVE_DATA};
    

    stickies-v commented at 5:29 pm on March 25, 2024:

    nit: also, perhaps good to add a type static_assert here to ensure this doesn’t silently introduce bugs when nStatus changes type?

    0static_assert(std::is_same<decltype(check_status), decltype(last_block->nStatus)>::value);
    

    fjahr commented at 5:03 pm on March 27, 2024:
    Not needed due to approach change
  11. in src/node/blockstorage.h:341 in 296243c315 outdated
    336@@ -337,8 +337,8 @@ class BlockManager
    337 
    338     //! Find the first stored ancestor of start_block immediately after the last
    339     //! pruned ancestor. Return value will never be null. Caller is responsible
    340-    //! for ensuring that start_block has data is not pruned.
    341-    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    342+    //! for ensuring that start_block has data.
    343+    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, const bool check_undo=false) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    stickies-v commented at 5:35 pm on March 25, 2024:

    nit: unnecessary const

    0    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, bool check_undo=false) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    

    stickies-v commented at 5:42 pm on March 25, 2024:
    nit: also, would be nice to update the docs for the check_undo param, and include the exception made for genesis block so new callers are made aware to look out for that?

    fjahr commented at 5:03 pm on March 27, 2024:
    done
  12. in src/rpc/blockchain.cpp:839 in 296243c315 outdated
    835@@ -835,7 +836,7 @@ static RPCHelpMan pruneblockchain()
    836 
    837     PruneBlockFilesManual(active_chainstate, height);
    838     const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
    839-    return block.nStatus & BLOCK_HAVE_DATA ? active_chainstate.m_blockman.GetFirstStoredBlock(block)->nHeight - 1 : block.nHeight;
    840+    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, nullptr, true)->nHeight - 1 : block.nHeight;
    


    stickies-v commented at 5:45 pm on March 25, 2024:

    nit (+ a few lines down)

    0    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*last_block=*/nullptr, /*check_undo=*/true)->nHeight - 1 : block.nHeight;
    

    fjahr commented at 5:03 pm on March 27, 2024:
    done
  13. stickies-v commented at 8:12 pm on March 25, 2024: contributor

    Concept ACK

    I think this approach works well, but I think I’d prefer passing a mask instead of a bool, which prevents combinatorial explosion of params if we want to add more filtering options in the future. I personally find that a bit easier to quickly understand too. What do you think about something like this (didn’t add static asserts and doc updates etc yet):

     0diff --git a/src/node/blockstorage.cpp b/src/node/blockstorage.cpp
     1index 74a233f43b..eaefa602d3 100644
     2--- a/src/node/blockstorage.cpp
     3+++ b/src/node/blockstorage.cpp
     4@@ -595,16 +595,12 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
     5     return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     6 }
     7 
     8-const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, const bool check_undo)
     9+const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, uint32_t status_mask)
    10 {
    11     AssertLockHeld(::cs_main);
    12-    uint32_t check_status{BLOCK_HAVE_DATA};
    13-    if (check_undo) {
    14-        check_status |= BLOCK_HAVE_UNDO;
    15-    }
    16     const CBlockIndex* last_block = &upper_block;
    17-    assert((last_block->nStatus & check_status) == check_status); // 'upper_block' must have data
    18-    while (last_block->pprev && ((last_block->pprev->nStatus & check_status) == check_status)) {
    19+    assert((last_block->nStatus & status_mask) == status_mask); // 'upper_block' must have data
    20+    while (last_block->pprev && ((last_block->pprev->nStatus & status_mask) == status_mask)) {
    21         if (lower_block) {
    22             // Return if we reached the lower_block
    23             if (last_block == lower_block) return lower_block;
    24@@ -617,7 +613,7 @@ const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_bl
    25     assert(last_block != nullptr);
    26     // In the special case that all blocks up to the Genesis block are not
    27     // pruned, we return the Genesis block instead of block 1, even though
    28-    // it can not have any undo data, since it is properly handled as an
    29+    // it may be missing (e.g. undo) data, since it is properly handled as an
    30     // exception everywhere.
    31     if (last_block->nHeight == 1) {
    32         return last_block->pprev;
    33diff --git a/src/node/blockstorage.h b/src/node/blockstorage.h
    34index 33d78259e6..766a170946 100644
    35--- a/src/node/blockstorage.h
    36+++ b/src/node/blockstorage.h
    37@@ -338,7 +338,7 @@ public:
    38     //! Find the first stored ancestor of start_block immediately after the last
    39     //! pruned ancestor. Return value will never be null. Caller is responsible
    40     //! for ensuring that start_block has data.
    41-    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, const bool check_undo=false) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    42+    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    43 
    44     /** True if any block files have ever been pruned. */
    45     bool m_have_pruned = false;
    46diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
    47index 3cd9421343..88ed7ed964 100644
    48--- a/src/rpc/blockchain.cpp
    49+++ b/src/rpc/blockchain.cpp
    50@@ -836,7 +836,8 @@ static RPCHelpMan pruneblockchain()
    51 
    52     PruneBlockFilesManual(active_chainstate, height);
    53     const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
    54-    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, nullptr, true)->nHeight - 1 : block.nHeight;
    55+    const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    56+    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, nullptr, has_undo)->nHeight - 1 : block.nHeight;
    57 },
    58     };
    59 }
    60@@ -1290,7 +1291,8 @@ RPCHelpMan getblockchaininfo()
    61     obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    62     if (chainman.m_blockman.IsPruneMode()) {
    63         bool has_tip_data = tip.nStatus & BLOCK_HAVE_MASK;
    64-        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip, nullptr, true)->nHeight : tip.nHeight + 1);
    65+        const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    66+        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip, nullptr, has_undo)->nHeight : tip.nHeight + 1);
    67 
    68         const bool automatic_pruning{chainman.m_blockman.GetPruneTarget() != BlockManager::PRUNE_TARGET_MANUAL};
    69         obj.pushKV("automatic_pruning",  automatic_pruning);
    

    Code LGTM 296243c3153e996d1626b8eabc8f8bce845ce3ad, and I verified that the (updated) feature_pruning.py test fails on master as expected. I’m happy to proceed with your approach too if you don’t like mine.

  14. fjahr force-pushed on Mar 27, 2024
  15. fjahr commented at 5:03 pm on March 27, 2024: contributor
    @stickies-v thanks for the feedback! I don’t have strong feelings between the approaches but I agree yours is more flexible and maybe a bit more expressive, so I applied that. I also addressed the rest of the feedback and added some more detail to the docs.
  16. DrahtBot added the label CI failed on Mar 28, 2024
  17. DrahtBot commented at 1:49 am on March 28, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/23163769501

  18. fjahr force-pushed on Mar 28, 2024
  19. furszy commented at 3:08 pm on March 28, 2024: member

    q: doesn’t this mean that if we generate a chain, prune half of it, then fetch the pruned blocks through getblockfrompeer, any fresh indexes sync will throw a different init error after this changes? (StartIndexBackgroundSync() should fail at CheckBlockDataAvailability() now).

    If this is correct, it would be nice to introduce a test presenting the behavior change.

  20. DrahtBot removed the label CI failed on Mar 28, 2024
  21. fjahr commented at 10:57 pm on March 30, 2024: contributor

    q: doesn’t this mean that if we generate a chain, prune half of it, then fetch the pruned blocks through getblockfrompeer, any fresh indexes sync will throw a different init error after this changes? (StartIndexBackgroundSync() should fail at CheckBlockDataAvailability() now).

    If this is correct, it would be nice to introduce a test presenting the behavior change.

    Good question but no, CheckBlockDataAvailability() does not change the default mask (HAVE_DATA), so it shouldn’t change behavior there for now. But your question made me realize that we should check for undo data for some indices because they need it. I have implemented that in a follow-up here: #29770 and I also implemented the test there that shows the behavior change you had in mind I think.

  22. in src/rpc/blockchain.cpp:840 in 23ac56177d outdated
    835@@ -835,7 +836,8 @@ static RPCHelpMan pruneblockchain()
    836 
    837     PruneBlockFilesManual(active_chainstate, height);
    838     const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
    839-    return block.nStatus & BLOCK_HAVE_DATA ? active_chainstate.m_blockman.GetFirstStoredBlock(block)->nHeight - 1 : block.nHeight;
    840+    const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    841+    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight - 1 : block.nHeight;
    


    stickies-v commented at 1:35 pm on March 31, 2024:
    0    return block.nStatus & has_undo ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight - 1 : block.nHeight;
    

    fjahr commented at 7:28 pm on April 28, 2024:
    done
  23. in src/rpc/blockchain.cpp:1293 in 23ac56177d outdated
    1290@@ -1289,7 +1291,8 @@ RPCHelpMan getblockchaininfo()
    1291     obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    1292     if (chainman.m_blockman.IsPruneMode()) {
    1293         bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    


    stickies-v commented at 1:36 pm on March 31, 2024:
    This should check for has_undo too

    stickies-v commented at 2:13 pm on March 31, 2024:
    Also, it seems like we can just use BLOCK_HAVE_MASK = BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO,

    fjahr commented at 7:28 pm on April 28, 2024:
    done, I liked that BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO is more expressive because we don’t use BLOCK_HAVE_MASK much I feel like people would need to look it up more, but I have just added a comment instead now.
  24. in src/node/blockstorage.h:344 in 23ac56177d outdated
    342+    //! ancestor that does not match the status mask. Return value will never be
    343+    //! null. Caller is responsible for ensuring that start_block has the status
    344+    //! mask data. If the whole chain matched the status mask the genesis block
    345+    //! will be returned regardless of it's status match because, for example,
    346+    //! it can not have undo data by nature.
    347+    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    stickies-v commented at 3:35 pm on March 31, 2024:

    nit: it’s called upper_block in the implementation, which I think is a better name too (cfr lower_block)

    0    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& upper_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    

    fjahr commented at 7:28 pm on April 28, 2024:
    done
  25. stickies-v commented at 10:14 pm on March 31, 2024: contributor

    Thanks for incorporating the mask suggestion. Had a deeper look at the code, I think this logic maybe warrants a bit more cleaning up, because with the status_mask, using GetFirstStoredBlock is not very intuitive anymore and potentially a bit footgunny.

    • GetFirstStoredBlock: rename to GetFirstBlock since the Stored bit depends on the status_mask now. Consequently, make status_mask required to avoid footguns.
    • let callsites handle genesis block exceptions, because it depends on the status_mask. As a bonus, everything is a bit more explicit?
    • some other tidying up, e.g. make parameter naming consistent between header and implementation, improve documentation

    Wdyt about this further iteration? I understand it’s quite a bit of overhaul compared to your first proposal for what should be a relatively simple change. I’m just hesitant about introducing new boolean parameters when we can avoid it, and I think it’s worth improving overall code robustness?

      0diff --git a/src/init.cpp b/src/init.cpp
      1index 1a4fce4678..a8a5ff0dfd 100644
      2--- a/src/init.cpp
      3+++ b/src/init.cpp
      4@@ -2032,7 +2032,7 @@ bool StartIndexBackgroundSync(NodeContext& node)
      5     if (indexes_start_block) {
      6         LOCK(::cs_main);
      7         const CBlockIndex* start_block = *indexes_start_block;
      8-        if (!start_block) start_block = chainman.ActiveChain().Genesis();
      9+        if (!start_block) start_block = chainman.ActiveChain()[1];
     10         if (!chainman.m_blockman.CheckBlockDataAvailability(*index_chain.Tip(), *Assert(start_block))) {
     11             return InitError(strprintf(Untranslated("%s best block of the index goes beyond pruned data. Please disable the index or reindex (which will download the whole blockchain again)"), older_index_name));
     12         }
     13diff --git a/src/node/blockstorage.cpp b/src/node/blockstorage.cpp
     14index eaefa602d3..8556504453 100644
     15--- a/src/node/blockstorage.cpp
     16+++ b/src/node/blockstorage.cpp
     17@@ -595,11 +595,12 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
     18     return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     19 }
     20 
     21-const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_block, const CBlockIndex* lower_block, uint32_t status_mask)
     22+const CBlockIndex* BlockManager::GetFirstBlock(const CBlockIndex& upper_block, uint32_t status_mask,
     23+                                               const CBlockIndex* lower_block)
     24 {
     25     AssertLockHeld(::cs_main);
     26     const CBlockIndex* last_block = &upper_block;
     27-    assert((last_block->nStatus & status_mask) == status_mask); // 'upper_block' must have data
     28+    assert((last_block->nStatus & status_mask) == status_mask); // 'upper_block' must satisfy the status mask
     29     while (last_block->pprev && ((last_block->pprev->nStatus & status_mask) == status_mask)) {
     30         if (lower_block) {
     31             // Return if we reached the lower_block
     32@@ -610,21 +611,13 @@ const CBlockIndex* BlockManager::GetFirstStoredBlock(const CBlockIndex& upper_bl
     33         }
     34         last_block = last_block->pprev;
     35     }
     36-    assert(last_block != nullptr);
     37-    // In the special case that all blocks up to the Genesis block are not
     38-    // pruned, we return the Genesis block instead of block 1, even though
     39-    // it may be missing (e.g. undo) data, since it is properly handled as an
     40-    // exception everywhere.
     41-    if (last_block->nHeight == 1) {
     42-        return last_block->pprev;
     43-    }
     44-    return last_block;
     45+    return Assert(last_block);
     46 }
     47 
     48 bool BlockManager::CheckBlockDataAvailability(const CBlockIndex& upper_block, const CBlockIndex& lower_block)
     49 {
     50     if (!(upper_block.nStatus & BLOCK_HAVE_DATA)) return false;
     51-    return GetFirstStoredBlock(upper_block, &lower_block) == &lower_block;
     52+    return GetFirstBlock(upper_block, BLOCK_HAVE_DATA, &lower_block) == &lower_block;
     53 }
     54 
     55 // If we're using -prune with -reindex, then delete block files that will be ignored by the
     56diff --git a/src/node/blockstorage.h b/src/node/blockstorage.h
     57index 8f83b68558..8f309c31b6 100644
     58--- a/src/node/blockstorage.h
     59+++ b/src/node/blockstorage.h
     60@@ -335,13 +335,29 @@ public:
     61     //! (part of the same chain).
     62     bool CheckBlockDataAvailability(const CBlockIndex& upper_block LIFETIMEBOUND, const CBlockIndex& lower_block LIFETIMEBOUND) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
     63 
     64-    //! Find the first stored ancestor of start_block immediately after the last
     65-    //! ancestor that does not match the status mask. Return value will never be
     66-    //! null. Caller is responsible for ensuring that start_block has the status
     67-    //! mask data. If the whole chain matched the status mask the genesis block
     68-    //! will be returned regardless of it's status match because, for example,
     69-    //! it can not have undo data by nature.
     70-    const CBlockIndex* GetFirstStoredBlock(const CBlockIndex& start_block LIFETIMEBOUND, const CBlockIndex* lower_block=nullptr, uint32_t status_mask=BLOCK_HAVE_DATA) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
     71+    /**
     72+     * [@brief](/bitcoin-bitcoin/contributor/brief/) Finds the furthest away ancestor of `upper_block` of which the nStatus matches the `status_mask`
     73+     * 
     74+     * Starts from `upper_block` and iterates backwards through its ancestors for as long as the block's
     75+     * nStatus keeps matching the `status_mask` mask, or until it encounters `lower_block`.
     76+     *
     77+     * [@pre](/bitcoin-bitcoin/contributor/pre/) `upper_block` must match the `status_mask`.
     78+     * 
     79+     * [@param](/bitcoin-bitcoin/contributor/param/) upper_block The block from which to start the search, must meet the `status_mask` criteria.
     80+     * [@param](/bitcoin-bitcoin/contributor/param/) status_mask A bitmask representing the conditions to check against each block's nStatus.
     81+     * [@param](/bitcoin-bitcoin/contributor/param/) lower_block An optional pointer to the `CBlockIndex` that serves as the lower boundary of
     82+     *                    the search (inclusive). If `nullptr`, the search includes up to the genesis
     83+     *                    block.
     84+     * [@return](/bitcoin-bitcoin/contributor/return/) A (non-null) pointer to the `CBlockIndex` of the furthest away ancestor (including
     85+     *         `upper_block` itself) that matches the `status_mask`, up to and including
     86+     *         `lower_block` if it is part of the ancestry.
     87+     */
     88+    const CBlockIndex* GetFirstBlock(
     89+        const CBlockIndex& upper_block LIFETIMEBOUND, 
     90+        uint32_t status_mask,
     91+        const CBlockIndex* lower_block = nullptr
     92+    ) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
     93+
     94 
     95     /** True if any block files have ever been pruned. */
     96     bool m_have_pruned = false;
     97diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
     98index fa83d2e397..15d6f9e9b9 100644
     99--- a/src/rpc/blockchain.cpp
    100+++ b/src/rpc/blockchain.cpp
    101@@ -94,6 +94,18 @@ double GetDifficulty(const CBlockIndex& blockindex)
    102     return dDiff;
    103 }
    104 
    105+//! Return height of highest block that has been pruned, or -1 if no blocks have been pruned
    106+static int GetPruneHeight(const Chainstate& active_chainstate) EXCLUSIVE_LOCKS_REQUIRED(!cs_main) {
    107+    AssertLockHeld(cs_main);
    108+
    109+    const CBlockIndex& chain_tip{*CHECK_NONFATAL(active_chainstate.m_chain.Tip())};
    110+    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
    111+    const auto first_pruned{active_chainstate.m_blockman.GetFirstBlock(chain_tip, BLOCK_HAVE_MASK)->nHeight - 1};
    112+
    113+    // special case: genesis block is never pruned
    114+    return first_pruned == 0 ? -1 : first_pruned;
    115+}
    116+
    117 static int ComputeNextBlockAndDepth(const CBlockIndex& tip, const CBlockIndex& blockindex, const CBlockIndex*& next)
    118 {
    119     next = tip.GetAncestor(blockindex.nHeight + 1);
    120@@ -835,9 +847,7 @@ static RPCHelpMan pruneblockchain()
    121     }
    122 
    123     PruneBlockFilesManual(active_chainstate, height);
    124-    const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
    125-    const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    126-    return block.nStatus & BLOCK_HAVE_MASK ? active_chainstate.m_blockman.GetFirstStoredBlock(block, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight - 1 : block.nHeight;
    127+    return GetPruneHeight(active_chainstate);
    128 },
    129     };
    130 }
    131@@ -1290,10 +1300,7 @@ RPCHelpMan getblockchaininfo()
    132     obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
    133     obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    134     if (chainman.m_blockman.IsPruneMode()) {
    135-        bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    136-        const uint32_t has_undo{BLOCK_HAVE_DATA | BLOCK_HAVE_UNDO};
    137-        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip, /*lower_block=*/nullptr, /*status_mask=*/has_undo)->nHeight : tip.nHeight + 1);
    138-
    139+        obj.pushKV("pruneheight", GetPruneHeight(active_chainstate) + 1);
    140         const bool automatic_pruning{chainman.m_blockman.GetPruneTarget() != BlockManager::PRUNE_TARGET_MANUAL};
    141         obj.pushKV("automatic_pruning",  automatic_pruning);
    142         if (automatic_pruning) {
    143diff --git a/src/test/blockmanager_tests.cpp b/src/test/blockmanager_tests.cpp
    144index d7ac0bf823..12f3257700 100644
    145--- a/src/test/blockmanager_tests.cpp
    146+++ b/src/test/blockmanager_tests.cpp
    147@@ -2,6 +2,7 @@
    148 // Distributed under the MIT software license, see the accompanying
    149 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
    150 
    151+#include <chain.h>
    152 #include <chainparams.h>
    153 #include <clientversion.h>
    154 #include <node/blockstorage.h>
    155@@ -113,7 +114,7 @@ BOOST_FIXTURE_TEST_CASE(blockmanager_block_data_availability, TestChain100Setup)
    156     };
    157 
    158     // 1) Return genesis block when all blocks are available
    159-    BOOST_CHECK_EQUAL(blockman.GetFirstStoredBlock(tip), chainman->ActiveChain()[0]);
    160+    BOOST_CHECK_EQUAL(blockman.GetFirstBlock(tip, BLOCK_HAVE_DATA), chainman->ActiveChain()[0]);
    161     BOOST_CHECK(blockman.CheckBlockDataAvailability(tip, *chainman->ActiveChain()[0]));
    162 
    163     // 2) Check lower_block when all blocks are available
    164@@ -127,7 +128,7 @@ BOOST_FIXTURE_TEST_CASE(blockmanager_block_data_availability, TestChain100Setup)
    165     func_prune_blocks(last_pruned_block);
    166 
    167     // 3) The last block not pruned is in-between upper-block and the genesis block
    168-    BOOST_CHECK_EQUAL(blockman.GetFirstStoredBlock(tip), first_available_block);
    169+    BOOST_CHECK_EQUAL(blockman.GetFirstBlock(tip, BLOCK_HAVE_DATA), first_available_block);
    170     BOOST_CHECK(blockman.CheckBlockDataAvailability(tip, *first_available_block));
    171     BOOST_CHECK(!blockman.CheckBlockDataAvailability(tip, *last_pruned_block));
    172 }
    
  26. fjahr force-pushed on Apr 28, 2024
  27. fjahr commented at 7:34 pm on April 28, 2024: contributor

    @stickies-v Thank you, I mostly took your suggestions with some minor tweaks and made you co-author of the refactoring commit. I was a bit torn about letting the caller handle the Genesis block issue but since this was actually the previous behavior I guess it’s better to not change it if you like it the way it is. I have adapted GetPruneHeight to return std::optional which I found a good fit and avoids the -1 return value. And I think the change in init.cpp isn’t needed here (yet). But I think we will need it in #29770 later.

    EDIT: I had forgotten that pruneblockchain returns -1 when nothing is pruned, but I still think handling it internally with std::optional is nicer.

  28. DrahtBot added the label CI failed on Apr 28, 2024
  29. fjahr force-pushed on Apr 28, 2024
  30. fjahr force-pushed on Apr 28, 2024
  31. fjahr force-pushed on Apr 28, 2024
  32. DrahtBot removed the label CI failed on Apr 28, 2024
  33. in src/rpc/blockchain.cpp:798 in edbb6a3eb1 outdated
    792+    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
    793+    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, /*status_mask=*/BLOCK_HAVE_MASK)};
    794+
    795+    // Result can never be 0 because genesis block is never pruned, so the
    796+    // result 1 means that no block has been pruned.
    797+    if (first_block->nHeight == 1) {
    


    stickies-v commented at 11:51 am on April 29, 2024:
    Could add a CHECK_NONFATAL(first_block->nHeight > 0); here to put that assumption in code, that would’ve helped with the segfault in current HEAD

    fjahr commented at 4:25 pm on April 29, 2024:
    done
  34. in src/rpc/blockchain.cpp:795 in edbb6a3eb1 outdated
    790+    const CBlockIndex& chain_tip{*CHECK_NONFATAL(active_chainstate.m_chain.Tip())};
    791+    // Check for both data and undo data
    792+    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
    793+    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, /*status_mask=*/BLOCK_HAVE_MASK)};
    794+
    795+    // Result can never be 0 because genesis block is never pruned, so the
    


    stickies-v commented at 11:52 am on April 29, 2024:
    This block needs to be in the next commit - we’re not passing BLOCK_HAVE_MASK yet in dd7696b97f20d756df1afa523a8f9e7bf3924c7d. This should fix the failing CI

    fjahr commented at 4:25 pm on April 29, 2024:
    done
  35. in src/rpc/blockchain.cpp:801 in edbb6a3eb1 outdated
    796+    // result 1 means that no block has been pruned.
    797+    if (first_block->nHeight == 1) {
    798+        return std::nullopt;
    799+    }
    800+    // The block before the block with all data is the last that was pruned
    801+    return first_block->pprev->nHeight;
    


    stickies-v commented at 11:59 am on April 29, 2024:
    GetFirstBlock() does not provide any guarantees that pprev is not nullptr (nor should it), so I think we need to assert that here

    fjahr commented at 4:25 pm on April 29, 2024:
    done

    stickies-v commented at 4:10 pm on June 3, 2024:
    I don’t think this is fully addressed in 0501ec102f3740049bd2e891cd83527c31b29222 - would update to return Assert(first_unpruned.pprev)->nHeight;?

    fjahr commented at 4:19 pm on June 22, 2024:
    Not sure why this is still needed when we return above if !first_unpruned.pprev but I’ve added it since it won’t hurt.
  36. in src/rpc/blockchain.cpp:786 in edbb6a3eb1 outdated
    782@@ -782,6 +783,24 @@ static RPCHelpMan getblock()
    783     };
    784 }
    785 
    786+//! Return height of highest block that has been pruned
    


    stickies-v commented at 12:02 pm on April 29, 2024:
    0//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    

    fjahr commented at 4:25 pm on April 29, 2024:
    added
  37. in src/rpc/blockchain.cpp:857 in edbb6a3eb1 outdated
    852@@ -834,8 +853,8 @@ static RPCHelpMan pruneblockchain()
    853     }
    854 
    855     PruneBlockFilesManual(active_chainstate, height);
    856-    const CBlockIndex& block{*CHECK_NONFATAL(active_chain.Tip())};
    857-    return block.nStatus & BLOCK_HAVE_DATA ? active_chainstate.m_blockman.GetFirstStoredBlock(block)->nHeight - 1 : block.nHeight;
    858+    const auto prune_height{GetPruneHeight(active_chainstate)};
    859+    return prune_height ? prune_height.value() : -1;
    


    stickies-v commented at 12:07 pm on April 29, 2024:

    nit

    0    return GetPruneHeight(active_chainstate).value_or(-1);
    

    fjahr commented at 4:26 pm on April 29, 2024:
    taken
  38. in src/rpc/blockchain.cpp:1318 in edbb6a3eb1 outdated
    1306@@ -1288,8 +1307,8 @@ RPCHelpMan getblockchaininfo()
    1307     obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
    1308     obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    1309     if (chainman.m_blockman.IsPruneMode()) {
    1310-        bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    1311-        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip)->nHeight : tip.nHeight + 1);
    1312+        const auto prune_height{GetPruneHeight(active_chainstate)};
    1313+        obj.pushKV("pruneheight", prune_height ? prune_height.value() + 1 : 0);
    


    stickies-v commented at 12:14 pm on April 29, 2024:

    nit: more consistent with the (suggested) approach in pruneblockchain, but then “plus one (only present if pruning is enabled)” as per the docs in this RPC

    0        const auto prune_height{GetPruneHeight(active_chainstate).value_or(-1)};
    1        obj.pushKV("pruneheight", prune_height + 1);
    

    fjahr commented at 4:26 pm on April 29, 2024:
    I prefer to keep this one as is because I find it more intuitive.
  39. stickies-v commented at 12:18 pm on April 29, 2024: contributor
    nit: I think the last 3 commits should be squashed
  40. in src/node/blockstorage.h:359 in dd7696b97f outdated
    353+     *                    the search (inclusive). If `nullptr`, the search includes up to the genesis
    354+     *                    block.
    355+     * @return A (non-null) pointer to the `CBlockIndex` of the furthest away ancestor (including
    356+     *         `upper_block` itself) that matches the `status_mask`, up to and including
    357+     *         `lower_block` if it is part of the ancestry.
    358+     */
    


    ryanofsky commented at 4:00 pm on April 29, 2024:

    In commit “refactor, blockstorage: Generalize GetFirstStoredBlock” (dd7696b97f20d756df1afa523a8f9e7bf3924c7d)

    I think there are a few issues with this comment:

    • The @brief and @return sections make it sound like it is returning the earliest block matching the flags, when actually it is returning the earliest block matching the flags after the latest block not matching them. This is different because it means not only does the returned block have the flags, but all blocks between the returned block and upper block have them.
    • One precondition is specified but one is missing (this also asserts false if the lower block is not an ancestor of the upper block).
    • I don’t like how some pretty important details are written in parentheses. Better to use parentheses for more optional explanatory information.

    Suggestion to fix these things:

     0    /**
     1     * [@brief](/bitcoin-bitcoin/contributor/brief/) Returns the earliest block with specified `status_mask` flags set after
     2     * the latest block _not_ having those flags.
     3     *
     4     * This function starts from `upper_block`, which must have all
     5     * `status_mask` flags set, and iterates backwards through its ancestors. It
     6     * continues as long as each block has all `status_mask` flags set, until
     7     * reaching the oldest ancestor or `lower_block`.
     8     * 
     9     * [@pre](/bitcoin-bitcoin/contributor/pre/) `upper_block` must have all `status_mask` flags set.
    10     * [@pre](/bitcoin-bitcoin/contributor/pre/) `lower_block` must be null or an ancestor of `upper_block`
    11     *
    12     * [@param](/bitcoin-bitcoin/contributor/param/) upper_block The starting block for the search, which must have all
    13     *                    `status_mask` flags set.
    14     * [@param](/bitcoin-bitcoin/contributor/param/) status_mask Bitmask specifying required status flags.
    15     * [@param](/bitcoin-bitcoin/contributor/param/) lower_block The earliest possible block to return. If null, the
    16     *                    search can extend to the genesis block.
    17     *
    18     * [@return](/bitcoin-bitcoin/contributor/return/) A non-null pointer to the earliest block between `upper_block`
    19     *         and `lower_block`, inclusive, such that every block between the
    20     *         returned block and `upper_block` has `status_mask` flags set.
    21     */
    

    fjahr commented at 6:57 pm on May 30, 2024:
    Taken, thanks!
  41. fjahr force-pushed on Apr 29, 2024
  42. fjahr commented at 4:27 pm on April 29, 2024: contributor

    Adressed feedback from @stickies-v , thanks again!

    nit: I think the last 3 commits should be squashed

    The doc change is only tangentially related so I will keep it separate. But I squashed the 2nd and 3rd.

  43. in src/rpc/blockchain.cpp:797 in 4e62129d8a outdated
    792+
    793+    if (first_block->nHeight == 0) {
    794+        return std::nullopt;
    795+    }
    796+    // The block before the block with all data is the last that was pruned
    797+    return Assert(first_block->pprev)->nHeight;
    


    ryanofsky commented at 4:44 pm on April 29, 2024:

    In commit “refactor, blockstorage: Generalize GetFirstStoredBlock” (4e62129d8aed839c83fa4a595aab06ed5ca7eeb6)

    IMO, this logic is a little confusing because it using both nHeight and pprev instead of just using pprev consistently. Also I think the cases here could have more explanation.

    0// Get first block with data, after the last block without data.
    1// This is the start of the unpruned range of blocks.
    2const auto first_unpruned{active_chainstate.m_blockman.GetFirstBlock(chain_tip, BLOCK_HAVE_DATA)};
    3if (!first_unpruned->prev) {
    4   // No block before the first unpruned block means nothing is pruned.
    5   return nullopt;
    6}
    7// Block before the first unpruned block is the last pruned block.
    8return first_unpruned->prev->nHeight;
    

    fjahr commented at 6:57 pm on May 30, 2024:
    Taken, thanks!
  44. in src/rpc/blockchain.cpp:791 in 61f352d6bc outdated
    786@@ -787,10 +787,14 @@ static std::optional<int> GetPruneHeight(const Chainstate& active_chainstate) EX
    787     AssertLockHeld(::cs_main);
    788 
    789     const CBlockIndex& chain_tip{*CHECK_NONFATAL(active_chainstate.m_chain.Tip())};
    790-    if (!(chain_tip.nStatus & BLOCK_HAVE_DATA)) return chain_tip.nHeight;
    791-    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, BLOCK_HAVE_DATA)};
    792-
    793-    if (first_block->nHeight == 0) {
    794+    // Check for both data and undo data
    795+    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
    


    ryanofsky commented at 5:30 pm on April 29, 2024:

    In commit “rpc: Make pruneheight also reflect undo data presence” (61f352d6bcf485862815fdd68064cc07ea8ab6ce)

    This doesn’t seem right. If chain consists of just genesis block, and genesis block doesn’t have undo data this will return height 0 as if genesis block were pruned.


    fjahr commented at 6:59 pm on May 30, 2024:
    Addressed in the other comment on this commit
  45. in src/rpc/blockchain.cpp:786 in 4e62129d8a outdated
    781@@ -782,6 +782,21 @@ static RPCHelpMan getblock()
    782     };
    783 }
    784 
    785+//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    786+static std::optional<int> GetPruneHeight(const Chainstate& active_chainstate) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    


    ryanofsky commented at 5:41 pm on April 29, 2024:

    In commit “refactor, blockstorage: Generalize GetFirstStoredBlock” (4e62129d8aed839c83fa4a595aab06ed5ca7eeb6)

    I don’t think it makes sense to call this parameter active_chainstate. It is true the active chainstate is always passed, but this function should care whether the chainstate is active or not. Probably also it would be better for this function not to access the Chainstate class at all and instead be passed separate BlockMan and CChain parameters.


    fjahr commented at 6:57 pm on May 30, 2024:
    Ok, passing CChain and BlockMan as parameters now.
  46. in src/rpc/blockchain.cpp:794 in 61f352d6bc outdated
    793-    if (first_block->nHeight == 0) {
    794+    // Check for both data and undo data
    795+    if (!(chain_tip.nStatus & BLOCK_HAVE_MASK)) return chain_tip.nHeight;
    796+    const auto first_block{active_chainstate.m_blockman.GetFirstBlock(chain_tip, /*status_mask=*/BLOCK_HAVE_MASK)};
    797+
    798+    // Result can never be 0 because genesis block is never pruned, so the
    


    ryanofsky commented at 6:15 pm on April 29, 2024:

    In commit “rpc: Make pruneheight also reflect undo data presence” (61f352d6bcf485862815fdd68064cc07ea8ab6ce)

    This comment “Result can never be 0 because genesis block is never pruned” is misleading and basically describes the opposite of what is happening. If the genesis block was not pruned you would expect GetFirstBlock to return the genesis block, because it has data, and for the result to still be 0 here like it was in the previous commit. The reason GetFirstBlock is no longer returning the genesis block is because the genesis block does not have undo data, so it is now being considered pruned.

    So code works for a different reason than is being described. I think it would probably clearer to ignore genesis block flags rather than making this code depend on that that genesis undo flag won’t be set.

    Would suggest something more like the following (untested):

     0//! Return height of highest block on the chain that has been pruned, or std::nullopt if no blocks have been pruned
     1static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
     2    AssertLockHeld(::cs_main);
     3
     4    // Search for the last block missing block data or undo data. Don't let the
     5    // search consider the genesis block, because the genesis block does not
     6    // have undo data, but should not be considered pruned.
     7    const CBlockIndex* first_block{chain[1]};
     8    const CBlockIndex* chain_tip{chain.Tip()};
     9
    10    // If there are no blocks after the genesis block, or no blocks at all, nothing is pruned.
    11    if (!first_block || !chain_tip) return std::nullopt;
    12
    13    // If the chain tip is pruned, everything is pruned.
    14    if (!(chain_tip->nStatus & BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    15
    16    const auto& first_unpruned{*Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_MASK, first_block))};
    17    if (&first_unpruned == first_block) {
    18        // All blocks between first_block and chain_tip have data, so nothing is pruned.
    19        return std::nullopt;
    20    }
    21
    22    // Block before the first unpruned block is the last pruned block.
    23    return Assert(first_unpruned.pprev)->nHeight;
    24}
    

    fjahr commented at 6:59 pm on May 30, 2024:
    I have adopted these changes with minor edits and that addresses your previous comment about the genesis behavior as well.
  47. ryanofsky commented at 6:21 pm on April 29, 2024: contributor
    Code review 697d0a23f5d0a75829ffe1ac874eaf15fff3bfbe. Approach seems good and first commit seems ok in its current form but I think version of GetPruneHeight in the second commit is not handling the genesis block correctly.
  48. fjahr force-pushed on May 30, 2024
  49. fjahr commented at 7:07 pm on May 30, 2024: contributor
    Addressed comments from @ryanofsky , thank you for the thoughtful review!
  50. in src/rpc/blockchain.cpp:786 in dca1ca1f56 outdated
    781@@ -782,6 +782,24 @@ static RPCHelpMan getblock()
    782     };
    783 }
    784 
    785+//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    786+static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    


    stickies-v commented at 1:52 pm on June 3, 2024:

    In dca1ca1f56d8e63929eb7aea64f8fb3e01752357: I think BlockManager::GetFirstBlock() can be made const, which allows us to pass a const BlockManager& blockman here:

     0diff --git a/src/node/blockstorage.cpp b/src/node/blockstorage.cpp
     1index 18eb2dc1aa..6dbc111eb6 100644
     2--- a/src/node/blockstorage.cpp
     3+++ b/src/node/blockstorage.cpp
     4@@ -595,7 +595,7 @@ bool BlockManager::IsBlockPruned(const CBlockIndex& block)
     5     return m_have_pruned && !(block.nStatus & BLOCK_HAVE_DATA) && (block.nTx > 0);
     6 }
     7 
     8-const CBlockIndex* BlockManager::GetFirstBlock(const CBlockIndex& upper_block, uint32_t status_mask, const CBlockIndex* lower_block)
     9+const CBlockIndex* BlockManager::GetFirstBlock(const CBlockIndex& upper_block, uint32_t status_mask, const CBlockIndex* lower_block) const
    10 {
    11     AssertLockHeld(::cs_main);
    12     const CBlockIndex* last_block = &upper_block;
    13diff --git a/src/node/blockstorage.h b/src/node/blockstorage.h
    14index 917d75232d..d1e46fde2b 100644
    15--- a/src/node/blockstorage.h
    16+++ b/src/node/blockstorage.h
    17@@ -361,7 +361,7 @@ public:
    18         const CBlockIndex& upper_block LIFETIMEBOUND,
    19         uint32_t status_mask,
    20         const CBlockIndex* lower_block = nullptr
    21-    ) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    22+    ) const EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    23 
    24     /** True if any block files have ever been pruned. */
    25     bool m_have_pruned = false;
    26diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
    27index 90cee9901d..b99abe545d 100644
    28--- a/src/rpc/blockchain.cpp
    29+++ b/src/rpc/blockchain.cpp
    30@@ -783,7 +783,7 @@ static RPCHelpMan getblock()
    31 }
    32 
    33 //! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    34-static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    35+static std::optional<int> GetPruneHeight(const BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    36     AssertLockHeld(::cs_main);
    37 
    38     const CBlockIndex* chain_tip{chain.Tip()};
    

    fjahr commented at 4:21 pm on June 22, 2024:
    done
  51. in src/rpc/blockchain.cpp:800 in 0501ec102f outdated
    795+
    796+    // If there are no blocks after the genesis block, or no blocks at all, nothing is pruned.
    797+    if (!first_block || !chain_tip) return std::nullopt;
    798+
    799+    // If the chain tip is pruned, everything is pruned.
    800+    if (!(chain_tip->nStatus & BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    


    stickies-v commented at 3:42 pm on June 3, 2024:

    I think this is buggy, chain tip should be considered pruned if any of the BLOCK_HAVE_MASK flags are missing:

    0    if (!((chain_tip->nStatus & BLOCK_HAVE_MASK) == BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    

    Can be verified by adding unit test to the first commit, which passes on dca1ca1f56 and fails on current HEAD for CheckGetPruneHeight(blockman, chain, 100):

    0$ ./src/test/test_bitcoin --run_test=blockchain_tests/get_prune_height
    1...
    2Running 1 test case...
    3Assertion failed: ((last_block->nStatus & status_mask) == status_mask), function GetFirstBlock, file blockstorage.cpp, line 602.
    4unknown location:0: fatal error: in "blockchain_tests/get_prune_height": signal: SIGABRT (application abort requested)
    5test/blockchain_tests.cpp:93: last checkpoint
    
     0diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
     1index 90cee9901d..b7c1bc9673 100644
     2--- a/src/rpc/blockchain.cpp
     3+++ b/src/rpc/blockchain.cpp
     4@@ -783,7 +783,7 @@ static RPCHelpMan getblock()
     5 }
     6 
     7 //! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
     8-static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
     9+std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) {
    10     AssertLockHeld(::cs_main);
    11 
    12     const CBlockIndex* chain_tip{chain.Tip()};
    13diff --git a/src/rpc/blockchain.h b/src/rpc/blockchain.h
    14index c2021c3608..d9bf627df5 100644
    15--- a/src/rpc/blockchain.h
    16+++ b/src/rpc/blockchain.h
    17@@ -21,6 +21,7 @@ class CBlockIndex;
    18 class Chainstate;
    19 class UniValue;
    20 namespace node {
    21+class BlockManager;
    22 struct NodeContext;
    23 } // namespace node
    24 
    25@@ -57,4 +58,7 @@ UniValue CreateUTXOSnapshot(
    26     const fs::path& path,
    27     const fs::path& tmppath);
    28 
    29+//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    30+std::optional<int> GetPruneHeight(node::BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    31+
    32 #endif // BITCOIN_RPC_BLOCKCHAIN_H
    33diff --git a/src/test/blockchain_tests.cpp b/src/test/blockchain_tests.cpp
    34index be515a9eac..2a63d09795 100644
    35--- a/src/test/blockchain_tests.cpp
    36+++ b/src/test/blockchain_tests.cpp
    37@@ -5,7 +5,9 @@
    38 #include <boost/test/unit_test.hpp>
    39 
    40 #include <chain.h>
    41+#include <node/blockstorage.h>
    42 #include <rpc/blockchain.h>
    43+#include <sync.h>
    44 #include <test/util/setup_common.h>
    45 #include <util/string.h>
    46 
    47@@ -74,4 +76,36 @@ BOOST_AUTO_TEST_CASE(get_difficulty_for_very_high_target)
    48     TestDifficulty(0x12345678, 5913134931067755359633408.0);
    49 }
    50 
    51+//! Prune chain from height down to genesis block and check that
    52+//! GetPruneHeight returns the correct value
    53+static void CheckGetPruneHeight(node::BlockManager& blockman, CChain& chain, int height) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    54+{
    55+    AssertLockHeld(::cs_main);
    56+
    57+    // Emulate pruning all blocks from `height` down to the genesis block
    58+    // by unsetting the `BLOCK_HAVE_DATA` flag from `nStatus`
    59+    for (CBlockIndex* it{chain[height]}; it != nullptr && it->nHeight > 0; it = it->pprev) {
    60+        it->nStatus &= ~BLOCK_HAVE_DATA;
    61+    }
    62+
    63+    const auto prune_height{GetPruneHeight(blockman, chain)};
    64+    BOOST_REQUIRE(prune_height.has_value());
    65+    BOOST_CHECK_EQUAL(*prune_height, height);
    66+}
    67+
    68+BOOST_FIXTURE_TEST_CASE(get_prune_height, TestChain100Setup)
    69+{
    70+    LOCK(::cs_main);
    71+    auto& chain = m_node.chainman->ActiveChain();
    72+    auto& blockman = m_node.chainman->m_blockman;
    73+
    74+    // Fresh chain of 100 blocks without any pruned blocks, so std::nullopt should be returned
    75+    BOOST_CHECK(!GetPruneHeight(blockman, chain).has_value());
    76+
    77+    // Start pruning
    78+    CheckGetPruneHeight(blockman, chain, 1);
    79+    CheckGetPruneHeight(blockman, chain, 99);
    80+    CheckGetPruneHeight(blockman, chain, 100);
    81+}
    82+
    83 BOOST_AUTO_TEST_SUITE_END()
    

    ryanofsky commented at 4:48 pm on June 3, 2024:

    re: #29668 (review)

    I think this is buggy, chain tip should be considered pruned if any of the BLOCK_HAVE_MASK flags are missing:

    Nice catch! Would be great to add the test to the first commit.


    fjahr commented at 4:20 pm on June 22, 2024:
    Cool, thanks! Added the fix and the test.
  52. in test/functional/feature_pruning.py:510 in 0501ec102f outdated
    505+        fetch_block = node.getblockhash(pruneheight - 1)
    506+
    507+        self.connect_nodes(1, 2)
    508+        peers = node.getpeerinfo()
    509+        peer_id = peers[0]["id"]
    510+        node.getblockfrompeer(fetch_block, peer_id)
    


    stickies-v commented at 4:20 pm on June 3, 2024:

    nit: peer_id makes things less readable imo

    0        node.getblockfrompeer(fetch_block, peers[0]["id"])
    

    fjahr commented at 4:19 pm on June 22, 2024:
    done
  53. in src/rpc/blockchain.cpp:786 in dca1ca1f56 outdated
    781@@ -782,6 +782,24 @@ static RPCHelpMan getblock()
    782     };
    783 }
    784 
    785+//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    


    ryanofsky commented at 4:42 pm on June 3, 2024:

    In commit “refactor, blockstorage: Generalize GetFirstStoredBlock” (dca1ca1f56d8e63929eb7aea64f8fb3e01752357)

    I think the code would be simpler (and have fewer cases to think about) if this function just returned -1 when no blocks were pruned, instead of nullopt:

     0--- a/src/rpc/blockchain.cpp
     1+++ b/src/rpc/blockchain.cpp
     2@@ -782,8 +782,8 @@ static RPCHelpMan getblock()
     3     };
     4 }
     5 
     6-//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
     7-static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
     8+//! Return height of highest block that has been pruned, or -1 if no blocks have been pruned
     9+int GetPruneHeight(BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    10     AssertLockHeld(::cs_main);
    11 
    12     const CBlockIndex* chain_tip{chain.Tip()};
    13@@ -794,7 +794,7 @@ static std::optional<int> GetPruneHeight(BlockManager& blockman, const CChain& c
    14     const auto& first_unpruned{*Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_DATA))};
    15     if (!first_unpruned.pprev) {
    16        // No block before the first unpruned block means nothing is pruned.
    17-       return std::nullopt;
    18+       return -1;
    19     }
    20     // Block before the first unpruned block is the last pruned block.
    21     return first_unpruned.pprev->nHeight;
    22@@ -852,7 +852,7 @@ static RPCHelpMan pruneblockchain()
    23     }
    24 
    25     PruneBlockFilesManual(active_chainstate, height);
    26-    return GetPruneHeight(chainman.m_blockman, active_chain).value_or(-1);
    27+    return GetPruneHeight(chainman.m_blockman, active_chain);
    28 },
    29     };
    30 }
    31@@ -1305,8 +1305,7 @@ RPCHelpMan getblockchaininfo()
    32     obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
    33     obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    34     if (chainman.m_blockman.IsPruneMode()) {
    35-        const auto prune_height{GetPruneHeight(chainman.m_blockman, active_chainstate.m_chain)};
    36-        obj.pushKV("pruneheight", prune_height ? prune_height.value() + 1 : 0);
    37+        obj.pushKV("pruneheight", GetPruneHeight(chainman.m_blockman, active_chainstate.m_chain) + 1);
    38 
    39         const bool automatic_pruning{chainman.m_blockman.GetPruneTarget() != BlockManager::PRUNE_TARGET_MANUAL};
    40         obj.pushKV("automatic_pruning",  automatic_pruning);
    

    fjahr commented at 4:20 pm on June 22, 2024:
    I disagree here. I think this is the correct use of an optional when we may or may not have a value to return. And I think it’s easier to understand because the code is more expressive. Handing a magical -1 around is just not intuitive IMO.
  54. ryanofsky approved
  55. ryanofsky commented at 4:54 pm on June 3, 2024: contributor
    Code review 0501ec102f3740049bd2e891cd83527c31b29222. Code looks very good, but the nStatus & BLOCK_HAVE_MASK bug found by stickies-v #29668 (review) needs to be fixed, and the other suggested changes could be considered too
  56. refactor, blockstorage: Generalize GetFirstStoredBlock
    GetFirstStoredBlock is generalized to check for any data status with a
    status mask that needs to be passed as a parameter. To reflect this the
    function is also renamed to GetFirstBlock.
    
    Co-authored-by: stickies-v <stickies-v@protonmail.com>
    96b4facc91
  57. fjahr force-pushed on Jun 22, 2024
  58. fjahr commented at 4:23 pm on June 22, 2024: contributor
    Addressed feedback from @stickies-v and @ryanofsky, thank you!
  59. DrahtBot commented at 5:37 pm on June 22, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/26553131801

  60. DrahtBot added the label CI failed on Jun 22, 2024
  61. rpc: Make pruneheight also reflect undo data presence 4a1975008b
  62. doc: Add note to getblockfrompeer on missing undo data 8789dc8f31
  63. fjahr force-pushed on Jun 22, 2024
  64. DrahtBot removed the label CI failed on Jun 22, 2024
  65. in src/rpc/blockchain.cpp:432 in 8789dc8f31
    428@@ -429,6 +429,7 @@ static RPCHelpMan getblockfrompeer()
    429         "getblockfrompeer",
    430         "Attempt to fetch block from a given peer.\n\n"
    431         "We must have the header for this block, e.g. using submitheader.\n"
    432+        "The block will not have any undo data which can limit the usage of the block data in a context where the undo data is needed.\n"
    


    furszy commented at 4:48 pm on June 23, 2024:
    q: is the “undo data” term well known outside of core? It seems to be an implementation detail to me. Maybe, could write this differently: “The block data will be stored alone; there will be no available information associated with the outputs this block spent. This limits the usage.. etc”.

    fjahr commented at 10:19 pm on June 23, 2024:
    Hm, not sure how much it is known but it is referenced in other places already, like the getblock RPC help. And I’m not sure if a more generic text helps more people because it’s a quite technical reason and I don’t know how to explain it in a way that works for everyone and is still concise. With the term undo data people at least have something they can search on Bitcoin SE etc.

    furszy commented at 1:31 pm on June 24, 2024:

    I don’t know how to explain it in a way that works for everyone and is still concise

    No problem on leaving it as is if there is other RPC command using this term. What I wrote seems generally useful to me; it means that the node might not be able to access the spent outputs script and value. Maybe adding few scenarios where the undo data is needed could clarify it usage.

  66. in src/rpc/blockchain.cpp:1318 in 8789dc8f31
    1313@@ -1288,8 +1314,8 @@ RPCHelpMan getblockchaininfo()
    1314     obj.pushKV("size_on_disk", chainman.m_blockman.CalculateCurrentUsage());
    1315     obj.pushKV("pruned", chainman.m_blockman.IsPruneMode());
    1316     if (chainman.m_blockman.IsPruneMode()) {
    1317-        bool has_tip_data = tip.nStatus & BLOCK_HAVE_DATA;
    1318-        obj.pushKV("pruneheight", has_tip_data ? chainman.m_blockman.GetFirstStoredBlock(tip)->nHeight : tip.nHeight + 1);
    1319+        const auto prune_height{GetPruneHeight(chainman.m_blockman, active_chainstate.m_chain)};
    1320+        obj.pushKV("pruneheight", prune_height ? prune_height.value() + 1 : 0);
    


    furszy commented at 1:55 pm on June 24, 2024:
    little topic: maybe changing the RPC docs to say that “pruneheight” returns “the first block unpruned, all previous blocks were pruned” is more useful than saying “height of the last block pruned, plus one” as is now.

    fjahr commented at 4:02 pm on July 12, 2024:
    sounds good, taken as you suggested in #29770
  67. in src/rpc/blockchain.h:62 in 8789dc8f31
    57@@ -57,4 +58,7 @@ UniValue CreateUTXOSnapshot(
    58     const fs::path& path,
    59     const fs::path& tmppath);
    60 
    61+//! Return height of highest block that has been pruned, or std::nullopt if no blocks have been pruned
    62+std::optional<int> GetPruneHeight(const node::BlockManager& blockman, const CChain& chain) EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    


    stickies-v commented at 2:27 pm on June 24, 2024:

    nit: missing some includes / fwd decls for the newly added line:

     0diff --git a/src/rpc/blockchain.h b/src/rpc/blockchain.h
     1index f6a7fe236c..23909c334e 100644
     2--- a/src/rpc/blockchain.h
     3+++ b/src/rpc/blockchain.h
     4@@ -9,15 +9,18 @@
     5 #include <core_io.h>
     6 #include <streams.h>
     7 #include <sync.h>
     8+#include <threadsafety.h>
     9 #include <util/fs.h>
    10 #include <validation.h>
    11 
    12 #include <any>
    13+#include <optional>
    14 #include <stdint.h>
    15 #include <vector>
    16 
    17 class CBlock;
    18 class CBlockIndex;
    19+class CChain;
    20 class Chainstate;
    21 class UniValue;
    22 namespace node {
    

    fjahr commented at 4:02 pm on July 12, 2024:
    Done in #29770
  68. in src/test/blockchain_tests.cpp:81 in 8789dc8f31
    75@@ -74,4 +76,36 @@ BOOST_AUTO_TEST_CASE(get_difficulty_for_very_high_target)
    76     TestDifficulty(0x12345678, 5913134931067755359633408.0);
    77 }
    78 
    79+//! Prune chain from height down to genesis block and check that
    80+//! GetPruneHeight returns the correct value
    81+static void CheckGetPruneHeight(node::BlockManager& blockman, CChain& chain, int height) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    


    stickies-v commented at 2:37 pm on June 24, 2024:

    nit

    0static void CheckGetPruneHeight(const node::BlockManager& blockman, const CChain& chain, int height) EXCLUSIVE_LOCKS_REQUIRED(::cs_main)
    

    and related:

     0diff --git a/src/test/blockchain_tests.cpp b/src/test/blockchain_tests.cpp
     1index 2a63d09795..7a5d175c8c 100644
     2--- a/src/test/blockchain_tests.cpp
     3+++ b/src/test/blockchain_tests.cpp
     4@@ -96,8 +96,8 @@ static void CheckGetPruneHeight(node::BlockManager& blockman, CChain& chain, int
     5 BOOST_FIXTURE_TEST_CASE(get_prune_height, TestChain100Setup)
     6 {
     7     LOCK(::cs_main);
     8-    auto& chain = m_node.chainman->ActiveChain();
     9-    auto& blockman = m_node.chainman->m_blockman;
    10+    const auto& chain = m_node.chainman->ActiveChain();
    11+    const auto& blockman = m_node.chainman->m_blockman;
    12 
    13     // Fresh chain of 100 blocks without any pruned blocks, so std::nullopt should be returned
    14     BOOST_CHECK(!GetPruneHeight(blockman, chain).has_value());
    

    fjahr commented at 4:02 pm on July 12, 2024:
    Done in #29770
  69. in src/rpc/blockchain.cpp:802 in 4a1975008b outdated
    804+
    805+    // If the chain tip is pruned, everything is pruned.
    806+    if (!((chain_tip->nStatus & BLOCK_HAVE_MASK) == BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    807+
    808+    const auto& first_unpruned{*Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_MASK, first_block))};
    809+    if (&first_unpruned == first_block) {
    


    furszy commented at 2:38 pm on June 24, 2024:

    Can remove the back and forth.

    0    const CBlockIndex* first_unpruned{Assert(blockman.GetFirstBlock(*chain_tip, /*status_mask=*/BLOCK_HAVE_MASK, first_block))};
    1    if (first_unpruned == first_block) {
    

    fjahr commented at 4:02 pm on July 12, 2024:
    done in #29770
  70. in src/rpc/blockchain.cpp:799 in 4a1975008b outdated
    801-       return std::nullopt;
    802+    // If there are no blocks after the genesis block, or no blocks at all, nothing is pruned.
    803+    if (!first_block || !chain_tip) return std::nullopt;
    804+
    805+    // If the chain tip is pruned, everything is pruned.
    806+    if (!((chain_tip->nStatus & BLOCK_HAVE_MASK) == BLOCK_HAVE_MASK)) return chain_tip->nHeight;
    


    furszy commented at 2:41 pm on June 24, 2024:

    could be simpler:

    0    if ((chain_tip->nStatus & BLOCK_HAVE_MASK) != BLOCK_HAVE_MASK) return chain_tip->nHeight;
    

    fjahr commented at 4:02 pm on July 12, 2024:
    done in #29770
  71. furszy commented at 2:43 pm on June 24, 2024: member
    Code review ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7. Left few nits.
  72. DrahtBot requested review from stickies-v on Jun 24, 2024
  73. DrahtBot requested review from TheCharlatan on Jun 24, 2024
  74. stickies-v approved
  75. stickies-v commented at 2:45 pm on June 24, 2024: contributor
    ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7
  76. achow101 commented at 7:18 pm on July 10, 2024: member
    ACK 8789dc8f315a9d9ad7142d831bc9412f780248e7
  77. achow101 merged this on Jul 10, 2024
  78. achow101 closed this on Jul 10, 2024

  79. fjahr commented at 11:49 pm on July 10, 2024: contributor
    With this merged, I will take #29770 out of draft mode shortly and also address the left-over comments here. Thanks @stickies-v and @furszy !
  80. fjahr referenced this in commit 34fdeface4 on Jul 12, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me