blockstorage: Separate reindexing from saving new blocks #29975

pull mzumsande wants to merge 6 commits into bitcoin:master from mzumsande:202404_blockstorage_split_reindex changing 5 files +143 −105
  1. mzumsande commented at 9:30 pm on April 26, 2024: contributor

    SaveBlockToDisk / FindBlockPos are used for two purposes, depending on whether they are called during reindexing (dbp set,  fKnown = true) or in the “normal” case when adding new blocks (dbp == nullptr,  fKnown = false). The actual tasks are quite different

    • In normal mode, preparations for saving a new block are made, which is then saved: find the correct position on disk (maybe skipping to a new blk file), check for available disk space, update the blockfile info db, save the block.
    • during reindex, most of this is not necessary (the block is already on disk after all), only the blockfile info needs to rebuilt because reindex wiped the leveldb it’s saved in.

    Using one function with many conditional statements for this leads to code that is hard to read / understand and bug-prone:

    • many code paths in FindBlockPos are conditional on fKnown or !fKnown
    • It’s not really clear what actually needs to be done during reindex (we don’t need to “save a block to disk” or “find a block pos” as the function names suggest)
    • logic that should be applied to only one of the two modes is sometimes applied to both (see first commit, or #27039)

    #24858 and #27039 were recent bugs directly related to the differences between reindexing and normal mode, and in both cases the simple fix took a long time to be reviewed and merged.

    This PR proposes to clean this code up by splitting out the reindex logic into a separate function (UpdateBlockInfo) which will be called directly from validation. As a result, SaveBlockToDisk and FindBlockPos only need to cover the non-reindex logic.

  2. DrahtBot commented at 9:30 pm on April 26, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK paplorinc, TheCharlatan, ryanofsky
    Concept ACK furszy, BrandonOdiwuor

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Block storage on Apr 26, 2024
  4. mzumsande force-pushed on Apr 26, 2024
  5. DrahtBot commented at 9:42 pm on April 26, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/24317664317

  6. DrahtBot added the label CI failed on Apr 26, 2024
  7. TheCharlatan commented at 9:57 pm on April 26, 2024: contributor
    Concept ACK
  8. DrahtBot removed the label CI failed on Apr 27, 2024
  9. in src/test/blockmanager_tests.cpp:189 in 39ad8d825e outdated
    188@@ -189,7 +189,7 @@ BOOST_AUTO_TEST_CASE(blockmanager_flush_block_file)
    189     // to block 2 location.
    190     CBlockFileInfo* block_data = blockman.GetBlockFileInfo(0);
    191     BOOST_CHECK_EQUAL(block_data->nBlocks, 2);
    192-    BOOST_CHECK(blockman.SaveBlockToDisk(block3, /*nHeight=*/3, /*dbp=*/&pos2) == pos2);
    193+    blockman.AddToBlockFileInfo(block3, /*nHeight=*/3, /*pos=*/pos2);
    


    paplorinc commented at 10:00 am on April 27, 2024:
    seems to me the related comments needs to be updated in this file (e.g line 184 and 199)

    mzumsande commented at 7:40 pm on April 29, 2024:
    I updated some comments. The entire test setup probably makes a bit less sense after the refactor, users unfamiliar with the history might ask themselves why someone could think that Reindex / AddToBlockFileInfo would change the block files so that we’d require a test making sure it doesn’t.
  10. in src/node/blockstorage.cpp:946 in 39ad8d825e outdated
    1002+void BlockManager::AddToBlockFileInfo(const CBlock& block, unsigned int nHeight, const FlatFilePos& pos)
    1003+{
    1004+    LOCK(cs_LastBlockFile);
    1005+
    1006+    unsigned int nAddSize = ::GetSerializeSize(TX_WITH_WITNESS(block));
    1007+    const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
    


    paplorinc commented at 12:28 pm on April 27, 2024:
    nit: is the naming style deliberate here? When is it camel and when snake?

    mzumsande commented at 7:36 pm on April 29, 2024:
    I took both from FindBlockPos, snake case is correct, but for historical reasons camel case is still used in lots of places. But since this is arguably new code I renamed nAddSize to added_size
  11. in src/node/blockstorage.cpp:866 in 39ad8d825e outdated
    862@@ -863,90 +863,104 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
    863     }
    864     const int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
    865 
    866-    int nFile = fKnown ? pos.nFile : last_blockfile;
    867+    int nFile = last_blockfile;
    


    paplorinc commented at 12:41 pm on April 27, 2024:
    Does this reassignment still make sense?

    mzumsande commented at 7:35 pm on April 29, 2024:
    Yes, I think so. We later may increase nFile and then compare it to last_blockfile (line if (nFile != last_blockfile))
  12. in src/node/blockstorage.cpp:961 in 39ad8d825e outdated
    1013+
    1014+    if (static_cast<int>(m_blockfile_info.size()) <= nFile) {
    1015+        m_blockfile_info.resize(nFile + 1);
    1016+    }
    1017+    if (nFile != last_blockfile) {
    1018+        m_blockfile_cursors[chain_type] = BlockfileCursor{nFile};
    


    paplorinc commented at 12:42 pm on April 27, 2024:
    should we keep the original comment here?

    mzumsande commented at 7:35 pm on April 29, 2024:
    done.
  13. in src/node/blockstorage.h:164 in 39ad8d825e outdated
    160+     * Helper function performing various preparations before a block can be saved to disk:
    161+     * Locates the correct position for the block to be saved, which may be in the current or a new
    162+     * block file depending on nAddSize. May flush the previous blockfile to disk if full, updates
    163+     * blockfile info, and checks if there is enough disk space to save the block.
    164+     */
    165+    [[nodiscard]] bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime);
    


    paplorinc commented at 12:46 pm on April 27, 2024:
    Does the comment on line 215 need any update after the change?

    mzumsande commented at 7:37 pm on April 29, 2024:
    You mean the comment “The ASSUMED state is initialized, when necessary, in FindBlockPos().”? That behavior is unchanged, it’s part of the normal usage of FindBlockPos and wasn’t happening during reindex anyway (see first commit), so I don’t think it needs to be updated.
  14. in src/validation.cpp:4348 in 39ad8d825e outdated
    4347-            state.Error(strprintf("%s: Failed to find position to write new block to disk", __func__));
    4348-            return false;
    4349+        FlatFilePos blockPos{};
    4350+        if (dbp) {
    4351+            blockPos = *dbp;
    4352+            m_blockman.AddToBlockFileInfo(block, pindex->nHeight, *dbp);
    


    paplorinc commented at 12:48 pm on April 27, 2024:

    we might as well use blockPos here

    0m_blockman.AddToBlockFileInfo(block, pindex->nHeight, blockPos);
    

    mzumsande commented at 7:37 pm on April 29, 2024:
    done
  15. in src/node/blockstorage.cpp:950 in 20efb411bd outdated
    1006+    unsigned int nAddSize = ::GetSerializeSize(TX_WITH_WITNESS(block));
    1007+    const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
    1008+    // -reindex and -reindex-chainstate delete any snapshot chainstate during init
    1009+    Assume(chain_type == BlockfileType::NORMAL);
    1010+
    1011+    const int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
    


    paplorinc commented at 12:55 pm on April 27, 2024:
    we could move this closer to the usage

    mzumsande commented at 7:38 pm on April 29, 2024:
    done
  16. in src/node/blockstorage.cpp:951 in 20efb411bd outdated
    1004+    LOCK(cs_LastBlockFile);
    1005+
    1006+    unsigned int nAddSize = ::GetSerializeSize(TX_WITH_WITNESS(block));
    1007+    const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
    1008+    // -reindex and -reindex-chainstate delete any snapshot chainstate during init
    1009+    Assume(chain_type == BlockfileType::NORMAL);
    


    paplorinc commented at 12:57 pm on April 27, 2024:

    This is the reason why

    0if (!m_blockfile_cursors[chain_type]) {
    1    // If a snapshot is loaded during runtime, we may not have initialized this cursor yet.
    2    assert(chain_type == BlockfileType::ASSUMED);
    3    const auto new_cursor = BlockfileCursor{this->MaxBlockfileNum() + 1};
    4    m_blockfile_cursors[chain_type] = new_cursor;
    5    LogPrint(BCLog::BLOCKSTORAGE, "[%s] initializing blockfile cursor to %s\n", chain_type, new_cursor);
    6}
    

    is not applicable here, right? Would it make sense to also assume m_blockfile_cursors[chain_type]?


    mzumsande commented at 7:39 pm on April 29, 2024:

    Yes. I added this in the first commit, so that the second commit doesn’t change behavior.

    As for your second q: Well, it’d be undefined behavior if m_blockfile_cursors didn’t haven an element for BlockfileType::NORMAL. On the other hand, m_blockfile_cursors[...] is used all over the place, I’m not sure if we want to have an assert for each occurence. Other opinions?


    ryanofsky commented at 7:53 pm on May 6, 2024:

    In commit “blockstorage: split up FindBlockPos function” (a2ae0a33c7c30678721d7e7d37d8e6170b013383)

    re: #29975 (review)

    I’m not sure if we want to have an assert for each occurence. Other opinions?

    Not sure about other places, but it seems worth asserting here to avoid undefined behavior on line 955 and to be able to document assumptions this code is making. I suggested a comment in the previous commit that explains both assumptions and could be reused here (https://github.com/bitcoin/bitcoin/pull/29975#discussion_r1591461829).


    mzumsande commented at 10:24 pm on May 8, 2024:
    Added the assumptions.
  17. paplorinc commented at 12:59 pm on April 27, 2024: contributor
    Recreated the change to understand it better, commented on what I’ve noticed.
  18. furszy commented at 2:04 pm on April 27, 2024: member
    Concept ACK
  19. BrandonOdiwuor commented at 11:50 am on April 29, 2024: contributor
    Concept ACK
  20. mzumsande force-pushed on Apr 29, 2024
  21. mzumsande commented at 7:42 pm on April 29, 2024: contributor
    39ad8d8 to 194e84a: Addressed review feedback by @paplorinc, thanks!
  22. in src/validation.cpp:4347 in a17eacab1f outdated
    4346-        if (blockPos.IsNull()) {
    4347-            state.Error(strprintf("%s: Failed to find position to write new block to disk", __func__));
    4348-            return false;
    4349+        FlatFilePos blockPos{};
    4350+        if (dbp) {
    4351+            blockPos = *dbp;
    


    ryanofsky commented at 5:32 pm on May 6, 2024:

    In commit “validation, blockstorage: Separate code paths for reindex and saving new blocks” (a17eacab1f8790afc5f89ba2ee3e34da4c9369e1)

    It looks like previously there would have been an error here if dbp->IsNull() was true, and now there will not be an error. This is probably a good change, since AcceptBlock should not be looking at block positions, just passing them on.


    mzumsande commented at 8:11 pm on May 8, 2024:
    Yes, in the reindex case; In this case the passed dbp isn’t changed (it’s now a const arg to AddToBlockFileInfo). If a dpb was passed to AcceptBlock for which dbp->IsNull(), the error message (“Failed to find position to write new block to disk”) would have been very confusing anyway, because we don’t write a block to disk during reindex anyway.
  23. in src/node/blockstorage.cpp:859 in d6b0bb6dd0 outdated
    852@@ -853,8 +853,10 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
    853     LOCK(cs_LastBlockFile);
    854 
    855     const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
    856+    // -reindex and -reindex-chainstate delete any snapshot chainstate during init
    857+    if (fKnown) Assume(chain_type == BlockfileType::NORMAL);
    858 
    859-    if (!m_blockfile_cursors[chain_type]) {
    860+    if (!fKnown && !m_blockfile_cursors[chain_type]) {
    


    ryanofsky commented at 7:28 pm on May 6, 2024:

    In commit “blockstorage: Add Assume for fKnown / snapshot chainstate” (d6b0bb6dd0d26f3e10386d1afbafd8d52a12b2c5)

    I don’t think changing this if statement is good.

    Adding the Assume call above seems good, since it provides information about the context this code is called in and could potentially catch bugs if the code is run in an unanticipated state.

    But It’s less clear what benefit there is to adding the !fKnown && condition to this if statement. It just makes the logic more complicated without providing any extra explanation. And if outside code were changed such that fKnown was false while the cursor was null, undefined behavior would now happen on line 866 below.

    I think this commit would be a clearer if it avoided changing any existing logic and instead just added two Assume statements:

    0// Check that chain type is NORMAL if fKnown is true, because fKnown is only
    1// true during reindexing, and reindexing deletes snapshot chainstates, so
    2// chain_type will not be SNAPSHOT. Also check that cursor exists, because
    3// the normal cursor should never be null.
    4if (fKnown) {
    5    Assume(chain_type == BlockfileType::NORMAL);
    6    Assume(m_blockfile_cursors[chain_type]);    
    7}
    

    mzumsande commented at 7:28 pm on May 8, 2024:

    The reason I added it was to make the following commit, in which every line of code that can be reached with fKnown==true is moved into its own function, a mechanical refactor that is easier to review: Lines dependent on fKnown move, lines dependent on !fKnown stay, independent lines go into both. Without it, there would be the question why this block of code does not make it into AddToBlockFileInfo. So the reason was to move the potential behavior change (which would only be an actual one if our assumptions about reindexing were incorrect) into its own commit.

    Happy to add the suggestion though.


    ryanofsky commented at 3:22 pm on May 9, 2024:

    re: #29975 (review)

    That makes sense. I didn’t realize that. I can see how it makes the next commit more straightforward, at the cost of introducing a slightly mysterious change to this commit and adding a little more churn to the PR as a whole. Could be a good thing, as the next commit is the most complicated one, so either approach seems fine.

  24. in src/node/blockstorage.cpp:856 in d6b0bb6dd0 outdated
    852@@ -853,8 +853,10 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
    853     LOCK(cs_LastBlockFile);
    854 
    855     const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
    856+    // -reindex and -reindex-chainstate delete any snapshot chainstate during init
    


    ryanofsky commented at 7:29 pm on May 6, 2024:

    In commit “blockstorage: Add Assume for fKnown / snapshot chainstate” (d6b0bb6dd0d26f3e10386d1afbafd8d52a12b2c5)

    This comment seems really disconnected from the statement below it, because the statement does not mention reindexing or the snapshot chainstate at all. I left a suggestion to improve the comment below (https://github.com/bitcoin/bitcoin/pull/29975#discussion_r1591461829).


    mzumsande commented at 10:23 pm on May 8, 2024:
    I extended the comment according to the suggestion above.
  25. in src/node/blockstorage.cpp:914 in a2ae0a33c7 outdated
    955+        // here, and we crash, there is no expected additional block data
    956+        // inconsistency arising from the flush failure here. However, the undo
    957+        // data may be inconsistent after a crash if the flush is called during
    958+        // a reindex. A flush error might also leave some of the data files
    959+        // untrimmed.
    960+        if (!FlushBlockFile(last_blockfile, true, finalize_undo)) {
    


    ryanofsky commented at 7:45 pm on May 6, 2024:

    In commit “blockstorage: split up FindBlockPos function” (a2ae0a33c7c30678721d7e7d37d8e6170b013383)

    Would be helpful to clarify with meaning of true with /*fFinalize*/=true


    mzumsande commented at 10:23 pm on May 8, 2024:
    Done
  26. in src/node/blockstorage.cpp:851 in a2ae0a33c7 outdated
    847@@ -848,15 +848,13 @@ fs::path BlockManager::GetBlockPosFilename(const FlatFilePos& pos) const
    848     return BlockFileSeq().FileName(pos);
    849 }
    850 
    851-bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime, bool fKnown)
    852+bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime)
    


    ryanofsky commented at 3:45 pm on May 7, 2024:

    In commit “blockstorage: split up FindBlockPos function” (a2ae0a33c7c30678721d7e7d37d8e6170b013383)

    Two suggestions, maybe for later commits or a followup PR:

    Now that pos is an output parameter instead of being an in/out parameter, it would be better to just drop it entirely and make FindBlockPos return FlatFilePos like SaveBlockToDisk, instead of returning bool. This would make it more obvious what the function inputs and outputs are, and also make sure the output value is consistently initialized.

    Also, I think it would make sense to rename FindBlockPos to FindNextBlockPos to be clear this function is only called to find the position where the next block should be written, and no longer has anything to do with positions of existing blocks.


    mzumsande commented at 9:13 pm on May 10, 2024:
    Done as an extra commit at the end.
  27. in src/node/blockstorage.h:164 in a2ae0a33c7 outdated
    160+    /** Update blockfile info while processing a block during reindex. The block must be available on disk.
    161+     *
    162+     * @param[in]  block        the block being processed
    163+     * @param[in]  nHeight      the height of the block
    164+     * @param[in]  pos          the position of the block on disk. This must point *after* the
    165+     *                          8 byte serialization header, at the beginning of the actual block data.
    


    ryanofsky commented at 4:08 pm on May 7, 2024:

    In commit “blockstorage: split up FindBlockPos function” (a2ae0a33c7c30678721d7e7d37d8e6170b013383)

    I think this description is technically accurate, but I got confused by it and thought it was wrong because “the 8 byte serialization header” sounds like something that is part of CBlock serialization, when actually it is referring to separator fields written by WriteBlockToDisk before the serialized CBlock.

    Would suggest changing comment to “pos: the position of the serialized CBlock on disk. This is the position returned by WriteBlockToDisk pointing at the CBlock, not the separator fields before it.”

    I would also suggesting adding two more comments to this commit to make it clear what it happening at this stage of the PR.

    In WriteBlockToDisk documentation, “// The pos argument passed to this function is modified by this call. Before this call, it should point to an unused file location where separator fields will be written followed by the serialized CBlock data. After this call, it will point to the beginning of the serialized CBlock data, after the separator fields”

    In FindBlockPos documentation, “// The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but the also size of separator fields which are written before it by WriteBlockToDisk (BLOCK_SERIALIZATION_HEADER_SIZE).


    mzumsande commented at 9:13 pm on May 10, 2024:
    I have done that, but bundled most of the doc changes to a doc-only commit at the beginning in order to not have unrelated things in the “blockstorage: split up FindBlockPos function” commit.
  28. in src/node/blockstorage.cpp:1155 in a2ae0a33c7 outdated
    1151@@ -1139,17 +1152,17 @@ FlatFilePos BlockManager::SaveBlockToDisk(const CBlock& block, int nHeight, cons
    1152     const auto position_known {dbp != nullptr};
    1153     if (position_known) {
    1154         blockPos = *dbp;
    1155+        // During reindex, no blocks need to be written, only the blockfile info database needs to be rebuilt
    


    ryanofsky commented at 5:16 pm on May 7, 2024:

    In commit “blockstorage: split up FindBlockPos function” (a2ae0a33c7c30678721d7e7d37d8e6170b013383)

    I think this comment is a little confusing, because it isn’t obvious that position_known can only be true during reindexing. Could potentially clarify this, though not necessary since this code will be deleted in the next commit.


    mzumsande commented at 10:26 pm on May 8, 2024:
    I added a clarification.
  29. in src/node/blockstorage.h:166 in a2ae0a33c7 outdated
    162+     * @param[in]  block        the block being processed
    163+     * @param[in]  nHeight      the height of the block
    164+     * @param[in]  pos          the position of the block on disk. This must point *after* the
    165+     *                          8 byte serialization header, at the beginning of the actual block data.
    166+     */
    167+    void AddToBlockFileInfo(const CBlock& block, unsigned int nHeight, const FlatFilePos& pos);
    


    ryanofsky commented at 5:23 pm on May 7, 2024:

    In commit “blockstorage: split up FindBlockPos function” (a2ae0a33c7c30678721d7e7d37d8e6170b013383)

    I think SaveBlockPos would be a less ambiguous name for this function than AddToBlockFileInfo. It would also be consistent with FindBlockPos and I think make the AcceptBlock code more obvious (like if (dbp) SaveBlockPos(...) else SaveBlockToDisk(...))

    EDIT: SaveBlockInfo or UpdateBlockInfo might be a better name than SaveBlockPos since block position isn’t really what is updated here, other information is just updated based on the position.


    mzumsande commented at 10:26 pm on May 8, 2024:
    Renamed to UpdateBlockInfo
  30. in src/node/blockstorage.h:159 in a2ae0a33c7 outdated
    154@@ -155,7 +155,15 @@ class BlockManager
    155     /** Return false if undo file flushing fails. */
    156     [[nodiscard]] bool FlushUndoFile(int block_file, bool finalize = false);
    157 
    158-    [[nodiscard]] bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime, bool fKnown);
    159+    [[nodiscard]] bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime);
    160+    /** Update blockfile info while processing a block during reindex. The block must be available on disk.
    


    ryanofsky commented at 5:31 pm on May 7, 2024:

    In commit “blockstorage: split up FindBlockPos function” (a2ae0a33c7c30678721d7e7d37d8e6170b013383)

    Instead of adding this method and comment in the private section of the class this commit, and then moving it to the public section of the class in the next commit, and I think it would be better just to add it to the public section initially to make the PR easier to review, since it already makes sense as part of public interface.


    mzumsande commented at 10:27 pm on May 8, 2024:
    done
  31. in src/test/blockmanager_tests.cpp:183 in a17eacab1f outdated
    179@@ -181,22 +180,19 @@ BOOST_AUTO_TEST_CASE(blockmanager_flush_block_file)
    180         BOOST_CHECK_EQUAL(read_block.nVersion, 2);
    181     }
    182 
    183-    // When FlatFilePos* dbp is given, SaveBlockToDisk() will not write or
    184-    // overwrite anything to the flat file block storage. It will, however,
    185-    // update the blockfile metadata. This is to facilitate reindexing
    186-    // when the user has the blocks on disk but the metadata is being rebuilt.
    187+    // During reindex, the flat file block storage will not be written do.
    


    ryanofsky commented at 5:38 pm on May 7, 2024:

    In commit “validation, blockstorage: Separate code paths for reindex and saving new blocks” (a17eacab1f8790afc5f89ba2ee3e34da4c9369e1)

    s/do/to/


    mzumsande commented at 10:29 pm on May 8, 2024:
    fixed
  32. ryanofsky approved
  33. ryanofsky commented at 6:02 pm on May 7, 2024: contributor
    Code review ACK 194e84accced947ef63c6db389bc62a2b58cffa3. I left a lot of comments, but everything looks right here and the code is a lot nicer than before.
  34. DrahtBot requested review from BrandonOdiwuor on May 7, 2024
  35. DrahtBot requested review from furszy on May 7, 2024
  36. DrahtBot requested review from TheCharlatan on May 7, 2024
  37. ryanofsky commented at 6:45 pm on May 7, 2024: contributor

    With 194e84accced947ef63c6db389bc62a2b58cffa3, since reindexing regenerates undo data, and undo data shouldn’t be added until all existing blocks are, it seems like there is no reason for the AddToBlockFileInfo function to worry about resetting the BlockfileCursor::undo_file field or even accessing the block storage cursors at all. So I think the following simplification would make sense: [ EDIT: This is wrong and the cursors do need to be updated to record the max file number. The diff below is wrong and a better change would be #29975 (review) ]

     0--- a/src/node/blockstorage.cpp
     1+++ b/src/node/blockstorage.cpp
     2@@ -941,22 +941,11 @@ bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigne
     3 void BlockManager::AddToBlockFileInfo(const CBlock& block, unsigned int nHeight, const FlatFilePos& pos)
     4 {
     5     LOCK(cs_LastBlockFile);
     6-
     7     const unsigned int added_size = ::GetSerializeSize(TX_WITH_WITNESS(block));
     8-    const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
     9-    // -reindex and -reindex-chainstate delete any snapshot chainstate during init
    10-    Assume(chain_type == BlockfileType::NORMAL);
    11-
    12     const int nFile = pos.nFile;
    13     if (static_cast<int>(m_blockfile_info.size()) <= nFile) {
    14         m_blockfile_info.resize(nFile + 1);
    15     }
    16-
    17-    const int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
    18-    if (nFile != last_blockfile) {
    19-        // No undo data yet in the new file, so reset our undo-height tracking.
    20-        m_blockfile_cursors[chain_type] = BlockfileCursor{nFile};
    21-    }
    22     m_blockfile_info[nFile].AddBlock(nHeight, block.GetBlockTime());
    23     m_blockfile_info[nFile].nSize = std::max(pos.nPos + added_size, m_blockfile_info[nFile].nSize);
    24     m_dirty_fileinfo.insert(nFile);
    
  38. blockstorage: Add Assume for fKnown / snapshot chainstate
    fKnown is true during reindex (and only then), which deletes
    any existing snapshot chainstate. As a result, this function can never
    be called wth fKnown set and a snapshot chainstate.
    Add an Assume for this, and make the code initializing a blockfile cursor
    for the snapshot conditional on !fKnown.
    
    This is a preparation for splitting the reindex logic out of
    FindBlockPos in the following commits.
    0d114e3cb2
  39. mzumsande force-pushed on May 8, 2024
  40. mzumsande commented at 10:32 pm on May 8, 2024: contributor
    Thanks for the detailed review @ryanofsky! With the latest push, I addressed the feedback partially, see in particular #29975 (review). I will address the remaining comments soon.
  41. ryanofsky approved
  42. ryanofsky commented at 3:33 pm on May 9, 2024: contributor
    Code review ACK 6a22eede2083616ecc7558a16d8189c22b46403d. Just some suggested changes were made since the last review: adding more Assume checks, renaming a function, and moving a declaration. Everything still looks good.
  43. in src/node/blockstorage.cpp:960 in 2b6d274af0 outdated
    1016+        m_blockfile_info.resize(nFile + 1);
    1017+    }
    1018+
    1019+    const int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
    1020+    if (nFile != last_blockfile) {
    1021+        // No undo data yet in the new file, so reset our undo-height tracking.
    


    ryanofsky commented at 4:11 pm on May 9, 2024:

    In commit “blockstorage: split up FindBlockPos function” (2b6d274af050ea21e39ce59b6a6b3b7fb61e8cbd)

    I think this comment is not really accurate in this context. There should be no need to reset BlockfileCursor::undo_height here, only to set a new BlockfileCursor::file_num value so MaxBlockfileNum() returns the right thing. So the comment could be changed to something like “update the cursor so it points to the last file”.

    Maybe it would also make sense to make this a little more robust so doesn’t move the cursor backwards if UpdateBlockInfo calls are made out of order. We are already doing this for blocks within the file by using std::max to set the file size below. Generalizing this would also allow dropping the Assume() checks:

     0--- a/src/node/blockstorage.cpp
     1+++ b/src/node/blockstorage.cpp
     2@@ -942,24 +942,19 @@ void BlockManager::UpdateBlockInfo(const CBlock& block, unsigned int nHeight, co
     3 {
     4     LOCK(cs_LastBlockFile);
     5 
     6+    // Update the cursor so it points to the last file.
     7+    const BlockfileType chain_type{BlockfileTypeForHeight(nHeight)};
     8+    auto& cursor{m_blockfile_cursors[chain_type]};
     9+    if (!cursor || cursor->file_num < pos.nFile) {
    10+        m_blockfile_cursors[chain_type] = BlockfileCursor{pos.nFile};
    11+    }
    12+
    13+    // Update the file information so it points to the last block.
    14     const unsigned int added_size = ::GetSerializeSize(TX_WITH_WITNESS(block));
    15-    const BlockfileType chain_type = BlockfileTypeForHeight(nHeight);
    16-    // Check that chain type is NORMAL, because this function is only
    17-    // called during reindexing, and reindexing deletes snapshot chainstates, so
    18-    // chain_type will not be SNAPSHOT. Also check that cursor exists, because
    19-    // the normal cursor should never be null.
    20-    Assume(chain_type == BlockfileType::NORMAL);
    21-    Assume(m_blockfile_cursors[chain_type]);
    22     const int nFile = pos.nFile;
    23     if (static_cast<int>(m_blockfile_info.size()) <= nFile) {
    24         m_blockfile_info.resize(nFile + 1);
    25     }
    26-
    27-    const int last_blockfile = m_blockfile_cursors[chain_type]->file_num;
    28-    if (nFile != last_blockfile) {
    29-        // No undo data yet in the new file, so reset our undo-height tracking.
    30-        m_blockfile_cursors[chain_type] = BlockfileCursor{nFile};
    31-    }
    32     m_blockfile_info[nFile].AddBlock(nHeight, block.GetBlockTime());
    33     m_blockfile_info[nFile].nSize = std::max(pos.nPos + added_size, m_blockfile_info[nFile].nSize);
    34     m_dirty_fileinfo.insert(nFile);
    

    mzumsande commented at 9:44 pm on May 9, 2024:

    The current logic seems really brittle to me, I suspected that there might be a bug hiding somewhere, but it barely works out in all scenarios I could think of. With blocks being save on disk out of order, it is possible that at the end of a reindex, the cursor points to an older block file. Yet, it appears that nothing really bad happens as a result: When a new blocks arrives, FindBlockPos will still find the correct position in this while loop, skipping ahead. In a similar way, MaxBlockfileNum() being incorrect can result in an incorrect DB_LAST_BLOCK being written to disk on shutdown. However, we also recover from that because at next startup, LoadBlockIndexDB uses DB_LAST_BLOCK basically only for logging but doesn’t trust it’s actually pointing to the last block, searching for more db entries here, so only the log messages on startup will be wrong.

    This seems really fragile to me, so I think that your suggestion is a good idea, and I’ll add a commit for it!


    mzumsande commented at 9:14 pm on May 10, 2024:
    I have added this with the latest push. Changed the // Update the file information so it points to the last block. comment slightly because the file information doesn’t really point to anything.
  44. mzumsande force-pushed on May 10, 2024
  45. mzumsande force-pushed on May 10, 2024
  46. DrahtBot commented at 9:18 pm on May 10, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/24838737665

  47. DrahtBot added the label CI failed on May 10, 2024
  48. mzumsande commented at 9:25 pm on May 10, 2024: contributor

    6a22eed to 9cf475f:

    I think I addressed all feedback now. Added 2 additional commits (one for renaming / updating FindNextBlockPos, one for not moving the cursor backwards in UpdateBlockInfo) and reorganized the other commit slightly by having a doc-only commit at the beginning.

  49. DrahtBot removed the label CI failed on May 10, 2024
  50. in src/node/blockstorage.h:164 in 9cf475ffff outdated
    160+     * Helper function performing various preparations before a block can be saved to disk:
    161+     * Returns the correct position for the block to be saved, which may be in the current or a new
    162+     * block file depending on nAddSize. May flush the previous blockfile to disk if full, updates
    163+     * blockfile info, and checks if there is enough disk space to save the block.
    164+     *
    165+     * The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but the also size of
    


    TheCharlatan commented at 8:33 pm on May 11, 2024:
    Nit: s/the also/also the/
  51. TheCharlatan approved
  52. TheCharlatan commented at 8:38 pm on May 11, 2024: contributor
    ACK 9cf475ffffb869cd55c2b2f3be84d7c90b199521
  53. DrahtBot requested review from ryanofsky on May 11, 2024
  54. in src/node/blockstorage.h:339 in f90098e57d outdated
    331+     * @param[in]  nHeight      the height of the block
    332+     *
    333+     * @returns in case of success, the position to which the block was written to
    334+     *          in case of an error, an empty FlatFilePos
    335+     */
    336+
    


    paplorinc commented at 9:45 am on May 12, 2024:
    nit: extra line
  55. in src/node/blockstorage.h:173 in f90098e57d outdated
    167@@ -164,6 +168,12 @@ class BlockManager
    168 
    169     AutoFile OpenUndoFile(const FlatFilePos& pos, bool fReadOnly = false) const;
    170 
    171+    /**
    172+     * Write a block to disk. The pos argument passed to this function is modified by this call. Before this call, it should
    173+     * point to an unused file location where separator fields will be written followed by the serialized CBlock data.
    


    paplorinc commented at 9:47 am on May 12, 2024:

    nit:

    0     * point to an unused file location where separator fields will be written, followed by the serialized CBlock data.
    
  56. in src/node/blockstorage.h:326 in f90098e57d outdated
    321@@ -312,7 +322,17 @@ class BlockManager
    322     bool WriteUndoDataForBlock(const CBlockUndo& blockundo, BlockValidationState& state, CBlockIndex& block)
    323         EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    324 
    325-    /** Store block on disk. If dbp is not nullptr, then it provides the known position of the block within a block file on disk. */
    326+    /** Store block on disk and update block file statistics.
    327+     *  During reindex, only the block file statistics is updated.
    


    paplorinc commented at 9:49 am on May 12, 2024:

    nit:

    0     *  During reindex, only the block file statistics are updated.
    
  57. in src/node/blockstorage.cpp:1156 in 2a355dd2fb outdated
    1170-        if (!WriteBlockToDisk(block, blockPos)) {
    1171-            m_opts.notifications.fatalError(_("Failed to write block."));
    1172-            return FlatFilePos();
    1173-        }
    1174+    // Must also account for the serialization header
    1175+    // (the 4 magic message start bytes + the 4 length bytes = 8 bytes = BLOCK_SERIALIZATION_HEADER_SIZE)
    


    paplorinc commented at 9:53 am on May 12, 2024:

    nit:

    0    // Account for the 4 magic message start bytes + the 4 length bytes (8 bytes total, defined as BLOCK_SERIALIZATION_HEADER_SIZE).
    

    mzumsande commented at 6:55 pm on May 14, 2024:
    done
  58. in src/node/blockstorage.h:167 in d5904bd250 outdated
    163+     *
    164      * The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but the also size of
    165      * separator fields which are written before it by WriteBlockToDisk (BLOCK_SERIALIZATION_HEADER_SIZE).
    166      */
    167-    [[nodiscard]] bool FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime);
    168+    [[nodiscard]] FlatFilePos FindNextBlockPos(unsigned int nAddSize, unsigned int nHeight, uint64_t nTime);
    


    paplorinc commented at 9:55 am on May 12, 2024:
    does [[nodiscard]] still make sense now?

    ryanofsky commented at 3:15 pm on May 13, 2024:

    re: #29975 (review)

    does [[nodiscard]] still make sense now?

    I think it does make sense, since the function can still fail and checking the return value is the only way to know.


    mzumsande commented at 6:56 pm on May 14, 2024:
    Agree. Not only for error-checking, we’d also expect callers to do something with the FlatFilePos that FindNextBlockPos now returns.

    paplorinc commented at 8:52 pm on May 14, 2024:
    Would it still make sense to call this method and discard the returned value, like before, when the output param was mutated?

    mzumsande commented at 9:33 pm on May 14, 2024:
    I don’t think so, with the way the function is now only called if we don’t know the BlockPos already it seems logical for callers to do something with the result. Of course, if someone comes up with such a use case in the future, they can just remove the [[nodiscard]] - after all its point is just to prevent future programming errors.
  59. in src/node/blockstorage.cpp:851 in d5904bd250 outdated
    847@@ -848,7 +848,7 @@ fs::path BlockManager::GetBlockPosFilename(const FlatFilePos& pos) const
    848     return BlockFileSeq().FileName(pos);
    849 }
    850 
    851-bool BlockManager::FindBlockPos(FlatFilePos& pos, unsigned int nAddSize, unsigned int nHeight, uint64_t nTime)
    852+FlatFilePos BlockManager::FindNextBlockPos(unsigned int nAddSize, unsigned int nHeight, uint64_t nTime)
    


    paplorinc commented at 9:55 am on May 12, 2024:
    +1

    mzumsande commented at 6:58 pm on May 14, 2024:
    Sorry, missing the context here, is this suggesting a change?

    paplorinc commented at 8:52 pm on May 14, 2024:
    Ah, no, I just like this change
  60. in src/node/blockstorage.cpp:945 in 9cf475ffff outdated
    942@@ -943,24 +943,19 @@ void BlockManager::UpdateBlockInfo(const CBlock& block, unsigned int nHeight, co
    943 {
    944     LOCK(cs_LastBlockFile);
    945 
    946+    // Update the cursor so it points to the last file.
    


    paplorinc commented at 9:58 am on May 12, 2024:

    nit (in commit message):

    0This would means that MaxBlockfileNum
    

    vs

    0This would mean that MaxBlockfileNum
    

    mzumsande commented at 6:59 pm on May 14, 2024:
    fixed
  61. paplorinc approved
  62. paplorinc commented at 9:59 am on May 12, 2024: contributor
    Nice, only left a few nit comments
  63. in src/node/blockstorage.h:159 in f90098e57d outdated
    154@@ -155,6 +155,10 @@ class BlockManager
    155     /** Return false if undo file flushing fails. */
    156     [[nodiscard]] bool FlushUndoFile(int block_file, bool finalize = false);
    157 
    158+    /**
    159+     * The nAddSize argument passed to this function should include not just the size of the serialized CBlock, but the also size of
    


    ryanofsky commented at 2:44 pm on May 13, 2024:

    In commit “doc: Improve doc for functions involved in saving blocks to disk” (f90098e57dd8dc77a9788f5af7a529b32ca37df6)

    This comment is wrong at this point in the PR. The way FindBlockPos works right now, you only pass it the size of the serialized CBlock plus the size of the separator fields when fKnown is false. When fKnown is true, you are supposed to pass it just the size of serialized CBlock, without the size of the separator fields.


    mzumsande commented at 6:59 pm on May 14, 2024:
    Expanded the comment with the fKnown behavior.
  64. in src/node/blockstorage.h:327 in f90098e57d outdated
    321@@ -312,7 +322,17 @@ class BlockManager
    322     bool WriteUndoDataForBlock(const CBlockUndo& blockundo, BlockValidationState& state, CBlockIndex& block)
    323         EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    324 
    325-    /** Store block on disk. If dbp is not nullptr, then it provides the known position of the block within a block file on disk. */
    326+    /** Store block on disk and update block file statistics.
    327+     *  During reindex, only the block file statistics is updated.
    328+     *  In that case, dbp is not nullptr and provides the known position of the block within a block file on disk.
    


    ryanofsky commented at 2:59 pm on May 13, 2024:

    In commit “doc: Improve doc for functions involved in saving blocks to disk” (f90098e57dd8dc77a9788f5af7a529b32ca37df6)

    I think it is confusing that this says “during reindex, only the statistics are updated” instead of “If the block is already stored, only the statistics are updated.” It is true that this function is only called with already-stored blocks during reindexing, but that doesn’t seem obvious, and you wouldn’t know it without checking every code path that calls this function.

    Would suggest changing to something like “If dbp is non-null, it means the block data is already stored, and dbp contains the file position. In this case, the block data will not be written, only the block file statistics will be updated. This case should only happen during reindexing”


    mzumsande commented at 7:00 pm on May 14, 2024:
    done as suggested.
  65. in src/node/blockstorage.h:163 in d5904bd250 outdated
    155@@ -156,10 +156,15 @@ class BlockManager
    156     [[nodiscard]] bool FlushUndoFile(int block_file, bool finalize = false);
    157 
    158     /**
    159+     * Helper function performing various preparations before a block can be saved to disk:
    160+     * Returns the correct position for the block to be saved, which may be in the current or a new
    161+     * block file depending on nAddSize. May flush the previous blockfile to disk if full, updates
    162+     * blockfile info, and checks if there is enough disk space to save the block.
    163+     *
    


    ryanofsky commented at 3:35 pm on May 13, 2024:

    In commit “blockstorage: Rename FindBlockPos and have it return a FlatFilePos” (d5904bd250f41b935d6ec776373d05b42d71b04f)

    Seems like it would be good to add this comment in earlier commit “doc: Improve doc for functions involved in saving blocks to disk” (f90098e57dd8dc77a9788f5af7a529b32ca37df6), since the information does apply there, so the comment is not changing unnecessarily here.


    mzumsande commented at 7:00 pm on May 14, 2024:
    moved to the doc commit.
  66. ryanofsky approved
  67. ryanofsky commented at 3:44 pm on May 13, 2024: contributor
    Code review ACK 9cf475ffffb869cd55c2b2f3be84d7c90b199521. I left some suggestions to clean up comments in intermediate commits, but everything looks good in the end.
  68. doc: Improve doc for functions involved in saving blocks to disk
    In particular, document the flat file positions expected and
    returned by functions better.
    
    Co-authored-by: Ryan Ofsky <ryan@ofsky.org>
    fdae638e83
  69. blockstorage: split up FindBlockPos function
    FindBlockPos does different things depending on whether the block is known
    or not, as shown by the fact that much of the existing code is conditional on fKnown set or not.
    
    If the block position is known (during reindex) the function only updates the block info
    statistics. It doesn't actually find a block position in this case.
    
    This commit removes fKnown and splits up these two code paths by introducing a separate function
    for the reindex case when the block position is known.
    It doesn't change behavior.
    064859bbad
  70. validation, blockstorage: Separate code paths for reindex and saving new blocks
    By calling SaveBlockToDisk only when we actually want to save a new
    block to disk. In the reindex case, we now call UpdateBlockInfo
    directly from validation.
    
    This commit doesn't change behavior.
    d9e477c4dc
  71. blockstorage: Rename FindBlockPos and have it return a FlatFilePos
    The new name reflects that it is no longer called with existing blocks
    for which the position is already known.
    
    Returning a FlatFilePos directly simplifies the interface.
    17103637c6
  72. blockstorage: Don't move cursor backwards in UpdateBlockInfo
    Previously, it was possible to move the cursor back to an older file
    during reindex if blocks are enocuntered out of order during reindex.
    This would mean that MaxBlockfileNum() would be incorrect, and
    a wrong DB_LAST_BLOCK could be written to disk.
    
    This improves the logic by only ever moving the cursor forward (if possible)
    but not backwards.
    
    Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
    e41667b720
  73. mzumsande force-pushed on May 14, 2024
  74. mzumsande commented at 7:01 pm on May 14, 2024: contributor
    9cf475f to e41667b: Addressed review feedback.
  75. paplorinc commented at 8:58 pm on May 14, 2024: contributor
    ACK e41667b720372dae8438ea86e9819027e62b54e0 Great job, love the results of these untangling tasks!
  76. DrahtBot requested review from ryanofsky on May 14, 2024
  77. DrahtBot requested review from TheCharlatan on May 14, 2024
  78. TheCharlatan approved
  79. TheCharlatan commented at 11:05 am on May 15, 2024: contributor
    Re-ACK e41667b720372dae8438ea86e9819027e62b54e0
  80. in src/node/blockstorage.cpp:944 in 064859bbad outdated
    1000 
    1001+void BlockManager::UpdateBlockInfo(const CBlock& block, unsigned int nHeight, const FlatFilePos& pos)
    1002+{
    1003+    LOCK(cs_LastBlockFile);
    1004+
    1005+    const unsigned int added_size = ::GetSerializeSize(TX_WITH_WITNESS(block));
    


    ryanofsky commented at 2:54 pm on May 16, 2024:

    In commit “blockstorage: split up FindBlockPos function” (064859bbad6984a6ec85c744064abdf757807c58)

    note: At this point in the PR, GetSerializeSize is already called by the SaveBlockToDisk function calling it, so calling it again here is a bit inefficient. But this is resolved by the next commit moving the the UpdateBlockInfo call

  81. ryanofsky approved
  82. ryanofsky commented at 2:55 pm on May 16, 2024: contributor
    Code review ACK e41667b720372dae8438ea86e9819027e62b54e0. Just improvements to comments since last review.
  83. ryanofsky merged this on May 16, 2024
  84. ryanofsky closed this on May 16, 2024

  85. mzumsande deleted the branch on May 17, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-23 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me