index: During sync, commit best block after indexing #25074

pull mzumsande wants to merge 1 commits into bitcoin:master from mzumsande:202205_index_sync_order changing 1 files +1 −1
  1. mzumsande commented at 12:34 pm on May 6, 2022: member

    This changes the periodic commit of the best block during the index sync phase to use the already indexed predecessor of the current block index, instead of committing the current one that will only be indexed (by calling WriteBlock()) after committing the best block.

    The previous code would leave the index database in an inconsistent state until the block is actually indexed - if an unclean shutdown happened at just this point in time, the index could get corrupted because at next startup, we’d assume that we have already indexed this block.

  2. fanquake added the label UTXO Db and Indexes on May 6, 2022
  3. fanquake requested review from ryanofsky on May 6, 2022
  4. theStack commented at 12:54 pm on May 6, 2022: member
    Concept ACK
  5. w0xlt approved
  6. pk-b2 approved
  7. luke-jr commented at 9:33 pm on May 9, 2022: member
    I wonder if it would make sense to always re-index the initial best block too, just to avoid accidentally running with such corruption undetected.
  8. in src/index/base.cpp:182 in 75f53f1e1f outdated
    177@@ -185,6 +178,13 @@ void BaseIndex::ThreadSync()
    178                            __func__, pindex->GetBlockHash().ToString());
    179                 return;
    180             }
    181+
    182+            if (last_locator_write_time + SYNC_LOCATOR_WRITE_INTERVAL < current_time) {
    


    luke-jr commented at 9:39 pm on May 9, 2022:
    current_time is stale here. Maybe instead of moving this, it’d be better to just SetBestBlockIndex(pindex->pprev);?

    ryanofsky commented at 2:02 am on May 11, 2022:

    re: #25074 (review)

    current_time is stale here. Maybe instead of moving this, it’d be better to just SetBestBlockIndex(pindex->pprev);?

    Yeah probably the reason this bug was introduced is that original author was trying to avoid current_time being stale. But pindex->pprev would be the wrong best block index to use in the Rewind() case above. It would be pointing to parent of the rewind block instead of the block that was actually rewound to. I guess this would be better than current behavior, but still technically incorrect, and seems like it would be fragile even if current index subclasses are ok with it.

    I think using stale rewind time is ok and probably better than the alternatives, even if it seems a little wonky. This is all replaced anyway in 1819bdb7cb0634dad2e07daccfe7e8a122a943d6 from #24230, as Martin noted in #24230 (review)


    luke-jr commented at 2:26 am on May 11, 2022:

    But pindex->pprev would be the wrong best block index to use in the Rewind() case above. It would be pointing to parent of the rewind block instead of the block that was actually rewound to. I guess this would be better than current behavior, but still technically incorrect,

    I don’t see how it would even be incorrect… After the rewind, pindex is changed to the block one beyond that which the index should be current to…?


    ryanofsky commented at 2:41 am on May 11, 2022:

    I don’t see how it would even be incorrect… After the rewind, pindex is changed to the block one beyond that which the index should be current to…?

    Oops, you’re right. I didn’t notice NextSyncBlock was returning the next block after the fork. So I think your pindex->pprev suggestion is good and makes sense, and probably a little better than the current fix, though I think both fixes are fine.


    mzumsande commented at 0:12 am on May 13, 2022:
    I changed it to Luke’s suggestion and adjusted the commit message, thanks. While this is simpler, current_time being a bit stale shouldn’t ever be a serious problem in my opinion - I think of updating the best block / committing it every 30s as a non-essential but nice-to-have feature that may save users some time in case of shutdowns, if we removed this block completely, it shouldn’t break anything.
  9. ryanofsky approved
  10. ryanofsky commented at 2:06 am on May 11, 2022: member
    Code review ACK 75f53f1e1fc6917eac40e51ce7b6bec2717bd526. Nice catch. This just changes the index sync thread to commit after writing instead of committing before writing, so the block referenced in the commit that is supposed to have been written is actually written.
  11. naumenkogs commented at 7:48 am on May 12, 2022: member
    Concept ACK
  12. DrahtBot commented at 9:17 am on May 12, 2022: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  13. mzumsande force-pushed on May 13, 2022
  14. mzumsande commented at 0:31 am on May 13, 2022: member

    I pushed an update, implementing Luke’s suggestion.

    I wonder if it would make sense to always re-index the initial best block too, just to avoid accidentally running with such corruption undetected.

    It does get detected for the coinstatsindex, leading to an init error - that’s how I encountered this, playing around with uprobes that send kill signals to bitcoind (branch). But other indexes may not. Since this should not be possible unless there is a logic bug in the index code, checking on startup that there is an indexed entry for the best block / aborting otherwise may be another way instead of attempting to recover ?

  15. luke-jr approved
  16. luke-jr commented at 0:50 am on May 13, 2022: member
    utACK
  17. in src/index/base.cpp:171 in 9d4509b64e outdated
    167@@ -168,7 +168,7 @@ void BaseIndex::ThreadSync()
    168             }
    169 
    170             if (last_locator_write_time + SYNC_LOCATOR_WRITE_INTERVAL < current_time) {
    171-                SetBestBlockIndex(pindex);
    172+                SetBestBlockIndex(pindex->pprev);
    


    ryanofsky commented at 0:21 am on May 19, 2022:

    In commit “index: Don’t commit a best block before indexing it during sync” (9d4509b64ee333ef4587c77fb0d52e8e720a28c4)

    I think commit description saying previous code leaves the “index database temporarily in an inconsistent state” is a little misleading, because the bug could permanently corrupt the index if node is killed before the block is committed. I’d drop the word “temporarily” and maybe mention that this could lead to corruption if the sync is interrupted at a bad time.

    Also could add:

    0Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
    

    since he spotted the simpler fix


    mzumsande commented at 11:22 am on May 19, 2022:
    Done both. I meant “temporarily” as in “until the committed best block is indexed” and made that more clear in the commit message and the OP.
  18. ryanofsky approved
  19. ryanofsky commented at 0:34 am on May 19, 2022: member

    Code review ACK 9d4509b64ee333ef4587c77fb0d52e8e720a28c4, but it would be good to update commit message & PR description.

    I was thinking probably it would be possible write a unit test for this with a mock BaseIndex subclass that overrides the CommitInternal() and WriteBlock() methods and checks that commit is called with the last block written, not the next block about to be written during sync. I don’t think this is necessary however, since the bug is pretty theoretical and seems unlikely to happen in practice.

    re: #25074#issue-1227811224

    This rearranges the order during the index sync phase to do the periodic update of the best block (Commit()) only after actually indexing this block (WriteBlock()). The previous order would leave the index database temporarily in an inconsistent state - if an unclean shutdown happened at just this point in time, the index could get corrupted because at next startup, we’d assume that we have already indexed this block. Indexing the block ahead of committing the best block is the typical order and not a problem - at worst, we’d have to index this block again.

    This PR description is out of date and would be good to fix because it will become part of the merge history.

  20. index: Don't commit a best block before indexing it during sync
    Committing a block prior to indexing would leave the index database
    in an inconsistent state until it is indexed, which could corrupt the
    index in case of a unclean shutdown. Thus commit its predecessor.
    
    Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
    7171ebc7cb
  21. mzumsande force-pushed on May 19, 2022
  22. mzumsande commented at 11:39 am on May 19, 2022: member

    9d4509b -> 7171ebc: Updated the commit message and the PR description.

    I was thinking probably it would be possible write a unit test for this with a mock BaseIndex subclass that overrides the CommitInternal() and WriteBlock() methods and checks that commit is called with the last block written, not the next block about to be written during sync. I don’t think this is necessary however, since the bug is pretty theoretical and seems unlikely to happen in practice.

    I think that having a test-only mock index is a really interesting idea, and that it could help with increasing the test coverage. I’d be interested to look into this as a follow-up, with the goal of improving the test coverage of the base index logic more broadly (not only with respect to this issue, I think if the mock index class tested only this it might be a bit overkill).

  23. ryanofsky approved
  24. ryanofsky commented at 12:05 pm on May 19, 2022: member
    Code review ACK 7171ebc7cbd911fa7ccad732ea7f73bce30928ee. Looks great! Just commit message changes since last review
  25. fanquake merged this on May 19, 2022
  26. fanquake closed this on May 19, 2022

  27. mzumsande deleted the branch on May 19, 2022
  28. luke-jr referenced this in commit 095d0ecc3b on May 22, 2022
  29. sidhujag referenced this in commit 63955f419f on May 28, 2022
  30. DrahtBot locked this on May 19, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me