index: fix wrong assert of current_tip == m_best_block_index #32878

pull HowHsu wants to merge 1 commits into bitcoin:master from HowHsu:rewind changing 1 files +0 −1
  1. HowHsu commented at 3:53 pm on July 4, 2025: none
    In BaseIndex::Sync(), pindex in Rewind(pindex, pindex_next->pprev) isn’t always equal to m_best_block_index since m_best_block_index is updated every SYNC_LOCATOR_WRITE_INTERVAL seconds, during which multiple pindex update could happen. Thus the assert here is wrong.
  2. index: fix wrong assert of current_tip == m_best_block_index
    In BaseIndex::Sync(), pindex in `Rewind(pindex, pindex_next->pprev)` isn't always
    equal to m_best_block_index since m_best_block_index is updated every
    SYNC_LOCATOR_WRITE_INTERVAL seconds, during which multiple pindex update could
    happen. Thus the assert here is wrong.
    
    Signed-off-by: Hao Xu <hao.xu@linux.dev>
    b57eb0a344
  3. DrahtBot added the label UTXO Db and Indexes on Jul 4, 2025
  4. DrahtBot commented at 3:53 pm on July 4, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32878.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK furszy

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  5. 1440000bytes commented at 8:36 pm on July 4, 2025: none
    Can you write a test that triggers this assert on master branch?
  6. HowHsu commented at 6:39 am on July 5, 2025: none

    Can you write a test that triggers this assert on master branch?

    Sure, I’ll do that.

  7. mzumsande commented at 9:54 pm on July 6, 2025: contributor

    I think that this assert could be hit if a reorg was happening while Sync() was going on:

    1. Sync() syncs some blocks but isn’t finished (doesn’t update m_best_block)

    2. The node undergoes a reorg, so that some of the synced blocks need to be rewinded.

    3. Rewind() is called and the assert fails.

    I could trigger this on regtest with an added sleep in Sync(), freezing the sync so that I could do the reorg. I think it’s going to be hard / impossible to trigger it in a test without that sleep.

  8. HowHsu force-pushed on Jul 7, 2025
  9. DrahtBot added the label CI failed on Jul 7, 2025
  10. DrahtBot commented at 12:37 pm on July 7, 2025: contributor

    🚧 At least one of the CI tasks failed. Task macOS-cross, gui, no tests: https://github.com/bitcoin/bitcoin/runs/45472933675 LLM reason (✨ experimental): The CI failure is caused by compile errors due to unused lambda capture and thread safety analysis violations in the test source code.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  11. HowHsu force-pushed on Jul 7, 2025
  12. HowHsu force-pushed on Jul 7, 2025
  13. furszy commented at 1:51 pm on July 7, 2025: member

    I don’t think the test is useful. The change is being tested against a Sync() mechanism introduced within the test itself and not the real one, which can change over time (see for example #26966 that changes it completely). It would be better, and simpler for you, to share a patch that reproduces the issue (as @mzumsande mentioned, just adding a sleep() call and triggering the reorg is enough to reproduce it. See #32835 PR description for a patch example).

    Other than that, concept ACK.

  14. HowHsu force-pushed on Jul 7, 2025
  15. HowHsu commented at 2:40 pm on July 7, 2025: none

    I don’t think the test is useful. The change is being tested against a Sync() mechanism introduced within the test itself and not the real one, which can change over time (see for example #26966 that changes it completely). It would be better, and simpler for you, to share a patch that reproduces the issue (as @mzumsande mentioned, just adding a sleep() call and triggering the reorg is enough to reproduce it. See #32835 PR description for a patch example).

    Other than that, concept ACK.

    Thanks furszy, you are right, I’ll remove this test.

  16. HowHsu commented at 2:41 pm on July 7, 2025: none

    I think that this assert could be hit if a reorg was happening while Sync() was going on:

    1. Sync() syncs some blocks but isn’t finished (doesn’t update m_best_block)
    2. The node undergoes a reorg, so that some of the synced blocks need to be rewinded.
    3. Rewind() is called and the assert fails.

    I could trigger this on regtest with an added sleep in Sync(), freezing the sync so that I could do the reorg. I think it’s going to be hard / impossible to trigger it in a test without that sleep.

    Thanks mzumsande.

  17. HowHsu force-pushed on Jul 7, 2025
  18. HowHsu commented at 2:46 pm on July 7, 2025: none
    Hi all, I’ve written a test with tweaked Sync() to demonstrate this bug: https://github.com/bitcoin/bitcoin/commit/e9f4ffcd7d8b7211cc049a8ab7632b19c06f7b11
  19. DrahtBot removed the label CI failed on Jul 7, 2025
  20. luke-jr commented at 9:01 pm on July 7, 2025: member
    The test should probably be added to the PR after the commit fixing it
  21. HowHsu commented at 2:57 am on July 8, 2025: none

    furszy

    But like furszy said, the Sync() in test uses mutex and conditional variables to sync with the main thread, while the real Sync() in bitcoind changes as times goes by.

  22. furszy commented at 1:30 pm on July 8, 2025: member

    I don’t think the test is useful. Other than that, concept ACK.

    You can ignore it.

    Maybe give another look at the test code before saying that? This was the commit https://github.com/bitcoin/bitcoin/commit/21ade546808fd757bf451db74cea1da3755f6613

    The test isn’t testing the indexes code — just its own test-only implementation of it. It was essentially testing itself. And that’s not useful.

  23. HowHsu commented at 2:31 pm on July 8, 2025: none

    The test isn’t testing the indexes code — just its own test-only implementation of it. It was essentially testing itself. And that’s not useful.

    I now don’t think so, it does test a case that the current tests don’t do: reorg happens in sync period. look at the last part of the test, it tests if the old fork is not in the filter and the new best chain blocks is in the filter, though very basic for now. The Sync in the test is same as in bitcoind, expect the thread synchronization code which is to trigger the reorg. The only cons is we have to keep it updated when the logic of Sync() changes, which is pointed out by you, but I now think this is doable.

  24. HowHsu commented at 2:40 pm on July 8, 2025: none

    I don’t think the test is useful. Other than that, concept ACK.

    You can ignore it.

    Easy, everyone. Without specific reasoning, these kinds of claims don’t really help.

  25. furszy commented at 8:32 pm on July 9, 2025: member

    @HowHsu, check this test https://github.com/furszy/bitcoin-core/commit/f2b8a06060e2a313f35101454c7669d46e0edc38. It triggers the reorg assertion failure without needing to deduplicate the index synchronization code. The test fails without your fix commit and passes with it. Feel free to cherry-pick it.

    The Sync in the test is same as in bitcoind, expect the thread synchronization code which is to trigger the reorg. The only cons is we have to keep it updated when the logic of Sync() changes, which is pointed out by you, but I now think this is doable.

    From a code maintenance perspective, that’s a bad idea. It’s hard to rephrase it in other terms. Even a simple external link or patch that exercises the assertion would still be much better than adding code duplication.

  26. mzumsande commented at 9:48 pm on July 9, 2025: contributor
    General question - should we remove the assert, or would it maybe be better to keep it, but commit the best block index before calling Rewind() in Sync()?
  27. HowHsu commented at 4:54 pm on July 10, 2025: none

    @HowHsu, check this test furszy@f2b8a06. It triggers the reorg assertion failure without needing to deduplicate the index synchronization code. The test fails without your fix commit and passes with it. Feel free to cherry-pick it.

    The Sync in the test is same as in bitcoind, expect the thread synchronization code which is to trigger the reorg. The only cons is we have to keep it updated when the logic of Sync() changes, which is pointed out by you, but I now think this is doable.

    From a code maintenance perspective, that’s a bad idea. It’s hard to rephrase it in other terms. Even a simple external link or patch that exercises the assertion would still be much better than adding code duplication.

    I see, make sense.

  28. furszy commented at 11:00 pm on July 15, 2025: member

    General question - should we remove the assert, or would it maybe be better to keep it, but commit the best block index before calling Rewind() in Sync()?

    I’m not sure about this but.. if we go down that road, wouldn’t it be simpler to just drop the current_tip arg and use m_best_block_index internally?

  29. furszy commented at 9:10 pm on July 23, 2025: member
    @HowHsu, are you going to include the test or a patch to reproduce the issue? I’m happy to ACK it either way.
  30. HowHsu commented at 1:51 am on July 24, 2025: none

    @HowHsu, are you going to include the test or a patch to reproduce the issue? I’m happy to ACK it either way.

    Hi furszy, I thought the test I put already demonstrate the issue, though it is not proper to be included to the codebase.

  31. furszy commented at 3:40 am on July 24, 2025: member

    @HowHsu, are you going to include the test or a patch to reproduce the issue? I’m happy to ACK it either way.

    Hi furszy, I thought the test I put already demonstrate the issue, though it is not proper to be included to the codebase.

    Are you referring to the one you linked in #32878 (comment)? That link returns a 404.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-07-30 12:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me