Rewind(pindex, pindex_next->pprev)
isn’t always equal to m_best_block_index since m_best_block_index is updated every SYNC_LOCATOR_WRITE_INTERVAL seconds, during which multiple pindex update could happen. Thus the assert here is wrong.
index: fix wrong assert of current_tip == m_best_block_index #32878
pull HowHsu wants to merge 2 commits into bitcoin:master from HowHsu:rewind changing 2 files +84 −8-
HowHsu commented at 3:53 pm on July 4, 2025: noneIn BaseIndex::Sync(), pindex in
-
DrahtBot added the label UTXO Db and Indexes on Jul 4, 2025
-
DrahtBot commented at 3:53 pm on July 4, 2025: contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Code Coverage & Benchmarks
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32878.
Reviews
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #26966 (index: initial sync speedup, parallelize process by furszy)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
-
1440000bytes commented at 8:36 pm on July 4, 2025: noneCan you write a test that triggers this assert on master branch?
-
HowHsu commented at 6:39 am on July 5, 2025: none
Can you write a test that triggers this assert on master branch?
Sure, I’ll do that.
-
mzumsande commented at 9:54 pm on July 6, 2025: contributor
I think that this assert could be hit if a reorg was happening while
Sync()
was going on:-
Sync()
syncs some blocks but isn’t finished (doesn’t update m_best_block) -
The node undergoes a reorg, so that some of the synced blocks need to be rewinded.
-
Rewind()
is called and the assert fails.
I could trigger this on regtest with an added sleep in
Sync()
, freezing the sync so that I could do the reorg. I think it’s going to be hard / impossible to trigger it in a test without that sleep. -
-
HowHsu force-pushed on Jul 7, 2025
-
DrahtBot added the label CI failed on Jul 7, 2025
-
DrahtBot commented at 12:37 pm on July 7, 2025: contributor
🚧 At least one of the CI tasks failed. Task
macOS-cross, gui, no tests
: https://github.com/bitcoin/bitcoin/runs/45472933675 LLM reason (✨ experimental): The CI failure is caused by compile errors due to unused lambda capture and thread safety analysis violations in the test source code.Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:
-
Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
-
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
-
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
-
-
HowHsu force-pushed on Jul 7, 2025
-
HowHsu force-pushed on Jul 7, 2025
-
furszy commented at 1:51 pm on July 7, 2025: member
I don’t think the test is useful. The change is being tested against a
Sync()
mechanism introduced within the test itself and not the real one, which can change over time (see for example #26966 that changes it completely). It would be better, and simpler for you, to share a patch that reproduces the issue (as @mzumsande mentioned, just adding a sleep() call and triggering the reorg is enough to reproduce it. See #32835 PR description for a patch example).Other than that, concept ACK.
-
HowHsu force-pushed on Jul 7, 2025
-
HowHsu commented at 2:40 pm on July 7, 2025: none
I don’t think the test is useful. The change is being tested against a
Sync()
mechanism introduced within the test itself and not the real one, which can change over time (see for example #26966 that changes it completely). It would be better, and simpler for you, to share a patch that reproduces the issue (as @mzumsande mentioned, just adding a sleep() call and triggering the reorg is enough to reproduce it. See #32835 PR description for a patch example).Other than that, concept ACK.
Thanks furszy, you are right, I’ll remove this test.
-
HowHsu commented at 2:41 pm on July 7, 2025: none
I think that this assert could be hit if a reorg was happening while
Sync()
was going on:Sync()
syncs some blocks but isn’t finished (doesn’t update m_best_block)- The node undergoes a reorg, so that some of the synced blocks need to be rewinded.
Rewind()
is called and the assert fails.
I could trigger this on regtest with an added sleep in
Sync()
, freezing the sync so that I could do the reorg. I think it’s going to be hard / impossible to trigger it in a test without that sleep.Thanks mzumsande.
-
HowHsu force-pushed on Jul 7, 2025
-
HowHsu commented at 2:46 pm on July 7, 2025: noneHi all, I’ve written a test with tweaked Sync() to demonstrate this bug: https://github.com/bitcoin/bitcoin/commit/e9f4ffcd7d8b7211cc049a8ab7632b19c06f7b11
-
DrahtBot removed the label CI failed on Jul 7, 2025
-
luke-jr commented at 9:01 pm on July 7, 2025: memberThe test should probably be added to the PR after the commit fixing it
-
HowHsu commented at 2:57 am on July 8, 2025: none
furszy
But like furszy said, the Sync() in test uses mutex and conditional variables to sync with the main thread, while the real Sync() in bitcoind changes as times goes by.
-
furszy commented at 1:30 pm on July 8, 2025: member
I don’t think the test is useful. Other than that, concept ACK.
You can ignore it.
Maybe give another look at the test code before saying that? This was the commit https://github.com/bitcoin/bitcoin/commit/21ade546808fd757bf451db74cea1da3755f6613
The test isn’t testing the indexes code — just its own test-only implementation of it. It was essentially testing itself. And that’s not useful.
-
HowHsu commented at 2:31 pm on July 8, 2025: none
The test isn’t testing the indexes code — just its own test-only implementation of it. It was essentially testing itself. And that’s not useful.
I now don’t think so, it does test a case that the current tests don’t do: reorg happens in sync period. look at the last part of the test, it tests if the old fork is not in the filter and the new best chain blocks is in the filter, though very basic for now. The Sync in the test is same as in bitcoind, expect the thread synchronization code which is to trigger the reorg. The only cons is we have to keep it updated when the logic of Sync() changes, which is pointed out by you, but I now think this is doable.
-
HowHsu commented at 2:40 pm on July 8, 2025: none
I don’t think the test is useful. Other than that, concept ACK.
You can ignore it.
Easy, everyone. Without specific reasoning, these kinds of claims don’t really help.
-
furszy commented at 8:32 pm on July 9, 2025: member
@HowHsu, check this test https://github.com/furszy/bitcoin-core/commit/f2b8a06060e2a313f35101454c7669d46e0edc38. It triggers the reorg assertion failure without needing to deduplicate the index synchronization code. The test fails without your fix commit and passes with it. Feel free to cherry-pick it.
The Sync in the test is same as in bitcoind, expect the thread synchronization code which is to trigger the reorg. The only cons is we have to keep it updated when the logic of Sync() changes, which is pointed out by you, but I now think this is doable.
From a code maintenance perspective, that’s a bad idea. It’s hard to rephrase it in other terms. Even a simple external link or patch that exercises the assertion would still be much better than adding code duplication.
-
mzumsande commented at 9:48 pm on July 9, 2025: contributorGeneral question - should we remove the assert, or would it maybe be better to keep it, but commit the best block index before calling
Rewind()
inSync()
? -
HowHsu commented at 4:54 pm on July 10, 2025: none
@HowHsu, check this test furszy@f2b8a06. It triggers the reorg assertion failure without needing to deduplicate the index synchronization code. The test fails without your fix commit and passes with it. Feel free to cherry-pick it.
The Sync in the test is same as in bitcoind, expect the thread synchronization code which is to trigger the reorg. The only cons is we have to keep it updated when the logic of Sync() changes, which is pointed out by you, but I now think this is doable.
From a code maintenance perspective, that’s a bad idea. It’s hard to rephrase it in other terms. Even a simple external link or patch that exercises the assertion would still be much better than adding code duplication.
I see, make sense.
-
furszy commented at 11:00 pm on July 15, 2025: member
General question - should we remove the assert, or would it maybe be better to keep it, but commit the best block index before calling
Rewind()
inSync()
?I’m not sure about this but.. if we go down that road, wouldn’t it be simpler to just drop the
current_tip
arg and usem_best_block_index
internally? -
furszy commented at 3:40 am on July 24, 2025: member
@HowHsu, are you going to include the test or a patch to reproduce the issue? I’m happy to ACK it either way.
Hi furszy, I thought the test I put already demonstrate the issue, though it is not proper to be included to the codebase.
Are you referring to the one you linked in #32878 (comment)? That link returns a 404.
-
HowHsu commented at 7:59 am on July 24, 2025: none
@HowHsu, are you going to include the test or a patch to reproduce the issue? I’m happy to ACK it either way.
Hi furszy, I thought the test I put already demonstrate the issue, though it is not proper to be included to the codebase.
Are you referring to the one you linked in #32878 (comment)? That link returns a 404.
Sorry for that, check this one: https://github.com/HowHsu/bitcoin/commit/e9f4ffcd7d8b7211cc049a8ab7632b19c06f7b11
-
mzumsande commented at 7:36 pm on July 30, 2025: contributor
Hi furszy, I thought the test I put already demonstrate the issue, though it is not proper to be included to the codebase.
I agree that a test that copies / amends the actual production code is not great for inclusion, but why not include the test https://github.com/furszy/bitcoin-core/commit/f2b8a06060e2a313f35101454c7669d46e0edc38 from above that doesn’t have this problem?
-
HowHsu commented at 5:43 am on July 31, 2025: none
Hi furszy, I thought the test I put already demonstrate the issue, though it is not proper to be included to the codebase.
I agree that a test that copies / amends the actual production code is not great for inclusion, but why not include the test furszy@f2b8a06 from above that doesn’t have this problem?
I see, sorry for somehow I missed that part. I thought the test furszy mentioned is my test… I’ll pick it.
-
DrahtBot added the label Needs rebase on Jul 31, 2025
-
HowHsu force-pushed on Aug 1, 2025
-
HowHsu commented at 4:42 am on August 1, 2025: none
-
DrahtBot added the label CI failed on Aug 1, 2025
-
DrahtBot commented at 5:08 am on August 1, 2025: contributor
🚧 At least one of the CI tasks failed. Task
TSan, depends, no gui
: https://github.com/bitcoin/bitcoin/runs/47172613862 LLM reason (✨ experimental): Data race in destructor of BaseIndex caused test failure.Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:
-
Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
-
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
-
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
-
-
DrahtBot removed the label Needs rebase on Aug 1, 2025
-
furszy commented at 3:13 pm on August 1, 2025: member
Hi @furszy , sorry for the delay, I’ve picked your commit of the test, which I missed earlier.
No problem. This needs a rebase.
Hi @furszy , seems the test causes some data race, would be great if you could update it. I tried to fix it but I’m not familiar with the index logic.
It seems thread-related. Wait for the background thread to finish before the test ends: https://github.com/furszy/bitcoin-core/commit/30e0b2929da53a1519e282504720fac43b180ea0 (same code, just added an index
Stop()
call at the end). This will also unregister the index early on. Push it and let me know. If that doesn’t work, will run the CI locally to debug it, but hopefully that won’t be necessary. -
index: fix wrong assert of current_tip == m_best_block_index
In BaseIndex::Sync(), pindex in `Rewind(pindex, pindex_next->pprev)` isn't always equal to m_best_block_index since m_best_block_index is updated every SYNC_LOCATOR_WRITE_INTERVAL seconds, during which multiple pindex update could happen. Thus the assert here is wrong. Signed-off-by: Hao Xu <hao.xu@linux.dev>
-
test: exercise index reorg assertion failure 3aef38f44b
-
HowHsu force-pushed on Aug 1, 2025
-
DrahtBot removed the label CI failed on Aug 1, 2025
-
furszy commented at 11:43 pm on August 1, 2025: memberACK 3aef38f Test fails without the fix commit and passes with it.
-
fanquake requested review from mzumsande on Aug 2, 2025
-
mzumsande commented at 3:32 pm on August 18, 2025: contributorCode Review ACK 3aef38f44b76dfda77f47dc1a0e1fdc6ff3c7766
-
achow101 commented at 7:05 pm on August 19, 2025: memberACK 3aef38f44b76dfda77f47dc1a0e1fdc6ff3c7766
-
achow101 merged this on Aug 19, 2025
-
achow101 closed this on Aug 19, 2025
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-09-02 15:13 UTC
More mirrored repositories can be found on mirror.b10c.me