validation: Make ReplayBlocks interruptible

mzumsande commented at 3:12 pm on May 22, 2024: contributor

This addresses the problem from #11600 that the Rolling Forward process after a crash (ReplayBlocks()) is uninterruptible. ReplayBlocks can take a long time to finish, and this can be especially annoying to GUI users who are taunted to “press q to shutdown” even though pressing “q” does nothing.

Now, when an interrupt is received during ReplayBlocks(), the intermediate progress is saved: In addition to writing the updated coins to disk, the flush adjusts the lower head block (saved in DB_HEAD_BLOCKS), so that with the next restart, the replay continues from where it was stopped without losing progress.

I tested this manually on signet: A situation where ReplayBlocks() becomes necessary can be created by syncing with -dbcrashratio=1 and stopping the node after ~70k blocks. Then, after the next restart, the replay process can be interrupted. I then used the dumptxoutset output to check that the final utxo-set hash is unaffected by how many times the process is interrupted.

DrahtBot commented at 3:12 pm on May 22, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30155.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	TheCharlatan, andrewtoth

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

DrahtBot added the label Validation on May 22, 2024

in src/validation.cpp:4758 in 4bd626ec6e outdated

4753@@ -4754,6 +4754,12 @@ bool Chainstate::ReplayBlocks()
4754         LogPrintf("Rolling forward %s (%i)\n", pindex.GetBlockHash().ToString(), nHeight);
4755         m_chainman.GetNotifications().progress(_("Replaying blocks…"), (int)((nHeight - nForkHeight) * 100.0 / (pindexNew->nHeight - nForkHeight)), false);
4756         if (!RollforwardBlock(&pindex, cache)) return false;
4757+        if (m_chainman.m_interrupt) {
4758+            LogPrintf("Flushing intermediate state of replay\n");

maflcko commented at 6:35 pm on May 22, 2024:

0            LogInfo("Flushing intermediate state of replay\n");

nit: For new code it would be better to use the non-deprecated non-confusing alias LogInfo over LogPrintf.

mzumsande commented at 8:33 pm on May 22, 2024:

sure - updated.

mzumsande force-pushed on May 22, 2024

in src/txdb.cpp:105 in 26e78f2ee4 outdated

100-        // We may be in the middle of replaying.
101+        // We may be in the process of replaying.
102         std::vector<uint256> old_heads = GetHeadBlocks();
103         if (old_heads.size() == 2) {
104             if (old_heads[0] != hashBlock) {
105-                LogPrintLevel(BCLog::COINDB, BCLog::Level::Error, "The coins database detected an inconsistent state, likely due to a previous crash or shutdown. You will need to restart bitcoind with the -reindex-chainstate or -reindex configuration option.\n");

luke-jr commented at 5:17 pm on May 23, 2024:

Is it a safe assumption that this cannot happen except an interrupted replay?

mzumsande commented at 5:59 pm on May 23, 2024:

Currently we abort with an assert in the next line after the log, so the expectation in master is that this shouldn’t happen outside of db corruption. I couldn’t find any reports on the internet of users that have run into this error / assert, so it probably doesn’t happen very often in practice. But theoretically speaking, if the coins db gets corrupted, I guess anything can happen depending on the specifics of the corruption?

While working on this PR, I initially went with the different approach of adding an explicit in_replay flag to BatchWrite(), and setting that flag only during the interrupt flush. I switched to the current approach of detecting this implicitly inside BatchWrite() because it seemed more elegant and led to less boilerplate code, but it’s still an option.

ryanofsky commented at 2:18 pm on June 6, 2024:

In commit “validation: Make ReplayBlocks interruptible” (26e78f2ee4a325f20c3b3bfaac5be532ab9e7bac)

I switched to the current approach of detecting this implicitly inside BatchWrite() because it seemed more elegant and led to less boilerplate code, but it’s still an option.

I would definitely be curious to see the other approach. Not just because it might be better at detecting inconsistent states like Luke is suggesting, but also because it seems like it would make the code more clear.

I think it would be pretty hard to look at the code after this PR (without looking at the PR itself), and understand why old_heads[0] != hashBlock implies that this BatchWrite call is finalizing a replay that is currently being interrupted.

I was also confused by the name of the “unfinished_replay” variable below, because I thought “unfinished replay” was just referring to a replay of an previously unfinished flush. I did not realize it was referring specifically to an interrupted replay of an unfinished flush.

Also as a possible followup, for future maintainability of this code I think it could be a good idea to remove the replay logic entirely from BatchWrite(), since ReplayBlocks() is the only BatchWrite() caller that should need it. Other BatchWrite calls should be able to assume and assert that GetHeadBlocks() is empty, and this is just a normal write, not a replay.

mzumsande commented at 7:16 pm on June 10, 2024:

I would definitely be curious to see the other approach. Not just because it might be better at detecting inconsistent states like Luke is suggesting, but also because it seems like it would make the code more clear.

See https://github.com/mzumsande/bitcoin/commit/db8232d47068bba9ae1bb8135ef456be731e5ca5 - because of the inheritance the bool interrupted_replay needed to be added to quite a few spots, which I didn’t love.

I think it would be pretty hard to look at the code after this PR (without looking at the PR itself), and understand why old_heads[0] != hashBlock implies that this BatchWrite call is finalizing a replay that is currently being interrupted.

I’ll add more documentation (if I don’t switch to the other approach).

I was also confused by the name of the “unfinished_replay” variable below,

Maybe interrupted_replay is better? But the issue remains that it’s the replay that is being interrupted (or unfinished), and not the flush.

Also as a possible followup, for future maintainability of this code I think it could be a good idea to remove the replay logic entirely from BatchWrite(), since ReplayBlocks() is the only BatchWrite() caller that should need it. Other BatchWrite calls should be able to assume and assert that GetHeadBlocks() is empty, and this is just a normal write, not a replay.

Seems reasonable for a follow-up, though I haven’t looked into details yet.

achow101 requested review from sipa on Oct 15, 2024

TheCharlatan commented at 2:54 pm on October 15, 2024: contributor

Concept ACK

andrewtoth commented at 5:52 pm on October 19, 2024: contributor

Concept ACK

It appears to me that the size of the cache used in ReplayBlocks is never checked if it grows too large. Wouldn’t this mean that if the node was sufficiently far behind, the rolling forward would fill up the cache to levels exceeding dbcache? This would result in a cycle of rolling forward and then OOM crashing, never being able to write any chainstate to disk.

DrahtBot added the label CI failed on Mar 17, 2025

validation: Make ReplayBlocks interruptible

Now, when an interrupt is received during ReplayBlocks the
intermediate progress is flushed:
In addition to updating the coins on disk, this
flush adjusts the head blocks, so that with the next
restart, the replay continues from where we stopped.

4d1342fbcf

mzumsande force-pushed on Mar 17, 2025

mzumsande commented at 3:41 pm on March 17, 2025: contributor

It appears to me that the size of the cache used in ReplayBlocks is never checked if it grows too large. Wouldn’t this mean that if the node was sufficiently far behind, the rolling forward would fill up the cache to levels exceeding dbcache? This would result in a cycle of rolling forward and then OOM crashing, never being able to write any chainstate to disk.

Good point - I think that is true, ReplayBlocks() uses its own cache, which doesn’t seem to be subject to any limits. However, the fact that we got into this position in the first place means that the user specified a certain dbcache in their original run that was shutdown uncleanly, and I guess that will be respected by design because we don’t connect any blocks that weren’t also connected in the first run?!

I guess users don’t get the chance to change their preference in between the original run and the replay though, which is not ideal. Maybe the method introduced in this PR could also be used to do regular flushes.

DrahtBot removed the label CI failed on Mar 17, 2025

validation: Make ReplayBlocks interruptible #30155

Code Coverage & Benchmarks

Reviews

Conflicts