Reindex: save progress to continue after interruption #35071

pinheadmz commented at 2:22 PM on April 14, 2026: member

Currently, if the reindex process is interrupted it will start over on next run at blk00000.dat. Even after reindexing is finished when the node is in ActivateBestChain() an interruption may STILL require a full reindex process because DB_REINDEX_FLAG is written false, but not flushed.

Mentioned in #30424 but I couldn't find any specific follow-up:

There is no reindex progess (it should pick up the previous work and try to make progess)

The solution in this PR is simply to write a new field DB_REINDEX_LASTFILE when reindex is interrupted and flush the DB_REINDEX_FLAG setting when the process is complete. The complication is that blocks may be out of order on disk and so as we reindex we store orphan blocks temporarily in memory until they are reconnected with their parent in later files. To ensure that data is recovered, the orphan map is serialized and also saved to the database as DB_REINDEX_ORPHAN_BLOCKS.

DrahtBot commented at 2:23 PM on April 14, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35071.

Reviews

See the guideline for information on the review process. A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#33854 (fix assumevalid is ignored during reindex by Eunovo)
#32427 (kernel: Replace leveldb-based BlockTreeDB with flat-file based store by sedited)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot added the label CI failed on Apr 14, 2026

maflcko commented at 5:16 PM on April 14, 2026: member

Not sure about slowing down the happy path for an edge case: Reindex is already rare (hopefully?), and power outage during reindex should be doubly-rare.

Also, writing the out-of-order blocks seems duplicate effort. Shouldn't it be trivial and fast to read them from the existing block files instead of going the extra hop through the leveldb?

I guess it could make sense to have a flame graph showing the actual overhead that is seen when continuing a reindex. Without actual data it is hard to optimize it.

If I had to guess, is the overhead from FindByte? If yes, my preference would be to just remove it, see #34044 (comment)

Alternatively, the overhead is so minimal, that it doesn't matter?

pinheadmz commented at 7:03 PM on April 14, 2026: member

I could've used this at least in the interrupt block last week. Moving data to a bigger drive on my RPi node and messed something up so had to reindex. A few hours in I wanted to change something and hit ctrl-c. When I restarted I wondered why I had lost those hours of progress.

Also, writing the out-of-order blocks seems duplicate effort. Shouldn't it be trivial and fast to read them from the existing block files instead of going the extra hop through the leveldb?

Yeah saving the map after every file is a bummer, but we only need to read the map if we restart after an interruption, so there shouldn't be any hopping.

Alternatively, the overhead is so minimal, that it doesn't matter?

I could use a bit of clarity on what you're referring to as overhead here ?

maflcko commented at 7:27 PM on April 14, 2026: member

I could use a bit of clarity on what you're referring to as overhead here ?

Well, I couldn't find a large overhead myself (but I only tried signet so far), so maybe I am missing something. Let's recall that AcceptBlock is guarded on current master, so any progress in deserializing blocks, and accepting them is properly saved on current master. So the only remaining overhead comes from BufferedFile, but locally and for signet, it was small enough to not matter.

So I guess it could make sense to see a flame graph or anything else to see where the bottleneck is on your side. I see you have measured a reindex with this branch and saw that it is slower. But have you measured a resume and seen that it is faster? If yes, why is it faster? Knowing this will make it easier to find alternative solutions.

I can also imagine that the performance depends on the storage device that hosts the blocks dir. If that is on a network drive, then BufferedFile may be slow enough to matter?

mzumsande commented at 1:05 PM on April 15, 2026: contributor

Not sure about slowing down the happy path for an edge case: Reindex is already rare (hopefully?), and power outage during reindex should be doubly-rare.

I think agree with that. I think there is a use case for handling user interrupts, or for flushing after the first phase when all block files are indexed, but accommodating unclean restarts during reindex seems too much of a special case.

pinheadmz force-pushed on May 7, 2026

pinheadmz commented at 3:41 PM on May 7, 2026: member

push to 14b586d1c3:

Changed behavior to only save progress on interrupt instead of after every file (so happy path remains unaffected)
Covered a few edge cases
- don't proceed if orphan map was not read from database
- don't proceed if start file number is greater than total files (something got deleted since last run)

test: assert current interrupted-reindex behavior: wipe and start over fb7b803dac

blockstorage: save reindex progress upon interrupt to resume after restart

Adds two new keys to the BlockTreeDB that are written only if
reindex is interrupted:
- The last file read
- A serialized map of orphan blocks

If a reindex is interrupted, these values are read on restart and
the reindex progress continues from the checkpoint. This does not
affect runs with the -reindex flag explicitly set, which always
wipes the index and starts from blk00000.dat

99d0d61cab

pinheadmz force-pushed on May 7, 2026

DrahtBot removed the label CI failed on May 7, 2026

pinheadmz commented at 7:59 PM on May 7, 2026: member

Push to 99d0d61cabe0f6f4b3a4c79a46a73d075e4fc3ee:

Fix flakiness in tests by using the last blkXXXXX.dat file to assert tht reindexing has / has not finished

pinheadmz marked this as ready for review on May 7, 2026

maflcko commented at 5:27 AM on May 8, 2026: member

I still don't understand what this is trying to do. If this is a performance improvement, you should say so, and also provide the reason for the improvement (possibly with flamegraphs or so), and benchmarks.

pinheadmz commented at 9:44 AM on May 8, 2026: member

It is not a performance improvement. It addresses grief from restarting a lengthy reindex process if it was interrupted by the user. Motivated by my own experience.

The only reason performance was mentioned is because my original solution reduced performance. The current version of the branch does not because now we only write reindex state once, on interruption, as opposed to 5,000 times during the process.

maflcko commented at 9:55 AM on May 8, 2026: member

Sorry, I meant how much of a performance improvement is this under the scenario of an interrupt. Because personally I didn't see any overhead on restarts, see #35071 (comment).

Without seeing that this improves anything for anyone (with performance numbers or graphs), then I don't see the point.

pinheadmz commented at 10:10 AM on May 8, 2026: member

Master: Start a reindex. Three hours into that you realize your drive won't be big enough and you want to move some stuff around. Start over. Three hours to redo work then finish half an hour later.

Branch: Just the half hour you didn't do before.

If you want me to look in to why a reindex takes 3.5 hours, I'm happy to do that. It's just not the UX I'm going for here.

mzumsande commented at 12:54 PM on May 8, 2026: contributor

As long as you drop the -reindex parameter in the restart run it won't start indexing from scratch again after a interrupt (on master). While it will scan through the block files that were already indexed, this is very fast, at least on signet.

maflcko commented at 12:56 PM on May 8, 2026: member

Huh, I thought the bulk of the changes should be in AcceptBlock, which is skipped on the second run, so it should be less than 3 hours.

Edit: Ok, so you didn't disable the -reindex on the second run?

In that case, my suggested solution would be to always stop the node immediately when -reindex is supplied. This way, the user is forced to remove it before the reindex actually starts.

pinheadmz commented at 1:18 PM on May 8, 2026: member

As long as you drop the -reindex parameter in the restart run it won't start indexing from scratch again after a interrupt (on master). While it will scan through the block files that were already indexed, this is very fast, at least on signet.

Oh crap I didn't realize this. I just saw Reindexing blk00000.dat in the log the second time and wondered why it was redoing work.

Edit: Ok, so you didn't disable the -reindex on the second run?

No I did, but I was misunderstanding our dialog. I'll switch to draft for now and get actual measurements on re-re-index to see if this branch actually improves anything.

maflcko commented at 1:39 PM on May 8, 2026: member

I actually think this branch doesn't change anything, if the -reindex is supplied twice and started again from scratch. At least, I would object if dirty state is persisted in the db between reindex runs.

pinheadmz marked this as a draft on May 8, 2026

pinheadmz commented at 1:44 PM on May 8, 2026: member

I actually think this branch doesn't change anything, if the -reindex is supplied twice and started again from scratch.

Agreed, yeah here's the question: Does user experience time of run 1 + run 2 = run 3 ?

run 1: -reindex, interrupt at 90% run 2: restart with no extra args, observe until 100%

run 3: -reindex, observe until 100%