rpc: rollback

Sjors commented at 10:50 am on March 5, 2024: member

Please describe the feature you’d like to see added.

bitcoin-cli rollback

Arguments:

height or hash
delete blocks from storage
delete headers

This should be used with networking turned off. As soon as you connect to a peer, it will sync back to the tip as normal.

It should rollback indexes too.

It should probably warn users if they have a wallet loaded, especially if the rollback is more than 10 (?) block; that’s the point where the rollback process stops placing transactions back into the mempool.

We currently rely on invalidateblock to roll back the chain to an arbitrary height. And then reconsiderblock to go back to the tip. Rolling back the chain is used when creating snapshots for assume utxo.

This works but the RPC methods are not designed for this. It has annoying side-effects like disconnecting peers because they have “invalid” blocks. If there are stale blocks near the height you want to roll back to, you have to make sure to invalidate those too. IIRC there’s also no way to abort the process.

For functional tests it can be useful to forget a block and a header, e.g. to test behavior when a header is received multiple times from different peers, when receiving an unsolicited or mutated block. Currently such a test has to carefully sequence events, or use multiple unconnected nodes. This RPC would effectively add “undo” for such tests.

Describe the solution you’d like

See above

Describe any alternatives you’ve considered

Status quo.

Please leave any additional context

I’m not advocating to drop invalidateblock entirely. For example it’s still useful if you want to fully validate a stale block.

Sjors added the label Feature on Mar 5, 2024

maflcko commented at 8:26 am on March 12, 2024: member

Not sure about adding a heavy and complicated feature such as this to real code, when this feature will only be used in tests. AU will likely use #29553, IIUC?

Sjors commented at 8:33 am on March 12, 2024: member

My suggestion in #29553 was to use a proper rollback mechanism (for which you might as well add an RPC call). Right now that PR effectively uses invalidateblock and reconsiderblock internally, as the shell scripts do, which is brittle.

maflcko commented at 9:02 am on March 13, 2024: member

Deleting block files and headers from storage (and memory) seems more brittle and unclean than, let’s say, simply rolling back a view of the utxo set (similar to validatechain). I guess this requires RAM (not sure how easy it would be to change), but otherwise this seems cleaner.

Sjors commented at 10:54 am on March 18, 2024: member

@maflcko maybe “clean” in terms of code, but not in terms of behaviour. Our validation and p2p code behaves different when blocks are stored or headers are known. We also disconnect peers that send us “invalid” blocks. This is why the UTXO snapshot generation script has to turn off the network. But if anything goes wrong, you might end up with blocks marked invalid.

If we want e.g. 50 people to generate and attest their own snapshot, it seems quite likely that at least some of them will kill the node halfway because it takes too long. It seems safer to me if the node can recover from that with regular IBD behavior, and we don’t have to tell users to call getchaintips and call reconsiderblock on various blocks.

Also, the only way at the moment to test if a given snapshot is valid, is to start a new node. With a rollback RPC you can test it on an existing node.

I’d like to see if the code for such an RPC is actually messy and complicated. Perhaps it can be done in a straight-forward way. I haven’t tried (yet).

sipa commented at 11:35 am on March 18, 2024: member

Could this be accomplished using a (test-only) RPC that marks block data of an inactive block deleted, and/or one that deletes the block index entry of an inactive block?

Then you could use invalidateblock + deleteblockdata to simulate the behavior you want?

Because internally, that’s how I’d imagine the functionality you’re asking for anyway.

maflcko commented at 11:40 am on March 18, 2024: member

Then you could use invalidateblock + deleteblockdata to simulate the behavior you want?

I don’t think this works, in the unlikely case, as explained by sjors, where a stale block (a sibling of the invalidated block) happens to exist and is activated.

maflcko commented at 11:43 am on March 18, 2024: member

@maflcko maybe “clean” in terms of code, but not in terms of behaviour. Our validation and p2p code behaves different when blocks are stored or headers are known. We also disconnect peers that send us “invalid” blocks. This is why the UTXO snapshot generation script has to turn off the network. But if anything goes wrong, you might end up with blocks marked invalid.

Why would it be cleaner in terms of behavior to delete blocks when rolling back the utxo set? And the p2p issues you mention would remain present with your suggested solution, unless you also disable the network.

Sjors commented at 12:03 pm on March 18, 2024: member

It would still be better to disable the network during the rollback, because otherwise you’d immediately get the headers again and start downloading. We would not disconnect peers, though they might replace us for not having recent stuff (or for not delivering blocks at a height that we previously announced).

Why would it be cleaner in terms of behavior to delete blocks when rolling back the utxo set?

For the use case of assume utxo, there might be alternative ways to roll back the UTXO set. Doing it purely in memory (as a cache) - as you suggested above - would have the advantage of making it safe and quick to abort.

For the use case of p2p test code (compact blocks, modified blocks, header spam, etc), you need to delete blocks and headers.

I don’t know a priori which approach is easier to implement.

in the unlikely case

I agree it seems unlikely on mainnet, unless someone does it on purpose to frustrate the process. That could happen if we were to pick a predictable interval. That could be useful if we share (partial) snapshots via the p2p network, where nodes know in advance at which height the next snapshot will be.

maflcko commented at 12:29 pm on March 18, 2024: member

For the use case of p2p test code (compact blocks, modified blocks, header spam, etc), you need to delete blocks and headers.

For p2p tests, my preference would to just use what has been used previously: Spin up a node and sync the blocks/headers from scratch, when invalidateblock doesn’t work. There needs to be a substantial benefit, to warrant writing a complex, new, test-only RPC for an alternative solution to an already solved edge case.

Sjors commented at 10:40 am on December 6, 2024: member

#29553 has been merged, so closing this for now as it’s indeed perhaps a bit overkill for just tests.

Sjors closed this on Dec 6, 2024

rpc: rollback #29565

Please describe the feature you’d like to see added.

Is your feature related to a problem, if so please describe it.

Describe the solution you’d like

Describe any alternatives you’ve considered

Please leave any additional context