validation: Improve warnings in case of chain corruption

mzumsande commented at 9:49 pm on October 6, 2025: contributor

In case of corruption that leads to a block being marked as invalid that is seen as valid by the rest of the network, the user currently doesn’t receive good error messages, but will often be stuck in an endless headers-sync loop with no explanation (#26391).

This PR improves warnings in two ways:

When we receive a header that is already saved in our disk, but invalid, add a warning. This will happen repeatedly during the headerssync loop (see #26391 (comment) on how to trigger it artificially).
Removes the IBD check from CheckForkWarningConditions and adds a call to the function during init (LoadChainTip()). The existing check was added in https://github.com/bitcoin/bitcoin/commit/55ed3f14751206fc87f0cbf8cb4e223efacef338 a long time ago when we had more sophisticated fork detection that could lead to false positives during IBD, but that logic was removed in https://github.com/bitcoin/bitcoin/commit/fa62304c9760f0de9838e56150008816e7a9bacb so that I don’t see a reason to suppress the warning anymore.

Fixes #26391 (We’ll still do the endless looping, trying to find a peer with a headers that we can use, but will now repeatedly log warnings while doing so).

DrahtBot added the label Validation on Oct 6, 2025

DrahtBot commented at 9:49 pm on October 6, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33553.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	janb84, hodlinator, theStack
Stale ACK	stickies-v, polespinasa, maflcko

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#33856 (kernel, validation: Refactor ProcessNewBlock(Headers) to return BlockValidationState by yuvicc)
#30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)
#30214 (refactor: Improve assumeutxo state representation by ryanofsky)
#29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

in src/validation.cpp:2035 in d55830cd50

2035         return;
2036     }
2037 
2038     if (m_chainman.m_best_invalid && m_chainman.m_best_invalid->nChainWork > m_chain.Tip()->nChainWork + (GetBlockProof(*m_chain.Tip()) * 6)) {
2039-        LogPrintf("%s: Warning: Found invalid chain at least ~6 blocks longer than our best chain.\nChain state database corruption likely.\n", __func__);
2040+        LogPrintf("%s: Warning: Found invalid chain at least ~6 blocks longer than our best chain. Chain state database corruption likely.\n", __func__);

maflcko commented at 7:52 am on October 7, 2025:

0        LogWarning("Found invalid chain at least ~6 blocks longer than our best chain. Chain state database corruption likely.");

nit: while touching this, could add the correct log severity?

mzumsande commented at 2:43 pm on October 7, 2025:

done

in src/validation.cpp:2029 in d55830cd50 outdated

2025@@ -2026,15 +2026,13 @@ void Chainstate::CheckForkWarningConditions()
2026 {
2027     AssertLockHeld(cs_main);
2028 
2029-    // Before we get past initial download, we cannot reliably alert about forks
2030-    // (we assume we don't get stuck on a fork before finishing our initial sync)
2031-    // Also not applicable to the background chainstate
2032-    if (m_chainman.IsInitialBlockDownload() || this->GetRole() == ChainstateRole::BACKGROUND) {
2033+    // Not applicable to the background chainstate

stickies-v commented at 11:05 am on October 7, 2025:

nit: trivial docstring, would prefer removing (or elaborating)

mzumsande commented at 4:03 pm on October 8, 2025:

removed it. I think it’s obvious enough why it’s not applicable.

mzumsande force-pushed on Oct 7, 2025

janb84 commented at 2:56 pm on October 7, 2025: contributor

Concept ACK

This PR introduces a informative warning for the user in the case that the users has an invalid block (header) due to corruption or otherwise.

NIT: why keep looping ? is there a path where the node can recover from this ?

partial code review ✅

Gives warning in case of chain corruption

02025-10-07T14:24:52Z Pre-synchronizing blockheaders, height: 266146 (~97.53%)
12025-10-07T14:24:55Z [warning] Received header for a block previously marked as invalid from peer=0. If this happens with all peers, database corruption is likely and -reindex may be necessary to recover.
22025-10-07T14:24:56Z Pre-synchronizing blockheaders, height: 4146 (~1.54%)

Loops without warning:

 02025-10-07T14:17:05Z Pre-synchronizing blockheaders, height: 258146 (~94.63%)
 12025-10-07T14:17:05Z Pre-synchronizing blockheaders, height: 262146 (~96.10%)
 22025-10-07T14:17:05Z Pre-synchronizing blockheaders, height: 266146 (~97.53%)
 32025-10-07T14:17:07Z Pre-synchronizing blockheaders, height: 4146 (~1.54%)
 42025-10-07T14:17:07Z Pre-synchronizing blockheaders, height: 16146 (~6.00%)
 52025-10-07T14:17:08Z Pre-synchronizing blockheaders, height: 28146 (~10.45%)
 62025-10-07T14:17:08Z Pre-synchronizing blockheaders, height: 42146 (~15.66%)
 72025-10-07T14:17:08Z Pre-synchronizing blockheaders, height: 54146 (~20.12%)
 82025-10-07T14:17:08Z Pre-synchronizing blockheaders, height: 64146 (~23.84%)
 92025-10-07T14:17:09Z Pre-synchronizing blockheaders, height: 78146 (~29.03%)
102025-10-07T14:17:09Z Pre-synchronizing blockheaders, height: 92146 (~34.24%)

mzumsande commented at 3:23 pm on October 7, 2025: contributor

NIT: why keep looping ? is there a path where the node can recover from this ?

Not if we are actually corrupted, but we can’t be 100% sure. Maybe we are right, and the next peer would extend the chain that we see as valid.

There is the possibility of an an attack where a miner creates an invalid block with valid PoW (which is costly), and has some sybil nodes feed that chain to others - it wouldn’t be good if this could be used to shutdown nodes. Or there could be some kind of chaotic chain split / hard fork scenario. This is not very likely to happen (hopefully), but I don’t think shutting down the node automatically would be a good idea.

in src/net_processing.cpp:2960 in 6ae9af6b98

2955@@ -2956,6 +2956,10 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, Peer& peer,
2956                                                            state, &pindexLast)};
2957     if (!processed) {
2958         if (state.IsInvalid()) {
2959+            if (state.GetResult() == BlockValidationResult::BLOCK_CACHED_INVALID) {
2960+                LogWarning("Received header for a block previously marked as invalid from peer=%d. "

stickies-v commented at 6:09 pm on October 7, 2025:

nit to respect line-length, clang-format and grammatically make things clearer that the peer in question did not send the previous version of the header:

 0diff --git a/src/net_processing.cpp b/src/net_processing.cpp
 1index 073a434335..4574c6cf19 100644
 2--- a/src/net_processing.cpp
 3+++ b/src/net_processing.cpp
 4@@ -2957,8 +2957,10 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, Peer& peer,
 5     if (!processed) {
 6         if (state.IsInvalid()) {
 7             if (state.GetResult() == BlockValidationResult::BLOCK_CACHED_INVALID) {
 8-                LogWarning("Received header for a block previously marked as invalid from peer=%d. "
 9-                           "If this happens with all peers, database corruption is likely and -reindex may be necessary to recover.", pfrom.GetId());
10+                LogWarning("Header received from peer=%d was previously marked as invalid. If this "
11+                           "happens with all peers, database corruption is likely and restarting "
12+                           "with -reindex may be necessary to recover.",
13+                           pfrom.GetId());
14             }
15             MaybePunishNodeForBlock(pfrom.GetId(), state, via_compact_block, "invalid header received");
16             return;

mzumsande commented at 4:04 pm on October 8, 2025:

done.

stickies-v commented at 6:12 pm on October 7, 2025: contributor

Approach ACK, agreed that something non-invasive like logging is the way to go here.

in src/net_processing.cpp:2959 in 6ae9af6b98

2955@@ -2956,6 +2956,10 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, Peer& peer,
2956                                                            state, &pindexLast)};
2957     if (!processed) {
2958         if (state.IsInvalid()) {
2959+            if (state.GetResult() == BlockValidationResult::BLOCK_CACHED_INVALID) {

stickies-v commented at 11:38 am on October 8, 2025:

I think there is potential for unconditional logging abuse here. If an inbound peer (which doesn’t get disconnected for sending BLOCK_CACHED_INVALID headers) knows of (or spends the PoW to construct) an invalid block with a valid header, they can resend the same header and spam the log. The impacts should be limited since we now have disk filling mitigation in place, but it’s still annoying at the very least, and it’s very cheap to execute on the entire network when e.g. a miner accidentally constructs such a block.

Minimal test to showcase:

 0diff --git a/test/functional/p2p_invalid_block.py b/test/functional/p2p_invalid_block.py
 1index 88bdbe22a8..c5d22c3382 100755
 2--- a/test/functional/p2p_invalid_block.py
 3+++ b/test/functional/p2p_invalid_block.py
 4@@ -104,6 +104,8 @@ class InvalidBlockRequestTest(BitcoinTestFramework):
 5         block3.solve()
 6 
 7         peer.send_blocks_and_test([block3], node, success=False, reject_reason='bad-cb-amount')
 8+        for i in range(100):
 9+            peer.send_blocks_and_test([block3], node, success=False, reject_reason='Received header for a block previously marked')
10 
11 
12         # Complete testing of CVE-2012-2459 by sending the original block.

mzumsande commented at 4:15 pm on October 8, 2025:

good point - I restricted the log to outbound peers, which makes sense anyway, because it’s consistent with the fact that we only disconnect those.

(Besides, I’m curious what the current way to think about unconditional logging should be - be more liberal / care less about spamming? Or keep on being careful about log levels just as before when the rate limiting didn’t exist? A bit offtopic here, but maybe something to discuss at CoreDev…)

stickies-v commented at 11:10 am on October 30, 2025:

Or keep on being careful about log levels just as before when the rate limiting didn’t exist?

Imo rate limiting changes nothing about how we should think about carefully using log levels. Logs are meant to be informative to the user, and any potential for uninformative entries to flood the log, potentially burying important information, should be avoided. Rate limiting is just a belt-and-suspenders, in my view.

polespinasa commented at 3:27 am on October 31, 2025:

If an inbound peer (which doesn’t get disconnected for sending BLOCK_CACHED_INVALID headers) knows of (or spends the PoW to construct) an invalid block with a valid header, they can resend the same header and spam the log.

Just curiosity, why we don’t disconnect it?

mzumsande commented at 7:24 pm on November 19, 2025:

@polespinasa Not sure (#15141 seems relevant), but I think it’s to avoid that the network splits into two parts in case of bugs leading to an temporary consensus incompability between clients, that may self-correct because the invalid block may be reorged out.

mzumsande force-pushed on Oct 8, 2025

in src/net_processing.cpp:2966 in 04b943b63c outdated

2960+                LogWarning("Header received from peer=%i was previously marked as invalid. "
2961+                           "If this happens with all peers, database corruption is likely and "
2962+                           "-reindex may be necessary to recover.",
2963+                           pfrom.GetId());
2964+            }
2965             MaybePunishNodeForBlock(pfrom.GetId(), state, via_compact_block, "invalid header received");

hodlinator commented at 9:05 pm on October 8, 2025:

(Inline comment in random place): Heads up that I’ve started working on a functional test for this at https://github.com/bitcoin/bitcoin/compare/04b943b63ca4c27aa857b03e5de82ed8cae7b17d...hodlinator:bitcoin:pr/33553_suggestions

It’s super-slow (partially due to HeadersSyncState only being created when receiving the very hardcoded 2000 headers message), and still of questionable usefulness.

hodlinator commented at 9:06 pm on October 8, 2025: contributor

Concept ACK 04b943b63ca4c27aa857b03e5de82ed8cae7b17d

Good to see progress on this issue!

stickies-v commented at 1:30 pm on October 13, 2025: contributor

In case of corruption that leads to a block being marked as invalid

Is the working assumption that the errors users are reporting in #26391 are due to disk corruption causing exactly the serialized nStatus::BLOCK_FAILED_{VALID,CHILD} bit to be flipped, causing the block to be loaded as invalid instead of valid, without causing any other corruption that would likely crash the application? Because while that wouldn’t be impossible, it seems like an incredibly unlikely event to me?

mzumsande commented at 7:17 pm on October 14, 2025: contributor

Is the working assumption that the errors users are reporting in #26391 are due to disk corruption causing exactly the serialized nStatus::BLOCK_FAILED_{VALID,CHILD} bit to be flipped, causing the block to be loaded as invalid instead of valid, without causing any other corruption that would likely crash the application? Because while that wouldn’t be impossible, it seems like an incredibly unlikely event to me?

No, that would indeed be unlikely. The assumption is that some kind of corruption led to a downloaded block being marked as invalid during ConnectBlock() erroneously - the BLOCK_FAILED flag is then set by the validation code as a result. The original corruption could also be with the UTXO set (e.g. an input needed by the block is no longer there) or with the block.

hodlinator commented at 1:36 pm on October 15, 2025: contributor

Please excuse me while I beat the dead horse before digging deeper. Was also assuming this had to do with some kind of corruption in the bits of CBlockIndex::nStatus while at rest on disk, thanks for clearing this up! In #26391 (comment) we find what triggered the original issue:

0ERROR: ConnectBlock: Consensus::CheckTxInputs: 878d6685666400b75a1947ccfc676249ecdf52678b2dc0d83e0328f8c24a951a, bad-txns-inputs-missingorspent, CheckTxInputs: inputs missing/spent
1InvalidChainFound: invalid block=000000000000032021a6d18011d202df36cf07822a657b47390ab90568bb14e2  height=224855  log2_work=69.513793  date=2013-03-08T15:58:52Z

When checking the inputs of the above transaction https://mempool.space/tx/878d6685666400b75a1947ccfc676249ecdf52678b2dc0d83e0328f8c24a951a, the UTXO set (chainstate) for some reason doesn’t contain expected entries. We mark the block as invalid, even though the UTXO set is the culprit. - The transaction itself is not corrupt, since it hashed into a valid txid. And the block (224855) is probably intact as the inputs are from older blocks, 224108 and 224589, and earlier checks in ConnectBlock() would probably have failed too if the block data had been mutated.

Q1: Maybe bitcoind could automatically wipe the UTXO set if the current block is an ancestor of the assume valid hash, instead of marking the block as invalid?

Another less drastic alternative could be to not automatically wipe the UTXO set, but log a suggestion to -reindex and - if block is an ancestor of assume valid - Shut the node down.

In the full debug.log from the original issue one can see that 224108 and 224589 where downloaded within the same process lifetime as 224855. So this is the UTXO set being corrupted while bitcoind is running.

Q2: Do we suspect that the UTXO set was corrupted / out of sync due to a logic error, or cosmic ray-type event out of our control?

Edit: Clarified Qs and avoided negation. Edit2: Suggested less drastic alternative to wiping UTXO set. Corrected assume UTXO -> assume valid.

in src/net_processing.cpp:2960 in 04b943b63c outdated

2955@@ -2956,6 +2956,12 @@ void PeerManagerImpl::ProcessHeadersMessage(CNode& pfrom, Peer& peer,
2956                                                            state, &pindexLast)};
2957     if (!processed) {
2958         if (state.IsInvalid()) {
2959+            if (!pfrom.IsInboundConn() && state.GetResult() == BlockValidationResult::BLOCK_CACHED_INVALID) {
2960+                LogWarning("Header received from peer=%i was previously marked as invalid. "

stickies-v commented at 3:11 pm on October 15, 2025:

This doesn’t combine well with invalidateblock: e.g. when invalidating the tip, the log just gets spammed with warnings as we cycle through new peers:

 02025-10-15T14:45:32Z UpdateTip: new best=000000092ca51c152010fd76552e12d9c9c7f410ebd04012dcff49eeecb6cfff height=274019 version=0x20000000 log2_work=42.943635 tx=27881630 date='2025-10-15T14:27:13Z' progress=1.000000 cache=71.8MiB(503926txo)
 1...
 22025-10-15T14:45:50Z UpdateTip: new best=0000000b9e7f0760c48d76ef301616dbaa1421788b29911c02f7beacbd20b867 height=274018 version=0x20000000 log2_work=42.943600 tx=27881585 date='2025-10-15T14:12:48Z' progress=0.999984 cache=71.8MiB(503258txo)
 32025-10-15T14:45:50Z InvalidChainFound: invalid block=000000092ca51c152010fd76552e12d9c9c7f410ebd04012dcff49eeecb6cfff  height=274019  log2_work=42.943635  date=2025-10-15T14:27:13Z
 42025-10-15T14:45:50Z InvalidChainFound:  current best=0000000b9e7f0760c48d76ef301616dbaa1421788b29911c02f7beacbd20b867  height=274018  log2_work=42.943600  date=2025-10-15T14:12:48Z
 52025-10-15T14:46:01Z New outbound-full-relay v2 peer connected: version: 70016, blocks=274019, peer=8
 62025-10-15T14:46:01Z [warning] Header received from peer=8 was previously marked as invalid. If this happens with all peers, database corruption is likely and -reindex may be necessary to recover.
 72025-10-15T14:46:03Z New outbound-full-relay v2 peer connected: version: 70016, blocks=274019, peer=9
 82025-10-15T14:46:03Z [warning] Header received from peer=9 was previously marked as invalid. If this happens with all peers, database corruption is likely and -reindex may be necessary to recover.
 92025-10-15T14:46:03Z New outbound-full-relay v2 peer connected: version: 70016, blocks=274019, peer=10
102025-10-15T14:46:03Z [warning] Header received from peer=10 was previously marked as invalid. If this happens with all peers, database corruption is likely and -reindex may be necessary to recover.

It seems like the main purpose of this new log message is to help users understand endless pre-sync loops. In that case, maybe we can improve on this by making the warning specific to the pre-sync?

I’ve created a branch which implements that approach, see diff: https://github.com/mzumsande/bitcoin/compare/202510_headersync_loop..stickies-v:bitcoin:2025-10/log-presync-invalid-chain

(note: I used static std::atomic<int64_t> previous_height{height} as a quick fix that works well enough, but adding a ChainstateManager::m_last_presync_height would be better)

mzumsande commented at 1:57 am on October 16, 2025:

This doesn’t combine well with invalidateblock

Being out of consensus with all outbound peers is a situation as catastrophic as it gets, and probably requires intervention by the user - why wouldn’t we want to be a bit spammy in this scenario? Outside of a real catastrophic chain split, one could of course artificially induce this situation with invalidateblock (which is debug-only) and then circle through outbound peers - but I don’t see the use case, why would anyone do this and why we should care about it?

stickies-v commented at 11:12 am on October 16, 2025:

why wouldn’t we want to be a bit spammy in this scenario?

I don’t disagree, but it seems orthogonal to this PR which is addressing db corruption issues? It’s okay for warning messages to not cover all edge cases, but I think it is good to try to avoid incorrect diagnostics or solutions if we can help it without too much complexity. invalidateblock is debug-only, but is also commonly referenced as a break-in-case-of-emergency RPC. With the current change, usage of the RPC is guaranteed to produce incorrect and undocumented warning logs, so I don’t think that’s ideal.

In this case, it’s tricky because we have 2 different and quite catastrophic issues ((1) db corruption and 2) consensus issues), with overlapping symptoms: receiving seemingly invalid headers from peers. However, since statistically speaking the db corruption is very likely to affect historical blocks and thus trigger a new headers (pre-)sync, logging that at the header (pre-)sync stage seems more accurate to me, without downside? Consensus issues (that don’t seem likely to be db corruption issues) should in most cases happen at the tip, and can still be logged/warned as such, with a more accurate message.

stickies-v commented at 11:14 am on October 16, 2025:

Alternatively, I would be okay with (but not prefer) the log message to be generalized to highlight that db corruption or consensus failure are possible, and the invalidateblock RPC documentation updated to highlight that this warning will be flooded upon usage.

mzumsande commented at 8:49 pm on October 29, 2025:

I agree that mentioning a consensus error makes sense, so I added it to the warning message. Note that I didn’t change the existing CheckForkWarningConditions() message as well, but could do that too if reviewers suggest it.

In principle, I don’t see a difference whether this scenario happens during IBD or close to the tip. In both cases, it is most likely corruption but a consensus error is possible too. Current behaviour is that in the IBD case, we will cycle through header-syncs with no warning (but unrelated logging), in the tip case we will silently cycle through outbound peers that we all disconnect with no warning at all. I think it makes sense to alert the user in both situations.

As for invalidateblock: If it is used in an actual emergency case, I think that it is correct to repeatedly warn the user about what’s going on because they should probably do something to get compatible outbound peers - a few repeated log messages shouldn’t be a concern there, and eventually rate-limiting will kick in.

If it is used for testing, having these warnings could remind the user to switch of network connectivity as the -dumptxoutset use case already does.

stickies-v commented at 11:05 am on October 30, 2025:

As for invalidateblock:… because they should probably do something to get compatible outbound peers…

I don’t think that’s necessarily true. In the “happy” path use case of invalidateblock, a large number of nodes will have done the same, so compatible nodes should be found automatically. Other approaches to prevent connecting with incompatible nodes would of course be better, but we’re talking about an emergency here.

… and eventually rate-limiting will kick in.

Yes, but rate limiting still allows 1MB of logging every hour, so while that mitigates crash issues, it definitely isn’t a solution for UX.

That said, I agree that being noisy about receiving invalid headers is better than being quiet, and that my earlier suggested approach is no improvement. In case of a legitimate network-wide use case for invalidateblock, sustained noisy logs will be the last of people’s concerns, and they can always be filtered until a patch is rolled out. Perhaps someone comes up with a better alternative later on, but I think this PR is an improvement to master and I can’t think of a better approach myself.

stickies-v commented at 11:12 am on October 30, 2025:

I think logging the block hash would be helpful here?

glozow commented at 2:59 pm on November 12, 2025:

Marking this conversation resolved, please unresolve if that’s a misinterpretation

mzumsande commented at 3:21 pm on November 12, 2025:

Yes, makes sense, I’ll add that soon.

mzumsande commented at 2:46 pm on November 13, 2025:

logging the block hash here is not as straightforward as it looks, because we would have to know which of the headers failed, and ProcessNewBlockHeaders doesn’t return that info: So I think we would either have to change the interface of ProcessNewBlockHeaders, or we would have to add a loop over headers (CBlockHeader) here, lookup each header in our block index and check whether it’s BLOCK_FAILED_MASK.

I’m on vacation this week, will look into it more next week.

maflcko commented at 2:48 pm on November 13, 2025:

Shouldn’t be a blocker either way, because the block hash has been logged previously and it is just a matter of scrolling up

mzumsande commented at 8:18 pm on November 17, 2025:

had a closer look now, it actually is straightforward using GetDebugMessage(). So I added the log.

mzumsande force-pushed on Oct 29, 2025

stickies-v approved

stickies-v commented at 11:17 am on October 30, 2025: contributor

ACK 6d2c8ea9dbd77c71051935b5ab59224487509559

DrahtBot requested review from hodlinator on Oct 30, 2025

DrahtBot requested review from janb84 on Oct 30, 2025

polespinasa approved

polespinasa commented at 3:28 am on October 31, 2025: member

code reviewed and tested ACK 6d2c8ea9dbd77c71051935b5ab59224487509559

This PR introduces an informative warning about an invalid block header, which may fix issues like #26391 by giving more information to the user.

Tried to force the warning (f8955eaee19d592cc20c497d145f5eae25277014) following #26391 (comment) and it behaves as expected.

Non-blocking nit: I agree with @stickies-v that the block hash may be useful in the message.

maflcko commented at 2:03 pm on November 13, 2025: member

review ACK 6d2c8ea9dbd77c71051935b5ab59224487509559 🔰

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: review ACK 6d2c8ea9dbd77c71051935b5ab59224487509559 🔰
3e7zVGtlAw9tQANK5Te51nUSvNnbjuXtLdKqar4hxyH81q3jHN8rDezqO2V3vSFP7A3uVsjQmrrS0dvig50raAQ==

mzumsande force-pushed on Nov 17, 2025

mzumsande commented at 8:21 pm on November 17, 2025: contributor

6d2c8ea to 6db2551: Added hash of invalid block to the log.

in src/validation.cpp:4298 in 6db2551dc2

4294@@ -4309,7 +4295,7 @@ bool ChainstateManager::AcceptBlockHeader(const CBlockHeader& block, BlockValida
4295                 *ppindex = pindex;
4296             if (pindex->nStatus & BLOCK_FAILED_MASK) {
4297                 LogDebug(BCLog::VALIDATION, "%s: block %s is marked invalid\n", __func__, hash.ToString());
4298-                return state.Invalid(BlockValidationResult::BLOCK_CACHED_INVALID, "duplicate-invalid");
4299+                return state.Invalid(BlockValidationResult::BLOCK_CACHED_INVALID, "duplicate-invalid", hash.ToString());

maflcko commented at 6:01 pm on November 18, 2025:

nit in 34339dbfdcc439f394eed63af6dfb4e3e066d156:

All other give a more verbose explanation, so maybe this one can be as well?

Ref:

 0$ git grep --perl-regexp 'Invalid\(BlockValidationResult::.*?, ".*?", '
 1src/validation.cpp:            state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-sigops", "too many sigops");
 2src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_INVALID_HEADER, "high-hash", "proof of work failed");
 3src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-signet-blksig", "signet block signature validation failure");
 4src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-length", "size limits failed");
 5src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-cb-missing", "first tx is not coinbase");
 6src/validation.cpp:            return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-cb-multiple", "more than one coinbase");
 7src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-sigops", "out-of-bounds SigOpCount");
 8src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_INVALID_HEADER, "bad-diffbits", "incorrect proof of work");
 9src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_INVALID_HEADER, "time-too-old", "block's timestamp is too early");
10src/validation.cpp:                return state.Invalid(BlockValidationResult::BLOCK_INVALID_HEADER, "time-timewarp-attack", "block's timestamp is too early on diff adjustment block");
11src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_TIME_FUTURE, "time-too-new", "block timestamp too far in the future");
12src/validation.cpp:            return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-txns-nonfinal", "non-final transaction");
13src/validation.cpp:            return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-cb-height", "block height mismatch in coinbase");
14src/validation.cpp:        return state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, "bad-blk-weight", strprintf("%s : weight limit failed", __func__));

0                return state.Invalid(BlockValidationResult::BLOCK_CACHED_INVALID, "duplicate-invalid",
1                  strprintf("block %s was previously marked invalid", hash.ToString())););

mzumsande commented at 7:00 pm on November 19, 2025:

Done. Since the net_processing warning is now harder to understand (half of the logged text comes from validation) I added a comment there.

maflcko approved

maflcko commented at 6:03 pm on November 18, 2025: member

re-ACK 6db2551dc27c4a9b989e8814054c93dd9d8f1b36 🍓

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: re-ACK 6db2551dc27c4a9b989e8814054c93dd9d8f1b36 🍓
3fjxltr9mIi84pDn0KRIgG5z1/cTvjewO0zon25Re50Br1E3RbXNaF6I5Kvlx198yIQBIOMmkg61t7ISG/YW2Cg==

DrahtBot requested review from polespinasa on Nov 18, 2025

DrahtBot requested review from stickies-v on Nov 18, 2025

p2p: Add warning message when receiving headers for blocks cached as invalid

Currently, if database corruption leads to a block being marked as
invalid incorrectly, we can get stuck in an infinite headerssync
loop with no indication what went wrong or how to fix it.
With the added log message, users will receive an explicit warning after each
failed headerssync attempt with an outbound peer.

3e224eae65

validation: Reword CheckForkWarningConditions and call it also during IBD and at startup

The existing IBD disable was added at a time when CheckForkWarningConditions
did also sophisticated fork detection that could lead to false positives
during IBD (55ed3f14751206fc87f0cbf8cb4e223efacef338).

The fork detection logic doesn't exist anymore
(since fa62304c9760f0de9838e56150008816e7a9bacb), so the IBD check is no
longer necessary.

Displaying the log at startup will help node operators diagnose the
problem better.

Also unify log message and alert warning text, since a long invalid chain
could be due to chainstate corruption or an actual consensus incompatibility
with peers. Previously the log assumed the former and the alert the latter.

7f284835be

mzumsande force-pushed on Nov 19, 2025

mzumsande commented at 7:02 pm on November 19, 2025: contributor

6db2551 to 7f28483:

addressed #33553 (review)
reworked the second CheckForkWarningConditions commit to unify the log message with the alernotify warning. I noticed the discrepancy due to #33893

theStack commented at 9:12 pm on November 19, 2025: contributor

Concept ACK

validation: Improve warnings in case of chain corruption #33553

Code Coverage & Benchmarks

Reviews

Conflicts