A compact block filter request might arrive when a block has already been processed (and advertised) but its filter has not yet been constructed. In that window the node fails to respond, which the requesting peer treats as misbehaviour and may disconnect/ban.
Fixes: #29655, #27085 Related discussion: 1, 2
The problem: Bitcoin Core receives a new block and processes it, then advertises it; however, by that moment the compact filter might not yet have been constructed. If a peer asks for a filter in that window, Bitcoin Core simply does not respond. From the peer's perspective this is a misbehaviour (we advertised a block and don't provide filters for it), which results in disconnects or bans. While it does not critically affect core nodes, it breaks the network topology for BIP 157 clients.
More precisely, BIP157 specifies that for getcfilters requestor:
StopHash MUST be known to belong to a block accepted by the receiving peer. This is the case if the peer had previously sent a headers or inv message with that block or any descendents. A node that receives getcfilters with an unknown StopHash SHOULD NOT respond.
Current behaviour clearly contradicts the above specification.
I (re-)discovered it while testing "light" compact block filter nodes kyoto and its addition to ldk-node, it was also discussed in context of neutrino.
Proposed solution: compute filters on the fly when they are requested for blocks we already have. Unlike filter headers, individual filters do not commit to their predecessors, so we can compute them independently for a set of blocks. Moreover we compute each filter at most once and persist it, so subsequent requests proceed as usual by reading the already-computed filter from disk.
Changes:
LookupFilterLookupFilterHeaderLookupFilterRangeLookupFilterHashRange
Schematically LookupFilter now works as follows:
- On filter request, check if we know the requested block. If not, abort.
- Check if the filter is already computed and stored. If so, respond with it.
- Otherwise compute the filter, persist it, respond with it.
LookupFilterRange (and LookupFilterHashRange) now work as follows:
- We are asked for a filter range, say
[tip-N, tip](BIP 157 capsNat 1000). - Check if filters for the full range
[tip-N, tip]are already on disk. If so, respond with them. - Otherwise check if filters for the stable subrange
[tip-N, tip-10]are all on disk. If any are missing, fail — those are too far behind the tip to be a race-window miss and we refuse to recompute them on demand. - For each block in
[tip-9, tip]that is missing from the index, compute its filter on the fly. Then respond with the complete range.
LookupFilterHeader is used by ProcessGetCFHeaders to fetch prev_filter_header at start - 1 of a requested range. For a getcfheaders [tip, tip] request, that predecessor is at tip - 1, which can itself be inside the race window. Without on-the-fly fallback on this path, the response would fail even though the filter hashes are recoverable. All four lookup paths now share the same chain-aware fallback.
This change exposes the possibility that an external peer can force our node to compute filters on the fly and thus burn CPU. However this is not a concern, because:
- It can only happen once per filter — subsequent requests read from disk.
- There is a hard limit on the depth of on-the-fly computation: only the 10 blocks below the active chain tip (
ONTHEFLY_TIP_WINDOW). - The whole window is tiny (single-digit milliseconds on testnet4, tens of milliseconds on mainnet — see numbers below).
The value 10 for constant ONTHEFLY_TIP_WINDOW is somewhat arbitrary and perhaps can be lowered. Myself I observed a window of maximum depth 2, on regtest.
Benchmarks
I benchmarked the four scenarios below on testnet4 (heights 134703–134802). The benchmark files are not included in this PR because they require a synced node and real block data, but I can include them in an additional commit on request.
- 1a. Request 100 filters, all already computed and stored.
- 1b. Request 100 filters, 90 already stored, 10 computed on the fly (the worst case for a 100-block request).
- 2a. Request 10 filters, all already stored.
- 2b. Request 10 filters, none stored — all computed on the fly (the worst case for a 10-block request).
Results on testnet4:
| Scenario | Per-request total | Per-block on-the-fly cost | Overhead vs all-indexed |
|---|---|---|---|
| 1a (100 indexed) | 0.97 ms | — | baseline |
| 1b (100, 10 on the fly) | 1.25 ms | ~28 µs | +0.28 ms |
| 2a (10 indexed) | 0.099 ms | — | baseline |
| 2b (10 all on the fly) | 0.50 ms | ~40 µs | +0.40 ms |
Mainnet figures extrapolation (compute scales roughly with block size and script count; estimated from the per-block compute cost of ~3.4 ms measured against a real mainnet datadir):
| Scenario | Per-request total | Overhead vs all-indexed |
|---|---|---|
| 1a (100 indexed) | ~2.5 ms | baseline |
| 1b (100, 10 on the fly) | ~36 ms | +~34 ms |
| 2a (10 indexed) | ~0.25 ms | baseline |
| 2b (10 all on the fly) | ~34 ms | +~34 ms |
In other words, the worst-case extra cost a hostile peer can extract is ~34 ms per request, bounded to the 10 tip blocks, paid at most once per block per node lifetime due to the write-back.
In theory, on-the-fly entries interact correctly with reorgs: the standard hash-vs-height fallback in LookUpOne ensures CustomAppend overwrites them with the new chain's filter, and CustomRemove preserves them via the hash index just like any other filter. I haven't tested it with block reorganizations.
This PR does not affect block propagation, as the very root of this bug is the independence of block processing and filter construction in CustomAppend. Nor does it affect the initial block download, since during IBD a node does not respond to compact filter messages.
I've used Claude Code for this PR.