net: check for empty header before calling FillBlock #33296

pull Crypt-iQ wants to merge 2 commits into bitcoin:master from Crypt-iQ:cmpctblock_assume_fix_09032025 changing 2 files +55 −0
  1. Crypt-iQ commented at 5:54 pm on September 3, 2025: contributor
    This avoids an Assume crash if multiple blocktxn messages are received. The first call to FillBlock would make the header empty via SetNull and the call right before the second FillBlock would crash here since LookupBlockIndex won’t find anything. Fix that by checking for an empty header before the Assume.
  2. DrahtBot added the label P2P on Sep 3, 2025
  3. DrahtBot commented at 5:54 pm on September 3, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33296.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK instagibbs, mzumsande, fjahr, achow101

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  4. in src/net_processing.cpp:3332 in eb8bc005e0 outdated
    3328@@ -3329,6 +3329,11 @@ void PeerManagerImpl::ProcessCompactBlockTxns(CNode& pfrom, Peer& peer, const Bl
    3329 
    3330         PartiallyDownloadedBlock& partialBlock = *range_flight.first->second.second->partialBlock;
    3331 
    3332+        if (partialBlock.header.IsNull()) {
    


    instagibbs commented at 5:59 pm on September 3, 2025:
    might note that the header is blanked in FillBlock, but the partial block never deleted, which is why this would catch the behavior

    Crypt-iQ commented at 6:05 pm on September 3, 2025:
    Yup good point, will do.

    Crypt-iQ commented at 9:40 pm on September 3, 2025:
    Done in 0df02ac0e870bc34c90f95d2a429e6dcc9217dfa

    dergoegge commented at 9:06 am on September 4, 2025:
    Wondering if we should always wipe the partialBlock (i.e. downgrade the request to a regular block request) when FillBlock fails. Then it would no longer be possible to process two blocktxn messages for the same block.

    Crypt-iQ commented at 12:35 pm on September 4, 2025:

    @instagibbs suggested this as well offline. I looked into it and the main issue is that it allows a peer to stall block download:

    1. malicious peer sends compact block, reconstruction is required
    2. malicious peer sends blocktxn and reconstruction fails, fallback to regular block request with this peer (and wiping partialBlock)
    3. malicious peer goes back to step 1, preventing honest non-compact-block-relay peers from giving the block. This does not work in current master because of this else statement.

    I think there is probably a solution here where the malicious peer couldn’t keep doing this, but it might require some additional state? I think it would be better to not keep the partialBlock around.


    mzumsande commented at 10:27 pm on September 4, 2025:

    allows a peer to stall block download

    I guess this would only be a serious problem in the case where all of our other peers don’t support compact block (because otherwise parallel compact blocks would allow us to download the block).

    In any case, this was non-obvious to me. Maybe add a comment in the spot where we fall back to GETDATA explaining that we don’t wipe the partialBlock there on purpose?


    Crypt-iQ commented at 12:38 pm on September 5, 2025:

    where all of our other peers don’t support compact block

    This, or a malicious peer somehow controls all three slots (harder since one slot is outbound), or we are in -blocksonly mode.

    In any case, this was non-obvious to me. Maybe add a comment in the spot where we fall back to GETDATA explaining that we don’t wipe the partialBlock there on purpose?

    Good idea, I will add a comment. I am not sure if the existing logic is on purpose, but the way it is written in master makes it impossible to stall block download indefinitely since the first_in_flight peer needs to make forward progress after the getdata fallback after 10 minutes or get disconnected.


    Crypt-iQ commented at 12:56 pm on September 5, 2025:
    It may be possible to call reset() on partialBlock to wipe it. I was assuming that wiping would mean calling RemoveBlockRequest (resets m_downloading_since which allows block download stalling) followed by BlockRequested. I will have to think more on this because I am not sure if it introduces a footgun.

    Crypt-iQ commented at 2:01 pm on September 5, 2025:
    Comment added in 7c1b388c7ab2059b562c89b1e7c7dd5b641a9cb8
  5. Crypt-iQ force-pushed on Sep 3, 2025
  6. Crypt-iQ marked this as ready for review on Sep 3, 2025
  7. fanquake added this to the milestone 30.0 on Sep 4, 2025
  8. instagibbs commented at 4:11 pm on September 4, 2025: member

    concept ACK, as per offline conversations. It would be really nice to make the state machine better, for now we can at least avoid the Assume in debug builds when a peer does this.

    We could also disconnect the peer for this behavior, I think, but not a priority

  9. in src/net_processing.cpp:3339 in 0df02ac0e8 outdated
    3328@@ -3329,6 +3329,14 @@ void PeerManagerImpl::ProcessCompactBlockTxns(CNode& pfrom, Peer& peer, const Bl
    3329 
    3330         PartiallyDownloadedBlock& partialBlock = *range_flight.first->second.second->partialBlock;
    3331 
    3332+        if (partialBlock.header.IsNull()) {
    3333+            // It is possible for the header to be empty if a previous call to FillBlock wiped the header, but left
    3334+            // the PartiallyDownloadedBlock pointer around (i.e. did not call RemoveBlockRequest). In this case, we
    3335+            // should not call LookupBlockIndex below.
    3336+            LogDebug(BCLog::NET, "Peer %d sent compact block transactions multiple times\n", pfrom.GetId());
    3337+            return;
    


    mzumsande commented at 10:36 pm on September 4, 2025:
    I think this will change behavior in non-debug builds such that repeated blocktxn messages will now be ignored (whereas they would result in READ_STATUS_INVALID / disconnection before). Shouldn’t we rather keep on disconnecting the peer?

    Crypt-iQ commented at 12:39 pm on September 5, 2025:
    Good point, I will preserve the logic by disconnecting the peer.

    Crypt-iQ commented at 2:01 pm on September 5, 2025:
    Added this in 7c1b388c7ab2059b562c89b1e7c7dd5b641a9cb8
  10. mzumsande commented at 10:36 pm on September 4, 2025: contributor
    Concept ACK
  11. Crypt-iQ force-pushed on Sep 5, 2025
  12. in src/net_processing.cpp:3336 in 7c1b388c7a outdated
    3328@@ -3329,6 +3329,16 @@ void PeerManagerImpl::ProcessCompactBlockTxns(CNode& pfrom, Peer& peer, const Bl
    3329 
    3330         PartiallyDownloadedBlock& partialBlock = *range_flight.first->second.second->partialBlock;
    3331 
    3332+        if (partialBlock.header.IsNull()) {
    3333+            // It is possible for the header to be empty if a previous call to FillBlock wiped the header, but left
    3334+            // the PartiallyDownloadedBlock pointer around (i.e. did not call RemoveBlockRequest). In this case, we
    3335+            // should not call LookupBlockIndex below.
    3336+            RemoveBlockRequest(block_transactions.blockhash, pfrom.GetId());
    


    instagibbs commented at 2:08 pm on September 5, 2025:
    IIUC block requests are cleaned up during disconnect, so this would allow noban/manual connections to re-start trying to give us blocks vs freeze them out?

    Crypt-iQ commented at 2:31 pm on September 5, 2025:

    Yup, like @mzumsande pointed out, this shouldn’t be a change in behavior since they would previously hit the READ_STATUS_INVALID case.

    I didn’t consider noban/manual connections for stalling block download as it seems like this is already possible by triggering either of the READ_STATUS_INVALID cases (whether here or here) and then having SendMessages request it again here.

    I think it shouldn’t be allowed for a noban/manual connection to do this, but since this is pre-existing I didn’t really consider it.

  13. in src/net_processing.cpp:3353 in 7c1b388c7a outdated
    3349@@ -3340,6 +3350,11 @@ void PeerManagerImpl::ProcessCompactBlockTxns(CNode& pfrom, Peer& peer, const Bl
    3350         } else if (status == READ_STATUS_FAILED) {
    3351             if (first_in_flight) {
    3352                 // Might have collided, fall back to getdata now :(
    3353+                // If we wipe the partialBlock here via RemoveBlockRequest followed by BlockRequested instead of leaving
    


    instagibbs commented at 2:20 pm on September 5, 2025:

    How about something like:

    // We keep the failed partialBlock to disallow processing another compact block announcement from the same // peer for the same block. We let the full block download below continue under the same m_downloading_since timer.


    Crypt-iQ commented at 2:42 pm on September 5, 2025:
    Thanks, this is less verbose. I’ve given commit attribution since I copied the text verbatim.

    instagibbs commented at 2:43 pm on September 5, 2025:
    thanks, as a future reader I always get suspicious the longer the comment is :)
  14. Crypt-iQ force-pushed on Sep 5, 2025
  15. instagibbs approved
  16. instagibbs commented at 1:32 pm on September 8, 2025: member
    ACK 2d31c5ae72940c594ec19d745bdbf09284c257d2
  17. DrahtBot requested review from mzumsande on Sep 8, 2025
  18. in src/net_processing.cpp:3337 in be0fb2cecb outdated
    3328@@ -3329,6 +3329,16 @@ void PeerManagerImpl::ProcessCompactBlockTxns(CNode& pfrom, Peer& peer, const Bl
    3329 
    3330         PartiallyDownloadedBlock& partialBlock = *range_flight.first->second.second->partialBlock;
    3331 
    3332+        if (partialBlock.header.IsNull()) {
    3333+            // It is possible for the header to be empty if a previous call to FillBlock wiped the header, but left
    3334+            // the PartiallyDownloadedBlock pointer around (i.e. did not call RemoveBlockRequest). In this case, we
    3335+            // should not call LookupBlockIndex below.
    3336+            RemoveBlockRequest(block_transactions.blockhash, pfrom.GetId());
    3337+            Misbehaving(peer, "invalid compact block/non-matching block transactions");
    


    fjahr commented at 6:19 pm on September 8, 2025:
    Hm, curious if this should maybe be a different (unique) message so we can clearly distinguish which case between this one and the one below was hit?

    Crypt-iQ commented at 9:05 pm on September 8, 2025:
    I originally decided not to, but makes sense. Done in 5e585a0fc4fd68dd7b4982054b34deae2e7aeb89
  19. in src/net_processing.cpp:3338 in be0fb2cecb outdated
    3333+            // It is possible for the header to be empty if a previous call to FillBlock wiped the header, but left
    3334+            // the PartiallyDownloadedBlock pointer around (i.e. did not call RemoveBlockRequest). In this case, we
    3335+            // should not call LookupBlockIndex below.
    3336+            RemoveBlockRequest(block_transactions.blockhash, pfrom.GetId());
    3337+            Misbehaving(peer, "invalid compact block/non-matching block transactions");
    3338+            LogDebug(BCLog::NET, "Peer %d sent compact block transactions multiple times\n", pfrom.GetId());
    


    fjahr commented at 6:47 pm on September 8, 2025:
    Log strings don’t require a newline at the end anymore since #30929

    Crypt-iQ commented at 9:06 pm on September 8, 2025:
    Thanks, done in 5e585a0fc4fd68dd7b4982054b34deae2e7aeb89
  20. fjahr commented at 6:53 pm on September 8, 2025: contributor

    Concept ACK

    It’s great that there is now documentation for this but it’s still not ideal that some magic state has this special meaning that is very implicit. I would have liked it more if it was explicit but I haven’t spend enough time on this part of the code base what would be ideal and I don’t want this to block v30 more than necessary. Maybe a simple bool flag failed_reconstruction on PartiallyDownloadedBlock would work? Feel free to ignore.

  21. net: check for empty header before calling FillBlock
    Previously in debug builds, this would cause an Assume crash if
    FillBlock had been called previously. This could happen when multiple
    blocktxn messages were received.
    
    Co-Authored-By: Greg Sanders <gsanders87@gmail.com>
    5e585a0fc4
  22. test: send duplicate blocktxn message in p2p_compactblocks.py
    Add test_multiple_blocktxn_response that checks that the peer is
    disconnected.
    8b62647680
  23. Crypt-iQ force-pushed on Sep 8, 2025
  24. instagibbs commented at 9:14 pm on September 8, 2025: member
    @fjahr definitely not ideal, this PR is basically only removing the Assume crash while better state machine can be worked on
  25. DrahtBot requested review from fjahr on Sep 8, 2025
  26. Crypt-iQ commented at 9:28 pm on September 8, 2025: contributor

    Maybe a simple bool flag failed_reconstruction on PartiallyDownloadedBlock would work? Feel free to ignore.

    This would work, but I think we should wait until there is an in-repo fuzz harness with good coverage before trying to refactor the net_processing state or the PartiallyDownloadedBlock state. I think a bool attempted_reconstruction could replace the calls to SetNull() / IsNull(). An ideal solution would be to get rid of the unusable PartiallyDownloadedBlock while making sure a malicious peer that is first_in_flight can’t stall block download by resetting the m_downloading_since timer described here (there is some nuance here related to #22144 that I need to remember, the second and third peers can safely and freely reset their timer in this call to RemoveBlockRequest).

  27. mzumsande commented at 10:21 pm on September 8, 2025: contributor
    Code Review ACK 8b6264768030db1840041abeeaeefd6c227a2644
  28. fjahr commented at 11:44 pm on September 8, 2025: contributor

    tACK 8b6264768030db1840041abeeaeefd6c227a2644

    Reproduced the issue on debug build master using the test and ensured the changes here make the test pass.

  29. achow101 commented at 0:02 am on September 9, 2025: member
    ACK 8b6264768030db1840041abeeaeefd6c227a2644
  30. achow101 merged this on Sep 9, 2025
  31. achow101 closed this on Sep 9, 2025

  32. fanquake added the label Needs backport (29.x) on Sep 9, 2025
  33. fanquake referenced this in commit 569ceb0df4 on Sep 12, 2025
  34. fanquake referenced this in commit 1288d44804 on Sep 12, 2025
  35. fanquake removed the label Needs backport (29.x) on Sep 12, 2025
  36. fanquake commented at 2:46 pm on September 12, 2025: member
    Backported to 29.x in #33344.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-10-10 15:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me