refactor: extract per-message helpers from ProcessMessage (move-only) #35502

pull w0xlt wants to merge 7 commits into bitcoin:master from w0xlt:refactor/extract-processmessage-helpers changing 1 files +485 −429
  1. w0xlt commented at 1:54 AM on June 10, 2026: contributor

    PeerManagerImpl::ProcessMessage() is a ~1000-line function handling every p2p message type inline, which makes it hard to navigate and review.

    This PR continues splitting it into per-message helper functions, following the pattern of the recently merged fa5ab0220e02377c3c855042ecdf1f5f950d0965 (ProcessPong()) and fa55723b8fbd4fd056dddac5b35daf2e86021422 (ProcessAddrs()),, as suggested by maflcko #34588 (comment).

    Seven handlers are extracted, one commit each:

    • ProcessGetAddr()getaddr
    • ProcessGetDataMessage()getdata (named to avoid colliding with the existing ProcessGetData(), which services the request queue this handler fills)
    • ProcessGetBlocks()getblocks
    • ProcessGetHeaders()getheaders
    • ProcessInv()inv
    • ProcessSendTxRcncl()sendtxrcncl
    • ProcessTx()tx

    Each commit is a pure code move: the only new lines are the declaration (with thread-safety annotations), the function signature, and the one-line call site. Each call site is Helper(...); return;, so return statements inside the moved bodies keep identical semantics. No behavior change.

    Every commit can be reviewed with the git options: --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space

    Declarations and definitions are placed next to related existing helpers (e.g. ProcessGetDataMessage next to ProcessGetData, ProcessGetBlocks next to ProcessGetBlockData, ProcessGetAddr after ProcessAddrs).

  2. move-only: Extract ProcessGetAddr() helper
    This commit can be reviewed with the git options:
    --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
    2afa25ef8e
  3. move-only: Extract ProcessGetDataMessage() helper
    This commit can be reviewed with the git options:
    --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
    bc9154a4b2
  4. move-only: Extract ProcessGetBlocks() helper
    This commit can be reviewed with the git options:
    --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
    cb03c39cc5
  5. move-only: Extract ProcessGetHeaders() helper
    This commit can be reviewed with the git options:
    --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
    fb9216672c
  6. move-only: Extract ProcessInv() helper
    This commit can be reviewed with the git options:
    --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
    400857c65c
  7. move-only: Extract ProcessSendTxRcncl() helper
    This commit can be reviewed with the git options:
    --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
    19b613f4d7
  8. move-only: Extract ProcessTx() helper
    This commit can be reviewed with the git options:
    --color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
    4fe745f27b
  9. DrahtBot added the label Refactoring on Jun 10, 2026
  10. DrahtBot commented at 1:54 AM on June 10, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35502.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK pseudoramdom, thomasbuilds, pablomartin4btc
    Concept ACK stickies-v, theStack

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #35591 ([DO NOT MERGE] Erlay: bandwidth-efficient transaction relay protocol (Full implementation) by sr-gi)
    • #35558 (p2p: Prefill compact blocks by davidgumberg)
    • #35522 (refactor: Extract per-message helpers from SendMessages() (move-only) by pablomartin4btc)
    • #35315 (refactor: Use NodeClock::time_point in more places by maflcko)
    • #35252 (net: send decoy transactions via private broadcast by andrewtoth)
    • #34824 (net: encapsulate TxRelay state and replace recursive mutexes by w0xlt)
    • #34707 (net: keep finished private broadcast txs in memory by andrewtoth)
    • #34628 (p2p: Replace per-peer transaction rate-limiting with global rate limits by ajtowns)
    • #34565 (refactor: extract BlockDownloadManager from PeerManagerImpl by w0xlt)
    • #34271 (net_processing: make m_tx_for_private_broadcast optional by vasild)
    • #33854 (fix assumevalid is ignored during reindex by Eunovo)
    • #29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  11. stickies-v commented at 10:49 AM on June 10, 2026: contributor

    Concept ACK. Pretty straightforward change that makes it easier to navigate net_processing and better encapsulates logic.

  12. theStack commented at 1:19 PM on June 10, 2026: contributor

    Concept ACK

    Fwiw this has been proposed at least once: #9608 (it seems to have failed more due to lack of review back then rather than on strong pushback, as far as I understand)

  13. pablomartin4btc commented at 3:06 PM on June 10, 2026: member

    Concept ACK.

    I'm in favour of these refactoring, while reviewing #34824 also identified another possible candidate SendMessages() which is in the same file.

  14. pseudoramdom commented at 4:21 AM on June 11, 2026: none

    code review ACK 4fe745f27bd8a4a637df522421a4a7104923ca3e Except for a new comment for ProcessGetDataMessage, verified the change is a move-only refactor. Also verified using git show--color-moved=dimmed-zebra --color-moved-ws=allow-indentation-change <commit> against each commit.

  15. DrahtBot requested review from stickies-v on Jun 11, 2026
  16. DrahtBot requested review from theStack on Jun 11, 2026
  17. DrahtBot requested review from pablomartin4btc on Jun 11, 2026
  18. thomasbuilds commented at 6:08 AM on June 11, 2026: contributor

    code review ACK 4fe745f

    Verified each commit is a pure code move (only new lines are declarations, call sites, and one doc comment), call sites preserve return semantics, and the thread-safety annotations match what each body locks. It compiles cleanly with clang -Wthread-safety.

  19. pablomartin4btc commented at 4:01 AM on June 12, 2026: member

    ACK https://github.com/bitcoin/bitcoin/commit/4fe745f27bd8a4a637df522421a4a7104923ca3e.

    ProcessMessage() went from ~1500 to ~1000 lines, but CMPCTBLOCK (248 lines) and VERSION (235 lines) remain the two dominant inline blocks — extracting those two alone would drop it another ~480 lines, bringing it under 550. The same move-only pattern applied here would make both independently reviewable. Not a blocker, perhaps worth a follow-up PR in the same vein?

    On a side note, on the previous attempt mentioned above, there were concerns about "obscuring the control flow", I don't think this is the case here.

  20. davidgumberg commented at 12:44 AM on June 19, 2026: contributor

    Concept -0

    Let's say I'm reading a function called ProcessBlock() in net_processing.cpp:

    • Is this function called by ProcessMessage()?
    • Does ProcessMessage() do any set up before invoking ProcessBlock()?
    • Are there any other callers of ProcessBlock() and what are their expectations?
    • Are all the callers of ProcessBlock() in the process messages thread?

    These questions have to be answered when reading / modifying code and for all of the logic that lives in the ProcessMessage() ~switch statement these ambiguities don't exist. I think ProcessBlock() is a good example because it has multiple callers, and anyone reading it or modifying it should find them and think carefully about what they all do. That is a cost paid for by the reusability of ProcessBlock().

    I think the main thing I don't share is the feeling that ProcessMessage() is too long. It's pretty flat, and there is ~0 state that lives outside of the if(msg_type) {, so I don't feel burdened by any of the rest of the function when I'm reading the logic for the message I'm interested in.

    Not trying to bikeshed, the PR seems reasonable and everything would be fine if it was merged, just the POV of one person that happens to like ProcessMessage().

  21. w0xlt commented at 7:33 AM on June 19, 2026: contributor

    @davidgumberg Fair point. I agree the cost is real when a helper has multiple callers like ProcessBlock(), because setup and caller assumptions become less local.

    These extractions are different though: they are private, single-caller helpers from ProcessMessage(), called as Helper(...); return;, so the control flow remains simple.

    The benefit I see is lock reasoning. ProcessMessage() has to carry the broad lock contract needed by all branches:

    EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex, !m_most_recent_block_mutex,
                             !m_headers_presync_mutex, g_msgproc_mutex,
                             !m_tx_download_mutex)
    

    After extraction, each handler states only what it actually needs. For example, ProcessGetAddr() requires only g_msgproc_mutex, while ProcessTx() requires g_msgproc_mutex, !m_peer_mutex, !m_tx_download_mutex.

    So this does not add runtime safety by itself, but it makes the per-message lock contract more local and compiler-checked.

    Besides that, ProcessMessage() arguably accumulates disproportionate responsibility, and this extraction pattern addresses that, though I agree this is a more subjective point.

  22. maflcko commented at 8:17 AM on June 19, 2026: member

    Let's say I'm reading a function called ProcessBlock() in net_processing.cpp:

    * Is this function called by `ProcessMessage()`?

    Should be easy to answer with 'yes', otherwise the function wouldn't sit in the peer manager impl?

    * Does `ProcessMessage()` do any set up before invoking `ProcessBlock()`?

    Edit: I'd say generally the goal should be to avoid setup, but here it should be obvious from the function signature, which is passed min_pow_checked (and other stuff). There is also the possibility of different message types doing different setup steps, one would have to read the handshake code either way, before and after this pull request.

    * Are there any other callers of `ProcessBlock()` and what are their expectations?

    Should be easy to answer by looking at the header: If the function is public in the header, it is called externally. If it is not in the header, then it is not called externally. Generally, I don't think any Process* handler was ever exposed in the header?

    * Are all the callers of `ProcessBlock()` in the process messages thread?

    Should be easy to answer with yes, because no process handler was ever exposed publicly. Also, it should be easy to verify with a single call to git grep.


    Generally, I think the benefits here are limited, because devs generally know that at most a single message is handled in the large body (and thus no variables leak from one message handling block to the next). However, I think there is still a benefit in being able to review code changes with the git option git diff --function-context easier.

    Previously, it would basically print the whole file, even if only a single line in a single message type handling was changed. At least for me this makes review harder because I have to scroll past the irrelevant white lines every time and risk missing a green or red line.

    After this change, git diff --function-context nicely prints only the relevant context, so at least for me review would be easier. Also, parts of this file already use this pattern, so for consistency it also makes sense. So I am Concept +1, but this is just me, and maybe other people are using a different review flow?


    @w0xlt Please don't @ in pull descriptions. If this pull was merged, it would lead to ping spam every time the merge is cherry-picked.

  23. davidgumberg commented at 2:15 AM on June 25, 2026: contributor

    Let's say I'm reading a function called ProcessBlock() in net_processing.cpp:

    • Is this function called by ProcessMessage()?

    Should be easy to answer with 'yes', otherwise the function wouldn't sit in the peer manager impl?

    I meant directly, this is basically the same point as the set-up one.

    • Does ProcessMessage() do any set up before invoking ProcessBlock()?

    Edit: I'd say generally the goal should be to avoid setup, but here it should be obvious from the function signature, which is passed min_pow_checked (and other stuff). There is also the possibility of different message types doing different setup steps, one would have to read the handshake code either way, before and after this pull request.

    Right, but after this PR you have to look in multiple places, even if there is only one caller that doesn't do any setup and it's ProcessMessage()

    • Are there any other callers of ProcessBlock() and what are their expectations?

    Should be easy to answer by looking at the header: If the function is public in the header, it is called externally. If it is not in the header, then it is not called externally. Generally, I don't think any Process* handler was ever exposed in the header?

    • Are all the callers of ProcessBlock() in the process messages thread?

    Should be easy to answer with yes, because no process handler was ever exposed publicly. Also, it should be easy to verify with a single call to git grep.

    Yes, all of these questions can be answered quite easily by checking in one or two files and doing a grep or two, my point is that one doesn't have to just check the header and do an extra grep or two now.

    e.g. I don't use git diff --function-context but it sounds like reviewing ProcessMessage() might be pretty annoying if one does that. Of course I could recommend that you use a different set of flags or a different tool, or just open another window with the function in it, but any of these "solutions" would add friction for you i.e. they suck in comparison to using the tool / workflow you already like and are used to.

    That being said, it sounds like other contributors prefer the style of this PR to the style of master, and given that this is a stylistic question the thing that people prefer is definitely the thing that should happen.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-07-01 08:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me