p2p: TxOrphanage revamp cleanups #32941

pull glozow wants to merge 12 commits into bitcoin:master from glozow:2025-07-orphanage-followups changing 14 files +284 −276
  1. glozow commented at 4:05 pm on July 10, 2025: member

    Followup to #31829:

    • Release notes
    • Have the orphanage auto-trim itself whenever necessary (and test changes) #31829 (review)
    • Reduce duplicate reconsiderations by keeping track of which txns are already reconsiderable so we only mark it for reconsideration for 1 peer at a time #31829 (comment)
    • Rename OrphanTxBase to OrphanInfo
    • Get rid of Size() method by replacing all calls with CountUniqueOrphans
    • Rename outpoints index since they point to wtxids, not iterators #31829 (review)
    • Add more logging in the LimitOrphans inner loop to make it easy to see which peers are being trimmed #31829 (comment)
  2. glozow added the label P2P on Jul 10, 2025
  3. DrahtBot commented at 4:05 pm on July 10, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32941.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK sipa, marcofleon
    Stale ACK theStack

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33125 (ci: Use mlc v1 and ignore depends/work by fanquake)
    • #33116 (refactor: Convert uint256 to Txid by marcofleon)
    • #33066 (p2p: never check tx rejections by txid by glozow)
    • #32430 (test: Add and use ElapseTime helper by maflcko)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • possibile -> possible [misspelling in comment]
    • DoSiest -> Dosiest [consistency: “Dosiest” is the more standard spelling for “most DoS-heavy” peer]

    drahtbot_id_4_m

  4. glozow renamed this:
    TxOrphanage revamp cleanups
    p2p: TxOrphanage revamp cleanups
    on Jul 10, 2025
  5. glozow force-pushed on Jul 10, 2025
  6. DrahtBot added the label CI failed on Jul 10, 2025
  7. DrahtBot commented at 9:04 pm on July 10, 2025: contributor

    🚧 At least one of the CI tasks failed. Task fuzzer,address,undefined,integer, no depends: https://github.com/bitcoin/bitcoin/runs/45743710188 LLM reason (✨ experimental): Fuzz test failed due to an assertion failure causing a deadly signal crash.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  8. DrahtBot added the label Needs rebase on Jul 11, 2025
  9. glozow force-pushed on Jul 11, 2025
  10. DrahtBot removed the label Needs rebase on Jul 11, 2025
  11. glozow force-pushed on Jul 14, 2025
  12. glozow force-pushed on Jul 15, 2025
  13. DrahtBot removed the label CI failed on Jul 15, 2025
  14. glozow force-pushed on Jul 20, 2025
  15. glozow marked this as ready for review on Jul 21, 2025
  16. glozow commented at 3:15 pm on July 21, 2025: member
    This is ready for review. Would also be great to get https://github.com/bitcoin-core/qa-assets/pull/231 in for the 2 new fuzzies
  17. marcofleon commented at 3:24 pm on July 21, 2025: contributor
    Looks like the txorphanage_sim inputs might be invalidated with this PR. Not sure if we should wait to add those inputs until after this gets in or if it’s worth adding them before.
  18. in doc/release-notes-31829.md:9 in 74c664d0f3 outdated
    0@@ -0,0 +1,9 @@
    1+P2P
    2+
    3+- The transaction orphanage, which holds transactions with missing inputs temporarily while the node attempts to fetch
    4+its parents, now has improved Denial of Service protections. Previously, it enforced a maximum number of unique
    5+transactions (default 100, configurable using `-maxorphantxs`). Now, its limits are as follows: the number of entries
    6+plus each unique transaction's input count, divided by 10, must not exceed 3,000. The total weight of unique
    7+transactions must not exceed 404,000Wu multiplied by the number of peers.
    8+
    9+- The `-maxorphantxs` option has been removed, since the orphanage no longer limits the number of unique transactions.
    


    sipa commented at 3:45 pm on July 21, 2025:
    Hmm, this means that anyone with a bitcoin.conf that lists maxorphantxs=N will now get an error. An alternative could be to leave the option in for a while, but make it just emit a warning. Unsure if that’s necessary; I doubt many people configure this, but I don’t remember how we’ve dealt with similar changes in recent history.

    glozow commented at 8:49 pm on July 22, 2025:

    Yeah, I think it’s a good idea to add an error saying “this option doesn’t do anything anymore, remove it from your bitcoin.conf” and continue.

    Do you think it’s worthwhile to still support -maxorphantxs=0 e.g. for memory-conscious users?


    glozow commented at 6:01 pm on July 23, 2025:
    Added the option again, a warning, and a more explicit recommendation in the release notes.
  19. glozow commented at 4:18 pm on July 21, 2025: member

    Looks like the txorphanage_sim inputs might be invalidated with this PR.

    Ah good point, yeah.

  20. in src/net_processing.cpp:536 in 8c7ca46685 outdated
    532@@ -533,7 +533,7 @@ class PeerManagerImpl final : public PeerManager
    533     std::optional<std::string> FetchBlock(NodeId peer_id, const CBlockIndex& block_index) override
    534         EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex);
    535     bool GetNodeStateStats(NodeId nodeid, CNodeStateStats& stats) const override EXCLUSIVE_LOCKS_REQUIRED(!m_peer_mutex);
    536-    std::vector<node::TxOrphanage::OrphanTxBase> GetOrphanTransactions() override EXCLUSIVE_LOCKS_REQUIRED(!m_tx_download_mutex);
    537+    std::vector<node::TxOrphanage::OrphanInfo> GetOrphanTransactions() override EXCLUSIVE_LOCKS_REQUIRED(!m_tx_download_mutex);
    


    sipa commented at 4:33 pm on July 21, 2025:
    Use a scripted diff?

    glozow commented at 5:58 pm on July 23, 2025:
    done
  21. in src/node/txorphanage.cpp:53 in 2242972f4b outdated
    49@@ -50,17 +50,17 @@ class TxOrphanageImpl final : public TxOrphanage {
    50         { }
    51 
    52         /** Get an approximation for "memory usage". The total memory is a function of the memory used to store the
    53-         * transaction itself, each entry in m_orphans, and each entry in m_outpoint_to_orphan_it. We use weight because
    54+         * transaction itself, each entry in m_orphans, and each entry in m_outpoint_to_orphan_wtxids. We use weight because
    


    sipa commented at 4:34 pm on July 21, 2025:
    Use a scripted-diff?

    glozow commented at 5:58 pm on July 23, 2025:
    done
  22. in src/test/fuzz/txorphan.cpp:626 in 473bf54ce4 outdated
    622@@ -623,7 +623,7 @@ FUZZ_TARGET(txorphanage_sim)
    623             } else if (command-- == 0) {
    624                 // AddChildrenToWorkSet
    625                 auto tx = read_tx_fn();
    626-                FastRandomContext rand_ctx(rng.rand256());
    627+                FastRandomContext rand_ctx{ConsumeUInt256(provider)};
    


    sipa commented at 4:36 pm on July 21, 2025:
    That’s a lot of fuzzer bytes (32) for very little gain I think. Since it’s already fed through a cryptographic RNG anyway, I don’t think consuming bytes from the fuzzer has any advantage over using the harness’ rng (which is already seeded by fuzz input anyway).

    glozow commented at 8:50 pm on July 22, 2025:
    Hm, I didn’t think so either, but wonder what could have caused #31829 (review) 🤔

    sipa commented at 8:54 pm on July 22, 2025:

    Hmm, my best guess is a compiler/sanitizer bug.

    Try something like

    0auto rng_seed = rng.rand256();
    1FastRandomContext rand_ctx(rng_seed);
    

    ?


    maflcko commented at 6:45 am on July 23, 2025:

    Hmm, my best guess is a compiler/sanitizer bug.

    Jup, gcc uses heuristics that are sometimes off, see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=Wstringop-overflow.

    If you have access to the affected gcc version and confirm that sipa’s patch works, you can push it. If not, it seems fine to leave as-is and let a dev using the affected gcc version submit a workaround, if they want to.


    glozow commented at 5:59 pm on July 23, 2025:
    Thanks, removed that commit and leaving as is
  23. in doc/release-notes-31829.md:1 in 473bf54ce4 outdated
    0@@ -0,0 +1,9 @@
    1+P2P
    


    maflcko commented at 10:27 am on July 22, 2025:

    Could also mention the RPC field removals? Edit: I guess the RPC is marked EXPERIMENTAL, so this can be ignored.

    Also, if you want to fix typos, the LLM found a few:

    tranaction -> transaction [misspelling in comment “memory usage of the tranaction”, and "latency score for all tranactions"]
    comphehensive -> comprehensive [misspelling in comment “This is a comphehensive simulation…”]
    Remove trailing “)” in “We must now be within limits, otherwise LimitOrphans should have continued further).” [unmatched parenthesis]
    

    But up to you.


    glozow commented at 5:58 pm on July 23, 2025:
    Fixed the typos. I didn’t mention the getorphantxs fields since we haven’t mentioned them in release notes before
  24. glozow force-pushed on Jul 23, 2025
  25. monlovesmango commented at 10:40 pm on July 24, 2025: contributor

    My feedback is mostly a lot of annoying nits mainly from leftover variables and comments still referring to old max announcements DoS scoring. I compiled them in my own branch so you can just cherry pick whatever you think is actually worth taking. Overview below.

    There were also a lot of leftover references to max announcement counts (rather than latency score) in orphange_tests.cpp and would be happy to add a commit to clean those up if you think its worthwhile.

  26. glozow force-pushed on Jul 25, 2025
  27. glozow commented at 6:25 pm on July 25, 2025: member

    I cherry picked https://github.com/monlovesmango/bitcoin/commit/174827285797794928cd87732e2587fea70c0157 and fixed some of the incorrect comments.

    doc: Fixes GetChildrenFromSamePeer comments - Updates comments to reflect that non-reconsiderable announcements are sorted before reconsiderable announcements.

    Note that it’s the other way around - reconsiderable ones come first

  28. marcofleon commented at 8:45 pm on July 25, 2025: contributor

    Crash in the txorphan target:

    0/src/test/fuzz/txorphan.cpp:145 operator(): Assertion `orphanage->TotalOrphanUsage() == total_bytes_start' failed.
    

    txorphan_crash.txt

    The LimitOrphans in AddTx ends up decreasing m_unique_orphan_usage in this case. If we only add an announcer, it makes sense that memory usage shouldn’t increase, but it could be decreased if a transaction is completely removed from the orphanage iiuc. So maybe that assertion should be <= instead of ==?

  29. theStack commented at 4:11 pm on July 28, 2025: contributor

    Concept ACK

    Makes sense to keep the maxorphantx option for a while to only show a warning. Changes look good at first glance, will do a deeper dive with focus on test and fuzzing changes (considering #32941 (comment)) during the week.

  30. glozow force-pushed on Jul 29, 2025
  31. glozow commented at 1:25 pm on July 29, 2025: member

    txorphan_crash.txt

    The LimitOrphans in AddTx ends up decreasing m_unique_orphan_usage in this case.

    Thank you, fixed

  32. glozow force-pushed on Jul 30, 2025
  33. glozow commented at 6:36 pm on July 30, 2025: member

    Thank you, fixed

    And fixed the one a few lines after too (thanks @marcofleon)

  34. fanquake added this to the milestone 30.0 on Jul 31, 2025
  35. fanquake requested review from instagibbs on Aug 1, 2025
  36. fanquake requested review from marcofleon on Aug 1, 2025
  37. in doc/release-notes-31829.md:7 in 0e4daec11c outdated
    0@@ -0,0 +1,11 @@
    1+P2P
    2+
    3+- The transaction orphanage, which holds transactions with missing inputs temporarily while the node attempts to fetch
    4+its parents, now has improved Denial of Service protections. Previously, it enforced a maximum number of unique
    5+transactions (default 100, configurable using `-maxorphantx`). Now, its limits are as follows: the number of entries
    6+(unique by wtxid and peer), plus each unique transaction's input count divided by 10, must not exceed 3,000. The total
    7+weight of unique transactions must not exceed 404,000Wu multiplied by the number of peers.
    


    theStack commented at 12:15 pm on August 1, 2025:

    nit

    0weight of unique transactions must not exceed 404,000 WU multiplied by the number of peers.
    

    glozow commented at 3:53 pm on August 1, 2025:
    done
  38. in src/node/txorphanage.cpp:533 in fac2cc90ee outdated
    528@@ -521,6 +529,9 @@ std::vector<std::pair<Wtxid, NodeId>> TxOrphanageImpl::AddChildrenToWorkSet(cons
    529         const auto it_by_prev = m_outpoint_to_orphan_it.find(COutPoint(tx.GetHash(), i));
    530         if (it_by_prev != m_outpoint_to_orphan_it.end()) {
    531             for (const auto& wtxid : it_by_prev->second) {
    532+                // If a reconsiderable announcement for this wtxid already exists, skip it.
    533+                if (m_reconsiderable_wtxids.count(wtxid)) continue;
    


    theStack commented at 12:29 pm on August 1, 2025:

    nit

    0                if (m_reconsiderable_wtxids.contains(wtxid)) continue;
    

    glozow commented at 3:53 pm on August 1, 2025:
    done
  39. in src/test/fuzz/txorphan.cpp:150 in f56eda2cac outdated
    160-                        // if have_tx is still false, it must be too big
    161-                        Assert(!have_tx == (tx_weight > MAX_STANDARD_TX_WEIGHT));
    162-                        Assert(!have_tx || !add_tx);
    163-                    }
    164+                    // We are not guaranteed to have_tx after AddTx. There are a few possibile reasons:
    165+                    // - tx itself exceeds the per-peer memory usage limit, so LimitOrphans had to remove it immediately
    


    theStack commented at 12:50 pm on August 1, 2025:
    comment nit/question: I think this scenario could only ever happen if the per-peer memory usage limit was set smaller than MAX_STANDARD_TX_WEIGHT (which would obviously be a bad idea), right? Otherwise AddTx would return early for such txs and doesn’t reach LimitOrphans.

    glozow commented at 3:30 pm on August 1, 2025:
    Yes exactly!
  40. in src/test/fuzz/txorphan.cpp:734 in f56eda2cac outdated
    779+                }
    780+                if (done) break;
    781+            }
    782+            assert(done);
    783         }
    784+        // We must now be within limits, otherwise LimitOrphans should have continued further).
    


    theStack commented at 12:56 pm on August 1, 2025:
    0        // We must now be within limits, otherwise LimitOrphans should have continued further.
    

    glozow commented at 3:54 pm on August 1, 2025:
    already fixed I think
  41. in src/init.cpp:901 in 2b93c44146 outdated
    894@@ -893,6 +895,11 @@ bool AppInitParameterInteraction(const ArgsManager& args)
    895         InitWarning(_("Options '-datacarrier' or '-datacarriersize' are set but are marked as deprecated. They will be removed in a future version."));
    896     }
    897 
    898+    // We no longer limit the orphanage based on number of transactions but keep the option to warn users who still have it in their config.
    899+    if (args.IsArgSet("-maxorphantx")) {
    900+        InitWarning(_("Option '-maxorphantx' is set but no longer has any effect (see release notes). Please remove it from your configuration."));
    901+    }
    


    theStack commented at 3:25 pm on August 1, 2025:
    Might be nice to set -maxorphantx in one functional test to ensure it doesn’t error.

    glozow commented at 3:53 pm on August 1, 2025:
    added
  42. theStack approved
  43. theStack commented at 3:27 pm on August 1, 2025: contributor

    Code-review ACK 72c5f0321e90b324d125352a8ff720479af0382b

    Only with some nitty nits / follow-up to the follow-up ideas :)

  44. [doc] 31829 release note d1fac25ff3
  45. [refactor] make TxOrphanage keep itself trimmed af7402ccfa
  46. [optimization] Maintain at most 1 reconsiderable announcement per wtxid
    This introduces an invariant that TxOrphanageImpl never holds more than one
    announcement with m_reconsider=true for a given wtxid. This avoids duplicate
    work, both in the caller might otherwise reconsider the same transaction multiple
    times before it is ready, and internally in AddChildrenToWorkSet, which might
    otherwise iterate over all announcements multiple times.
    ed24e01696
  47. [cleanup] replace TxOrphanage::Size() with CountUniqueOrphans cc50f2f0df
  48. scripted-diff: rename OrphanTxBase to OrphanInfo
    -BEGIN VERIFY SCRIPT-
    sed -i 's/OrphanTxBase/OrphanInfo/g' $(git grep -l 'OrphanTxBase')
    -END VERIFY SCRIPT-
    8a58d0e87d
  49. [logging] add logs for inner loop of LimitOrphans edb97bb3f1
  50. scripted-diff: rename TxOrphanage outpoints index
    -BEGIN VERIFY SCRIPT-
    sed -i 's/m_outpoint_to_orphan_it/m_outpoint_to_orphan_wtxids/g' src/node/txorphanage.cpp
    -END VERIFY SCRIPT-
    cfd71c6704
  51. fix up TxOrphanage lower_bound sanity checks
    Co-authored-by: Sebastian Falbesoner <sebastian.falbesoner@gmail.com>
    b10c55b298
  52. [config] emit warning for -maxorphantx, but allow it to be set 1384dbaf6d
  53. [doc] comment fixups for orphanage changes 3b92448923
  54. scripted-diff: rename "ann" variables to "latency_score"
    -BEGIN VERIFY SCRIPT-
    sed -i 's/max_global_ann/max_global_latency_score/g' src/node/txorphanage.cpp
    sed -i 's/max_global_ann/max_global_latency_score/g' src/node/txorphanage.h
    sed -i 's/max_global_ann/max_global_latency_score/g' src/test/orphanage_tests.cpp
    sed -i 's/max_global_ann/max_global_latency_score/g' src/test/fuzz/txorphan.cpp
    sed -i 's/max_global_ann/max_global_latency_score/g' src/bench/txorphanage.cpp
    sed -i 's/max_ann/max_lat/g' src/node/txorphanage.cpp
    -END VERIFY SCRIPT-
    3d4d4f0d92
  55. glozow force-pushed on Aug 1, 2025
  56. theStack approved
  57. theStack commented at 7:15 pm on August 1, 2025: contributor
    re-ACK 3d4d4f0d92d42809e74377e4380abdc70f74de5d
  58. marcofleon commented at 1:59 pm on August 4, 2025: contributor

    Code review ACK 3d4d4f0d92d42809e74377e4380abdc70f74de5d

    Fuzzed all three targets over the weekend, no crashes. Only small thing I caught while reading the txorphan_protected target: https://github.com/bitcoin/bitcoin/blob/3d4d4f0d92d42809e74377e4380abdc70f74de5d/src/test/fuzz/txorphan.cpp#L331 That line should be (tx->vin.size() / 10) + 1 but it doesn’t result in a crash because it’s overestimating the latency score for an additional announcer.

  59. [fuzz] fix latency score check in txorphan_protected c0642e558a
  60. glozow commented at 2:50 pm on August 4, 2025: member

    That line should be (tx->vin.size() / 10) + 1

    Thanks @marcofleon! Just added a commit on top to fix that.

  61. sipa commented at 3:32 pm on August 4, 2025: member
    utACK c0642e558a02319ade33dc1014e7ae981663ea46
  62. DrahtBot requested review from marcofleon on Aug 4, 2025
  63. DrahtBot requested review from theStack on Aug 4, 2025
  64. marcofleon commented at 3:41 pm on August 4, 2025: contributor
    Nice, ACK c0642e558a02319ade33dc1014e7ae981663ea46
  65. fanquake merged this on Aug 4, 2025
  66. fanquake closed this on Aug 4, 2025

  67. glozow deleted the branch on Aug 4, 2025
  68. theStack commented at 5:50 pm on August 4, 2025: contributor
    post-merge ACK c0642e558a02319ade33dc1014e7ae981663ea46

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-08-15 15:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me