net: Filter addrman during address selection via AddrPolicy to avoid underfill #33663

pull waketraindev wants to merge 1 commits into bitcoin:master from waketraindev:2025-10-getaddressesunsafe-underfill changing 6 files +86 −18
  1. waketraindev commented at 6:03 pm on October 20, 2025: contributor

    This PR introduces AddrMan::AddrPolicy, a predicate that allows callers to exclude addresses during selection. The policy returns true to skip a given entry and is evaluated while holding AddrMan::cs.

    The mechanism is used in CConnman::GetAddressesUnsafe() to filter out banned and discouraged peers during address selection instead of removing them afterward. This prevents P2P responses (GETADDR) and RPCs (getnodeaddresses) from returning fewer results than requested when large portions of the address space are filtered.

    Additional improvements:

    • Logs the number of filtered entries for diagnostics
    • Avoids redundant filtering passes in higher-level callers
    • Keeps locking behavior unchanged (the policy runs under AddrMan::cs)

    Testing

    • Repro: ban or discourage a large set of addresses, then call getnodeaddresses 1, previously underfilled.
    • After this change, the RPC returns up to the requested count as expected.

    No behavior change for unaffected callers.

  2. DrahtBot commented at 6:03 pm on October 20, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33663.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK frankomosh
    Stale ACK mzumsande

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34162 (net: Avoid undershooting in GetAddressesUnsafe by fjahr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

    • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false) in src/test/addrman_tests.cpp
    • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false, /policy=/policyNoSkip) in src/test/addrman_tests.cpp
    • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false, /policy=/policySkipAll) in src/test/addrman_tests.cpp
    • addrman->GetAddr(/max_addresses=/0, /max_pct=/0, /network=/std::nullopt, /filtered=/false, /policy=/policyPortPolicy) in src/test/addrman_tests.cpp

    2025-12-28

  3. waketraindev marked this as ready for review on Oct 20, 2025
  4. in src/addrman.cpp:843 in c8cb2ea114
    839@@ -839,14 +840,17 @@ std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct
    840         const AddrInfo& ai{it->second};
    841 
    842         // Filter by network (optional)
    843-        if (network != std::nullopt && ai.GetNetClass() != network) continue;
    844+        if (network != std::nullopt && ai.GetNetClass() != network) { ++filtered_addresses; continue; }
    


    mzumsande commented at 1:38 pm on October 21, 2025:
    here and below: would be good to expand this over multiple lines (clang-format)
  5. mzumsande commented at 8:18 am on October 22, 2025: contributor

    Concept ACK

    The main function of this is p2p (answering GetAddr requests from peers), RPCs like getnodeaddresses are of secondary importance. Since the p2p use is also affected by this, the PR description should be changed to reflect this.

    That said, it seems good to always try to return 1000 addresses in a GetAddr answer, so that peers don’t know whether we have a long list of banned addresses or not.

  6. waketraindev force-pushed on Oct 22, 2025
  7. waketraindev commented at 1:02 pm on October 22, 2025: contributor
    • updated PR description and commit message to highlight P2p getaddr responses
    • corrected formatting
  8. waketraindev requested review from mzumsande on Oct 24, 2025
  9. in src/addrman.cpp:855 in 70e2204b61 outdated
    851+        if (ai.IsTerrible(now) && filtered) {
    852+            ++filtered_addresses;
    853+            continue;
    854+        }
    855+
    856+        // Filter by policy
    


    brunoerg commented at 2:49 pm on October 24, 2025:
    You should add a functional/unit test for this. See uncovered new code: https://corecheck.dev/bitcoin/bitcoin/pulls/33663

    waketraindev commented at 2:51 pm on October 24, 2025:
    Will do, looking into coverege and sonarcloud right now
  10. waketraindev force-pushed on Oct 28, 2025
  11. waketraindev commented at 6:55 am on October 28, 2025: contributor
    Added AddrPolicy coverage in addrman_tests
  12. waketraindev force-pushed on Oct 28, 2025
  13. waketraindev requested review from brunoerg on Oct 28, 2025
  14. waketraindev force-pushed on Oct 28, 2025
  15. in src/addrman.cpp:858 in d7c5cbb7f9 outdated
    854+        }
    855+
    856+        // Filter by policy
    857+        if (policy && policy(ai)) {
    858+            ++filtered_addresses;
    859+            continue;
    


    brunoerg commented at 1:59 pm on October 28, 2025:

    Thanks for adding test coverage. By doing mutation testing for this PR, I noticed that the following mutant has not been killed (it means tests were not able to detect this change):

     0diff --git a/src/addrman.cpp b/src/addrman.cpp
     1index c06fbb0673..4cc7073c3b 100644
     2--- a/src/addrman.cpp
     3+++ b/src/addrman.cpp
     4@@ -854,7 +854,7 @@ std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct
     5         // Filter by policy
     6         if (policy && policy(ai)) {
     7             ++filtered_addresses;
     8-            continue;
     9+            break;
    10         }
    

    This is because you’re checking that the addresses that are returned when filtering by port are - de facto - according to the policy. However, you could also check that all the proper addresses are being returned, I think just adding the following check would be enough:

    0diff --git a/src/test/addrman_tests.cpp b/src/test/addrman_tests.cpp
    1index 987249104d..3073656ed5 100644
    2--- a/src/test/addrman_tests.cpp
    3+++ b/src/test/addrman_tests.cpp
    4@@ -421,6 +421,7 @@ BOOST_AUTO_TEST_CASE(addrman_getaddr)
    5     BOOST_CHECK_EQUAL(vAddrNoPolicy.size(), addedAddresses);
    6     BOOST_CHECK_EQUAL(vAddrPolicyNoSkip.size(), vAddrNoPolicy.size());
    7     BOOST_CHECK_EQUAL(vAddrPolicySkipAll.size(), 0);
    8+    BOOST_CHECK_EQUAL(vAddrPolicyPort.size(), 4U);
    9     BOOST_CHECK(std::none_of(vAddrPolicyPort.begin(), vAddrPolicyPort.end(), policyPortPolicy));
    

    waketraindev commented at 2:29 pm on October 28, 2025:

    Thanks for running mutation testing and for review!

    I’m thinking:

    0    const std::set<CAddress> vExpectedAddresses = {addr1, addr3, addr4, addr5};
    1    BOOST_CHECK_EQUAL(vAddrPolicyPort.size(), vExpectedAddresses.size());
    2    BOOST_CHECK(std::all_of(vAddrPolicyPort.begin(), vAddrPolicyPort.end(), [&](const CAddress& a){ return vExpectedAddresses.contains(a); }));
    3    BOOST_CHECK(std::none_of(vAddrPolicyPort.begin(), vAddrPolicyPort.end(), policyPortPolicy));
    
  16. waketraindev force-pushed on Oct 28, 2025
  17. waketraindev commented at 2:39 pm on October 28, 2025: contributor
    Added test verifying only expected addresses are returned and filtered ones excluded.
  18. waketraindev requested review from brunoerg on Oct 29, 2025
  19. waketraindev renamed this:
    addrman, net: filter during address selection via AddrPolicy to avoid underfill
    addrman, net: Filter during address selection via AddrPolicy to avoid underfill
    on Nov 3, 2025
  20. waketraindev renamed this:
    addrman, net: Filter during address selection via AddrPolicy to avoid underfill
    net: Filter during address selection via AddrPolicy to avoid underfill
    on Nov 8, 2025
  21. DrahtBot added the label P2P on Nov 8, 2025
  22. waketraindev renamed this:
    net: Filter during address selection via AddrPolicy to avoid underfill
    net: Filter addrman during address selection via AddrPolicy to avoid underfill
    on Nov 8, 2025
  23. in src/net.cpp:3532 in 4f39b9255b outdated
    3532-        addresses.erase(std::remove_if(addresses.begin(), addresses.end(),
    3533-                        [this](const CAddress& addr){return m_banman->IsDiscouraged(addr) || m_banman->IsBanned(addr);}),
    3534-                        addresses.end());
    3535-    }
    3536-    return addresses;
    3537+        const AddrMan::AddrPolicy policy = [this](const CAddress& addr) {
    


    frankomosh commented at 10:46 am on November 14, 2025:

    Is there a way/test to verify that the BanMan methods here are safe to call while AddrMan’s lock is held?

    Can see a comment in addrman.h says the policy “must not acquire locks that can conflict with AddrMan::cs”. So it would make sense to ensure deadlock scenario is not accidentally created.


    waketraindev commented at 1:01 pm on November 14, 2025:

    The BanMan methods used here take m_banned_mutex and never call back into AddrMan, so the lock order in this policy is currently AddrMan::cs>BanMan::m_banned_mutex with no reverse lock order in the codebase.

    Using the predicate here also enforces that AddrMan::cs is always taken first, before any BanMan locks which helps maintain a consistent lock order.

    GetAddressesUnsafe() is invoked on startup to load peers.dat with the responses being cached for CConnman::GetAddresses(), every 24h for CConnman::ASMapHealthCheck(), when a GETADDR message is called and the peer has permission NetPermissionFlags::Addr and finally by RPC getnodeaddresses. I tested each of these manually and checked the lock ordering and lock wait times.

    Regarding ways to test for lock ordering, a debug build will enable DEBUG_LOCKORDER and DEBUG_LOCKCONTENTION which will log issues with lock ordering or lock waits (--debug=lock). (see: https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#debug_lockorder)


    frankomosh commented at 8:13 am on November 17, 2025:

    The BanMan methods used here take m_banned_mutex and never call back into AddrMan, so the lock order in this policy is currently AddrMan::cs>BanMan::m_banned_mutex with no reverse lock order in the codebase.

    Thanks. Might be worth a comment somewhere in the code documenting this ordering requirement so someone does not accidentally introduce the reverse path in future ?


    waketraindev commented at 11:27 am on November 17, 2025:

    Updated comments for more clarity in addrman.h:

    0    /** Predicate used to exclude addresses during selection.
    1     *  Return true to skip the given address.
    2     *
    3     *  Runs while holding AddrMan::cs, so it must be non-blocking, must not
    4     *  attempt to reacquire AddrMan::cs, and must preserve lock ordering to
    5     *  avoid deadlocks.
    6     */
    

    in net.cpp -> GetAddressesUnsafe()

    0        // This runs under AddrMan::cs, and BanMan checks take m_banned_mutex.
    1        // The lock order here is AddrMan::cs -> m_banned_mutex.
    

    frankomosh commented at 4:01 pm on November 17, 2025:
    Thanks
  24. in src/addrman.cpp:844 in 4f39b9255b
    839@@ -839,14 +840,26 @@ std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct
    840         const AddrInfo& ai{it->second};
    841 
    842         // Filter by network (optional)
    843-        if (network != std::nullopt && ai.GetNetClass() != network) continue;
    844+        if (network != std::nullopt && ai.GetNetClass() != network) {
    845+            ++filtered_addresses;
    


    frankomosh commented at 10:53 am on November 14, 2025:
    Would network mismatches also count as filtered i this case?

    waketraindev commented at 1:06 pm on November 14, 2025:
    Yes; Currently all skipped addresses in AddrMan::GetAddr are counted as filtered; If you’d prefer (or find it useful) separate counters for each reason (Network, Quality or Policy), I can add that.

    frankomosh commented at 8:07 am on November 17, 2025:
    Maybe separate would be clearer for diagnostics. Up to you though,.. if you think the current logging is clear enough then that’s fine too.

    waketraindev commented at 11:25 am on November 17, 2025:

    Went with separate counters and also added the selected network to the log output for clarity. Example log entry:

    0[addrman] GetAddr returned 62823 random addresses from ipv4; 1104 filtered (0 network, 0 quality, 1104 policy)
    1[addrman] GetAddr returned 0 random addresses from ipv6; 63927 filtered (63927 network, 0 quality, 0 policy)
    2[addrman] GetAddr returned 0 random addresses from onion; 63927 filtered (66927 network, 0 quality, 0 policy)
    

    frankomosh commented at 4:08 pm on November 17, 2025:
    Awesome
  25. frankomosh commented at 10:54 am on November 14, 2025: contributor
    Concept ACK
  26. waketraindev force-pushed on Nov 17, 2025
  27. in src/net.cpp:3535 in fb3efdb3d2
    3537+        // This runs under AddrMan::cs, and BanMan checks take m_banned_mutex.
    3538+        // The lock order here is AddrMan::cs -> m_banned_mutex.
    3539+        const AddrMan::AddrPolicy policy = [this](const CAddress& addr) {
    3540+            return m_banman && (m_banman->IsDiscouraged(addr) || m_banman->IsBanned(addr));
    3541+        };
    3542+        return addrman.GetAddr(/*max_addresses=*/max_addresses, /*max_pct=*/max_pct, /*network=*/network, /*filtered=*/filtered, /*policy=*/policy);
    


    mzumsande commented at 10:33 pm on December 22, 2025:
    nit: I don’t think hints are helpful if the passed variables already have the same name, they just add noise.
  28. in src/net.cpp:3530 in fb3efdb3d2
    3532-        addresses.erase(std::remove_if(addresses.begin(), addresses.end(),
    3533-                        [this](const CAddress& addr){return m_banman->IsDiscouraged(addr) || m_banman->IsBanned(addr);}),
    3534-                        addresses.end());
    3535-    }
    3536-    return addresses;
    3537+        // This runs under AddrMan::cs, and BanMan checks take m_banned_mutex.
    


    mzumsande commented at 10:35 pm on December 22, 2025:
    indentation is incorrect here.
  29. in src/test/addrman_tests.cpp:410 in fb3efdb3d2 outdated
    406@@ -407,6 +407,27 @@ BOOST_AUTO_TEST_CASE(addrman_getaddr)
    407     // Net processing asks for 23% of addresses. 23% of 5 is 1 rounded down.
    408     BOOST_CHECK_EQUAL(addrman->GetAddr(/*max_addresses=*/2500, /*max_pct=*/23, /*network=*/std::nullopt).size(), 1U);
    409 
    410+    // Test AddrMan::GetAddr filtering with various AddrPolicy rules.
    


    mzumsande commented at 10:50 pm on December 22, 2025:
    Would be nice to also have some coverage in the addrman fuzz test.
  30. mzumsande commented at 11:00 pm on December 22, 2025: contributor

    Code Review ACK fb3efdb3d267924266134f63b1ba6ddda0b6ce81

    As far as I can see, the reason for creating and passing an AddrPolicy is that the alternative (having addrman reach into banman to request if an addr is banned) would introduce coupling between the modules, while the current approach only introduces an implied lock order, which is somewhat of a potential footgun - but I don’t really see a better solution. For example, we probably wouldn’t want to remove entries from addrman when they are banned, because bans are often temporary.

  31. DrahtBot requested review from frankomosh on Dec 22, 2025
  32. in src/addrman.h:182 in fb3efdb3d2
    178@@ -169,10 +179,11 @@ class AddrMan
    179      * @param[in] max_pct        Maximum percentage of addresses to return (0 = all). Value must be from 0 to 100.
    180      * @param[in] network        Select only addresses of this network (nullopt = all).
    181      * @param[in] filtered       Select only addresses that are considered good quality (false = all).
    182+     * @param[in] policy         Optional predicate to exclude candidates during selection (true = exclude).
    


    fjahr commented at 10:05 pm on December 25, 2025:
    Hm, the true = exclude part doesn’t have the same meaning as above because this is not a bool parameter. Maybe update this to be more precise. The filtered option doesn’t say that it’s optional so you might add that while you touch this.
  33. in src/addrman.cpp:831 in fb3efdb3d2 outdated
    827@@ -827,6 +828,7 @@ std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct
    828     const auto now{Now<NodeSeconds>()};
    829     std::vector<CAddress> addresses;
    830     addresses.reserve(nNodes);
    831+    size_t filtered_addr_net = 0, filtered_addr_quality = 0, filtered_addr_policy = 0;
    


    fjahr commented at 10:09 pm on December 25, 2025:
    A general problem with the separate tracking like this is that it’s order dependent and so the values may actually be somewhat misleading. I.e. if there are some addrs filtered by policy but they are already caught by the net filter they won’t be included here. Doing this in a cleaner way may be overengineering but I wanted to mention that this isn’t ideal.

    fjahr commented at 10:15 pm on December 25, 2025:
    Adding to that: the non-optional filter is now in between two optional filters. Feels to me like the optional checks should go after the mandatory but maybe there are arguments for doing it the other way around.

    fjahr commented at 10:31 pm on December 25, 2025:
    If my suggestion from the top level comment is taken this is kind of obsolete but if not i think rather than trying to count each filter individually I would rather just log the total filtered count and which filters were active. I think that’s simpler and least confusing.
  34. in src/addrman.cpp:864 in fb3efdb3d2 outdated
    862 
    863         addresses.push_back(ai);
    864     }
    865-    LogDebug(BCLog::ADDRMAN, "GetAddr returned %d random addresses\n", addresses.size());
    866+    LogDebug(BCLog::ADDRMAN, "GetAddr returned %zu random addresses from %s; %zu filtered "
    867+                             "(%zu network, %zu quality, %zu policy)\n",
    


    fjahr commented at 10:21 pm on December 25, 2025:
    Would be generally nice if the options that weren’t used are also not logged.
  35. fjahr commented at 10:29 pm on December 25, 2025: contributor
    While leaving some nit-ish comments I have realized that if we want to add such a generalized concept for filtering, we should implement the current filtering with it as well. E.g. network and filtered params go away and we only have AddrPolicy as param to GetAddr. Then use AddrPolicy to implement what network and filtered were doing. That’s a bigger refactor but having this current mix with this redundancy is kind of ugly and I don’t really think it’s worth it alone.
  36. waketraindev commented at 4:55 am on December 27, 2025: contributor

    While leaving some nit-ish comments I have realized that if we want to add such a generalized concept for filtering, we should implement the current filtering with it as well. E.g. network and filtered params go away and we only have AddrPolicy as param to GetAddr. Then use AddrPolicy to implement what network and filtered were doing. That’s a bigger refactor but having this current mix with this redundancy is kind of ugly and I don’t really think it’s worth it alone.

    Thanks for the suggestion! network and filtered serve distinct, standard purposes (network filters by network, filtered disables quality filtering), while AddrPolicy is specifically for custom filters that skip entries without reducing the returned count. Using AddrPolicy also avoids coupling AddrMan to other modules (like BanMan) for filtering.

  37. fjahr commented at 2:03 pm on December 27, 2025: contributor

    network and filtered serve distinct, standard purposes (network filters by network, filtered disables quality filtering), while AddrPolicy is specifically for custom filters

    What distinguishes a “standard” from a “custom” filter? Is that documented somewhere? And if we need to add another filter that is “standard” we would add another param to the function rather than using AddrPolicy?

    that skip entries without reducing the returned count

    Huh? If they don’t reduce the count how are they filtering anything? This is not what the code is doing, the code is filtering and reducing the returned count with it as well of course. Also demonstrated by the tests. Am I misunderstanding what you mean with count?

    Using AddrPolicy also avoids coupling AddrMan to other modules (like BanMan) for filtering.

    I don’t see how that is an argument against using it for network and quality filters.

  38. waketraindev commented at 6:25 pm on December 27, 2025: contributor

    network and filtered serve distinct, standard purposes (network filters by network, filtered disables quality filtering), while AddrPolicy is specifically for custom filters

    What distinguishes a “standard” from a “custom” filter? Is that documented somewhere? And if we need to add another filter that is “standard” we would add another param to the function rather than using AddrPolicy?

    By “standard” I mean filters that are intrinsic to AddrMan and operate on AddrInfo (for example network classification and quality filtering via ai.IsTerrible(now)), and whose logic does not depend on external subsystems. By “custom” I mean filters that depend on external state or policy decisions outside of AddrMan (for example BanMan), which is what AddrPolicy is intended to support. If a future filter is intrinsic to AddrMan and depends on AddrInfo, adding it as a separate parameter would be consistent with the existing design. If it depends on external state, AddrPolicy is the appropriate mechanism.

    that skip entries without reducing the returned count

    Huh? If they don’t reduce the count how are they filtering anything? This is not what the code is doing, the code is filtering and reducing the returned count with it as well of course. Also demonstrated by the tests. Am I misunderstanding what you mean with count?

    Candidates are filtered and skipped during selection. The change here is when filtering happens. Previously, addresses were selected up to the target count and then banned or discouraged addresses were removed afterward, which caused the final CConnman::GetAddressesUnsafe result to be underfilled. With the current approach, filtering happens during selection, and the loop continues until the requested number of unfiltered addresses is reached or the address space is exhausted. So filtering still occurs, but it no longer causes CConnman::GetAddressesUnsafe to be underfilled relative to the request.

    Using AddrPolicy also avoids coupling AddrMan to other modules (like BanMan) for filtering.

    I don’t see how that is an argument against using it for network and quality filters.

    The point is not that AddrPolicy cannot express network or quality filters, but that doing so would conflate concerns and duplicate logic. Network and quality filtering are AddrMan responsibilities and require access to AddrInfo. Expressing them via AddrPolicy would either broaden the policy contract or require reimplementing intrinsic AddrMan logic in call sites, without addressing a concrete issue in this PR. This change keeps the PR scoped to fixing underfilling and avoids introducing additional coupling or refactors.


    To keep this PR focused: its purpose is to fix the underfilling issue by moving ban and discourage filtering into the selection phase. The current design achieves that with minimal surface area change and without refactoring existing AddrMan responsibilities or introducing new abstractions. Broader refactors (for example unifying all filters behind AddrPolicy or introducing builders/factories) can be explored separately if there is concrete motivation, but are out of scope for this change. Unless there are functional issues with the current approach, I intend to keep the implementation as is.

  39. in src/addrman.cpp:26 in fb3efdb3d2
    22@@ -23,6 +23,7 @@
    23 
    24 #include <cmath>
    25 #include <optional>
    26+#include <netbase.h>
    


    fjahr commented at 9:42 pm on December 27, 2025:
    nit: ordering
  40. in src/addrman.h:21 in fb3efdb3d2
    17@@ -18,6 +18,7 @@
    18 #include <unordered_set>
    19 #include <utility>
    20 #include <vector>
    21+#include <functional>
    


    fjahr commented at 9:42 pm on December 27, 2025:
    nit: ordering
  41. fjahr commented at 0:22 am on December 28, 2025: contributor

    By “standard” I mean filters that are intrinsic to AddrMan and operate on AddrInfo (for example network classification and quality filtering via ai.IsTerrible(now)), and whose logic does not depend on external subsystems. By “custom” I mean filters that depend on external state or policy decisions outside of AddrMan (for example BanMan), which is what AddrPolicy is intended to support. If a future filter is intrinsic to AddrMan and depends on AddrInfo, adding it as a separate parameter would be consistent with the existing design. If it depends on external state, AddrPolicy is the appropriate mechanism.

    The docs don’t mention any of this but instead the naming AddrPolicy and the docs suggest that it’s a generalized concept and could be used for any kind of filtering. I still think there is no reason to restrict usage of AddrPolicy in this way when it appears to be possible to just do both.

    To keep this PR focused: its purpose is to fix the underfilling issue by moving ban and discourage filtering into the selection phase. The current design achieves that with minimal surface area change and without refactoring existing AddrMan responsibilities or introducing new abstractions.

    If preventing underfilling is the only goal of this PR I would suggest a simpler approach: For example request more addresses than max_addresses and then trim at the end. Could be done smartly with using m_banned.size() to make sure we only overshoot as much as necessary. But simplest would be to just request with max_addresses = 0. The current approach doesn’t seem minimal and if there is no plan to use it more broadly it seems like overengineering. At least the conceptual limitations and “policy contract” that you are describing would need to be documented clearly in the code as well.

  42. waketraindev force-pushed on Dec 28, 2025
  43. waketraindev commented at 1:21 pm on December 28, 2025: contributor
    added fuzz coverage, fixed identation, removed hints, ordered includes, cleared nits
  44. addrman, net: filter during selection via AddrPolicy to avoid underfill
    Add AddrMan::AddrPolicy, a predicate returning true to exclude an address
    during selection. This allows callers to apply custom filtering directly in
    AddrMan rather than post-processing results.
    
    Use the new policy in CConnman::GetAddressesUnsafe to exclude banned and
    discouraged peers while filling address requests. This fixes underfilled
    results observed when banned peers were removed after selection, causing
    P2P responses from `GETADDR` to return fewer entries than requested.
    
    Also log the number of filtered entries for debugging and diagnostics.
    bb637a85c6
  45. in src/test/fuzz/addrman.cpp:185 in 33e7b66479
    178@@ -179,6 +179,27 @@ FUZZ_TARGET(addrman, .init = initialize_addrman)
    179     auto filtered = fuzzed_data_provider.ConsumeBool();
    180     (void)const_addr_man.GetAddr(max_addresses, max_pct, network, filtered);
    181 
    182+    CallOneOf(
    183+        fuzzed_data_provider,
    184+        [&] {
    185+            (void)const_addr_man.GetAddr(max_addresses, max_pct, network, filtered);
    


    fjahr commented at 2:23 pm on December 28, 2025:
    Why would you call the same line as above again? This line is completely redundant.

    waketraindev commented at 2:52 pm on December 28, 2025:
    oopsie
  46. waketraindev force-pushed on Dec 28, 2025
  47. fjahr commented at 5:02 pm on December 28, 2025: contributor
    It appears you aren’t interested in addressing the rest of my comments @waketraindev so I have implemented my suggested approach in #34162 instead. I prefer this approach due to its simplicity, as stated above. It fixes the overshooting problem in 2 lines and adds a functional test case for the behavior change. I will leave it to the other reviewers to decide which approach they prefer.
  48. Bicaru20 commented at 11:17 pm on December 28, 2025: none

    I think @fjahr has a good point. If the main goal is simply to always return 1000 addresses in GetAddr, introducing a predicate feels like adding unnecessary complexity. I believe his solution is cleaner.

    While leaving some nit-ish comments I have realized that if we want to add such a generalized concept for filtering, we should implement the current filtering with it as well. E.g. network and filtered params go away and we only have AddrPolicy as param to GetAddr. Then use AddrPolicy to implement what network and filtered were doing. That’s a bigger refactor but having this current mix with this redundancy is kind of ugly and I don’t really think it’s worth it alone.

    I agree. If a generalized filtering concept is added, it should replace the current filtering mechanism. This would make it easier to add new filters in the future. However, for the problem trying to be solved, I think it’s complicating things too much.

  49. waketraindev commented at 7:02 pm on December 29, 2025: contributor

    I think @fjahr has a good point. If the main goal is simply to always return 1000 addresses in GetAddr, introducing a predicate feels like adding unnecessary complexity. I believe his solution is cleaner.

    Regarding the “over-engineering” concerns, the lambda(the predicate) here is inline, concise, and used only once. It encapsulates a simple banned/discouraged filter applied during selection to ensure max_addresses are correctly returned. This approach is both functionally necessary and efficient, and does not introduce unnecessary complexity, so I do not consider it over-engineering.

    On the other hand, introducing a Builder to craft the policy, include other “generalized” filters, and export AddrInfo would be over-engineering in this context.

    The #34162 approach is highly inefficient for a database of 70k+ records, with on-demand calls through RPC with a max_addresses limit, and whitelisted nodes. It is insufficient for my use cases.

    Though both performance and simplicity are in the eyes of the beholder ig.

    Thanks for your feedback!

  50. fjahr commented at 3:24 pm on January 5, 2026: contributor

    The #34162 approach is highly inefficient for a database of 70k+ records, with on-demand calls through RPC with a max_addresses limit, and whitelisted nodes. It is insufficient for my use cases.

    If you could be more explicit with what your use-case is then maybe you could convince more people that your approach is really necessary, even when you refuse to use it more broadly. It comes down to if core wants to support your use case bad enough to take on the extra code to maintain, I guess. You have a lot of addresses, ok! But I don’t understand how whitelists play into this. Can you elaborate on that? And why would you do frequent calls to getnodeaddresses with a max_addresses limit, couldn’t you just increase the limit?

    I did some rough high-level benchmarking using the functional test from my PR (code). Due to the bucket collisions I couldn’t get beyond 64k addresses, but I guess this is close enough and I didn’t try to get to an even higher number. I am using an arbitrary 1k banned addresses and 1k max limit.

    This PR (#33663) has the best performance, not surprising:

    02025-12-30T14:34:41.400316Z TestFramework (INFO): Test that getnodeaddresses returns max count even with banned addresses
    12025-12-30T14:34:41.770049Z TestFramework (INFO): Adding addresses to addrman...
    22025-12-30T14:36:31.483147Z TestFramework (INFO): Successfully added 62636 addresses
    32025-12-30T14:36:31.483466Z TestFramework (INFO): Banning 1000 addresses...
    42025-12-30T14:36:32.511395Z TestFramework (INFO): Requesting 1000 addresses...
    52025-12-30T14:36:32.523990Z TestFramework (INFO): getnodeaddresses returned 1000 addresses in 0.013s
    

    My alternative approach (#34162) is much slower, also not surprising, but at 0.5s it doesn’t seem like this would break any workflows I can imagine:

    02025-12-30T14:18:12.848562Z TestFramework (INFO): Test that getnodeaddresses returns max count even with banned addresses
    12025-12-30T14:18:13.218102Z TestFramework (INFO): Adding addresses to addrman...
    22025-12-30T14:19:58.335032Z TestFramework (INFO): Successfully added 63471 addresses
    32025-12-30T14:19:58.335895Z TestFramework (INFO): Banning 1000 addresses...
    42025-12-30T14:19:59.381759Z TestFramework (INFO): Requesting 1000 addresses...
    52025-12-30T14:19:59.940406Z TestFramework (INFO): getnodeaddresses returned 1000 addresses in 0.559s
    

    As I had said previously, my suggest PR could be tweaked to get a middle ground between performance and simplicity, like for example fetch not all but only 2x max limit addresses to get to comparable performance and this should really work in 99% percent of the cases I can imagine. But sure, it’s not as clean as #34162 but performance is comparable to this PR ( demo code):

    02025-12-30T14:52:23.292663Z TestFramework (INFO): Test that getnodeaddresses returns max count even with banned addresses
    12025-12-30T14:52:23.663001Z TestFramework (INFO): Adding addresses to addrman...
    22025-12-30T14:54:09.192166Z TestFramework (INFO): Successfully added 63395 addresses
    32025-12-30T14:54:09.193381Z TestFramework (INFO): Banning 1000 addresses...
    42025-12-30T14:54:10.218400Z TestFramework (INFO): Requesting 1000 addresses...
    52025-12-30T14:54:10.238231Z TestFramework (INFO): getnodeaddresses returned 1000 addresses in 0.020s
    62025-12-30T14:54:11.333769Z TestFramework (INFO): Stopping nodes
    

    I also learned from playing around with the numbers that the total number of addresses only has minor impact on the performance. The big performance impact is coming from the number of banned addresses. So maybe that’s where the discussion on performance (aside from the goals) should focus. I am not sure I have a good grasp on what % of addresses are realistically banned or discouraged. @0xB10C maybe you can weigh in?


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-01-09 00:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me