addrman: cap the max_pct to not exceed the maximum number of addresses #31235

pull brunoerg wants to merge 1 commits into bitcoin:master from brunoerg:2024-11-fuzz-fix-connman-getaddresses changing 5 files +7 −5
  1. brunoerg commented at 6:07 pm on November 6, 2024: contributor

    Fixes #31234

    This PR fixes a bad alloc issue in GetAddresses by capping the value max_pct. In practice, values greater than 100 should be treated as 100 since it’s the percentage of addresses to return. Also, it limites the value max_pct in connman target to exercise values between 0 and 100.

  2. DrahtBot commented at 6:07 pm on November 6, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31235.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK marcofleon, mzumsande, adamandrews1, vasild

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Tests on Nov 6, 2024
  4. maflcko added this to the milestone 29.0 on Nov 6, 2024
  5. vasild commented at 6:10 am on November 7, 2024: contributor

    I think that the problem warrants a fix in the real code, not in the test. AddrManImpl::GetAddr_() would try to overallocate if the caller passes > 100% which is meaningless. It already handles max_addresses properly: nNodes = std::min(nNodes, max_addresses);. If there are 1000 addresses and the caller asks for 5000, it would clamp it down to 1000.

     0diff --git i/src/addrman.cpp w/src/addrman.cpp
     1index 43e7b6a32c..9fb092e9af 100644
     2--- i/src/addrman.cpp
     3+++ w/src/addrman.cpp
     4@@ -812,13 +812,13 @@ nid_type AddrManImpl::GetEntry(bool use_tried, size_t bucket, size_t position) c
     5 std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct, std::optional<Network> network, const bool filtered) const
     6 {
     7     AssertLockHeld(cs);
     8 
     9     size_t nNodes = vRandom.size();
    10     if (max_pct != 0) {
    11-        nNodes = max_pct * nNodes / 100;
    12+        nNodes = std::min(nNodes, max_pct * nNodes / 100);
    13     }
    14     if (max_addresses != 0) {
    15         nNodes = std::min(nNodes, max_addresses);
    16     }
    17 
    18     // gather a list of random nodes, skipping those of low quality
    19diff --git i/src/test/fuzz/addrman.cpp w/src/test/fuzz/addrman.cpp
    20index a7e7f49d9f..ce50e22916 100644
    21--- i/src/test/fuzz/addrman.cpp
    22+++ w/src/test/fuzz/addrman.cpp
    23@@ -169,14 +169,14 @@ FUZZ_TARGET(addrman, .init = initialize_addrman)
    24     }
    25     const AddrMan& const_addr_man{addr_man};
    26     std::optional<Network> network;
    27     if (fuzzed_data_provider.ConsumeBool()) {
    28         network = fuzzed_data_provider.PickValueInArray(ALL_NETWORKS);
    29     }
    30-    auto max_addresses = fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, 4096);
    31-    auto max_pct = fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, 4096);
    32+    auto max_addresses = fuzzed_data_provider.ConsumeIntegral<size_t>();
    33+    auto max_pct = fuzzed_data_provider.ConsumeIntegral<size_t>();
    34     auto filtered = fuzzed_data_provider.ConsumeBool();
    35     (void)const_addr_man.GetAddr(max_addresses, max_pct, network, filtered);
    36 
    37     std::unordered_set<Network> nets;
    38     for (const auto& net : ALL_NETWORKS) {
    39         if (fuzzed_data_provider.ConsumeBool()) {
    
  6. brunoerg commented at 10:49 am on November 7, 2024: contributor

    I think that the problem warrants a fix in the real code, not in the test.

    We can still limit the values to reflect the reality. I don’t see how to have a huge max_pct in practice. For example, for GETADDR, we use MAX_ADDR_TO_SEND/MAX_PCT_ADDR_TO_SEND. For the RPC getnodeaddresses and other usage, its value is always zero.

  7. vasild commented at 11:24 am on November 7, 2024: contributor

    In the future the callers of AddrManImpl::GetAddr_() can change, or new callers can be added. It is good to ensure that it does not misbehave or crash the program for some input.

    I understand that this is not an universal argument and should be applied with reason, with limits. For example, the following is beyond reason: “std::vector::reserve() is going to crash the program for some input - if asked to allocate 1000TB, lets fix it!”.

    IMO AddrManImpl::GetAddr_() capping max_pct is reasonable. It already caps max_addresses. And then we can ensure it does by throwing any random numbers at it and making sure it does not break.

  8. brunoerg commented at 11:30 am on November 7, 2024: contributor

    IMO AddrManImpl::GetAddr_() capping max_pct is reasonable. It already caps max_addresses. And then we can ensure it does by throwing any random numbers at it and making sure it does not break.

    I agree. I can address it. I’m thinking about If max_pct is the maximum percentage of addresses to return. We could add an assert in GetAddr that this number is not greater than 100 and change the targets to use values from 0 to 100. I doesn’t make sense to try to get 10000000% of the addresses, for example.

  9. vasild commented at 11:35 am on November 7, 2024: contributor

    Yeah, that too. Then the test should indeed not call it with >100. If you go this way, then it should be clearly documented in:

    0    /**
    1     * Return all or many randomly selected addresses, optionally by network.
    2     *
    3     * [@param](/bitcoin-bitcoin/contributor/param/)[in] max_addresses  Maximum number of addresses to return (0 = all).
    4     * [@param](/bitcoin-bitcoin/contributor/param/)[in] max_pct        Maximum percentage of addresses to return (0 = all).
    

    Somehow I feel it is safer to treat values >100 as 100 and document that instead of crashing with an assert (or bad alloc).

  10. brunoerg commented at 11:36 am on November 7, 2024: contributor

    Somehow I feel it is safer to treat values >100 as 100 and document that instead of crashing with an assert (or bad alloc).

    Sure, I will address this.

  11. brunoerg force-pushed on Nov 7, 2024
  12. brunoerg commented at 11:44 am on November 7, 2024: contributor
    Force-pushed adding the cap of max_pct in GetAddresses, changed the PR description as well. Thanks, @vasild.
  13. in src/test/fuzz/connman.cpp:113 in 81f26e9225 outdated
    109@@ -110,13 +110,13 @@ FUZZ_TARGET(connman, .init = initialize_connman)
    110             },
    111             [&] {
    112                 auto max_addresses = fuzzed_data_provider.ConsumeIntegral<size_t>();
    113-                auto max_pct = fuzzed_data_provider.ConsumeIntegral<size_t>();
    114+                auto max_pct = fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, 200);
    


    vasild commented at 12:34 pm on November 7, 2024:

    It would be good extend the GetAddr() doc to something like the following. And then no need to cap to 200 here.

     0--- i/src/addrman.h
     1+++ w/src/addrman.h
     2@@ -163,13 +163,13 @@ public:
     3     std::pair<CAddress, NodeSeconds> Select(bool new_only = false, const std::unordered_set<Network>& networks = {}) const;
     4 
     5     /**
     6      * Return all or many randomly selected addresses, optionally by network.
     7      *
     8      * [@param](/bitcoin-bitcoin/contributor/param/)[in] max_addresses  Maximum number of addresses to return (0 = all).
     9-     * [@param](/bitcoin-bitcoin/contributor/param/)[in] max_pct        Maximum percentage of addresses to return (0 = all).
    10+     * [@param](/bitcoin-bitcoin/contributor/param/)[in] max_pct        Maximum percentage of addresses to return, values >100 are treated as 100 (0 = all).
    11      * [@param](/bitcoin-bitcoin/contributor/param/)[in] network        Select only addresses of this network (nullopt = all).
    12      * [@param](/bitcoin-bitcoin/contributor/param/)[in] filtered       Select only addresses that are considered good quality (false = all).
    13      *
    14      * [@return](/bitcoin-bitcoin/contributor/return/)                   A vector of randomly selected addresses from vRandom.
    15      */
    16     std::vector<CAddress> GetAddr(size_t max_addresses, size_t max_pct, std::optional<Network> network, const bool filtered = true) const;
    

    brunoerg commented at 12:39 pm on November 7, 2024:
    The cap to 200 is just to facilitate exploring valid ranges and some exceeds.

    brunoerg commented at 12:39 pm on November 7, 2024:
    Going to extend the doc. Thanks.

    brunoerg commented at 12:41 pm on November 7, 2024:
    Done.
  14. vasild approved
  15. vasild commented at 12:36 pm on November 7, 2024: contributor
    ACK 81f26e9225838a1f0e0293c84f875efab95d9724
  16. brunoerg force-pushed on Nov 7, 2024
  17. brunoerg commented at 12:41 pm on November 7, 2024: contributor
    Force-pushed addressing #31235 (review)
  18. vasild approved
  19. vasild commented at 12:53 pm on November 7, 2024: contributor
    ACK 9cdf4e67bf04b5abb46b4ce1cf4c306819afe056
  20. in src/addrman.cpp:818 in db5bd56834 outdated
    814@@ -815,7 +815,7 @@ std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct
    815 
    816     size_t nNodes = vRandom.size();
    817     if (max_pct != 0) {
    818-        nNodes = max_pct * nNodes / 100;
    819+        nNodes = std::min(nNodes, max_pct * nNodes / 100);
    


    mzumsande commented at 3:37 pm on November 7, 2024:
    Since there was a bug somewhere if we ever pass max_pct > 100 (with privacy implications, the reason max_pct exists is not to reveal the entire addrman), why not Assume(max_pct <=100) / adjust the fuzzer range instead of trying to accommodate it? There is a difference between max_pct and the max_addresses where a caller could legitimately ask for more addresses than addrman has because they don’t know current addrman size - no caller should ever ask for more than 100%.

    vasild commented at 6:11 am on November 8, 2024:

    I am fine either way and slightly prefer the softer approach of capping it to 100 instead of asserting that it is <=100. There was a discussion some time ago about crashing the program way too easily and reserving the assert/crash only for cases where the continuation of the program really does not make sense. I kind of agree with that.

    Yes, Assume() is some middle ground between assert() and the softer approach. Assume() + cap in release builds looks reasonable to me too. Plus document it properly: “make sure you never pass >100 or you will crash the program” … hmm, could this end up in all callers doing the cap at the call site before calling GetAddr() to make sure to avoid a crash?

    Do you remember the bug number? “The reason max_pct exists is not to reveal the entire addrman”, but then passing even == 100 is a bug?


    If you will be touching this, there is this minor mostly theoretical thing: max_pct * nNodes could overflow and I guess max_pct * nNodes / 100 could end up with a lower value than nNodes for some very high input max_pct. So, if you will be touching this, maybe it is “safer” to implement the cap as:

    0max_pct = std::min(max_pct, 100);
    1nNodes = max_pct * nNodes / 100;
    

    (this occurred to me just recently)


    mzumsande commented at 3:46 pm on November 8, 2024:

    Do you remember the bug number? “The reason max_pct exists is not to reveal the entire addrman”, but then passing even == 100 is a bug?

    GetAddr answers have been capped by percentage at least since addrman was introduced (5fee401fe14aa6459428a26a82f764db70a6a0b9), just that MAX_PCT_ADDR_TO_SEND=23 was hardcoded in the past. It’s kind of still that way (MAX_PCT_ADDR_TO_SEND is used as an arg in net_processing, the rpc code path doesn’t use max_pct.), so in current master nothing but 23 or 0 should ever be passed.

    Assume() + cap in release builds looks reasonable to me too.

    Agree.


    brunoerg commented at 11:36 am on November 11, 2024:

    Yes, Assume() is some middle ground between assert() and the softer approach. Assume() + cap in release builds looks reasonable to me too.

    I agree. I can address it.

    so in current master nothing but 23 or 0 should ever be passed.

    So, we could adjust both addrman and connman fuzz targets to get 23 or 0 to use for max_pct. Sounds good?


    mzumsande commented at 1:03 pm on November 11, 2024:

    so in current master nothing but 23 or 0 should ever be passed.

    So, we could adjust both addrman and connman fuzz targets to get 23 or 0 to use for max_pct. Sounds good?

    I think fuzzing in range between 0 and 100 would make sense - after all that’s what the function is designed to handle


    brunoerg commented at 1:48 pm on November 11, 2024:

    I think fuzzing in range between 0 and 100 would make sense - after all that’s what the function is designed to handle

    I agree.


    brunoerg commented at 2:22 pm on November 11, 2024:
    Done.
  21. brunoerg force-pushed on Nov 11, 2024
  22. brunoerg commented at 2:23 pm on November 11, 2024: contributor

    Force-pushed:

    • Adjusted the cap and added the Assume for the max_pct.
    • Adjusted GetAddr and GetAddress documentation.
    • Adjusted both connman and addrman fuzz targets to use values from 0 to 100 for max_pct.

    Thanks @mzumsande and @vasild.

  23. l0rinc commented at 2:27 pm on November 11, 2024: contributor
    The title still claims “fuzz”, but the fix seems to be in production code - can we update the PR title?
  24. in src/addrman.cpp:819 in 5aea8ca23e outdated
    811@@ -812,9 +812,11 @@ nid_type AddrManImpl::GetEntry(bool use_tried, size_t bucket, size_t position) c
    812 std::vector<CAddress> AddrManImpl::GetAddr_(size_t max_addresses, size_t max_pct, std::optional<Network> network, const bool filtered) const
    813 {
    814     AssertLockHeld(cs);
    815+    Assume(max_pct <= 100);
    816 
    817     size_t nNodes = vRandom.size();
    818     if (max_pct != 0) {
    819+        max_pct = std::min(max_pct, (size_t)100);
    


    maflcko commented at 2:33 pm on November 11, 2024:
    style-nit: For new code it would be good to use size_t{100}, as explained in the dev notes.

    brunoerg commented at 3:49 pm on November 11, 2024:
    Done.
  25. maflcko commented at 2:34 pm on November 11, 2024: member
    you’ll have to squash the commits. Otherwise a fuzz bisect will fail on the addrman target as well. (connman already fails)
  26. addrman: cap the `max_pct` to not exceed the maximum number of addresses
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    9c5775c331
  27. brunoerg force-pushed on Nov 11, 2024
  28. brunoerg commented at 3:49 pm on November 11, 2024: contributor
    Force-pushed squashing and addressing #31235 (review).
  29. brunoerg renamed this:
    fuzz: fix bad alloc in connman target
    addrman: cap the `max_pct` to not exceed the maximum number of addresses
    on Nov 11, 2024
  30. maflcko removed the label Tests on Nov 11, 2024
  31. DrahtBot added the label P2P on Nov 11, 2024
  32. marcofleon commented at 3:40 pm on November 12, 2024: contributor
    Tested ACK 9c5775c331e02dab06c78ecbb58488542d16dda7. Reproduced the crash on master and checked that this fixed it. The checks added to GetAddr_ look reasonable.
  33. DrahtBot requested review from vasild on Nov 12, 2024
  34. mzumsande commented at 4:43 pm on November 12, 2024: contributor
    Code Review ACK 9c5775c331e02dab06c78ecbb58488542d16dda7
  35. adamandrews1 commented at 0:52 am on November 13, 2024: none
  36. vasild approved
  37. vasild commented at 11:31 am on November 13, 2024: contributor
    ACK 9c5775c331e02dab06c78ecbb58488542d16dda7
  38. fanquake merged this on Nov 13, 2024
  39. fanquake closed this on Nov 13, 2024

  40. brunoerg deleted the branch on Nov 13, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me