fuzz: Make ConsumeNetAddr always produce valid onion addresses #26497

pull dergoegge wants to merge 3 commits into bitcoin:master from dergoegge:2022-11-fuzz-valid-onion-addrs changing 12 files +63 −28
  1. dergoegge commented at 2:05 PM on November 14, 2022: member

    The chance that the fuzzer is able to guess a valid onion address is probably slim, as they are Base32 encoded and include a checksum. Right now, any target using ConsumeNetAddr would have a hard time uncovering bugs that require valid onion addresses as input.

    This PR makes ConsumeNetAddr produce valid onion addresses by using the 32 bytes given by the fuzzer as the pubkey for the onion address and forming a valid address according to the torv3 spec.

  2. dergoegge force-pushed on Nov 14, 2022
  3. in src/test/fuzz/util.cpp:530 in 4e1bc19712 outdated
     524 | @@ -524,7 +525,28 @@ CNetAddr ConsumeNetAddr(FuzzedDataProvider& fuzzed_data_provider) noexcept
     525 |      } else if (network == Network::NET_INTERNAL) {
     526 |          net_addr.SetInternal(fuzzed_data_provider.ConsumeBytesAsString(32));
     527 |      } else if (network == Network::NET_ONION) {
     528 | -        net_addr.SetSpecial(fuzzed_data_provider.ConsumeBytesAsString(32));
     529 | +        auto pub_key{fuzzed_data_provider.ConsumeBytes<uint8_t>(torv3::ADDR_SIZE)};
     530 | +        if (pub_key.size() > 0 && pub_key[0] & 1) {
     531 | +            pub_key.resize(32);
    


    vasild commented at 2:58 PM on November 14, 2022:

    nit:

                pub_key.resize(torv3::ADDR_SIZE);
    
  4. in src/test/fuzz/util.cpp:537 in 4e1bc19712 outdated
     533 | +            // being a fuzz blocker.
     534 | +            uint8_t check_sum[torv3::CHECKSUM_LEN];
     535 | +            torv3::Checksum({pub_key.data(), pub_key.size()}, check_sum);
     536 | +
     537 | +            std::vector<uint8_t> onion_addr;
     538 | +            onion_addr.insert(onion_addr.end(), pub_key.begin(), pub_key.end());
    


    vasild commented at 3:01 PM on November 14, 2022:

    nit:

                std::vector<uint8_t> onion_addr{pub_key};
    
  5. in src/test/fuzz/util.cpp:545 in 4e1bc19712 outdated
     541 | +            onion_addr.push_back(torv3::VERSION[0]);
     542 | +            assert(onion_addr.size() == torv3::TOTAL_LEN);
     543 | +
     544 | +            auto encoded{EncodeBase32({onion_addr.data(), onion_addr.size()})};
     545 | +            encoded += ".onion";
     546 | +            net_addr.SetSpecial(encoded);
    


    vasild commented at 3:03 PM on November 14, 2022:

    To ensure we actually did what we intended:

                const bool ok = net_addr.SetSpecial(encoded);
                assert(ok);
    
  6. in src/test/fuzz/util.cpp:548 in 4e1bc19712 outdated
     544 | +            auto encoded{EncodeBase32({onion_addr.data(), onion_addr.size()})};
     545 | +            encoded += ".onion";
     546 | +            net_addr.SetSpecial(encoded);
     547 | +        } else {
     548 | +            std::string addr(reinterpret_cast<const std::string::value_type*>(pub_key.data()), pub_key.size());
     549 | +            net_addr.SetSpecial(addr);
    


    vasild commented at 3:17 PM on November 14, 2022:

    SetSpecial() expects a string like 5g72ppm3krkorsfopcm2bi7wlv4ohhs4u4mlseymasn7g7zhdcyjpfid.onion and this code will feed it a binary string (containing non-printable chars, maybe also '\0') of length 32. It will never succeed because 32 is too short.

    I think that the branch } else if (network == Network::NET_ONION) { should always do its best to generate a valid Tor address. Maybe append Network::NET_UNROUTABLE to the array at the start of ConsumeNetAddr() and don't do anything in that case (leave net_addr default constructed, which is !IsValid()) in order to generate invalid results some of the time.


    dergoegge commented at 11:35 AM on November 15, 2022:

    I wanted to preserve the previous behavior but you are right that will never work in any case because of the length.

    Now it only ever produces valid onion addresses. I didn't add NET_UNROUTABLE as i think that would invalidate all our corpuses that make use of ConsumeNetAddr or at least make them less useful (not sure how much we would/should care about that).

  7. DrahtBot added the label Tests on Nov 14, 2022
  8. vasild commented at 3:28 PM on November 14, 2022: contributor

    Approach ACK 4e1bc1971245714735eff70700dfd0f336169db5

  9. maflcko approved
  10. maflcko commented at 9:18 AM on November 15, 2022: member

    Good catch.

    Nit: Maybe move it to a separate fuzz util helper? It is used in the following tests:

    src/test/fuzz/addrman.cpp
    src/test/fuzz/banman.cpp
    src/test/fuzz/connman.cpp
    src/test/fuzz/netaddress.cpp
    src/test/fuzz/netbase_dns_lookup.cpp
    

    and could be placed in a new src/test/fuzz/util/net.h?

  11. dergoegge force-pushed on Nov 15, 2022
  12. dergoegge commented at 11:38 AM on November 15, 2022: member

    Thanks for the reviews @vasild @MarcoFalke!

    Addressed all the comments and added a new commit for moving ConsumeNetAddr to src/test/fuzz/util/net.h. There are more net related utils that could move there in a future PR (also happy to do that in this PR if preferred).

  13. dergoegge renamed this:
    fuzz: Make ConsumeNetAddr (sometimes) produce valid onion addresses
    fuzz: Make ConsumeNetAddr always produce valid onion addresses
    on Nov 15, 2022
  14. vasild approved
  15. vasild commented at 12:06 PM on November 15, 2022: contributor

    ACK 60523801db69efccbe5e997237dc41a9684bc3a2

  16. in src/test/fuzz/util.cpp:544 in 3afc18f75e outdated
     540 | +        assert(onion_addr.size() == torv3::TOTAL_LEN);
     541 | +
     542 | +        auto encoded{EncodeBase32({onion_addr.data(), onion_addr.size()})};
     543 | +        encoded += ".onion";
     544 | +        const bool ok{net_addr.SetSpecial(encoded)};
     545 | +        assert(ok);
    


    brunoerg commented at 8:28 PM on November 15, 2022:

    wouldn't it be interesting assert that the net_addr is valid (IsValid)?


    dergoegge commented at 12:43 PM on November 16, 2022:

    I don't think so. Looking at CNetAddr::IsValid, there are no validity checks regarding Tor addresses, so even if a Tor address is malformed IsValid would return true.


    brunoerg commented at 1:00 PM on November 16, 2022:

    Interesting, thanks! I didn't notice that IsValid isn't able to check Tor addresses.

  17. DrahtBot commented at 12:44 PM on November 16, 2022: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK vasild, brunoerg
    Stale ACK MarcoFalke
  18. in src/util/torv3.h:20 in 834426d9b0 outdated
      15 | +namespace torv3 {
      16 | +// https://gitweb.torproject.org/torspec.git/tree/rend-spec-v3.txt#n2135
      17 | +static constexpr size_t CHECKSUM_LEN = 2;
      18 | +static const unsigned char VERSION[] = {3};
      19 | +static constexpr size_t ADDR_SIZE{32};
      20 | +static constexpr size_t TOTAL_LEN = ADDR_SIZE + CHECKSUM_LEN + sizeof(VERSION);
    


    maflcko commented at 1:50 PM on November 16, 2022:

    Nit in the first commit: Maybe use {} for new code instead of =?

  19. in src/netaddress.cpp:15 in 834426d9b0 outdated
      11 | @@ -12,6 +12,7 @@
      12 |  #include <tinyformat.h>
      13 |  #include <util/strencodings.h>
      14 |  #include <util/string.h>
      15 | +#include <util/torv3.h>
      16 |  
    


    maflcko commented at 1:51 PM on November 16, 2022:

    nit in the first commit: Maybe static_assert that ADDR_TORV3_SIZE is equal to torv3::ADDR_SIZE?


    vasild commented at 2:19 PM on November 16, 2022:

    torv3::ADDR_SIZE was introduced in the "move-only" commit. I think it is ok, but why did you actually add a new variable?


    dergoegge commented at 2:24 PM on November 16, 2022:

    I didn't want util/torv3.h to have netaddress.h as dependency.

  20. in src/test/fuzz/util.cpp:533 in 3afc18f75e outdated
     529 | +        auto pub_key{fuzzed_data_provider.ConsumeBytes<uint8_t>(torv3::ADDR_SIZE)};
     530 | +        pub_key.resize(torv3::ADDR_SIZE);
     531 | +        // Create valid onion address to prevent the address checksum from
     532 | +        // being a fuzz blocker.
     533 | +        uint8_t check_sum[torv3::CHECKSUM_LEN];
     534 | +        torv3::Checksum({pub_key.data(), pub_key.size()}, check_sum);
    


    maflcko commented at 1:59 PM on November 16, 2022:

    nit in 3afc18f75e6d904f1e6eae85317d27b462c9eecb: Why not pass (pub_key, check_sum)?


    brunoerg commented at 1:03 PM on November 17, 2022:

    I had same question from @MarcoFalke, seems similar toOnionToString:

    std::string OnionToString(Span<const uint8_t> addr)
    {
        uint8_t checksum[torv3::CHECKSUM_LEN];
        torv3::Checksum(addr, checksum);
        // TORv3 onion_address = base32(PUBKEY | CHECKSUM | VERSION) + ".onion"
        prevector<torv3::TOTAL_LEN, uint8_t> address{addr.begin(), addr.end()};
        address.insert(address.end(), checksum, checksum + torv3::CHECKSUM_LEN);
        address.insert(address.end(), torv3::VERSION, torv3::VERSION + sizeof(torv3::VERSION));
        return EncodeBase32(address) + ".onion";
    }
    

    vasild commented at 1:13 PM on November 17, 2022:

    Oh, well, I was unaware of OnionToString(). The newly added code is a duplicate of that. @jonatack has done a good job isolating it. Now it can be reused, cool!


    dergoegge commented at 1:23 PM on November 17, 2022:

    Was also unaware of OnionToString. Looks like the better approach here would be to make that public in netaddress.h instead of moving the torv3 utils like i am doing here? Let me know if y'all prefer that (thumbs up/thumbs down).

  21. in src/test/fuzz/util.cpp:541 in 3afc18f75e outdated
     537 | +        onion_addr.push_back(check_sum[0]);
     538 | +        onion_addr.push_back(check_sum[1]);
     539 | +        onion_addr.push_back(torv3::VERSION[0]);
     540 | +        assert(onion_addr.size() == torv3::TOTAL_LEN);
     541 | +
     542 | +        auto encoded{EncodeBase32({onion_addr.data(), onion_addr.size()})};
    


    maflcko commented at 2:00 PM on November 16, 2022:

    Same.

    (If needed, you can call MakeUCharSpan)

  22. maflcko approved
  23. maflcko commented at 2:01 PM on November 16, 2022: member

    tested at runtime that this SetSpecial always fails on master with this diff:

    diff --git a/src/test/fuzz/util.cpp b/src/test/fuzz/util.cpp
    index d495a6bfe3..74cddd037d 100644
    --- a/src/test/fuzz/util.cpp
    +++ b/src/test/fuzz/util.cpp
    @@ -524,7 +524,7 @@ CNetAddr ConsumeNetAddr(FuzzedDataProvider& fuzzed_data_provider) noexcept
         } else if (network == Network::NET_INTERNAL) {
             net_addr.SetInternal(fuzzed_data_provider.ConsumeBytesAsString(32));
         } else if (network == Network::NET_ONION) {
    -        net_addr.SetSpecial(fuzzed_data_provider.ConsumeBytesAsString(32));
    +        Assert(!net_addr.SetSpecial(fuzzed_data_provider.ConsumeBytesAsString(32)));
         }
         return net_addr;
     }
    

    review ACK 60523801db69efccbe5e997237dc41a9684bc3a2 🍦

    <details><summary>Show signature</summary>

    Signature:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA512
    
    review ACK 60523801db69efccbe5e997237dc41a9684bc3a2 🍦
    -----BEGIN PGP SIGNATURE-----
    
    iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
    pUjUcwv+LS3rIkpeUS3dDy6+pcvtTGUM5DUcyT+gYPxd+1vGujS1wk20fN5JqE+n
    ILMK5kgtRcQe/EV8yW9coVjWwiMJ2F5SGuJnw0+1ZeO7B6sCAsgpg7ouLoDyedbv
    H2HK3B8gtsu82SYaWP22Lh3lM5wASWeKdT3lrn+eBHTWBQd4A0+446h3cfM9h0sM
    RFtDdwr21vb5zQoq3KgbN6l8+RQMzn29QnFWYxPkS08pyqVacib4m+htZxFo3XWs
    Sez8d/YTBmg63YC6AdZ7Oz40icJsb75WJwNQxmRQLXpxlwkAaJYchtEKAJ0HRtZJ
    Uqz+OAwnlAkOIypfTH3yDqIFE9I2KbE5kvMlVbUfEnSJ9tVSAjTCgKw8chPMm6iW
    faFSf9fUc0ob0PiJlTXvplG444Sy4iT5C+m2+7LVfz2atVF6bpIUbs2rtGfZ3RWr
    dgv6TeLPAzjmUemug9KBvog4P5DhUYQWghwa6NG75pEkv68ulciyadlN5jNQtZCG
    AfEYffWE
    =yuEE
    -----END PGP SIGNATURE-----
    

    </details>

  24. vasild commented at 2:21 PM on November 16, 2022: contributor

    tested at runtime that this always fails on master ... @MarcoFalke, do you mean that it always succeeds? I mean, SetSpecial() will always return false, so !SetSpecial() will always be true, so the assert will never fail, right? Edit: or maybe you mean "this always fails" --> "SetSpecial() always fails".

  25. dergoegge force-pushed on Nov 17, 2022
  26. vasild approved
  27. vasild commented at 1:06 PM on November 17, 2022: contributor

    ACK 78b6372863a4b22023a7d141ea5c854a67d41363

  28. brunoerg approved
  29. brunoerg commented at 1:09 PM on November 17, 2022: contributor

    crACK 78b6372863a4b22023a7d141ea5c854a67d41363

  30. [netaddress] Make OnionToString public c9ba3f836e
  31. [fuzz] Make ConsumeNetAddr produce valid onion addresses 291c8697d4
  32. [fuzz] Move ConsumeNetAddr to fuzz/util/net.h 0eeb9b0442
  33. dergoegge force-pushed on Nov 17, 2022
  34. dergoegge commented at 6:01 PM on November 17, 2022: member

    Changed the approach a bit to use OnionToString. Thanks @brunoerg, @vasild and @MarcoFalke for the reviews!

  35. in src/netaddress.h:114 in 0eeb9b0442
     110 | @@ -111,6 +111,8 @@ static constexpr size_t ADDR_INTERNAL_SIZE = 10;
     111 |  /// SAM 3.1 and earlier do not support specifying ports and force the port to 0.
     112 |  static constexpr uint16_t I2P_SAM31_PORT{0};
     113 |  
     114 | +std::string OnionToString(Span<const uint8_t> addr);
    


    vasild commented at 12:37 PM on November 21, 2022:

    Since this is now public, it would be nice to document it. Something like:

    /**
     * Generate the ".onion" address which corresponds to a public key.
     * [@param](/bitcoin-bitcoin/contributor/param/)[in] addr public key bytes, must be 32.
     * [@returns](/bitcoin-bitcoin/contributor/returns/) for example 5g72ppm3krkorsfopcm2bi7wlv4ohhs4u4mlseymasn7g7zhdcyjpfid.onion
     */
    
  36. vasild approved
  37. vasild commented at 12:42 PM on November 21, 2022: contributor

    ACK 0eeb9b0442fb2f2da33c04704eefe6a84d28e981

  38. brunoerg approved
  39. brunoerg commented at 12:44 PM on November 21, 2022: contributor

    ACK 0eeb9b0442fb2f2da33c04704eefe6a84d28e981

  40. maflcko merged this on Nov 21, 2022
  41. maflcko closed this on Nov 21, 2022

  42. maflcko commented at 3:05 PM on November 21, 2022: member

    Btw, if there is value in breaking the inputs (https://github.com/bitcoin/bitcoin/pull/26497#discussion_r1022681025), it can be done, because fuzzing is running 24/7 anyway, so any "lost" inputs should be re-discovered eventually.

  43. sidhujag referenced this in commit c37a9c2848 on Nov 21, 2022
  44. vasild commented at 4:02 PM on November 21, 2022: contributor

    A natural followup to this would be to also generate I2P and CJDNS addresses. That needs to modify the network array anyway. At the same time some dummy value can be added to it to return an invalid address. So, invalidate the inputs just once.

  45. vasild commented at 5:15 PM on January 9, 2023: contributor

    A natural followup to this would be to also generate I2P and CJDNS addresses.

    Done in #26859.

  46. bitcoin locked this on Jan 9, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-28 21:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me