test: Fuzz the human-readable part of bech32 as well #30623

pull l0rinc wants to merge 2 commits into bitcoin:master from l0rinc:l0rinc/bech32-testing changing 3 files +51 −26
  1. l0rinc commented at 7:56 am on August 10, 2024: contributor

    Split out of https://github.com/bitcoin/bitcoin/pull/30596/files#r1706957219

    Instead of the static “bc” human-readable part, it’s now randomly generated based on https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki and the extra restrictions in the code:

    The human-readable part, which is intended to convey the type of data, or anything else that is relevant to the reader. This part MUST contain 1 to 83 US-ASCII characters, with each character having a value in the range [33-126]. HRP validity may be further restricted by specific applications.

    Since bech32::Encode rejects uppercase letters, we’re actually generating values in the [33-126] - ['A'-'Z'] range.

  2. DrahtBot commented at 7:56 am on August 10, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK marcofleon, brunoerg

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Tests on Aug 10, 2024
  4. DrahtBot added the label CI failed on Aug 10, 2024
  5. l0rinc force-pushed on Aug 10, 2024
  6. DrahtBot removed the label CI failed on Aug 10, 2024
  7. l0rinc marked this as ready for review on Aug 18, 2024
  8. achow101 requested review from josibake on Oct 15, 2024
  9. achow101 requested review from marcofleon on Oct 15, 2024
  10. in src/test/fuzz/bech32.cpp:19 in 0c5bba5313 outdated
    21-    const auto r1 = bech32::Decode(random_string);
    22-    if (r1.hrp.empty()) {
    23-        assert(r1.encoding == bech32::Encoding::INVALID);
    24-        assert(r1.data.empty());
    25+    FuzzedDataProvider fdp(buffer.data(), buffer.size());
    26+    auto random_string = fdp.ConsumeRandomLengthString();
    


    brunoerg commented at 12:58 pm on October 17, 2024:
    In 0c5bba5313f6eff05676d25c9dca1b73c5a0c95b: I think we could limit the size of random_string based on its max possible size.

    l0rinc commented at 1:07 pm on October 17, 2024:
    Aren’t we introducing bias that way? Can you point me to a BIP or any doc that defines these max values (I don’t know about any, that’s why I’m asking)?

    sipa commented at 1:32 pm on October 19, 2024:

    BIP173/BIP350 specify the maximum length of bech32/bech32m strings as 90 characters.

    And “bias” isn’t really the right metric in fuzz tests, because fuzz tests aren’t trying to be uniformly random (they’re coverage driven, aiming to explore variations that trigger edge cases). Of course, ideally all (hard) cases are (easily) reachable from the fuzz input. If you’d actually restrict the input to 90 characters, then yes, the code for detecting too-long strings would never trigger. But it’s not unreasonable to for example limit it to slightly more than 90 to still exercise that code, or alternatively, have a separate test for (much-)too-long strings, because all of those are expected to be invalid, and cross-pollination between the corpora isn’t likely to be useful.


    l0rinc commented at 5:37 pm on October 19, 2024:

    BIP173/BIP350 specify the maximum length of bech32/bech32m strings as 90 characters.

    Yes, but https://en.bitcoin.it/wiki/BIP_0352 states that:

    Note: BIP173 imposes a 90 character limit for Bech32 segwit addresses and limits versions to 0 through 16, whereas a silent payment address requires at least 117 characters[12] and allows versions up to 31. Additionally, since higher versions may add to the data field, it is recommended implementations use a limit of 1023 characters (see BIP173: Checksum design for more details).

    Why do silent payment addresses need at least 117 characters? A silent payment address is a bech32m encoding comprised of the following parts: HRP [2-3 characters] separator [1 character] version [1-2 characters] payload, 66 bytes concatenated pubkeys [ceil(66*8/5) = 106 characters] checksum [6 characters] For a silent payments v0 address, this results in a 117-character address when using a 3-character HRP. Future versions of silent payment addresses may add to the payload, which is why a 1023-character limit is suggested.

    enum CharLimit only contains BECH32 for now, but we can add the 1023 limit later: https://github.com/bitcoin/bitcoin/compare/12f3f6d6d4b15276d9118f1a55f1a4f4b5d5d111..9b7023d31a3ec95f66b45f0ecb47e79762d74854

    have a separate test for (much-)too-long strings

    https://github.com/bitcoin/bitcoin/blob/master/src/test/bech32_tests.cpp#L131 is already testing that, so I’ll limit it to limit + 1, thanks for the hint @sipa.

    ps. is fuzzing still working on a mac?

    % cmake -B build_fuzz –preset=libfuzzer
    -DCMAKE_C_COMPILER="$(brew –prefix llvm)/bin/clang"
    -DCMAKE_CXX_COMPILER="$(brew –prefix llvm)/bin/clang++"
    -DAPPEND_LDFLAGS=-Wl,-no_warn_duplicate_libraries …

    % cmake –build build_fuzz -j10 … ld: multiple errors: invalid r_symbolnum=1 in ‘bitcoin/build_fuzz/src/test/fuzz/CMakeFiles/fuzz.dir/addition_overflow.cpp.o’; invalid r_symbolnum=1 in ‘bitcoin/build_fuzz/src/test/fuzz/CMakeFiles/fuzz.dir/fees.cpp.o’; invalid r_symbolnum=1 in ‘bitcoin/build_fuzz/src/test/fuzz/CMakeFiles/fuzz.dir/float.cpp.o’; invalid r_symbolnum=1 in ‘bitcoin/build_fuzz/src/test/fuzz/CMakeFiles/fuzz.dir/multiplication_overflow.cpp.o’; invalid r_symbolnum=1 in ‘../../libbitcoin_cli.a2’; invalid r_symbolnum=1 in ‘../../../libcrc32c.a3’; invalid r_symbolnum=1 in ‘../../../libcrc32c.a2’; invalid r_symbolnum=1 in ‘../../../libcrc32c_arm64.a2’; invalid r_symbolnum=5 in ‘../../util/libbitcoin_util.a29’; invalid r_symbolnum=18 in ‘../../crypto/libbitcoin_crypto_arm_shani.a2’; invalid r_symbolnum=1 in ‘../../crypto/libbitcoin_crypto.a15’; invalid r_symbolnum=1 in ‘../../crypto/libbitcoin_crypto.a10’; invalid r_symbolnum=1 in ‘../../crypto/libbitcoin_crypto.a5’; invalid r_symbolnum=1 in ‘../util/libtest_util.a12’; invalid r_symbolnum=1 in ‘../../util/libbitcoin_util.a27’; invalid r_symbolnum=1 in ‘../../util/libbitcoin_util.a24’; invalid r_symbolnum=1 in ‘../../util/libbitcoin_util.a16’; invalid r_symbolnum=1 in ‘../../util/libbitcoin_util.a15’; invalid r_symbolnum=1 in ‘../../util/libbitcoin_util.a14’; invalid r_symbolnum=1 in ‘../util/libtest_util.a4’; invalid r_symbolnum=1 in ‘../util/libtest_util.a3’; invalid r_symbolnum=1 in ‘../../util/libbitcoin_util.a9’; invalid r_symbolnum=1 in ‘../../util/libbitcoin_util.a6’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a84’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a79’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a68’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a63’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a55’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a54’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a51’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a50’; invalid r_symbolnum=1 in ‘../../libbitcoin_consensus.a11’; invalid r_symbolnum=1 in ‘../../libbitcoin_consensus.a5’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a43’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a41’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a40’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a39’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a36’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a47’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a34’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a31’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a42’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a41’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a28’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a23’; invalid r_symbolnum=1 in ‘../../libbitcoin_node.a22’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a30’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a29’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a24’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a21’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a20’; invalid r_symbolnum=1 in ‘../../libbitcoin_common.a16’; invalid r_symbolnum=1 in ‘../../../libleveldb.a37’; invalid r_symbolnum=1 in ‘../../../libleveldb.a35’; invalid r_symbolnum=1 in ‘../../../libleveldb.a31’; invalid r_symbolnum=1 in ‘../../../libleveldb.a27’; invalid r_symbolnum=1 in ‘../../../libleveldb.a8’; invalid r_symbolnum=1 in ‘../../../libleveldb.a2’ c


    sipa commented at 6:49 pm on October 19, 2024:
    Sure, but support for silent payment hasn’t been merged. Today, the codebase only supports bech32 as specified by bip173/bip350 (see https://github.com/bitcoin/bitcoin/blob/master/doc/bips.md).
  11. DrahtBot added the label CI failed on Oct 18, 2024
  12. Split out bech32 separator char to header c1a5d5c100
  13. l0rinc force-pushed on Oct 19, 2024
  14. Fuzz HRP of bech32 as well
    Also separated the roundtrip testing from the random string decoding for clarity
    
    Note that while BIP 173 claims:
    ```
    The human-readable part, which is intended to convey the type of data, or anything else that is relevant to the reader. This part MUST contain 1 to 83 US-ASCII characters, with each character having a value in the range [33-126]. HRP validity may be further restricted by specific applications.
    ```
    bech32::Encode rejects uppercase letters.
    9b7023d31a
  15. l0rinc force-pushed on Oct 19, 2024
  16. DrahtBot removed the label CI failed on Oct 19, 2024
  17. marcofleon commented at 3:09 pm on October 21, 2024: contributor

    Code review ACK 9b7023d31a3ec95f66b45f0ecb47e79762d74854. The separation into two targets and the new GenerateRandomHRP seem fine to me.

    Decode coverage Roundtrip coverage

  18. in src/bech32.h:24 in 9b7023d31a
    20@@ -21,8 +21,8 @@
    21 namespace bech32
    22 {
    23 
    24-/** The Bech32 and Bech32m checksum size */
    


    brunoerg commented at 11:09 am on October 28, 2024:
    In c1a5d5c100b1628456acfa6129e303737f0ad4d3: nit: Any specific reason to remove the comment?

    l0rinc commented at 11:37 am on October 28, 2024:
    it’s in bech32 and is called CHECKSUM_SIZE - what does the comment add to that?

    brunoerg commented at 11:39 am on October 28, 2024:
    Maybe that this value works for both bech32 and bech32m?

    l0rinc commented at 11:42 am on October 28, 2024:
    The exceptional case is when they’re different (see https://github.com/bitcoin/bitcoin/pull/30623/files#diff-9607135f3b89a188b92ae678f50685a3f65f4e8ef7832e653cc344ec623100dfR39), we assume otherwise that these are common.
  19. brunoerg commented at 11:30 am on October 28, 2024: contributor

    Decode coverage Roundtrip coverage

    I don’t see anything new in these reports compared to https://maflcko.github.io/b-c-cov/fuzz.coverage/src/index.html.

  20. l0rinc commented at 11:40 am on October 28, 2024: contributor

    I don’t see anything new in these reports compared to maflcko.github.io/b-c-cov/fuzz.coverage/src/index.html.

    Thanks for checking! The human readable part is variable now since in Silent Payments it’s not just bc anymore, but the covered lines should be the same (it’s not a perfect metric).

  21. brunoerg approved
  22. brunoerg commented at 11:51 am on October 28, 2024: contributor
    code review ACK 9b7023d31a3ec95f66b45f0ecb47e79762d74854

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-22 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me