test: add a few more base32/64 calculation corner cases #29847

pull paplorinc wants to merge 1 commits into bitcoin:master from paplorinc:paplorinc/base-32-64-padding-simplification changing 2 files +77 −5
  1. paplorinc commented at 12:43 pm on April 10, 2024: contributor
    Added tests for base32 and base 64 conversions.
  2. DrahtBot commented at 12:43 pm on April 10, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29071 (refactor: Remove Span operator==, Use std::ranges::equal by maflcko)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Refactoring on Apr 10, 2024
  4. paplorinc marked this as a draft on Apr 10, 2024
  5. paplorinc force-pushed on Apr 10, 2024
  6. paplorinc force-pushed on Apr 10, 2024
  7. DrahtBot added the label CI failed on Apr 10, 2024
  8. DrahtBot commented at 1:20 pm on April 10, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/23660462203

  9. paplorinc force-pushed on Apr 10, 2024
  10. paplorinc force-pushed on Apr 10, 2024
  11. fanquake commented at 3:26 pm on April 10, 2024: member

    https://github.com/bitcoin/bitcoin/pull/29847/checks?check_run_id=23664375509

     0Run base_encode_decode with args ['/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', PosixPath('/ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/base_encode_decode')]INFO: Running with entropic power schedule (0xFF, 100).
     1INFO: Seed: 1375269970
     2INFO: Loaded 1 modules   (578208 inline 8-bit counters): 578208 [0x55e843d75228, 0x55e843e024c8), 
     3INFO: Loaded 1 PC tables (578208 PCs): 578208 [0x55e843e024c8,0x55e8446d4ec8), 
     4INFO:     1342 files found in /ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/base_encode_decode
     5INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048564 bytes
     6util/strencodings.cpp:213:63: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'size_type' (aka 'unsigned long')
     7    [#0](/bitcoin-bitcoin/0/) 0x55e842a7e2c4 in DecodeBase32(std::basic_string_view<char, std::char_traits<char>>) src/util/strencodings.cpp:213
     8    [#1](/bitcoin-bitcoin/1/) 0x55e840ea6c48 in base_encode_decode_fuzz_target(Span<unsigned char const>) src/test/fuzz/base_encode_decode.cpp:34:19
     9    [#2](/bitcoin-bitcoin/2/) 0x55e841344a6d in std::function<void (Span<unsigned char const>)>::operator()(Span<unsigned char const>) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
    10    [#3](/bitcoin-bitcoin/3/) 0x55e841344a6d in LLVMFuzzerTestOneInput src/test/fuzz/fuzz.cpp:180:5
    11    [#4](/bitcoin-bitcoin/4/) 0x55e840c6cad4 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a62ad4) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    12    [#5](/bitcoin-bitcoin/5/) 0x55e840c6dd01 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a63d01) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    13    [#6](/bitcoin-bitcoin/6/) 0x55e840c6e2f7 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a642f7) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    14    [#7](/bitcoin-bitcoin/7/) 0x55e840c5b7ef in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a517ef) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    15    [#8](/bitcoin-bitcoin/8/) 0x55e840c85f16 in main (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a7bf16) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    16    [#9](/bitcoin-bitcoin/9/) 0x7f3697ff01c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 232274c0019767b821da1c6ebc2df43e60503035)
    17    [#10](/bitcoin-bitcoin/10/) 0x7f3697ff028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 232274c0019767b821da1c6ebc2df43e60503035)
    18    [#11](/bitcoin-bitcoin/11/) 0x55e840c507d4 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a467d4) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd)
    19
    20SUMMARY: UndefinedBehaviorSanitizer: unsigned-integer-overflow util/strencodings.cpp:213:63 
    21MS: 0 ; base unit: 0000000000000000000000000000000000000000
    

    https://github.com/bitcoin/bitcoin/actions/runs/8632800923/job/23664375055?pr=29847

    0 Assertion failed: (ToLower(encoded_string) == ToLower(TrimString(random_encoded_string))), function base_encode_decode_fuzz_target, file base_encode_decode.cpp, line 45.
    
  12. maflcko commented at 3:35 pm on April 10, 2024: member

    Thank you for your contribution. While this stylistic change makes sense on its own, it comes at a cost and risk for the project as a whole. The weak motivation for the change does not justify the burden that it places on the project. A burden could be any of the following:

    • Time spent on review
    • Accidental introduction of bugs
    • (Silent) merge conflicts, either in the branch or a backport branch. Those conflicts demand further developer and reviewer time or introduce bugs.

    For more information about refactoring changes and stylistic cleanup, see

    Generally, if the style is not mentioned nor enforced by the developer notes, we leave it up to the original author to pick whatever fits them best personally and then leave it that way until the line is touched for other reasons.

    Leave a comment below, if you have any questions.

  13. paplorinc force-pushed on Apr 10, 2024
  14. paplorinc commented at 4:16 pm on April 10, 2024: contributor

    Thanks for the review @fanquake and @maflcko - though it was still in draft form because of the unsigned problems.

    ~I didn’t mean the change as a purely stylistic refactoring.~

    ~In the encodings the original str.reserve doesn’t enforce the final size, but after the change it’s obvious that it’s an exact calculation, not just an upper bound, i.e. while (str.size() % 4) str += '='; vs str.append(size - str.size(), '=');.~

    ~In the second change the point was to unify the suffix removal with other trim algos (e.g. TrimStringView). This would unify the 32 and 64 bit algos and obviate the inconsistency between the 32 and 64 base conversions for paddings (i.e. roundtrip tests aren’t working without padding).~

    ~If you think the changes are controversial, I can commit the test and the proposed changes separately.~

  15. DrahtBot removed the label CI failed on Apr 10, 2024
  16. paplorinc force-pushed on Apr 10, 2024
  17. paplorinc force-pushed on Apr 10, 2024
  18. paplorinc commented at 9:43 pm on April 10, 2024: contributor
    @maflcko, you were right, the changes didn’t add enough value, I’ve reverted them. I’ve kept the tests, since I think they’re valuable and unified the style between base 32 and 64 in the source. If you think that’s still too much, I’ll revert it.
  19. paplorinc renamed this:
    refactor: Simplify base32/64 padding calculations
    test: add a few more base32/64 calculation corner cases
    on Apr 10, 2024
  20. paplorinc renamed this:
    test: add a few more base32/64 calculation corner cases
    refactor/test: add a few more base32/64 calculation corner cases
    on Apr 10, 2024
  21. paplorinc marked this as ready for review on Apr 10, 2024
  22. maflcko commented at 9:28 am on April 11, 2024: member

    Please make sure to re-read the whole comment I shared and make sure to read all pages it links to. The burden to clearly motivate a change (and the review effort it requires) is on the pull request author (in this case here it would be you).

    Open Source does not work by doing random unmotivated changes, opening a pull request, seeing it fail CI and tests, then adjust it in a ping-pong fashion to make CI green and go back and forth with reviewers until they have essentially written the whole changeset themselves.

    At a minimum a (refactoring) pull request should have:

    • A clear motivation
    • Be correct and easy to review
    • Have passing tests and CI

    If any of the above do not apply, a refactoring pull request should not be opened in the first place.

  23. paplorinc commented at 9:31 am on April 11, 2024: contributor
    @maflcko, which of the listed motivation/review/passing CI doesn’t apply here?
  24. maflcko commented at 10:05 am on April 11, 2024: member

    All of it.

    • “unified … code” is not a motivation, but a style choice
    • The CI didn’t pass on several tries
    • Review doesn’t seem worth it, unless there is a motivation
  25. in src/test/base32_tests.cpp:5 in 9c1ae8af1e outdated
    1@@ -2,10 +2,11 @@
    2 // Distributed under the MIT software license, see the accompanying
    3 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
    4 
    5+#include <string>
    


    maflcko commented at 10:06 am on April 11, 2024:
    This is the wrong style in any case, see the dev notes

    paplorinc commented at 10:08 am on April 11, 2024:
    I can reorder them if you think it’s worth it. If you think these contributions are worthless I’ll stop.

    maflcko commented at 10:10 am on April 11, 2024:

    I can reorder them if you think it’s worth it.

    I don’t understand what you are trying to say.

    I am saying that the order was correct before and you changed it to be wrong.


    paplorinc commented at 10:13 am on April 11, 2024:
    I added a new import and the imports were reorganized as such. If you don’t like the order, I can change it, I don’t mind, though I would prefer the CI telling me these in the PR instead of wasting time reviewing styles.

    paplorinc commented at 10:15 am on April 11, 2024:

    I don’t understand what you are trying to say.

    I’m trying to find out if my contributions are welcome, I’ve only received hostility so far in return for my invested effort. It’s a decentralized, open source project, I’ll contribute what I can and want. If that’s not welcome, I’ll find a different project where my optimizations are welcome.


    maflcko commented at 10:28 am on April 11, 2024:

    I’m trying to find out if my contributions are welcome, I’ve only received hostility so far in return for my invested effort.

    Contributions are welcome and contributors are needed. However, refactoring changes are usually the wrong pick for a newcomer to tackle. This is explained https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#refactoring


    maflcko commented at 10:29 am on April 11, 2024:

    See also https://bitcoincore.academy/

    If you are looking for useful contributions to help out with, you can

    • Search through the good first issues or the ones that are up for grabs. Some of them might no longer be applicable. So if you are interested, but unsure, you might want to leave a comment on the issue first.
    • Write tests to improve the coverage. Any kind of test is welcome and coverage information can be obtained from a relatively recent coverage report. If you are unsure, don’t hesitate to check back first.
    • Help with review and testing. There are easy ones such as the gui and rpc. However, review on any open pull request is welcome. Review will also help you understand the codebase better.
    • Help on meta projects related to Bitcoin Core, such as https://github.com/corecheck.
    • Join the Bitcoin Core IRC channel(s) and leave a comment to share what you are interested in.

    paplorinc commented at 10:32 am on April 11, 2024:
    I was looking for introductory issues, couldn’t find any that was a good fit for me. But just looking at the code I found several optimization opportunities in relevant parts of the code, so I contributed those instead, e.g. the 4x speedup of base58.

    paplorinc commented at 10:38 am on April 11, 2024:
    Do you consider optimizations useless refactorings? Other projects I’ve worked on consider them features. For me it’s a gentle introduction to the codebase, isolated from other complexities, I poke it left-and-right and see how the code behaves. Is this not a valuable contribution? Especially since I usually cover the corner cases I find with new test cases.

    maflcko commented at 11:24 am on April 11, 2024:

    Tests are generally a good way to start contributing.

    Just generally for each pull request, it needs review before merge, and each reviewer will have to trade-off whether it is worth it to review. There are 316 other pull requests open waiting for review, so that is another thing for reviewers to trade-off on.

  26. paplorinc force-pushed on Apr 11, 2024
  27. in src/util/strencodings.cpp:184 in c6ee976cd9 outdated
    183     ConvertBits<8, 5, true>([&](int v) { str += pbase32[v]; }, input.begin(), input.end());
    184     if (pad) {
    185-        while (str.size() % 8) {
    186-            str += '=';
    187-        }
    188+        while (str.size() % 8) str += '=';
    


    maflcko commented at 6:29 am on April 12, 2024:
    Please no style changes of this kind

    paplorinc commented at 7:28 am on April 12, 2024:
    Reverted. I’ve changed it originally to be in alignment with the EncodeBase64
  28. paplorinc force-pushed on Apr 12, 2024
  29. laanwj commented at 5:47 am on April 16, 2024: member

    Thanks for your first contribution!

    As it is hard to review PRs that change things all over the place, this is unlikely to be merged like this. I’d suggest keeping the code change focused. For example, make it add test cases only, and remove the changes (refactoring and otherwise) to strecodings.cpp and other non-test code.

  30. laanwj added the label Tests on Apr 16, 2024
  31. paplorinc commented at 7:05 am on April 16, 2024: contributor
    Hey @laanwj, thanks for your first review. Please see the commits separately, don’t worry, they’re not all over the place.
  32. in src/util/strencodings.cpp:226 in 8ecae4ddb4 outdated
    220@@ -221,9 +221,9 @@ std::optional<std::vector<unsigned char>> DecodeBase32(std::string_view str)
    221     std::vector<unsigned char> ret;
    222     ret.reserve((str.size() * 5) / 8);
    223     bool valid = ConvertBits<5, 8, false>(
    224-        [&](unsigned char c) { ret.push_back(c); },
    225+        [&](auto c) { ret.push_back(c); },
    226         str.begin(), str.end(),
    227-        [](char c) { return decode32_table[uint8_t(c)]; }
    228+        [](auto c) { return decode32_table[uint8_t(c)]; }
    


    maflcko commented at 7:09 am on April 16, 2024:

    Why this change? This style change makes the code harder to read.

    Unless there is a reason to change the code and review it, it shouldn’t be changed.


    paplorinc commented at 7:14 am on April 16, 2024:
    I’ve changed it to unify it with the previous lambda. Why would I leave the type there when I’ve used auto 2 lines above?

    maflcko commented at 7:39 am on April 16, 2024:
    My comment applies to all changes in this file. I am just trying to explain why such style changes do not make sense.

    paplorinc commented at 9:52 am on April 16, 2024:
    Removed the 3/4 to 6/8 change, kept only the test. I don’t yet understand how we expect this code to become more and more maintainable, if we don’t regularly leave the campground cleaner than we’ve found it.

    maflcko commented at 10:07 am on April 16, 2024:

    I don’t see how using auto makes anything clearer.

    There certainly are style and language features that make maintenance easier, but they need to be well motivated and explained. Also, if there is a good reason for them, they should be applied globally via CI and checks. Otherwise, they are again applied inconsistently.


    paplorinc commented at 10:14 am on April 16, 2024:

    I agree. Is there any solution?

    Is there any non-test change that I could delve into—preferably an optimization that doesn’t require me to understand every part of the code—that would be welcome? For example, like my other base58 or bech32 speedups. Or am I just barking up the wrong tree and should move on to other projects?

  33. maflcko commented at 7:11 am on April 16, 2024: member
    @paplorinc If you are not being receptive to reviewers’ feedback, it is unlikely that anyone will review the pull request and without review it will not be merged.
  34. paplorinc commented at 7:16 am on April 16, 2024: contributor
    @maflcko, I’ve responded to every single comment within minutes. I just don’t appreciate the patronizing style I keep seeing in this repo.
  35. laanwj commented at 7:43 am on April 16, 2024: member
    Take it how you wish, i was just trying to be helpful, not patronizing.
  36. maflcko commented at 7:45 am on April 16, 2024: member

    Please understand that this software project requires review of all code changes. It is not a typical software project that auto-merges any change as soon as the CI is green.

    When writing a patch as a pull request, please consider how reviewers are supposed to review it and how easy it will be for them to follow the changes and their motivation.

  37. paplorinc force-pushed on Apr 16, 2024
  38. paplorinc renamed this:
    refactor/test: add a few more base32/64 calculation corner cases
    test: add a few more base32/64 calculation corner cases
    on Apr 16, 2024
  39. fanquake commented at 10:17 am on April 16, 2024: member

    I don’t yet understand how we expect this code to become more and more maintainable, if we don’t regularly leave the campground cleaner than we’ve found it.

    The issue here is that even if you’ve found the campground, and while maybe it looks like it could use a little maintenance, if you zoom out for a birds-eye view, you’ll actually notice that everyone else is busy somewhere next door, trying to fight the forest fires. Even if the code here is (maybe arbitrarily) “better” or “faster” by some metric, the impact to the whole project is ~0. In fact, it could even be argued that it’s a net negative, given the burden of changing and reviewing the code, probably doesn’t outweigh distracting engineers working on more important things and/or actually breaking something that already works (and will likely continue to work for many many years) + all the time spent here.

    I understand that as a new contributor, you’re looking for things to do, and picking random utility functions/code, and trying to generically improve it may seem easiest / appealing, and potentially something that other open source projects will happily take, however that isn’t how things work here.

    I’d suggest looking through https://github.com/bitcoin/bitcoin/issues (you can filter for “good first issues”), and either picking a bug that needs fixing, and working on fixing it, or, looking through, https://github.com/bitcoin/bitcoin/pulls, and finding something that interests you, and leaving some meaningful review comments. These will be much more productive avenues for you to become a productive contributor to this project. You should also familiarise yourself with https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md and the https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md.

  40. paplorinc commented at 10:25 am on April 16, 2024: contributor

    Thanks for the detailed answer @fanquake.

    everyone else is busy […] trying to fight the forest fires

    So how do we prevent future forest fires while extinguishing current ones?

    I have looked through good first issue since the beginning, there’s barely any, all of which were started by others, often years ago. That’s why I started working on what I can contribute instead.

  41. maflcko commented at 10:49 am on April 16, 2024: member

    Another great way to contribute would be to spend time on in-depth review of open pull requests and compare your review with the review done by others. Obviously this will be time-consuming, starting out, but if you care about the project, over time it will teach you relevant skills (for this project and for yourself to apply elsewhere).

    On some pull requests, there remain small follow-ups that were found during review, so you’ll naturally find stuff to work on if you start out by reviewing pull requests done by others. Another benefit of this is, that if you end up creating a follow-up, the code will be fresh in the minds of some developers, who may be willing to spend time on review.

  42. paplorinc force-pushed on May 11, 2024
  43. paplorinc force-pushed on May 29, 2024
  44. Add base32 and base64 roundtrip random padding tests
    The new randomized roundtrips for base32 and base64 encoding make sure encoding and decoding are symmetric.
    Note that if we omit the padding in EncodeBase32 we won't be able to decode it with DecodeBase32.
    Added dedicated padding tests to hardcode the failure behavior.
    5323c2c9c9
  45. in src/test/base64_tests.cpp:66 in fb03332664 outdated
    65+
    66+BOOST_AUTO_TEST_CASE(base64_roundtrip) {
    67+    SeedInsecureRand();
    68+
    69+    for (int i = 0; i < 100; ++i) {
    70+        auto randomStr = InsecureRand256().ToString().substr(0, InsecureRandRange(64));
    


    maflcko commented at 8:40 am on May 29, 2024:
    nit: Not sure about using the global for new code. I’d say it is cleaner to replace the SeedInsecureRand with a local FastRandomContext and then call the member methods directly. E.g frc.randrange(64), etc

    paplorinc commented at 9:26 am on May 29, 2024:
    Nice, thanks!
  46. paplorinc force-pushed on May 29, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-29 07:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me