test: add a few more base32/64 calculation corner cases #29847
pull l0rinc wants to merge 1 commits into bitcoin:master from l0rinc:paplorinc/base-32-64-padding-simplification changing 2 files +77 −5-
l0rinc commented at 12:43 pm on April 10, 2024: contributorAdded tests for base32 and base 64 conversions.
-
DrahtBot commented at 12:43 pm on April 10, 2024: contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Code Coverage
For detailed information about the code coverage, see the test coverage report.
Reviews
See the guideline for information on the review process. A summary of reviews will appear here.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #29119 (refactor: Use std::span over Span by maflcko)
- #29071 (refactor: Remove Span operator==, Use std::ranges::equal by maflcko)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
-
DrahtBot added the label Refactoring on Apr 10, 2024
-
l0rinc marked this as a draft on Apr 10, 2024
-
l0rinc force-pushed on Apr 10, 2024
-
l0rinc force-pushed on Apr 10, 2024
-
DrahtBot added the label CI failed on Apr 10, 2024
-
DrahtBot commented at 1:20 pm on April 10, 2024: contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
-
l0rinc force-pushed on Apr 10, 2024
-
l0rinc force-pushed on Apr 10, 2024
-
fanquake commented at 3:26 pm on April 10, 2024: member
https://github.com/bitcoin/bitcoin/pull/29847/checks?check_run_id=23664375509
0Run base_encode_decode with args ['/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', PosixPath('/ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/base_encode_decode')]INFO: Running with entropic power schedule (0xFF, 100). 1INFO: Seed: 1375269970 2INFO: Loaded 1 modules (578208 inline 8-bit counters): 578208 [0x55e843d75228, 0x55e843e024c8), 3INFO: Loaded 1 PC tables (578208 PCs): 578208 [0x55e843e024c8,0x55e8446d4ec8), 4INFO: 1342 files found in /ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/base_encode_decode 5INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048564 bytes 6util/strencodings.cpp:213:63: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'size_type' (aka 'unsigned long') 7 [#0](/bitcoin-bitcoin/0/) 0x55e842a7e2c4 in DecodeBase32(std::basic_string_view<char, std::char_traits<char>>) src/util/strencodings.cpp:213 8 [#1](/bitcoin-bitcoin/1/) 0x55e840ea6c48 in base_encode_decode_fuzz_target(Span<unsigned char const>) src/test/fuzz/base_encode_decode.cpp:34:19 9 [#2](/bitcoin-bitcoin/2/) 0x55e841344a6d in std::function<void (Span<unsigned char const>)>::operator()(Span<unsigned char const>) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9 10 [#3](/bitcoin-bitcoin/3/) 0x55e841344a6d in LLVMFuzzerTestOneInput src/test/fuzz/fuzz.cpp:180:5 11 [#4](/bitcoin-bitcoin/4/) 0x55e840c6cad4 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a62ad4) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd) 12 [#5](/bitcoin-bitcoin/5/) 0x55e840c6dd01 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a63d01) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd) 13 [#6](/bitcoin-bitcoin/6/) 0x55e840c6e2f7 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a642f7) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd) 14 [#7](/bitcoin-bitcoin/7/) 0x55e840c5b7ef in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a517ef) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd) 15 [#8](/bitcoin-bitcoin/8/) 0x55e840c85f16 in main (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a7bf16) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd) 16 [#9](/bitcoin-bitcoin/9/) 0x7f3697ff01c9 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 232274c0019767b821da1c6ebc2df43e60503035) 17 [#10](/bitcoin-bitcoin/10/) 0x7f3697ff028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 232274c0019767b821da1c6ebc2df43e60503035) 18 [#11](/bitcoin-bitcoin/11/) 0x55e840c507d4 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a467d4) (BuildId: 090547d582a1716a2b3f214f309a94f43fbb1ebd) 19 20SUMMARY: UndefinedBehaviorSanitizer: unsigned-integer-overflow util/strencodings.cpp:213:63 21MS: 0 ; base unit: 0000000000000000000000000000000000000000
https://github.com/bitcoin/bitcoin/actions/runs/8632800923/job/23664375055?pr=29847
0 Assertion failed: (ToLower(encoded_string) == ToLower(TrimString(random_encoded_string))), function base_encode_decode_fuzz_target, file base_encode_decode.cpp, line 45.
-
maflcko commented at 3:35 pm on April 10, 2024: member
Thank you for your contribution. While this stylistic change makes sense on its own, it comes at a cost and risk for the project as a whole. The weak motivation for the change does not justify the burden that it places on the project. A burden could be any of the following:
- Time spent on review
- Accidental introduction of bugs
- (Silent) merge conflicts, either in the branch or a backport branch. Those conflicts demand further developer and reviewer time or introduce bugs.
For more information about refactoring changes and stylistic cleanup, see
- https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#refactoring
- https://github.com/bitcoin/bitcoin/blob/master/.github/PULL_REQUEST_TEMPLATE.md
- #15465
- #26621 (comment)
Generally, if the style is not mentioned nor enforced by the developer notes, we leave it up to the original author to pick whatever fits them best personally and then leave it that way until the line is touched for other reasons.
Leave a comment below, if you have any questions.
-
l0rinc force-pushed on Apr 10, 2024
-
l0rinc commented at 4:16 pm on April 10, 2024: contributor
Thanks for the review @fanquake and @maflcko - though it was still in draft form because of the unsigned problems.
I didn’t mean the change as a purely stylistic refactoring.In the encodings the originalstr.reserve
doesn’t enforce the final size, but after the change it’s obvious that it’s an exact calculation, not just an upper bound, i.e.while (str.size() % 4) str += '=';
vsstr.append(size - str.size(), '=');
.In the second change the point was to unify the suffix removal with other trim algos (e.g.TrimStringView
). This would unify the 32 and 64 bit algos and obviate the inconsistency between the 32 and 64 base conversions for paddings (i.e. roundtrip tests aren’t working without padding).If you think the changes are controversial, I can commit the test and the proposed changes separately. -
DrahtBot removed the label CI failed on Apr 10, 2024
-
l0rinc force-pushed on Apr 10, 2024
-
l0rinc force-pushed on Apr 10, 2024
-
l0rinc commented at 9:43 pm on April 10, 2024: contributor@maflcko, you were right, the changes didn’t add enough value, I’ve reverted them. I’ve kept the tests, since I think they’re valuable and unified the style between base 32 and 64 in the source. If you think that’s still too much, I’ll revert it.
-
l0rinc renamed this:
refactor: Simplify base32/64 padding calculations
test: add a few more base32/64 calculation corner cases
on Apr 10, 2024 -
l0rinc renamed this:
test: add a few more base32/64 calculation corner cases
refactor/test: add a few more base32/64 calculation corner cases
on Apr 10, 2024 -
l0rinc marked this as ready for review on Apr 10, 2024
-
maflcko commented at 9:28 am on April 11, 2024: member
Please make sure to re-read the whole comment I shared and make sure to read all pages it links to. The burden to clearly motivate a change (and the review effort it requires) is on the pull request author (in this case here it would be you).
Open Source does not work by doing random unmotivated changes, opening a pull request, seeing it fail CI and tests, then adjust it in a ping-pong fashion to make CI green and go back and forth with reviewers until they have essentially written the whole changeset themselves.
At a minimum a (refactoring) pull request should have:
- A clear motivation
- Be correct and easy to review
- Have passing tests and CI
If any of the above do not apply, a refactoring pull request should not be opened in the first place.
-
maflcko commented at 10:05 am on April 11, 2024: member
All of it.
- “unified … code” is not a motivation, but a style choice
- The CI didn’t pass on several tries
- Review doesn’t seem worth it, unless there is a motivation
-
in src/test/base32_tests.cpp:5 in 9c1ae8af1e outdated
1@@ -2,10 +2,11 @@ 2 // Distributed under the MIT software license, see the accompanying 3 // file COPYING or http://www.opensource.org/licenses/mit-license.php. 4 5+#include <string>
maflcko commented at 10:06 am on April 11, 2024:This is the wrong style in any case, see the dev notes
l0rinc commented at 10:08 am on April 11, 2024:I can reorder them if you think it’s worth it. If you think these contributions are worthless I’ll stop.
maflcko commented at 10:10 am on April 11, 2024:I can reorder them if you think it’s worth it.
I don’t understand what you are trying to say.
I am saying that the order was correct before and you changed it to be wrong.
l0rinc commented at 10:13 am on April 11, 2024:I added a new import and the imports were reorganized as such. If you don’t like the order, I can change it, I don’t mind, though I would prefer the CI telling me these in the PR instead of wasting time reviewing styles.
l0rinc commented at 10:15 am on April 11, 2024:I don’t understand what you are trying to say.
I’m trying to find out if my contributions are welcome, I’ve only received hostility so far in return for my invested effort. It’s a decentralized, open source project, I’ll contribute what I can and want. If that’s not welcome, I’ll find a different project where my optimizations are welcome.
maflcko commented at 10:28 am on April 11, 2024:I’m trying to find out if my contributions are welcome, I’ve only received hostility so far in return for my invested effort.
Contributions are welcome and contributors are needed. However, refactoring changes are usually the wrong pick for a newcomer to tackle. This is explained https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#refactoring
maflcko commented at 10:29 am on April 11, 2024:See also https://bitcoincore.academy/
If you are looking for useful contributions to help out with, you can
- Search through the good first issues or the ones that are up for grabs. Some of them might no longer be applicable. So if you are interested, but unsure, you might want to leave a comment on the issue first.
- Write tests to improve the coverage. Any kind of test is welcome and coverage information can be obtained from a relatively recent coverage report. If you are unsure, don’t hesitate to check back first.
- Help with review and testing. There are easy ones such as the gui and rpc. However, review on any open pull request is welcome. Review will also help you understand the codebase better.
- Help on meta projects related to Bitcoin Core, such as https://github.com/corecheck.
- Join the Bitcoin Core IRC channel(s) and leave a comment to share what you are interested in.
l0rinc commented at 10:32 am on April 11, 2024:I was looking for introductory issues, couldn’t find any that was a good fit for me. But just looking at the code I found several optimization opportunities in relevant parts of the code, so I contributed those instead, e.g. the 4x speedup of base58.
l0rinc commented at 10:38 am on April 11, 2024:Do you consider optimizations useless refactorings? Other projects I’ve worked on consider them features. For me it’s a gentle introduction to the codebase, isolated from other complexities, I poke it left-and-right and see how the code behaves. Is this not a valuable contribution? Especially since I usually cover the corner cases I find with new test cases.
maflcko commented at 11:24 am on April 11, 2024:Tests are generally a good way to start contributing.
Just generally for each pull request, it needs review before merge, and each reviewer will have to trade-off whether it is worth it to review. There are 316 other pull requests open waiting for review, so that is another thing for reviewers to trade-off on.
l0rinc force-pushed on Apr 11, 2024in src/util/strencodings.cpp:184 in c6ee976cd9 outdated
183 ConvertBits<8, 5, true>([&](int v) { str += pbase32[v]; }, input.begin(), input.end()); 184 if (pad) { 185- while (str.size() % 8) { 186- str += '='; 187- } 188+ while (str.size() % 8) str += '=';
maflcko commented at 6:29 am on April 12, 2024:Please no style changes of this kind
l0rinc commented at 7:28 am on April 12, 2024:Reverted. I’ve changed it originally to be in alignment with theEncodeBase64
l0rinc force-pushed on Apr 12, 2024laanwj commented at 5:47 am on April 16, 2024: memberThanks for your first contribution!
As it is hard to review PRs that change things all over the place, this is unlikely to be merged like this. I’d suggest keeping the code change focused. For example, make it add test cases only, and remove the changes (refactoring and otherwise) to
strecodings.cpp
and other non-test code.laanwj added the label Tests on Apr 16, 2024in src/util/strencodings.cpp:226 in 8ecae4ddb4 outdated
220@@ -221,9 +221,9 @@ std::optional<std::vector<unsigned char>> DecodeBase32(std::string_view str) 221 std::vector<unsigned char> ret; 222 ret.reserve((str.size() * 5) / 8); 223 bool valid = ConvertBits<5, 8, false>( 224- [&](unsigned char c) { ret.push_back(c); }, 225+ [&](auto c) { ret.push_back(c); }, 226 str.begin(), str.end(), 227- [](char c) { return decode32_table[uint8_t(c)]; } 228+ [](auto c) { return decode32_table[uint8_t(c)]; }
maflcko commented at 7:09 am on April 16, 2024:Why this change? This style change makes the code harder to read.
Unless there is a reason to change the code and review it, it shouldn’t be changed.
l0rinc commented at 7:14 am on April 16, 2024:I’ve changed it to unify it with the previous lambda. Why would I leave the type there when I’ve used auto 2 lines above?
maflcko commented at 7:39 am on April 16, 2024:My comment applies to all changes in this file. I am just trying to explain why such style changes do not make sense.
l0rinc commented at 9:52 am on April 16, 2024:Removed the 3/4 to 6/8 change, kept only the test. I don’t yet understand how we expect this code to become more and more maintainable, if we don’t regularly leave the campground cleaner than we’ve found it.
maflcko commented at 10:07 am on April 16, 2024:I don’t see how using
auto
makes anything clearer.There certainly are style and language features that make maintenance easier, but they need to be well motivated and explained. Also, if there is a good reason for them, they should be applied globally via CI and checks. Otherwise, they are again applied inconsistently.
l0rinc commented at 10:14 am on April 16, 2024:I agree. Is there any solution?
Is there any non-test change that I could delve into—preferably an optimization that doesn’t require me to understand every part of the code—that would be welcome? For example, like my other base58 or bech32 speedups. Or am I just barking up the wrong tree and should move on to other projects?
maflcko commented at 7:11 am on April 16, 2024: member@paplorinc If you are not being receptive to reviewers’ feedback, it is unlikely that anyone will review the pull request and without review it will not be merged.laanwj commented at 7:43 am on April 16, 2024: memberTake it how you wish, i was just trying to be helpful, not patronizing.maflcko commented at 7:45 am on April 16, 2024: memberPlease understand that this software project requires review of all code changes. It is not a typical software project that auto-merges any change as soon as the CI is green.
When writing a patch as a pull request, please consider how reviewers are supposed to review it and how easy it will be for them to follow the changes and their motivation.
l0rinc force-pushed on Apr 16, 2024l0rinc renamed this:
refactor/test: add a few more base32/64 calculation corner cases
test: add a few more base32/64 calculation corner cases
on Apr 16, 2024fanquake commented at 10:17 am on April 16, 2024: memberI don’t yet understand how we expect this code to become more and more maintainable, if we don’t regularly leave the campground cleaner than we’ve found it.
The issue here is that even if you’ve found the campground, and while maybe it looks like it could use a little maintenance, if you zoom out for a birds-eye view, you’ll actually notice that everyone else is busy somewhere next door, trying to fight the forest fires. Even if the code here is (maybe arbitrarily) “better” or “faster” by some metric, the impact to the whole project is ~0. In fact, it could even be argued that it’s a net negative, given the burden of changing and reviewing the code, probably doesn’t outweigh distracting engineers working on more important things and/or actually breaking something that already works (and will likely continue to work for many many years) + all the time spent here.
I understand that as a new contributor, you’re looking for things to do, and picking random utility functions/code, and trying to generically improve it may seem easiest / appealing, and potentially something that other open source projects will happily take, however that isn’t how things work here.
I’d suggest looking through https://github.com/bitcoin/bitcoin/issues (you can filter for “good first issues”), and either picking a bug that needs fixing, and working on fixing it, or, looking through, https://github.com/bitcoin/bitcoin/pulls, and finding something that interests you, and leaving some meaningful review comments. These will be much more productive avenues for you to become a productive contributor to this project. You should also familiarise yourself with https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md and the https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md.
l0rinc commented at 10:25 am on April 16, 2024: contributorThanks for the detailed answer @fanquake.
everyone else is busy […] trying to fight the forest fires
So how do we prevent future forest fires while extinguishing current ones?
I have looked through
good first issue
since the beginning, there’s barely any, all of which were started by others, often years ago. That’s why I started working on what I can contribute instead.maflcko commented at 10:49 am on April 16, 2024: memberAnother great way to contribute would be to spend time on in-depth review of open pull requests and compare your review with the review done by others. Obviously this will be time-consuming, starting out, but if you care about the project, over time it will teach you relevant skills (for this project and for yourself to apply elsewhere).
On some pull requests, there remain small follow-ups that were found during review, so you’ll naturally find stuff to work on if you start out by reviewing pull requests done by others. Another benefit of this is, that if you end up creating a follow-up, the code will be fresh in the minds of some developers, who may be willing to spend time on review.
l0rinc force-pushed on May 11, 2024l0rinc force-pushed on May 29, 2024Add base32 and base64 roundtrip random padding tests
The new randomized roundtrips for base32 and base64 encoding make sure encoding and decoding are symmetric. Note that if we omit the padding in EncodeBase32 we won't be able to decode it with DecodeBase32. Added dedicated padding tests to hardcode the failure behavior.
in src/test/base64_tests.cpp:66 in fb03332664 outdated
65+ 66+BOOST_AUTO_TEST_CASE(base64_roundtrip) { 67+ SeedInsecureRand(); 68+ 69+ for (int i = 0; i < 100; ++i) { 70+ auto randomStr = InsecureRand256().ToString().substr(0, InsecureRandRange(64));
maflcko commented at 8:40 am on May 29, 2024:nit: Not sure about using the global for new code. I’d say it is cleaner to replace the SeedInsecureRand with a local FastRandomContext and then call the member methods directly. E.gfrc.randrange(64)
, etc
l0rinc commented at 9:26 am on May 29, 2024:Nice, thanks!l0rinc force-pushed on May 29, 2024DrahtBot commented at 10:03 am on August 28, 2024: contributor🐙 This pull request conflicts with the target branch and needs rebase.
DrahtBot added the label Needs rebase on Aug 28, 2024maflcko commented at 5:37 am on August 29, 2024: memberAdding tests seems fine, but maybe open a new pull request, because the discussion here was about a different change to non-test code.l0rinc closed this on Aug 29, 2024
l0rinc deleted the branch on Aug 29, 2024l0rinc commented at 6:59 am on August 29, 2024: contributormoved to a clean slate in https://github.com/bitcoin/bitcoin/pull/30746
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-26 12:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me