Safegcd-based modular inverses in MuHash3072

sipa commented at 4:26 am on April 4, 2021: member

This implements a safegcd-based modular inverse for MuHash3072. It is a fairly straightforward translation of the libsecp256k1 implementation, with the following changes:

Generic for 32-bit and 64-bit
Specialized for the specific MuHash3072 modulus (2^3072 - 1103717).
A bit more C++ish
Far fewer sanity checks

A benchmark is also included for MuHash3072::Finalize. The new implementation is around 100x faster on x86_64 for me (from 5.8 ms to 57 μs); for 32-bit code the factor is likely even larger.

For more information:

Original paper by Daniel J. Bernstein and Bo-Yin Yang
Implementation for libsecp256k1 by Peter Dettman; and the final version
Explanation of the algorithm using Python snippets
Analysis of the maximum number of iterations the algorithm needs
Formal proof in Coq by Russell O’Connor (for the 256-bit version of the algorithm; here we use a 3072-bit one).

DrahtBot added the label Build system on Apr 4, 2021

DrahtBot added the label Utils/log/libs on Apr 4, 2021

sipa commented at 11:11 pm on April 6, 2021: member

Using libgmp for inverses is 1.5x-2x faster still, which is somewhat expected - there are several optimizations to safegcd that become more relevant for larger input sizes but aren’t useful in the 256-bit code which this is adapted from as well.

I think it’s fine to leave those for future improvements, as this already gets hash finalization down to ~1 signature check worth, which is probably far below what we care about.

laanwj commented at 2:01 pm on May 19, 2021: member

Concept and high-level review ACK. Did not check the algorithm in detail.

theStack commented at 6:56 pm on June 22, 2021: contributor

Concept ACK

DrahtBot commented at 9:07 am on October 15, 2021: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/21590.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	TheCharlatan, dergoegge, achow101
Concept ACK	laanwj
Stale ACK	theStack

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#31507 ([POC] build: Use clang-cl to build on Windows natively by hebasto)
#31308 (ci, iwyu: Treat warnings as errors for specific targets by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot added the label Needs rebase on Nov 13, 2021

sipa force-pushed on Nov 15, 2021

sipa commented at 4:22 pm on November 15, 2021: member

Rebased.

DrahtBot removed the label Needs rebase on Nov 15, 2021

in configure.ac:975 in f1f4eae619 outdated

966@@ -967,6 +967,30 @@ AC_CHECK_DECLS([bswap_16, bswap_32, bswap_64],,,
967                  #include <byteswap.h>
968                  #endif])
969 
970+AC_MSG_CHECKING([for __builtin_ctz])
971+AC_COMPILE_IFELSE([AC_LANG_PROGRAM([[ ]], [[
972+ (void) __builtin_clz(0);
973+  ]])],
974+ [ AC_MSG_RESULT(yes); AC_DEFINE(HAVE_BUILTIN_CTZ, 1, [Define this symbol if you have __builtin_ctz])],
975+ [ AC_MSG_RESULT(no)]

fanquake commented at 11:44 pm on November 15, 2021:

Can you use the following here and below. Functionally there’s no real difference, but we’ve settled on this style for consistency across the build system.

0 [ AC_MSG_RESULT([yes]); AC_DEFINE([HAVE_BUILTIN_CTZ], [1], [Define this symbol if you have __builtin_ctz])],
1 [ AC_MSG_RESULT([no])]

sipa commented at 9:17 pm on December 1, 2021:

Done.

sipa force-pushed on Dec 1, 2021

sipa commented at 9:17 pm on December 1, 2021: member

I added additional fuzz tests, which seem to pass.

However, test/functional/feature_coinstatsindex.py now fails, and I haven’t figured out why. @fjahr ?

in src/test/fuzz/muhash.cpp:35 in 58b108ebcd outdated

30+    }
31+
32+    void Serialize(Span<uint8_t> bytes) {
33+        assert(bytes.size() % 4 == 0);
34+        assert(bytes.size() <= 768);
35+        for (int i = 0; i*4 < bytes.size(); ++i) {

maflcko commented at 3:07 pm on December 2, 2021:

0        for (size_t i{0}; i*4 < bytes.size(); ++i) {

0test/fuzz/muhash.cpp:27:29: error: comparison of integers of different signs: 'int' and 'std::size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
1        for (int i = 0; i*4 < bytes.size(); ++i) {
2                        ~~~ ^ ~~~~~~~~~~~~
3test/fuzz/muhash.cpp:35:29: error: comparison of integers of different signs: 'int' and 'std::size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
4        for (int i = 0; i*4 < bytes.size(); ++i) {
5                        ~~~ ^ ~~~~~~~~~~~~
6test/fuzz/muhash.cpp:131:13: error: unused variable 'buf' [-Werror,-Wunused-variable]
7    uint8_t buf[384];
8            ^
93 errors generated.

sipa commented at 4:40 pm on December 2, 2021:

Fixed.

in src/test/fuzz/muhash.cpp:110 in 58b108ebcd outdated

105+} // namespace
106+
107+FUZZ_TARGET(num3072_mul)
108+{
109+    FuzzedDataProvider provider{buffer.data(), buffer.size()};
110+    uint8_t buf[384];

maflcko commented at 3:09 pm on December 2, 2021:

unused

sipa commented at 4:40 pm on December 2, 2021:

Improved in various ways.

sipa force-pushed on Dec 2, 2021

maflcko referenced this in commit fd1c9e26d3 on Dec 3, 2021

sidhujag referenced this in commit 25e1d70230 on Dec 3, 2021

fjahr commented at 10:05 pm on December 5, 2021: contributor

I added additional fuzz tests, which seem to pass.

However, test/functional/feature_coinstatsindex.py now fails, and I haven’t figured out why. @fjahr ?

This isn’t an issue with the implementation here but with me the test just being stupid. More info here #23681.

I will take a closer look at the code here soon!

maflcko referenced this in commit 42b35f17d5 on Dec 6, 2021

sidhujag referenced this in commit 177261cf2d on Dec 6, 2021

in src/crypto/muhash.cpp:147 in 9cb1bc3409 outdated

148+    void FromNum3072(const Num3072& in)
149+    {
150+        double_limb_t c = 0;
151+        int b = 0, outpos = 0;
152+        for (int i = 0; i < LIMBS; ++i) {
153+            c += ((double_limb_t)in.limbs[i]) << b;

PastaPastaPasta commented at 1:44 pm on February 1, 2022:

please avoid c-style cast, use c++11 style functional casts

double_limb_t(in.limbs[i])

sipa commented at 2:04 pm on October 16, 2022:

Done.

in src/crypto/muhash.cpp:150 in 9cb1bc3409 outdated

151+        int b = 0, outpos = 0;
152+        for (int i = 0; i < LIMBS; ++i) {
153+            c += ((double_limb_t)in.limbs[i]) << b;
154+            b += LIMB_SIZE;
155+            while (b >= SIGNED_LIMB_SIZE) {
156+                limbs[outpos++] = (limb_t)c & MAX_SIGNED_LIMB;

PastaPastaPasta commented at 1:44 pm on February 1, 2022:

please use functional cast

sipa commented at 2:05 pm on October 16, 2022:

Done.

in src/crypto/muhash.cpp:167 in 9cb1bc3409 outdated

169+    void ToNum3072(Num3072& out) const
170+    {
171+        double_limb_t c = 0;
172+        int b = 0, outpos = 0;
173+        for (int i = 0; i < SIGNED_LIMBS; ++i) {
174+            c += ((double_limb_t)limbs[i]) << b;

PastaPastaPasta commented at 1:45 pm on February 1, 2022:

use functional cast

sipa commented at 2:05 pm on October 16, 2022:

Done.

in src/crypto/muhash.cpp:170 in 9cb1bc3409 outdated

172+        int b = 0, outpos = 0;
173+        for (int i = 0; i < SIGNED_LIMBS; ++i) {
174+            c += ((double_limb_t)limbs[i]) << b;
175+            b += SIGNED_LIMB_SIZE;
176+            if (b >= LIMB_SIZE) {
177+                out.limbs[outpos++] = (limb_t)c;

PastaPastaPasta commented at 1:45 pm on February 1, 2022:

please use functional cast

sipa commented at 2:05 pm on October 16, 2022:

Done.

PastaPastaPasta changes_requested

PastaPastaPasta commented at 1:49 pm on February 1, 2022: contributor

please change all the c-style casts to functional casts, I gave specific comments on a few instances, but stopped so it’s not as spammy

elichai commented at 3:00 pm on October 4, 2022: contributor

Another option here it to make arith_u256 generic over the integer size, and then we can get a generic u3072 and implement a simple egcd via extended euclidean algorithm (as this doesn’t require constant timness)

achow101 commented at 6:26 pm on October 12, 2022: member

Are you still working on this?

sipa force-pushed on Oct 16, 2022

sipa commented at 2:05 pm on October 16, 2022: member

Rebased, and addressed comments.

sipa commented at 2:07 pm on October 16, 2022: member

@elichai That’s a possibility. I expect it’d be an order of magnitude slower, but still significantly faster than what we have. The arith_uint256 code is already templated in the number of bits, so this would not be much work. Are you interested in trying that approach?

DrahtBot added the label Needs rebase on Oct 20, 2022

sipa commented at 9:03 pm on January 18, 2023: member

Rebased.

sipa force-pushed on Jan 18, 2023

DrahtBot removed the label Needs rebase on Jan 18, 2023

DrahtBot added the label Needs rebase on Jan 30, 2023

sipa force-pushed on Apr 25, 2023

achow101 commented at 3:52 pm on April 25, 2023: member

cc @real-or-random @fjahr

DrahtBot removed the label Needs rebase on Apr 25, 2023

DrahtBot added the label CI failed on Apr 25, 2023

maflcko commented at 2:21 pm on April 26, 2023: member

On Windows CI this will eat the bench CPU without terminating?

https://cirrus-ci.com/task/6655367648641024?logs=check#L1394

Fabcien referenced this in commit a045b9e47d on Jun 22, 2023

PastaPastaPasta referenced this in commit 25fcda0180 on Dec 25, 2023

PastaPastaPasta referenced this in commit 9b388c8388 on Dec 25, 2023

PastaPastaPasta referenced this in commit 7d601cfa85 on Dec 27, 2023

theuni commented at 3:43 pm on January 18, 2024: member

Seems the ctz builtins here can be switched to c++20’s countr_zero ? See the libc++ impl here, for an example of how it maps to builtins.

DrahtBot added the label Needs rebase on Mar 1, 2024

achow101 commented at 2:24 pm on April 9, 2024: member

Are you still working on this?

sipa force-pushed on Apr 9, 2024

DrahtBot removed the label Needs rebase on Apr 9, 2024

DrahtBot removed the label CI failed on Apr 9, 2024

sipa commented at 8:44 pm on April 9, 2024: member

Rebased, and switched to std::countr_zero instead of CTZ builtins.

maflcko commented at 10:10 am on April 10, 2024: member

Looks like this also fixed the Windows issue (https://github.com/bitcoin/bitcoin/pull/21590#issuecomment-1523507720), so I guess there may have been a bug in the previous implementation.

fjahr commented at 3:34 pm on April 15, 2024: contributor

@sipa The description still lists “Add more tests” as an open todo. Did you still want to address this before this could be merged? I think we have ok-ish coverage of MuHash but if you think something should be added could you tell us what you had in mind? Are there maybe test vectors we can port over?

sipa commented at 3:36 pm on April 15, 2024: member

@fjahr I’ve dropped the TODO. Feel free to contribute tests of course if you feel that’s helpful.

dergoegge approved

dergoegge commented at 3:42 pm on June 7, 2024: member

tACK 030c9edf5b12033207da2bc0735f97840dc88056

I differentially fuzzed the muhash implementation we have on master against the version in this PR. Given the coverage reached and assuming that the implementation on master is correct, I am confident that the muhash implementation in this PR is also correct. I tested across multiple compilers and optimization levels i.e. the following tuples: (master clang -O2, pr clang -O2), (master clang -O2, pr gcc -O2), (master clang-O2, pr gcc -O0) (all on x86_64).

Given the amount of testing and work that has already gone into all of this, I’m not surprised that I didn’t find any bugs.

The speedup for the muhash harness is very noticeable, ~60x faster for me.

DrahtBot requested review from laanwj on Jun 7, 2024

DrahtBot requested review from theStack on Jun 7, 2024

hebasto added the label Needs CMake port on Aug 16, 2024

DrahtBot added the label Needs rebase on Aug 28, 2024

maflcko removed the label Needs CMake port on Aug 29, 2024

sipa force-pushed on Oct 15, 2024

DrahtBot added the label CI failed on Oct 15, 2024

DrahtBot commented at 2:49 pm on October 15, 2024: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/31561424673

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

maflcko commented at 2:54 pm on October 15, 2024: member

CI fails, presumably after https://github.com/bitcoin/bitcoin/pull/29071

DrahtBot removed the label Needs rebase on Oct 15, 2024

TheCharlatan commented at 4:31 pm on October 15, 2024: contributor

Concept ACK

sipa force-pushed on Oct 16, 2024

DrahtBot removed the label CI failed on Oct 16, 2024

in src/crypto/muhash.cpp:227 in 310b778eb1 outdated

251+ * f:   bottom SIGNED_LIMB_SIZE bits of initial f value
252+ * g:   bottom SIGNED_LIMB_SIZE bits of initial g value
253+ * out: resulting transformation matrix, scaled by 2^SIGNED_LIMB_SIZE
254+ * return: eta value after SIGNED_LIMB_SIZE divsteps
255+ */
256+limb_t ComputeDivstepMatrix(signed_limb_t eta, limb_t f, limb_t g, SignedMatrix& out)

TheCharlatan commented at 12:36 pm on October 31, 2024:

Nit: Not important, but could inline these same as the other functions only used within this module?

sipa commented at 3:12 pm on January 9, 2025:

Done.

in src/crypto/muhash.cpp:398 in ad67fd2e0b outdated

451+    Num3072Signed d, e, f, g;
452+    e.limbs[0] = 1;
453+    // F is initialized as modulus, which in signed limb representation can be expressed
454+    // simply as 2^3072 + -MAX_PRIME_DIFF.
455+    f.limbs[0] = -MAX_PRIME_DIFF;
456+    f.limbs[3072 / SIGNED_LIMB_SIZE] = ((limb_t)1) << (3072 % SIGNED_LIMB_SIZE);

TheCharlatan commented at 2:40 pm on November 6, 2024:

Nit: Both 3072 / SIGNED_LIMB_SIZE and 3072 % SIGNED_LIMB_SIZE are used a couple of times. Might it be more clear to give them their own descriptive constants? I was initially thinking FINAL_LIMB_POSITION and FINAL_LIMB_MODULUS_BITS, but not sure if that really feels clearer either.

sipa commented at 3:12 pm on January 9, 2025:

Done.

in src/crypto/muhash.cpp:304 in ad67fd2e0b outdated

340+    /* Begin computing t*[d,e]. */
341+    signed_limb_t di = d.limbs[0], ei = e.limbs[0];
342+    signed_double_limb_t cd = (signed_double_limb_t)u * di + (signed_double_limb_t)v * ei;
343+    signed_double_limb_t ce = (signed_double_limb_t)q * di + (signed_double_limb_t)r * ei;
344+    /* Correct md,me so that t*[d,e]+modulus*[md,me] has SIGNED_LIMB_SIZE zero bottom bits. */
345+    md -= (limb_t(0x70a1421da087d93) * limb_t(cd) + md) & MAX_SIGNED_LIMB;

TheCharlatan commented at 9:33 am on November 7, 2024:

Nit: The constant here seems clearer in modinv32_impl.h. Could it get a name here too, or an explanation how it was computed?

sipa commented at 3:12 pm on January 9, 2025:

Added a constant for it.

in src/crypto/muhash.cpp:311 in ad67fd2e0b outdated

347+    /* Update the beginning of computation for t*[d,e]+modulus*[md,me] now md,me are known. */
348+    cd -= (signed_double_limb_t)1103717 * md;
349+    ce -= (signed_double_limb_t)1103717 * me;
350+    /* Verify that the low SIGNED_LIMB_SIZE bits of the computation are indeed zero, and then throw them away. */
351+    assert((cd & MAX_SIGNED_LIMB) == 0);
352+    assert((ce & MAX_SIGNED_LIMB) == 0);

TheCharlatan commented at 9:41 am on November 7, 2024:

Just a question: Compared to the code in modinv32_impl.h, there are fewer bound checks done here. Is this on purpose?

sipa commented at 3:07 pm on January 9, 2025:

Which ones in particular are missing?

TheCharlatan commented at 8:59 pm on January 9, 2025:

I meant these checks: https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/modinv32_impl.h#L414-L419 https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/modinv32_impl.h#L456-L459

But I guess they are not really that important and it is not worth adding the required utilities?

in src/crypto/muhash.cpp:190 in ad67fd2e0b outdated

213+    void Normalize(bool negate)
214+    {
215+        // Add modulus if this was negative. This brings the range of *this to 1-2^3072..2^3072-1.
216+        signed_limb_t cond_add = limbs[SIGNED_LIMBS-1] >> (LIMB_SIZE-1); // -1 if this is negative; 0 otherwise
217+        limbs[0] += signed_limb_t(-MAX_PRIME_DIFF) & cond_add;
218+        limbs[3072 / SIGNED_LIMB_SIZE] += (signed_limb_t(1) << (3072 % SIGNED_LIMB_SIZE)) & cond_add;

TheCharlatan commented at 12:12 pm on November 7, 2024:

Just a question: Compared to modinv32_normalize, this step seems to skip the inner limbs. Is there a reason why there is a difference between the implementations here? IIUC this works out in the end because of the carry step.

sipa commented at 3:10 pm on January 9, 2025:

Yes and no. They aren’t skipped, the carry step is the equivalent of processing the inner limbs. They’re just easier, because the modulus here can be represented as [1 « FINAL_LIMB_MODULUS_BITS, 0, 0, 0, …, 0, 0, -MAX_PRIME_DIFF]. In the modinv32_impl.h code, the modulus is treated as generic, where any limb can be nonzero.

in src/crypto/muhash.cpp:455 in ad67fd2e0b outdated

503+    d.ToNum3072(ret);
504+    return ret;
505 }
506 
507-void Num3072::Square()
508+void Num3072::Multiply(const Num3072& a)

TheCharlatan commented at 7:58 pm on November 7, 2024:

Note for other reviewers: The diff here is confusing, but from what I can tell, this does not actually change the Multiply function, but just gets rid of Square(), muldbladd3, and square_n_mul, which were used in the old Inverse implementation.

in src/crypto/muhash.cpp:314 in ad67fd2e0b outdated

344+    /* Correct md,me so that t*[d,e]+modulus*[md,me] has SIGNED_LIMB_SIZE zero bottom bits. */
345+    md -= (limb_t(0x70a1421da087d93) * limb_t(cd) + md) & MAX_SIGNED_LIMB;
346+    me -= (limb_t(0x70a1421da087d93) * limb_t(ce) + me) & MAX_SIGNED_LIMB;
347+    /* Update the beginning of computation for t*[d,e]+modulus*[md,me] now md,me are known. */
348+    cd -= (signed_double_limb_t)1103717 * md;
349+    ce -= (signed_double_limb_t)1103717 * me;

TheCharlatan commented at 8:19 pm on November 7, 2024:

This tripped me up at first, because something is added instead of subtracted compared to the implementation in modinv32_impl.h. But I think this is correct, because it subtracts the distance (I don’t know what the correct terminology is for this) to the modulus instead of adding the modulus, which should work out to the same. Similarly, the operation in the for loop can also be moved to the final step, which I’m guessing is a further nice optimization.

sipa commented at 3:11 pm on January 9, 2025:

Same comment as above. The modulus here is 2^3072 - MAX_PRIME_DIFF, which is represented in signed-limb representation as [1 « FINAL_LIMB_MODULUS_BITS, 0, 0, 0, …, 0, -MAX_PRIME_DIFF], so we only need to do something for the bottom limb (where our modulus is negative) and the top limb.

TheCharlatan approved

TheCharlatan commented at 8:26 pm on November 7, 2024: contributor

ACK ad67fd2e0bfa6f43f350066596b6cca146391362

Just nits, and all of them can be ignored. This was fun to review, the explanations in the safegcd_implementation.md are excellent. I also profited from the two review clubs and their notes that were done on the original MuHash introduction.

DrahtBot requested review from dergoegge on Nov 7, 2024

theStack approved

theStack commented at 5:21 pm on November 18, 2024: contributor

Fuzz-tested ACK ad67fd2e0bfa6f43f350066596b6cca146391362

With the friendly help of @dergoegge I managed to get differential fuzzing running last week and let that ran for the last ~77 hours. Here are the rough instructions for those who also want to give it a try:

created branches on top of master and the PR each that add a characterization to the MuHash fuzz test, writing to a shared memory for comparison (see https://github.com/theStack/bitcoin/tree/muhash_characterization_master and https://github.com/theStack/bitcoin/tree/muhash_characterization_pr21590, cherry-picking the commit originally from https://github.com/dergoegge/bitcoin/commit/d3273787bc97f1023259724ddcf2968f3fe12279; note that the environment variable value had to be adapted to SEMSAN_CHARACTERIZATION_SHMEM_ID)
built the afl-clang-... binaries for clang 18:

0$ git clone https://github.com/AFLplusplus/AFLplusplus
1$ cd AFLplusplus
2$ LLVM_CONFIG=llvm-config-18 make

built both branches mentioned in step 1 above using afl-clang-lto/afl-clang-lto++ (built in step 2):

0$ cmake -B build_fuzz -DCMAKE_C_COMPILER="/path/to/AFLplusplus/afl-clang-lto" -DCMAKE_CXX_COMPILER="/path/to/AFLplusplus/afl-clang-lto++" -DBUILD_FOR_FUZZING=ON
1...
2$ cmake --build build_fuzz/
3...

(Note that this can take quite a while. Unfortunately, using the -fast binaries didn’t work for me and resulted in a linker error.)

cloned the qa-assets repo for the fuzzing seeds

0$ git clone --depth=1 https://github.com/bitcoin-core/qa-assets

built the dergoegge’s semsan tool and run it with each of the built fuzzing binaries above and the fuzzing seed:

0$ https://github.com/dergoegge/semsan
1$ cd semsan
2$ cargo build --release
3$ AFL_DEBUG=1 FUZZ=muhash ./target/release/semsan --debug-children /path/to/master_characterization_branch/build_fuzz/src/test/fuzz/fuzz /path/to/pr21590_characterization_branch/build_fuzz/src/test/fuzz/fuzz fuzz --seeds ~/qa-assets/fuzz_corpora/muhash/ --solutions ./solutions

wait and enjoy 🍻 🥃 🥩 🍨

The latest output looked like this on my machine:

0[Client Heartbeat [#0](/bitcoin-bitcoin/0/)] run time: 77h-20m-55s, clients: 1, corpus: 22, objectives: 0, executions: 31816828, exec/sec: 114.3, combined-coverage: 262/563840 (0%), stability: 262/262 (100%)

dergoegge approved

dergoegge commented at 10:01 am on January 9, 2025: member

tACK ad67fd2e0bfa6f43f350066596b6cca146391362

Only minor changes since my last review.

Add benchmark for MuHash finalization 91ce8cef2d

Safegcd based modular inverse for Num3072 a26ce62894

Add a fuzz test for Num3072 multiplication and inversion f5883286e3

sipa force-pushed on Jan 9, 2025

sipa commented at 3:12 pm on January 9, 2025: member

Rebased, addressed a few comments, and changed some assert() to Assume().

TheCharlatan approved

TheCharlatan commented at 8:13 pm on January 9, 2025: contributor

Re-ACK f5883286e32b625aab3dd80c74d6adb4f37f0a80

Range-diff’ed from the last push, changes are marking functions as inline, checking asserts to Assume and adding some constants. There are some minor formatting issues, which can be fixed by running the commits through clang-format-diff.

DrahtBot requested review from theStack on Jan 9, 2025

DrahtBot requested review from dergoegge on Jan 9, 2025

in CMakeLists.txt:400 in f5883286e3

396@@ -397,6 +397,7 @@ target_link_libraries(core_interface INTERFACE warn_interface)
397 if(MSVC)
398   try_append_cxx_flags("/W3" TARGET warn_interface SKIP_LINK)
399   try_append_cxx_flags("/wd4018" TARGET warn_interface SKIP_LINK)
400+  try_append_cxx_flags("/wd4146" TARGET warn_interface SKIP_LINK)

hebasto commented at 9:18 am on January 10, 2025:

I suggest to limit the scope of this warning suppression to the bitcoin_crypto library only:

 0diff --git a/CMakeLists.txt b/CMakeLists.txt
 1index f2a8183c84..2dba6f255d 100644
 2--- a/CMakeLists.txt
 3+++ b/CMakeLists.txt
 4@@ -397,7 +397,6 @@ target_link_libraries(core_interface INTERFACE warn_interface)
 5 if(MSVC)
 6   try_append_cxx_flags("/W3" TARGET warn_interface SKIP_LINK)
 7   try_append_cxx_flags("/wd4018" TARGET warn_interface SKIP_LINK)
 8-  try_append_cxx_flags("/wd4146" TARGET warn_interface SKIP_LINK)
 9   try_append_cxx_flags("/wd4244" TARGET warn_interface SKIP_LINK)
10   try_append_cxx_flags("/wd4267" TARGET warn_interface SKIP_LINK)
11   try_append_cxx_flags("/wd4715" TARGET warn_interface SKIP_LINK)
12diff --git a/src/crypto/CMakeLists.txt b/src/crypto/CMakeLists.txt
13index 03c6972dca..4536801450 100644
14--- a/src/crypto/CMakeLists.txt
15+++ b/src/crypto/CMakeLists.txt
16@@ -22,6 +22,11 @@ add_library(bitcoin_crypto STATIC EXCLUDE_FROM_ALL
17   ../support/cleanse.cpp
18 )
19 
20+target_compile_options(bitcoin_crypto
21+  PRIVATE
22+    $<$<CXX_COMPILER_ID:MSVC>:/wd4146>
23+)
24+
25 target_link_libraries(bitcoin_crypto
26   PRIVATE
27     core_interface

TheCharlatan commented at 12:22 pm on January 28, 2025:

Do you want to open a follow up for this?

dergoegge approved

dergoegge commented at 7:13 pm on January 10, 2025: member

tACK f5883286e32b625aab3dd80c74d6adb4f37f0a80

in src/crypto/muhash.cpp:313 in a26ce62894 outdated

349+    signed_double_limb_t ce = (signed_double_limb_t)q * di + (signed_double_limb_t)r * ei;
350+    /* Correct md,me so that t*[d,e]+modulus*[md,me] has SIGNED_LIMB_SIZE zero bottom bits. */
351+    md -= (MODULUS_INVERSE * limb_t(cd) + md) & MAX_SIGNED_LIMB;
352+    me -= (MODULUS_INVERSE * limb_t(ce) + me) & MAX_SIGNED_LIMB;
353+    /* Update the beginning of computation for t*[d,e]+modulus*[md,me] now md,me are known. */
354+    cd -= (signed_double_limb_t)1103717 * md;

achow101 commented at 9:22 pm on January 27, 2025:

In a26ce628942243fc9848a63bfdfa5e61f5e936f3 “Safegcd based modular inverse for Num3072”

nit: 1103717 appears to be the value of MAX_PRIME_DIFF, so I think this can use that variable instead.

achow101 commented at 9:42 pm on January 27, 2025: member

ACK f5883286e32b625aab3dd80c74d6adb4f37f0a80

achow101 merged this on Jan 27, 2025

achow101 closed this on Jan 27, 2025

Safegcd-based modular inverses in MuHash3072 #21590

Code Coverage & Benchmarks

Reviews

Conflicts