Make SHA256 compression runtime pluggable

furszy commented at 11:18 pm on November 27, 2025: member

Tackling the long-standing request #702.

Right now we ship our own SHA256 implementation, a standard baseline version that does not take advantage of any hardware-optimized instruction, and it cannot be accessed by the embedding application - it is for internal usage only.

This means embedding applications often have to implement or include a different version for their use cases, wasting space on constrained environments, and in performance-sensitive setups it forces them to use a slower path than what the platform provides. Many projects already rely on tuned SHA-NI / ARMv8 / or other hardware-optimized code, so always using the baseline implementation we ship within the library is not ideal.

These changes allow users to supply their own SHA256 compression function at runtime, while preserving the existing default behavior for everyone else. This is primarily intended for environments where the available SHA256 implementation is detected dynamically and recompiling the library with a different implementation is not feasible (equivalent build-time functionality will come in a follow-up PR).

It introduces a new API:

0secp256k1_context_set_sha256_transform_callback(ctx, fn_transform)

This function installs the optimized SHA256 compression into the secp256k1_context, which is then used by all internal computations. Important: The provided function is verified to be output-equivalent to the original one.

As a quick example, using this functionality in Bitcoin-Core will be very straightforward: https://github.com/furszy/bitcoin-core/commit/f68bef06d95a589859f98fc898dd80ab2e35eb39

furszy force-pushed on Nov 28, 2025

fjahr commented at 4:16 pm on December 1, 2025: contributor

I guess this is more of a draft for initial review but I will spell it out anyway: It’s currently missing some CI coverage for building with an external library as well as some docs. Curious what people think in terms of docs for something like this: Should there be extensive guidance in which context this is safe to use or would users be left on their own to try it out? Or are we assuming all users that go to this length are competent enough to judge if it’s a good idea to bring their own sha256 or use the systems one?

furszy commented at 7:03 pm on December 1, 2025: member

Thanks for the feedback fjahr!

I guess this is more of a draft for initial review but I will spell it out anyway: It’s currently missing some CI coverage for building with an external library as well as some docs.

Yeah, can create a CI job testing the compile-time pluggable compression function very easily. Thanks for reminding that.

And about the missing docs; yeah. I didn’t add it because we currently don’t have much documentation outside configure.ac / CMakeLists.txt. Happy to create a file for it.

Curious what people think in terms of docs for something like this: Should there be extensive guidance in which context this is safe to use or would users be left on their own to try it out? Or are we assuming all users that go to this length are competent enough to judge if it’s a good idea to bring their own sha256 or use the systems one?

It’s not that we’re letting people plug in whatever they want. Both introduced features have guardrails:

Compile-time: we run all current tests against the provided implementation, plus a runtime self-test.
Runtime: besides the runtime self-test, have introduced an “equivalence check” that hashes known inputs and compares them against the library’s internal implementation before accepting the external function.

So you can bring your own compression function, but it still has to prove it behaves exactly like ours bit-for-bit.

Also, the target user here is someone who’s actually written a hardware-optimized SHA256. It’s not like they stumbled into this by accident.

in include/secp256k1.h:96 in 7fefa9c851

92@@ -92,6 +93,7 @@ typedef struct secp256k1_ecdsa_signature {
93  * the message, the algorithm, the key and the attempt.
94  */
95 typedef int (*secp256k1_nonce_function)(
96+    const secp256k1_context *ctx,

real-or-random commented at 8:11 am on December 8, 2025:

Argh, that’s a breaking change. I didn’t see that coming.

But I think it can be avoided. If the user wants to pass a custom nonce function, they anyway have no way of calling our internal function (whether it uses a custom SHA256 transform or not).

So we can keep the struct here and add the context arg only to our built-in nonce function. Our ECDSA signing function (and Schnorr signing, and ECDH, etc.) will need to special-case the built-in nonce function because it has a different function signature. Not elegant but it will do the job.

~~Maybe we can come up with something nicer if we (ever) add a modern ECDSA module and deprecate the ECDSA stuff in the main secp256k1.h file.~~ edit: Probably no because this affects not just ECDSA.

furszy commented at 9:23 pm on December 15, 2025:

So we can keep the struct here and add the context arg only to our built-in nonce function. Our ECDSA signing function (and Schnorr signing, and ECDH, etc.) will need to special-case the built-in nonce function because it has a different function signature. Not elegant but it will do the job.

Done. Spent quite a bit of time thinking about this, mainly because the so-called “default” function (the one exposed in the public header here) is not the one will be used when a custom sha256 transform is provided.., which somewhat defeats the purpose of calling it “default”. Still, after a few rounds of thought and discussions with theStack and sipa, I think this is fine. It’s expected that a “different” function is used when the user provides own sha transform.

in src/hash.h:17 in 7fefa9c851

 9@@ -10,13 +10,20 @@
10 #include <stdlib.h>
11 #include <stdint.h>
12 
13+typedef void (*sha256_transform_callback)(uint32_t *state, const unsigned char *block, size_t rounds);
14+
15+/* Validate user-supplied SHA-256 transform by comparing its output against
16+ * the library's linked implementation */
17+static int secp256k1_sha256_check_transform(sha256_transform_callback fn_transform);

real-or-random commented at 8:22 am on December 8, 2025:

Do we need this function given that there’s already secp256k1_selftest_sha256(void), which is run when a context is created? The existing one seems a bit simpler than your function and it’s already there, so using that will make the diff smaller (but it will need to be modified to obtain an optional context). And perhaps the existing one is a bit slower than yours. I’d say either function is fine but keeping both seems overkill.

(The existing selftest is currently in selftest.h. That’s already wrong; it should have been put in a selftest_impl.h. But your approach is better: the SHA256 test function should be exposed from the internal hash module and then called from selftest.)

edit: Okay, I see now. It’s annoying than I thought. We expose secp256k1_selftest(void) which doesn’t get a context object… The reason it’s exposed is that you can use it when using the static context.

So I believe it makes sense to check the SHA256 at two places:

In secp256k1_selftest(void) because this will catch bugs in the built-in SHA256. This is particularly important with this PR given that it makes possible overriding the built-in implementation it at build time.
When setting a custom SHA256 transform.

This means we may perform two checks in the worst case, but that won’t be the end of the world. You get the overhead only at library initialization time. But the question remains: How should the test look like? And I still think what I said above is true: It’s okay to have a single secp256k1_sha256_selftest function in the hash module, and this can be called from the selftest module and from secp256k1_context_set_sha256_transform_callback.

furszy commented at 3:54 pm on December 16, 2025:

So I believe it makes sense to check the SHA256 at two places:

In secp256k1_selftest(void) because this will catch bugs in the built-in SHA256. This is particularly important with this PR given that it makes possible overriding the built-in implementation it at build time.

When setting a custom SHA256 transform.

This means we may perform two checks in the worst case, but that won’t be the end of the world. You get the overhead only at library initialization time. But the question remains: How should the test look like? And I still think what I said above is true: It’s okay to have a single secp256k1_sha256_selftest function in the hash module, and this can be called from the selftest module and from secp256k1_context_set_sha256_transform_callback.

Yeah, I did start from that idea originally (that’s essentially why we also call to secp256k1_selftest_sha256 inside secp256k1_context_set_sha256_transform first), but I ended up feeling that the existing selftest is a bit too weak for what we want to validate here.

It only checks a single 64-byte message, so it exercises exactly one compression round and never goes through the padding path or the buffering across multiple writes. Which felt insufficient to me. Validating the transform against the internal implementation also seemed like a reasonable extra safety net to ensure parity without adding more hardcoded values.

Structurally, I agree that having a single secp256k1_sha256_selftest living in the hash module, makes sense.

The thing now is that neither test alone fully covers both aspects: we still want a check against hardcoded outputs, but we also want to verify more realistic usage patterns. That’s why I thought having both was useful. That said, we could get similar coverage by extending the existing selftest with more hardcoded values too.. (this comes from a nice talk I had with @theStack). So.. probably we could re-use the existing one for now, and extend its coverage in a follow-up. Happy to hear your thoughts

furszy commented at 3:56 pm on December 17, 2025:

I missed to update this one. The latest push uses only secp256k1_sha256_selftest. Calling it twice as suggested.

in src/hash.h:28 in 7fefa9c851

24 } secp256k1_sha256;
25 
26-static void secp256k1_sha256_initialize(secp256k1_sha256 *hash);
27+static void secp256k1_sha256_initialize(secp256k1_sha256 *hash, sha256_transform_callback fn_transform);
28 static void secp256k1_sha256_write(secp256k1_sha256 *hash, const unsigned char *data, size_t size);
29 static void secp256k1_sha256_finalize(secp256k1_sha256 *hash, unsigned char *out32);

real-or-random commented at 8:31 am on December 8, 2025:

A potential problem with storing the callback in the secp256k1_sha256 struct is that you could have this series of events:

User sets callback to cb1.
User calls API function that calls secp256k1_sha256_initialize, which outputs some opaque object that contains the secp256k1_sha256 struct.
User sets callback go cb2.
Users passes the opaque object to another API function, which calls secp256k1_sha256_write, which calls cb1. UB.

(I believe) this cannot happen in the current code because there’s no pair of API functions that behaves in this way, but it’s a potential footgun for the future.

My suggestion is just passing the context object to every internal function that needs the SHA256 callback. (Another angle: State is annoying. We have state already in the context, so let’s try to keep it there.)

furszy commented at 9:05 pm on December 16, 2025:

My suggestion is just passing the context object to every internal function that needs the SHA256 callback.

I did started there, but didn’t really like where it was leading me. Mainly because going down that path will be significantly more invasive than the current changes: It requires changing every secp256k1_sha256_write call to receive the function arg, every hmac, tagged hash etc. It also includes changing secp256k1_sha256_finalize! which will also require the function arg because we call write during finalization too. Overall, I think the end result would be quite ugly for what we gain here (see below for more rationale).

A potential problem with storing the callback in the secp256k1_sha256 struct is that you could have this series of events:

User sets callback to cb1.

User calls API function that calls secp256k1_sha256_initialize, which outputs some opaque object that contains the secp256k1_sha256 struct.

User sets callback go cb2.

Users passes the opaque object to another API function, which calls secp256k1_sha256_write, which calls cb1. UB.

(I believe) this cannot happen in the current code because there’s no pair of API functions that behaves in this way, but it’s a potential footgun for the future.

I’m not fully convinced it’s a realistic scenario. Changing the sha256 transform callback in the middle of an ongoing procedure doesn’t really make sense in any intended usage. If someone does that, they should already be well outside the model we expect the API to be used under. To me, this is similar to deleting the callback function from another thread while a libsecp operation is executing.

Also, exposing an in-use sha256 object across API boundaries is something we should simply never do right?. That feels really ugly. These structs are only valid under the assumption that they were initialized via secp256k1_sha256_initialize; otherwise, the internal fields would already be invalid, independently of the transform callback. And we cannot ensure that the struct has been initialized properly (or not corrupted) if we share it externally. I mean, if we ever want to share an existing sha256, I would say we should share the state and buffer directly as bytes and then re-initialize the struct on the receiving side.

So overall, I think it worth documenting the expectation clearly for sure: like sha256 structs must not be shared externally, and the provided sha256 callback, must be provided during app’s init, and must remain active for the entire app lifecycle once provided (because it doesn’t make sense to change it). But I’m not sure this is a problem we need to design around. It would add other complications without a clear benefit. But let me know what you think.

furszy commented at 7:04 pm on December 17, 2025:

Maybe I should give the “pass the function to every write call” patch-set another try and compare the resulting changes. I did it early on so maybe it is not as bad as I remember it. Let me see..

real-or-random commented at 8:35 pm on December 17, 2025:

Also, exposing an in-use sha256 object across API boundaries is something we should simply never do right?. That feels really ugly.

It doesn’t feel ugly to me. Of course, we wouldn’t share it standalone but inside some other object. For example, like we expose an secp256k1_scalar inside an opaque secp256k1_musig_keyagg_cache.

And I don’t think the use case would be unrealistic. Having a “hash builder” is common, and it could very well be part of a high-level function. Just imagine an advanced schnorrsig signing object that lets you append to the message multiple times before finalizing the signatures (e.g., for cases in which you want to sign a long message whose parts you receive step by step).

real-or-random commented at 8:39 pm on December 17, 2025:

Maybe I should give the “pass the function to every write call” patch-set another try and compare the resulting changes. I did it early on so maybe it is not as bad as I remember it. Let me see..

Yes, I see that passing the function down is annoying and that it has a non-zero readability and maintenance cost. But I imagine it’s not too bad. (I tend to agree with sipa here: reviewing many obvious diff lines is not an issue.) Seeing the resulting diff would certainly help to see how bad it is.

furszy commented at 6:41 pm on December 18, 2025:

Maybe I should give the “pass the function to every write call” patch-set another try and compare the resulting changes. I did it early on so maybe it is not as bad as I remember it. Let me see..

Yes, I see that passing the function down is annoying and that it has a non-zero readability and maintenance cost. But I imagine it’s not too bad. (I tend to agree with sipa here: reviewing many obvious diff lines is not an issue.) Seeing the resulting diff would certainly help to see how bad it is.

Ok done, both options available #1777#pullrequestreview-3594435763

in src/secp256k1.c:62 in 7fefa9c851

56@@ -56,20 +57,27 @@
57     } \
58 } while(0)
59 
60+struct secp256k1_hash_context {
61+    sha256_transform_callback fn_sha256_transform;
62+};

real-or-random commented at 8:37 am on December 8, 2025:

I think it will be nice to define this in the internal hash module, just like secp256k1_ecmult_gen_context is defined in the ecmult_gen module.

furszy commented at 9:18 pm on December 16, 2025:

I think it will be nice to define this in the internal hash module, just like secp256k1_ecmult_gen_context is defined in the ecmult_gen module.

Sure. Done as suggested.

in include/secp256k1.h:428 in 7fefa9c851

423+ * In: callback: pointer to a function implementing the transform step.
424+ *               (passing NULL restores the default implementation)
425+ */
426+SECP256K1_API void secp256k1_context_set_sha256_transform_callback(
427+        secp256k1_context *ctx,
428+        void (*sha256_transform_callback)(uint32_t *state, const unsigned char *block, size_t rounds)

real-or-random commented at 8:45 am on December 8, 2025:

I was wondering whether it makes sense to allow for a real return value here to make it possible for the callback to indicate some kind of failure (e.g., couldn’t access external hardware, or malloc failed). We could then fall back to our implementation or simply call the error callback in order to crash.

My current conclusion is no: I can’t imagine this being useful in practice. (If you call external hardware, this will be super slow anyway. If you use malloc for SHA256, you’re doing it wrong.) Even if there’s some failure, the callback will still have the possibility to crash the process.

real-or-random commented at 8:55 am on December 8, 2025:

What’s the motivation for the rounds arg? (Do you think we’ll ever call this with any other value than 1?)

furszy commented at 9:29 pm on December 16, 2025:

What’s the motivation for the rounds arg? (Do you think we’ll ever call this with any other value than 1?)

I think it’s better to introduce it now, even if it’s currently always set to 1, so we don’t end up breaking the API later if we ever need it. But can remove it if you are strongly opposed to it.

real-or-random commented at 9:20 am on December 17, 2025:

so we don’t end up breaking the API later if we ever need it.

I might be missing something but I really tend to think that is too obscure.

I can’t imagine a reason why we would ever need it. Applying the transform more than once to the same data is not done within SHA256. So we’ll need this if we ever implement our own hardened hash function SHA256+ based on SHA256?
Even if it turns out that we need this functionality, we can simply call the transform in a loop. Perhaps that will come with a tiny performance disadvantage due to the overhead of the function call, but that will be acceptable.

furszy commented at 7:33 pm on December 17, 2025:

I was wondering whether it makes sense to allow for a real return value here to make it possible for the callback to indicate some kind of failure (e.g., couldn’t access external hardware, or malloc failed). We could then fall back to our implementation or simply call the error callback in order to crash.

hmm, how would we actually call the error callback from inside hash_impl.h? That would require passing the context all the way down into the sha256 sha_write and sha_finalize functions (hmac, tagged hashes, etc), which would introduce a hash_impl.h dependency on secp256k1.h (unless we move the context definition elsewhere). But this seems to be a low-level primitive file that shouldn’t receive the context.

real-or-random commented at 8:26 pm on December 17, 2025:

But this seems to be a low-level primitive file that shouldn’t receive the context.

Hm, I don’t think there is (or should be) a rule that says low-level primitives shouldn’t receive the error callback. It doesn’t need to be the full context.

But let me just resolve this conversation. As I already wrote above, I doubt that my suggestion was useful in the first place.

furszy commented at 7:05 pm on December 18, 2025:

Ok yeah, agree. Adding this just to repeat the compression on the same block doesn’t make much sense, but (idea) what about processing multiple blocks in a row? An API for that could be useful since implementations would be able to skip per-block-compression variables initialization, cleanups, etc.

sipa commented at 9:38 pm on December 18, 2025:

@furszy That makes more sense - and it’s apparently also how the Bitcoin Core transform implementations work (but call it blocks, not rounds).

furszy commented at 4:20 pm on December 19, 2025:

@furszy That makes more sense - and it’s apparently also how the Bitcoin Core transform implementations work (but call it blocks, not rounds).

ha, that explains where the idea came from. It must have been in the back of my mind.

in include/secp256k1_sha.h:1 in 75ccf99a07

real-or-random commented at 9:34 am on December 8, 2025:

What’s the motivation for exposing the SHA256 compression?

Unless I’m missing something, then if anything, I think we’d want to expose the high-level SHA256 function. I’m not sure how useful it is, and we tried to keep the library focused on ECC. On the other hand, we’ll have it anyway in the code base, and it’s required for Schnorr sigs (and we anyway expose a tagged hash for Schnorr sigs). So if you ask me, it’s fine to expose SHA256 as long as we add a comment that says that users shouldn’t expect a highly performant implementation.

But perhaps it’s better to debate the exposing in a separate PR.

furszy commented at 9:42 pm on December 16, 2025:

What’s the motivation for exposing the SHA256 compression?

Unless I’m missing something, then if anything, I think we’d want to expose the high-level SHA256 function. I’m not sure how useful it is, and we tried to keep the library focused on ECC. On the other hand, we’ll have it anyway in the code base, and it’s required for Schnorr sigs (and we anyway expose a tagged hash for Schnorr sigs). So if you ask me, it’s fine to expose SHA256 as long as we add a comment that says that users shouldn’t expect a highly performant implementation.

Actually, the motivation wasn’t so much that I expect people to use it directly. It’s more about making an explicit statement in the API: the sha256 transform is something that can be replaced or disabled if you want to (thus I thought it could be a new module). Exposing it makes that contract clear, instead of it being an internal detail that people can obscurely replace.

That said, I agree this is somewhat orthogonal topic.

real-or-random commented at 9:11 am on December 17, 2025:

It’s more about making an explicit statement in the API: the sha256 transform is something that can be replaced or disabled if you want to (thus I thought it could be a new module). Exposing it makes that contract clear, instead of it being an internal detail that people can obscurely replace.

Maybe my thinking is just too simplistic, but the only thing that exposing makes clear to me (as a user) is that I can call it, (and that there’s some kind of guarantee that future releases will contain a function with the same signature and behavior).

That said, I agree this is somewhat orthogonal topic.

I think so, yes.

What could be done to make the API docs more explicit is to point out in the docs for secp256k1_context_set_sha256_transform_callback where this is used by the library without being an implementation detail (namely Schnorr sigs, ECDH, IIRC also ellswift). Would probably suffice to say something like “e.g., Schnorr signatures and the default key derivation function for ECDH”.

in include/secp256k1.h:426 in 7fefa9c851

421+ *
422+ * Args: ctx:    pointer to a context object.
423+ * In: callback: pointer to a function implementing the transform step.
424+ *               (passing NULL restores the default implementation)
425+ */
426+SECP256K1_API void secp256k1_context_set_sha256_transform_callback(

real-or-random commented at 9:38 am on December 8, 2025:

Naming: I wonder whether it makes sense to avoid the word callback here. Sure, it’s a callback, but whenever we said callback in the past 10 years of this library, it was about error handling. We even have type secp256k1_callback (which probably should have been called secp256k1_error_callback but it’s too late now.)

Maybe we can just call it “function " and “function pointer”, like we do in the bring-your-own-nonce-function interfaces.

furszy commented at 9:19 pm on December 16, 2025:

Naming: I wonder whether it makes sense to avoid the word callback here. Sure, it’s a callback, but whenever we said callback in the past 10 years of this library, it was about error handling. We even have type secp256k1_callback (which probably should have been called secp256k1_error_callback but it’s too late now.)

Sure. Renamed to secp256k1_context_set_sha256_transform 👍🏼 .

in src/secp256k1.c:232 in 7fefa9c851

228@@ -220,6 +229,18 @@ void secp256k1_context_set_error_callback(secp256k1_context* ctx, void (*fun)(co
229     ctx->error_callback.data = data;
230 }
231 
232+void secp256k1_context_set_sha256_transform_callback(secp256k1_context *ctx, void (*sha256_transform_callback_)(uint32_t *state, const unsigned char *block, size_t rounds)) {

real-or-random commented at 9:42 am on December 8, 2025:

It will be nice to have a typedef for the function type in the public header.

real-or-random commented at 7:48 pm on December 19, 2025:

If I’m not mistaken, this comment is still unaddressed.

Let me add that the typedef is a good place to document the behavior of the function (e.g., the blocks arg). The naming of the args should be reconsidered because now block may point to more than one block. Maybe (..., const unsigned char *blocks64, size_t n_blocks).

furszy commented at 8:12 pm on December 19, 2025:

If I’m not mistaken, this comment is still unaddressed.

Let me add that the typedef is a good place to document the behavior of the function (e.g., the blocks arg). The naming of the args should be reconsidered because now block may point to more than one block. Maybe (..., const unsigned char *blocks64, size_t n_blocks).

Done, pushed. Sorry for missing this one. Most of the function docs were inside the build-time feature commit. In the module definition; Moved what had here and renamed the args.

in configure.ac:141 in 7fefa9c851

133@@ -134,6 +134,12 @@ SECP_TRY_APPEND_DEFAULT_CFLAGS(SECP_CFLAGS)
134 ### Define config arguments
135 ###
136 
137+AC_ARG_WITH([external-sha256],
138+  AS_HELP_STRING([--with-external-sha256=PATH], [use external SHA256 compression implementation]),
139+  [external_sha256_path="$withval"],
140+  [external_sha256_path=""]
141+)

real-or-random commented at 10:43 am on December 8, 2025:

Regarding the build stuff for the build-time override: We already have the possibility to have build-time overrides for the error callbacks, e.g., see AC_ARG_ENABLE(external_default_callbacks, below.

When it comes to the implementation of how the new override is configured and performed, I think it’s similar to the existing overrides for the callback. (Except that these don’t allow configuring the header to be included, which makes them rather useless… #1461)

So this will need a rework anyway [1]. This belongs to a different PR, of course, but I think the SHA override here is a good opportunity to think about a better design of a compile-time override, so we may want to iterate on this a bit more than expected. No matter how it’s designed, I think we should, in the long run, have only one (non-deprecated) mechanism for a build-time override of a function.

[1] I don’t think what I did for the error callbacks great, and I guess it would be okay to change it (I doubt that many users rely on this). Or even better, we could deprecate it and simply provide ways to provide build-time overrides for standard library symbols like printf, stderr, and abort. See for example the approach taken by mbedTLS (in the past ?) which is built with embedded systems in mind: https://github.com/Mbed-TLS/mbedtls/blob/550a18d4d6b29e62b6824201d8a49b8224e61c97/tf-psa-crypto%2Fdrivers%2Fbuiltin%2Finclude%2Fmbedtls%2Fplatform.h They provide macros for overriding standard library functions. So you have full flexibility as a user: if you’ve implemented a function that aborts, but it’s not called abort() (perhaps because that is a reserved name) then you can link it to the library also with a different name.

furszy commented at 4:21 pm on December 17, 2025:

Hmm, I see. Those are very good points. Agree this worth a deeper discussion.

No problem splitting the changes. We can keep this PR focused on the runtime feature and handle the build-time override design separately. The only change we need from the build-time commit for this PR is exposing the transform function in hash.h, so we can initialize the context default function.

real-or-random commented at 8:23 pm on December 17, 2025:

We can keep this PR focused on the runtime feature and handle the build-time override design separately

Right, I hadn’t considered this option. That’s a good idea. (And I see that sipa suggested the same below.)

in configure.ac:137 in 7fefa9c851

133@@ -134,6 +134,12 @@ SECP_TRY_APPEND_DEFAULT_CFLAGS(SECP_CFLAGS)
134 ### Define config arguments
135 ###
136 
137+AC_ARG_WITH([external-sha256],

real-or-random commented at 10:43 am on December 8, 2025:

nit: I think this one should also be an enable-style configure option instead of a with-style configure options; with is for compiling with external packages e.g., with-libxyz. And what the user provides here is not a package.

real-or-random commented at 10:44 am on December 8, 2025: contributor

Concept ACK, great to see a PR for this!

real-or-random added the label feature on Dec 8, 2025

real-or-random added the label build on Dec 8, 2025

real-or-random added the label performance on Dec 8, 2025

real-or-random added the label needs-changelog on Dec 8, 2025

hebasto commented at 4:24 pm on December 15, 2025: member

Concept ACK.

furszy force-pushed on Dec 16, 2025

furszy commented at 9:48 pm on December 16, 2025: member

Update pushed, thanks for the deep review real_or_random! I haven’t fully finished addressing all the comments yet, but I’ll hopefully finish tomorrow. So many good topics.

sipa commented at 7:44 pm on December 17, 2025: contributor

A few high-level comments:

I don’t think we should do link-time plugging in the same PR as runtime plugging, and when we do, I don’t think it should be done through the “module” abstractions. We use modules for exporting functionality from the library, not for offering replacement of internal dependencies (I find it strange to have base functionality depend on a module!).
“round” is a term of art in cipher design, and SHA-256 has exactly 64 rounds. Using the same term with another meaning is confusing.
I would try hard to avoid storing the transformation function pointer inside the secp256k1_sha256 struct. Reviewing many lines that just add an extra parameter being pass through is not hard.

furszy commented at 6:31 pm on December 18, 2025: member

Per feedback, this is how it would look if we pass the function everywhere https://github.com/furszy/secp256k1/commit/8149fc119071c532e9d2d6c3b633af7a37e5a0b8 vs how it looks when the function lives in the sha256 struct (an updated version of what we currently have here): https://github.com/furszy/secp256k1/commit/eb570826371f42a5bd320f8e43800281b1f63b0a

Let me know what you guys think.

sipa commented at 9:41 pm on December 18, 2025: contributor

@furszy From a quick glance, the passing of the function everywhere is fine I think.

I’d suggest passing a pointer to secp256k1_hash_context to the sha256 functions rather than the function pointer itself. That makes it easier to (if ever) add additional fields to that context.

furszy force-pushed on Dec 19, 2025

furszy commented at 5:58 pm on December 19, 2025: member

Updated per feedback. We now pass the hash context to all functions that need it, without storing any state inside the sha256 struct. The API arg “rounds” has also been renamed to “blocks” to prevent confusion with existing terms. Also, the commit introducing the compile-time feature has been removed for inclusion at a later stage (see #1777 (review) for more details).

real-or-random commented at 7:49 pm on December 19, 2025: contributor

Nice, I hope I’ll find the time have another look soon

furszy force-pushed on Dec 19, 2025

real-or-random commented at 8:08 am on December 23, 2025: contributor

Nice, I hope I’ll find the time have another look soon

Sorry, I didn’t manage to find the time before the holidays…

in include/secp256k1.h:408 in 041a6217fa

403@@ -404,6 +404,49 @@ SECP256K1_API void secp256k1_context_set_error_callback(
404     const void *data
405 ) SECP256K1_ARG_NONNULL(1);
406 
407+/**
408+ * SHA256 block compression raw primitive.

real-or-random commented at 10:10 am on January 6, 2026:

0 * A pointer to a function that  implementation of SHA256's internal compression function

“A pointer to” is a bit more in line with the other function pointer types. (" * A pointer to a function that implements SHA256’s internal compression function" would be even more in line, but it’s > 80 chars and duplicates “function”)
I’d call it “internal” rather than “raw” because that’s more precise (to me, at least).
The typical name of the thing is “(one-way) compression function”, see https://en.wikipedia.org/wiki/One-way_compression_function

furszy commented at 8:02 pm on January 13, 2026:

Agree. Done as suggested. Thanks

in include/secp256k1.h:412 in 041a6217fa

403@@ -404,6 +404,49 @@ SECP256K1_API void secp256k1_context_set_error_callback(
404     const void *data
405 ) SECP256K1_ARG_NONNULL(1);
406 
407+/**
408+ * SHA256 block compression raw primitive.
409+ *
410+ * This function performs the SHA256 compression on one or more contiguous
411+ * 64-byte message blocks. It does **not** perform padding, message scheduling
412+ * across blocks, or length encoding.

real-or-random commented at 10:21 am on January 6, 2026:

The function pointed to consumes one or more contiguous 64-byte message blocks and updates the internal SHA256 state accordingly. The function is not responsible for counting consumed blocks or bytes, and it is not responsible for performing padding.

What is “message scheduling across blocks”?
“length encoding” is part of padding, no?

furszy commented at 4:42 pm on January 13, 2026:

What is “message scheduling across blocks”?

Tried to state that the function operates only on complete 64-byte blocks and does not retain or combine leftover data between calls. E.g. calling it twice with two 32-byte inputs will not buffer them into a single 64-byte block; the function expects fully formed blocks and anything else is caller misuse. Probably just saying that the function only accepts 64-byte finalized blocks is enough and better.

“length encoding” is part of padding, no?

Yeah, I tend to think of padding as just the zero fill.., but specs wise this is wrong. The msg length is officially part of the padding.

furszy commented at 8:02 pm on January 13, 2026:

Done, updated per suggestion.

in include/secp256k1.h:419 in 041a6217fa

414+ * Parameters:
415+ * - `state`    Pointer to eight 32-bit words representing the current state.
416+ *              The state is updated in place.
417+ * - `blocks64` Pointer to the first 64-byte block; additional blocks must be
418+ *              contiguous in memory.
419+ * - `n_blocks` Number of contiguous 64-byte blocks to process.

real-or-random commented at 10:23 am on January 6, 2026:

nits:

Formatting should be consistent with other docs for function pointer types, see for example secp256k1_nonce_function. That includes s/Parameters/Args
s/current state/current internal state

furszy commented at 8:02 pm on January 13, 2026:

Done as suggested. Thanks

in include/secp256k1.h:448 in 041a6217fa outdated

443+ *               (passing NULL restores the default implementation)
444+ */
445+SECP256K1_API void secp256k1_context_set_sha256_transform(
446+        secp256k1_context *ctx,
447+        secp256k1_sha256_transform_fn fn_transform
448+) SECP256K1_ARG_NONNULL(1);

real-or-random commented at 1:24 pm on January 6, 2026:

We should use the same terminology here, too.

I’d vote for “compression function” instead of “transform”. It’s the term you’d find in the cryptographic literature. Some APIs call the state update “update”, some call it “transform”. But in either case, they refer to updating the entire internal state including the byte/block count, which is not what we’re doing here. Moreover, “SHA256 transform” could be misread as (the whole) “SHA256 function”. So I believe “compression function” is the most precise term for our purposes.

furszy commented at 8:03 pm on January 13, 2026:

Yeah sure, updated to use “compression function” everywhere.

in src/modules/ecdh/main_impl.h:64 in 041a6217fa

59@@ -60,7 +60,12 @@ int secp256k1_ecdh(const secp256k1_context* ctx, unsigned char *output, const se
60     secp256k1_fe_get_b32(x, &pt.x);
61     secp256k1_fe_get_b32(y, &pt.y);
62 
63-    ret = hashfp(output, x, y, data);
64+    if (hashfp == NULL) {
65+        /* Use context-aware function by default */

real-or-random commented at 1:29 pm on January 6, 2026:

nit (multiple times):

0        /* Use ctx-aware function by default */

Or “secp256k1_context-aware”. This makes it clear that the comment refers to the context object. I was confused first because “context-aware” is an ordinary English word used in many contexts in which “context” has a much less specific meaning.

furszy commented at 8:03 pm on January 13, 2026:

Sure, done as suggested.

in src/modules/ellswift/main_impl.h:584 in 041a6217fa

579@@ -572,8 +580,15 @@ int secp256k1_ellswift_xdh(const secp256k1_context *ctx, unsigned char *output,
580     secp256k1_fe_normalize(&px);
581     secp256k1_fe_get_b32(sx, &px);
582 
583-    /* Invoke hasher */
584-    ret = hashfp(output, sx, ell_a64, ell_b64, data);
585+    /* Invoke hasher.
586+     * Use context-aware function when custom SHA256 is provided */

real-or-random commented at 1:31 pm on January 6, 2026:

0     * Use ctx-aware function by default

This may use the context-aware function even if no custom SHA256 (transform) is provided.

furszy commented at 8:04 pm on January 13, 2026:

Done as suggested.

in src/testutil.h:20 in 041a6217fa

15+#define DEFINE_SHA256_TRANSFORM_PROBE(name)                                     \
16+    static int name##_called = 0;                                               \
17+    static void name(uint32_t *s, const unsigned char *msg, size_t rounds) {    \
18+        name##_called = 1;                                                      \
19+        secp256k1_sha256_transform(s, msg, rounds);                             \
20+    }

real-or-random commented at 1:45 pm on January 6, 2026:

I believe what we actually want to test here is not that the custom SHA256 was called but that the ctx-aware function was called. If I’m right, this may be simpler to test: The ctx-aware function will always be the “default” function, e.g., the default ECDH hash function. So shouldn’t it suffice to check correctness of ECDH once with the default function (NULL) and once with a custom hash function (not custom SHA256 transform)?

furszy commented at 7:48 pm on January 13, 2026:

what you mean with “custom hash function (not custom SHA256 transform)”?

The idea of the test was to verify that:

Default behavior remains unchanged when the ctx compression function is not set.
The custom compression function is actually called when provided. And because such function is correct, the complete ECDH test passes (and the same holds for the other processes that rely on the ctx-provided compression function)

Update: I think I understand your point now.. changing the compression function to something that isn’t SHA256 shouldn’t break the test. So this could be simplified. However, it still wouldn’t pass the secp256k1_selftest_sha256 check during the callback set.

real-or-random commented at 8:02 pm on January 22, 2026:

I think I understand your point now..

I don’t think so, but I also think my point was mistaken. Let me resolve this conversation and start a new one. :D

real-or-random commented at 1:46 pm on January 6, 2026: contributor

Approach ACK

furszy force-pushed on Jan 13, 2026

furszy commented at 8:04 pm on January 13, 2026: member

Updated per feedback. Thanks @real-or-random!

in include/secp256k1.h:413 in fd7bf49500

408+ *
409+ * This function processes one or more contiguous 64-byte message blocks and
410+ * updates the internal SHA256 state accordingly. The function is not responsible
411+ * for counting consumed blocks or bytes, nor for performing padding.
412+ *
413+ * Out:     state:     pointer to eight 32-bit words representing the current internal state;

sipa commented at 8:09 pm on January 13, 2026:

This should probably be labelled “In/Out”.

furszy commented at 9:39 pm on January 13, 2026:

upps, yeah. Thanks.

in include/secp256k1.h:415 in fd7bf49500

410+ * updates the internal SHA256 state accordingly. The function is not responsible
411+ * for counting consumed blocks or bytes, nor for performing padding.
412+ *
413+ * Out:     state:     pointer to eight 32-bit words representing the current internal state;
414+ *                     the state is updated in place.
415+ * In:      blocks64:  pointer to the first 64-byte block; additional blocks must

sipa commented at 8:11 pm on January 13, 2026:

“Pointer to concatenation of n_blocks blocks, of 64 bytes each”. Maybe also worth mentioning that there is no alignment guarantee on the input?

furszy commented at 9:39 pm on January 13, 2026:

sure, added.

in include/secp256k1.h:431 in fd7bf49500

426+ * Set a callback function to override the internal SHA256 compression function.
427+ *
428+ * This installs a function to replace the built-in block-compression
429+ * step used by the library's internal SHA256 implementation.
430+ * The provided callback must be functionally identical (bit-for-bit)
431+ * to the default compression function. Any deviation will cause incorrect

sipa commented at 8:14 pm on January 13, 2026:

It seems hard to a reader of this file to guess what the exact behavior of the default compression function is without specifying it.

How about “The provided callback must exactly implement the effect of n_blocks repeated applications of the SHA-256 compression function.”

furszy commented at 9:42 pm on January 13, 2026:

sure. Done as suggested.

in include/secp256k1.h:437 in fd7bf49500

432+ * results and undefined behavior.
433+ *
434+ * This API exists to support environments that wish to route the
435+ * SHA256 compression step through a hardware-accelerated or otherwise
436+ * specialized implementation. It is not meant for modifying the
437+ * semantics of SHA256.

sipa commented at 8:16 pm on January 13, 2026:

Maybe “It is not meant for replacing the use of SHA-256 in this library with a modified hash function”?

Modifying the semantics of SHA-256 sounds hard, as it may involve $5 wrench attacks on a few NIST officials.

furszy commented at 9:43 pm on January 13, 2026:

hehe sure, thanks.

real-or-random commented at 1:46 pm on January 14, 2026:

Good point. A bit simpler even: “It is not meant for replacing SHA-256 with a different hash function”.

in src/modules/schnorrsig/main_impl.h:52 in fd7bf49500

48@@ -49,7 +49,7 @@ static const unsigned char bip340_algo[] = {'B', 'I', 'P', '0', '3', '4', '0', '
49 
50 static const unsigned char schnorrsig_extraparams_magic[4] = SECP256K1_SCHNORRSIG_EXTRAPARAMS_MAGIC;
51 
52-static int nonce_function_bip340(unsigned char *nonce32, const unsigned char *msg, size_t msglen, const unsigned char *key32, const unsigned char *xonly_pk32, const unsigned char *algo, size_t algolen, void *data) {
53+static int nonce_function_bip340_impl(const secp256k1_context *ctx, unsigned char *nonce32, const unsigned char *msg, size_t msglen, const unsigned char *key32, const unsigned char *xonly_pk32, const unsigned char *algo, size_t algolen, void *data) {

sipa commented at 8:35 pm on January 13, 2026:

In these calls (and many similar ones in other parts of the codebase), I think it would be better to have them take a const secp256k1_sha256_context *ctx argument. That would:

Make it clear they’re not actually dependent on other context data.
Avoid the many secp256k1_get_hash_context(ctx) calls in their implementation.
Maybe provide better inlining opportunities for the compiler when invoked with the default transform context as argument, rather than needing to reason through secp256k1_context_static (unsure).

If somehow there is need later for them to access more context information, the argument can always be changed to secp256k1_context as they’re internal only.

furszy commented at 5:20 pm on January 14, 2026:

I actually went back and forth with this topic a few times. But sure. Done as suggested.

real-or-random commented at 9:55 am on January 22, 2026:

All of these comments are related to style, so take these as suggestions but not “hard” requests:

It’s probably a bit arbitrary to draw a line between functions that get the full context and functions that get only the hash context. My feeling is that it should be the hash context when the primary purpose of the function is hashing, and this roughly seems to match what you’ve done, but we could go a bit further even and include functions like secp256k1_schnorrsig_challenge, secp256k1_ellswift_prng, secp256k1_nonce_function_musig/secp256k1_nonce_function_musig_helper, secp256k1_musig_compute_noncehash, nonce_function_rfc6979_impl.
The argument is now sometimes called hash_ctx and sometimes ctx. I prefer the former to distinguish it clearly from the “full” context. It could also be hctx for brevity.
If we feel that secp256k1_get_hash_context(ctx) is a bit long, we could also rename it to secp256k1_get_hash_ctx(ctx) or _get_hctx depending on the previous point. I’d also be fine with &ctx->hash_ctx or &ctx->hctx. Given that we use it many times, it will be an idiom already once this PR is merged. (Also fine with keeping as-is if you like the long name!)
Argument order: We usually have the ctx first. This is a rule for API calls, and we typically follow it internally, too. So it would make sense to have the hash context also first, I think. If I’m not mistaken, you followed this rule for all functions except for the secp256k1_sha256_ functions. (Was this intentional?)

furszy commented at 4:11 pm on January 22, 2026:

If we feel that secp256k1_get_hash_context(ctx) is a bit long, we could also rename it to secp256k1_get_hash_ctx(ctx) or _get_hctx depending on the previous point. I’d also be fine with &ctx->hash_ctx or &ctx->hctx. Given that we use it many times, it will be an idiom already once this PR is merged. (Also fine with keeping as-is if you like the long name!)

I considered renaming it as well, but wanted to keep consistency with the setter naming. All existing callback setters are named secp256k1_context_set_<something>, so the getters “should” be secp256k1_context_get_<something>.

That was my conclusion, though I’m fine shortening the getter name if consistency over this two functions isn’t a concern.

furszy commented at 7:55 pm on January 22, 2026:

All suggestions applied (except the one above). Even when it was a mechanical change, took me more than what I expected to tackle them 🐜

real-or-random commented at 8:01 pm on January 22, 2026:

That makes perfect sense. If a function seems too cluttered with getter calls, we still have the options to store the hash context in a local variable, or move the getter call into the caller.

real-or-random commented at 9:00 pm on January 22, 2026:

All suggestions applied (except the one above). Even when it was a mechanical change, took me more than what I expected to tackle them 🐜

thanks for your stamina!

in src/hash_impl.h:144 in fd7bf49500

140     size_t bufsize = hash->bytes & 0x3F;
141     hash->bytes += len;
142     VERIFY_CHECK(hash->bytes >= len);
143+    VERIFY_CHECK(ctx != NULL);
144+    VERIFY_CHECK(ctx->fn_sha256_compression != NULL);
145     while (len >= 64 - bufsize) {

sipa commented at 8:45 pm on January 13, 2026:

There is an easy way to take advantage of the fact that the compression function can deal with multiple blocks now, by splitting the processing in 3 steps:

if (bufsize && len >= 64 - bufsize): fill the buffer, and invoke compression function on the buffer, wipe the buffer.
if (len >= 64): invoke the compression function directly on the input, with len / 64 blocks.
if (len >= 0): append to the buffer, don’t invoke.

Feel free to do here or in a follow-up.

furszy commented at 3:24 pm on January 14, 2026:

nice idea! The last if (len >= 0) should be if (len > 0) but rest looks great. Applied. Also added test coverage for it.

furszy force-pushed on Jan 14, 2026

in src/tests.c:508 in 8b45a8ff58

503+        secp256k1_sha256_write(&sha256_two, data32, 32, &hash_ctx);
504+        memcpy(data32, data + 32, 32);
505+        secp256k1_sha256_write(&sha256_two, data32, 32, &hash_ctx);
506+        secp256k1_sha256_finalize(&sha256_two, out_two, &hash_ctx);
507+
508+        CHECK(memcmp(out_one, out_two, 32) == 0);

real-or-random commented at 8:26 am on January 22, 2026:

It’s probably debatable now (somewhere else), but CONTRIBUTING.md says you should use secp256k1_memcmp_var instead of memcmp.

(4 times in this function.)

furszy commented at 7:55 pm on January 22, 2026:

Sure, done as suggested.

in src/hash_impl.h:156 in 8b45a8ff58

154         len -= chunk_len;
155         ctx->fn_sha256_compression(hash->s, hash->buf, 1);
156         bufsize = 0;
157     }
158+
159+    /* If we still have data to process; Invoke compression directly on the input */

real-or-random commented at 8:29 am on January 22, 2026:

nit: “If we still have data to process, invoke compression directly on the input” (why a semi-colon?)

furszy commented at 8:02 pm on January 22, 2026:

Done as suggested. Dunno how the semicolon and the capitalized “invoke” got there.. I probably had two different sentences and merged them at the end in the worst possible way.

real-or-random commented at 8:31 am on January 22, 2026: contributor

two nits on the last commit (I like that one!)

in src/secp256k1.c:68 in 8b45a8ff58 outdated

62@@ -63,13 +63,16 @@ struct secp256k1_context_struct {
63     secp256k1_callback illegal_callback;
64     secp256k1_callback error_callback;
65     int declassify;
66+
67+    secp256k1_hash_context hash_context;
68 };

real-or-random commented at 10:07 am on January 22, 2026:

You overlooked the comment. :)
The naming should probably be hash_ctx (or even hctx you prefer this) for consistency with the other member ecmult_gen_ctx.
And I believe we could use a more natural order and move hash_context after ecmult_gen_ctx. (I hate to think about ABI compatibility, but this shouldn’t be a problem here at all because contexts are entirely opaque)?

furszy commented at 8:04 pm on January 22, 2026:

You overlooked the comment. :)

upps, done. Thanks.

The naming should probably be hash_ctx (or even hctx you prefer this) for consistency with the other member ecmult_gen_ctx.

And I believe we could use a more natural order and move hash_context after ecmult_gen_ctx. (I hate to think about ABI compatibility, but this shouldn’t be a problem here at all because contexts are entirely opaque)?

Sure. Done as suggested.

furszy force-pushed on Jan 22, 2026

furszy renamed this:
~~Make SHA256 compression pluggable~~
Make SHA256 compression runtime pluggable
on Jan 22, 2026

in src/modules/ecdh/tests_impl.h:105 in 82519578a6 outdated

108+    /* Re-run using a ctx-provided SHA256 compression */
109+    secp256k1_context_set_sha256_compression(ctx, sha256_ecdh);
110+    test_ecdh_generator_basepoint_impl(ctx);
111+    CHECK(sha256_ecdh_called);
112+    secp256k1_context_destroy(ctx);
113+}

real-or-random commented at 8:58 pm on January 22, 2026:

What is the point of this test? The baseline run makes sense, of course, but it already exists on master.

For the re-run, I’m not entirely sure what it is supposed to test.

The first thing it tests is that test_ecdh_generator_basepoint_impl still passes when an external SHA256 compression function is set. But that’s not very surprising given that the “external” function simply delegates to the internal one. I don’t think that this is a very meaningful test.
The other thing it tests is that the custom function was actually called. This is not entirely a meaningful test because the test function itself test_ecdh_generator_basepoint_impl has secp256k1_sha256_write and secp256k1_finalize calls. So we’re testing the test function here and not secp256k1_ecdh.

I suspect you had the second item in mind, but simply overlooked the internal sha256 calls?

Anyway, I believe a test that this PR should add is:

If an external compression function is set, then secp256k1_ecdh(..., NULL, ...) calls the external compression. We could also add this:
If the external compression function is removed (set to NULL) afterwards, then secp256k1_ecdh(..., NULL, ...) won’t call the external compression again. Not sure if it adds much.

And I suspect that’s enough in the case of ECDH. Another new thing is the branch on hashfp == NULL and we want to make sure to test both sides of it. The desired test 1 I described above exercises the “then” branch. And the “else” branch should already be covered by existing tests (and hashfp function won’t have access to a ctx, so whether it uses an external or internal compression is not a meaningful question).

And I guess what I said here will apply more or less to the tests in the other modules as well.

edit: I still have to look at your most recent updates, but let me say that this one here is my last comment, so everything else is ACK and we’re almost ready to go from my side.

furszy commented at 7:13 pm on January 23, 2026:

Done. Ended up skipping the SHA256 output sanity check during setup so that we can modify the compression function and assert that the results from the default compression versus the ctx-provided one are different.

I know I could have simplified DEFINE_SHA256_TRANSFORM_PROBE even more by not calling the internal function at all and always returning a random or fixed value, but I think behaving like a real one-way compression function is better for testing.

furszy force-pushed on Jan 23, 2026

furszy commented at 7:02 pm on January 23, 2026: member

Updated per feedback regarding the tests. Thanks, real-or-random!

I also ended up reorganizing the PR slightly differently:

The first commit introduces the hash context with all the mechanical changes.
The second commit adds the API to allow supplying the compression function (and new tests).
And the third commit implements the multi-block compression speedup.

real-or-random commented at 7:57 pm on January 23, 2026: contributor

CI has some segfaults unfortunately. Here’s the relevant snippet from valgrind:

 0  ==3305== Invalid read of size 4
 1  ==3305==    at 0x1237F7: secp256k1_read_be32 (util.h:419)
 2  ==3305==    by 0x1237F7: secp256k1_sha256_transform_impl (hash_impl.h:48)
 3  ==3305==    by 0x1237F7: secp256k1_sha256_transform (hash_impl.h:128)
 4  ==3305==    by 0x110EA9: secp256k1_sha256_write (hash_impl.h:160)
 5  ==3305==    by 0x11CE35: ellswift_xdh_hash_function_prefix_impl (main_impl.h:501)
 6  ==3305==    by 0x11CE35: secp256k1_ellswift_xdh (main_impl.h:587)
 7  ==3305==    by 0x1222D6: ellswift_xdh_ctx_sha256_tests (tests_impl.h:450)
 8  ==3305==    by 0x122ECE: run_sequential (unit_test.c:275)
 9  ==3305==    by 0x122ECE: tf_run (unit_test.c:464)
10  ==3305==    by 0x122ECE: main (???:8016)
11  ==3305==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

furszy commented at 8:16 pm on January 23, 2026: member

CI has some segfaults unfortunately. Here’s the relevant snippet from valgrind:

 0  ==3305== Invalid read of size 4
 1  ==3305==    at 0x1237F7: secp256k1_read_be32 (util.h:419)
 2  ==3305==    by 0x1237F7: secp256k1_sha256_transform_impl (hash_impl.h:48)
 3  ==3305==    by 0x1237F7: secp256k1_sha256_transform (hash_impl.h:128)
 4  ==3305==    by 0x110EA9: secp256k1_sha256_write (hash_impl.h:160)
 5  ==3305==    by 0x11CE35: ellswift_xdh_hash_function_prefix_impl (main_impl.h:501)
 6  ==3305==    by 0x11CE35: secp256k1_ellswift_xdh (main_impl.h:587)
 7  ==3305==    by 0x1222D6: ellswift_xdh_ctx_sha256_tests (tests_impl.h:450)
 8  ==3305==    by 0x122ECE: run_sequential (unit_test.c:275)
 9  ==3305==    by 0x122ECE: tf_run (unit_test.c:464)
10  ==3305==    by 0x122ECE: main (???:8016)
11  ==3305==  Address 0x0 is not stack'd, malloc'd or (recently) free'd

yep, was working on it. Nothing serious. See #1806. Will push the small fix here shortly.

furszy force-pushed on Jan 23, 2026

furszy commented at 10:54 pm on January 23, 2026: member

CI green again. Ready to go.

in include/secp256k1.h:439 in a1d98f2025

434+ * SHA256 compression step through a hardware-accelerated or otherwise
435+ * specialized implementation. It is NOT meant for replacing SHA256
436+ * with a different hash function.
437+ *
438+ * Args:    ctx:       pointer to a context object.
439+ * In:      callback:  pointer to a function implementing the compression function;

theStack commented at 5:22 pm on February 2, 2026:

0 * In:      fn_compression:  pointer to a function implementing the compression function;

furszy commented at 6:23 pm on February 2, 2026:

Done. Thanks.

in src/tests.c:7422 in 7ec96fce68

7418@@ -7413,6 +7419,7 @@ static void test_ecdsa_wycheproof(void) {
7419     #include "wycheproof/ecdsa_secp256k1_sha256_bitcoin_test.h"
7420 
7421     int t;
7422+    const secp256k1_hash_ctx* hash_ctx = secp256k1_get_hash_context(CTX);

theStack commented at 5:26 pm on February 2, 2026:

coding style nit (as per conventions):

0    const secp256k1_hash_ctx *hash_ctx = secp256k1_get_hash_context(CTX);

furszy commented at 6:23 pm on February 2, 2026:

Done. Thanks.

theStack commented at 5:39 pm on February 2, 2026: contributor

Concept ACK

I extended your example usage commit in Bitcoin Core to use the SHA256 HW acceleration also for signature verification functions (currently they wouldn’t used it since we pass the static context) and based in on the new P2TR schnorr verification benchmark PR https://github.com/bitcoin/bitcoin/pull/34472: https://github.com/theStack/bitcoin/commit/296ec21c0e9b3612f07295e54b7ca2b2c1b10a19

The relevant benchmarks can be executed via:

0$ ./build/bin/bench_bitcoin -filter=SignTransaction.*\|VerifyScript.*\|ConnectBlock.*

Unfortunately the benchmarks results fluctuate too much on the machine I’m currently working on, so posting the results wouldn’t have much meaning yet. Will try to establish a stable environment and come back to that later. Maybe it is useful for other reviewers already.

furszy force-pushed on Feb 2, 2026

sipa commented at 2:40 pm on February 19, 2026: contributor

Seems like this needs rebase?

furszy force-pushed on Feb 19, 2026

furszy commented at 3:12 pm on February 19, 2026: member

Seems like this needs rebase?

Done, thanks. This repo changes too fast.

furszy force-pushed on Feb 28, 2026

furszy commented at 3:26 pm on February 28, 2026: member

Rebased post #1825 merge. Conflicts solved.

real-or-random commented at 1:55 pm on March 2, 2026: contributor

ACK d6d14d84d07148d91479bdb4c1cf41a8d7231f9a mod the commit message

I think the commit message of the main commit is a bit outdated when it says this:

“This is useful in setups where the optimal implementation is detected dynamically, where rebuilding the library is not possible, or when the compression function is not written in bare C89.”

It took me a while to figure out what you want to say here: this was written when the PR still also had a static override, and these are scenarios in which a static override isn’t suitable. I think this can just be dropped.

Except for this, I have two remaining comments, but I really prefer to see this PR merged sooner rather than later, and these things may totally be addressed in follow-up PRs:

Getter `secp256k1_get_hash_context`

The long getter name still looks a bit weird to me. It makes it hard for the eye to parse lines such as this:

0secp256k1_sha256_write(secp256k1_get_hash_context(ctx), &hash, seckey32, 32);

I admit I contributed to this by suggesting that the hash context object should go first. After reconsidering this, I prefer what we do for the ecmult_gen context:

0secp256k1_ecmult_gen(&ctx->ecmult_gen_ctx, &nonce_ptj[i], &k[i]);

In fact, there are a few calls that just read ctx instead of &ctx->ecmult_gen_ctx. These are certainly code smells because they rely on the order of members in the struct….

In a potential follow-up PR, we could make the way of getting the subcontext consistent across ecmult_gen and hash consistent and also solve these smells. As said above, I’d prefer &ctx->... which is short, and direct accesses to field members are not uncommon in C. The setters still make sense because they validate the args. I don’t insist on that solution, but I think we should at least make it consistent across modules.

An entirely different option will be to pass just ctx and do the unfolding only inside the hash and ecmult_gen modules. But perhaps this makes testing and other stuff harder: now you’ll need a full context to call these functions.

Self-test

I’m not entirely happy with what this PR does to the selftest module. (Again, the current code may have been based on my previous feedback – sorry, if that’s the case.)

At first glance, it feels a bit weird that the API call secp256k1_selftest now uses the built-in implementation even if the caller has set an override. This may be a questionable scenario anyway (secp256k1_selftest is there only for the static context), but nothing in the API says you can’t call it with a proper context. And if you think about it, the behavior in this PR actually makes sense: The docs of secp256k1_selftest say that this function checks for wrong endianness builds, etc., so we probably want to always use the built-in SHA256 implementation in the self-test.

But with that in mind, the function secp256k1_selftest_sha256 should live in the hash module (perhaps with the name secp256k1_sha256_selftest, and the selftest module should call it deliberately (i.e., with an explanatory comment) with the built-in hash implementation.

edit: This may also be an opportunity to split selftest module in selftest.h and selftest_impl.h like every other module.

edit edit: Or get rid of it if only one function remains left in there…

sipa commented at 3:06 pm on March 2, 2026: contributor

ACK d6d14d84d07148d91479bdb4c1cf41a8d7231f9a

I agree with @real-or-random’s comments above, though they seem minor. I’m happy to do a quick re-review if you should choose to address any of them.

in src/secp256k1.c:227 in d6d14d84d0 outdated

222@@ -220,6 +223,21 @@ void secp256k1_context_set_error_callback(secp256k1_context* ctx, void (*fun)(co
223     ctx->error_callback.data = data;
224 }
225 
226+void secp256k1_context_set_sha256_compression(secp256k1_context *ctx, secp256k1_fn_sha256_compression fn_compression) {
227+    VERIFY_CHECK(ctx != NULL);

w0xlt commented at 6:23 pm on March 2, 2026:

Is there a specific reason to use VERIFY_CHECK(ctx != NULL); instead of ARG_CHECK_VOID(ctx != secp256k1_context_static); considering the function is writing to the context ?

furszy commented at 10:16 pm on March 2, 2026:

Is there a specific reason to use VERIFY_CHECK(ctx != NULL); instead of ARG_CHECK_VOID(ctx != secp256k1_context_static); considering the function is writing to the context ?

Not really. Added the check. Thanks.

real-or-random commented at 6:47 am on March 3, 2026:

Oh, that’s a good point. But then the right way is this:

0VERIFY_CHECK(ctx != NULL);
1ARG_CHECK_VOID(secp256k1_context_is_proper(ctx));

ARG_CHECK_VOID(ctx != secp256k1_context_static); misses that ctx could be a copy of the static context, obtained via memcpy. Copying the static context is questionable, but secp256k1_context_is_proper was introduced precisely to catch this case.

furszy commented at 1:30 pm on March 3, 2026:

Done as suggested.

in src/hash.h:13 in d6d14d84d0 outdated

 9@@ -10,6 +10,14 @@
10 #include <stdlib.h>
11 #include <stdint.h>
12 
13+typedef void (*secp256k1_fn_sha256_compression)(uint32_t *state, const unsigned char *blocks64, size_t n_blocks);

w0xlt commented at 6:45 pm on March 2, 2026:

nit:

0typedef void (*secp256k1_sha256_compression_fn)(uint32_t *state, const unsigned char *blocks64, size_t n_blocks);

furszy commented at 10:23 pm on March 2, 2026:

I prefer the current wording. English reads left to right, so seeing the fn prefix early on tells everyone that this is a function pointer type right away. Placing it at the end would be awkward for me, as would have to scan the whole name before I understand what it is.

Just as an extra note, this is already a pattern we follow everywhere in Core. We don’t name numerical variables as keys_num, we named them as num_keys.

real-or-random commented at 7:00 am on March 3, 2026:

Ah, but in the public header we have typedef void (*secp256k1_sha256_compression_fn) .... (I think the reason why the compiler accepts this name clash at all is that the two types are “compatible”).

These should be consistent. And since this name is in the public API, it would probably make sense to see what we’ve done so far:

0> git grep "typedef.*\*" include
1include/secp256k1.h:typedef int (*secp256k1_nonce_function)(
2include/secp256k1_ecdh.h:typedef int (*secp256k1_ecdh_hash_function)(
3include/secp256k1_ellswift.h:typedef int (*secp256k1_ellswift_xdh_hash_function)(
4include/secp256k1_schnorrsig.h:typedef int (*secp256k1_nonce_function_hardened)(

So the most consistent choice would be secp256k1_sha256_compression_function, at least for the type. (And we could just omit the internal typedef and rely on the one in the public header.) I don’t care that much about the name of the corresponding argument to secp256k1_context_set_sha256_compression. Adjusting that one is not a breaking change.

Alternatively, we could also omit the public typedef entirely, and spell out the function type in the signature of secp256k1_context_set_sha256_compression. But I tend to think the typedef is a good idea.

furszy commented at 1:45 pm on March 3, 2026:

ah.. I forgot we had the typedef defined in two places. I did that in order to avoid hash.h including the public header, which would create a cyclic dependency, but at this point in the PR I’m fine either way. Would prefer to move forward. Pushed the public header only typedef definition.

real-or-random commented at 1:48 pm on March 3, 2026:

. I did that in order to avoid hash.h including the public header, which would create a cyclic dependency, but at this point in the PR I’m fine either way. Would prefer to move forward.

Yeah, I can assure you we have “include” stuff much worse than a cycle…

furszy force-pushed on Mar 2, 2026

furszy commented at 11:10 pm on March 2, 2026: member

Updated per feedback. Modified the main commit message to reflect the current changes.

Introduce hash context to support pluggable SHA256 compression

This is purely a mechanical change with no behavior change.

It introduces a secp256k1_hash_ctx struct inside secp256k1_context
and propagates it to all SHA256-related operations.

This sets up the ability to provide a hardware-optimized SHA256
compression function at runtime in a follow-up commit.

fdb6a91a5e

furszy force-pushed on Mar 3, 2026

Add API to override SHA256 compression at runtime

This introduces `secp256k1_context_set_sha256_compression()`,
which allows users to provide their own SHA256 block-compression
function at runtime.

This is useful in setups where the fastest implementation can only
be determined dynamically based on the available CPU features, and
rebuilding the library is not possible.

The callback is installed on the `secp256k1_context` and is then used
by all operations that compute SHA256 hashes. As part of the setup,
the library performs sanity checks to ensure that the supplied
function is equivalent to the default transform.

Passing NULL to the callback setter restores the built-in
implementation.

0753f8b909

sha256: speed up writes using multi-block compression

Multiple 64-byte blocks can now be compressed directly
from the input buffer, without copying them into the
internal buffer.

4d92a083bc

furszy force-pushed on Mar 3, 2026

real-or-random approved

real-or-random commented at 1:50 pm on March 3, 2026: contributor

ACK 4d92a083bc267b87db654ae75cbcaf2a87854c09

furszy commented at 1:51 pm on March 3, 2026: member

Updated per feedback. Diff is quite small.

w0xlt approved

w0xlt commented at 3:15 pm on March 3, 2026: contributor

ACK 4d92a083bc267b87db654ae75cbcaf2a87854c09

theStack approved

theStack commented at 1:21 am on March 4, 2026: contributor

ACK 4d92a083bc267b87db654ae75cbcaf2a87854c09

nit: I also tend to prefer direct &ctx->hash_ctx usage over the secp256k1_get_hash_context getter function, as suggested in #1777 (comment) above.

real-or-random merged this on Mar 4, 2026

real-or-random closed this on Mar 4, 2026

fanquake referenced this in commit a50dbd2bcd on Mar 4, 2026

furszy deleted the branch on Mar 4, 2026

sipa commented at 1:21 pm on March 4, 2026: contributor

Posthumous reACK 4d92a083bc267b87db654ae75cbcaf2a87854c09

Make SHA256 compression runtime pluggable #1777

Getter secp256k1_get_hash_context

Self-test

Getter `secp256k1_get_hash_context`