memset
which can still be optimized out as secnonce
isn’t read later in this function. The API makes it clear the callee is responsible for it, so we need to assure it’s cleared properly.
memset
which can still be optimized out as secnonce
isn’t read later in this function. The API makes it clear the callee is responsible for it, so we need to assure it’s cleared properly.
Replace
memset
which can still be optimized out assecnonce
isn’t read later in this function.
Why do you think it can be optimized out? I think it can be optimized out if the compiler can prove that no other code will read it. That seems unlikely, as the call is an API call. But if the compiler can prove this, then it’s indeed fine to optimize it out? Or are you saying we should make absolutely sure that it’s cleared out, even if the caller won’t read it? Perhaps.
676@@ -677,9 +677,9 @@ int secp256k1_musig_partial_sign(const secp256k1_context* ctx, secp256k1_musig_p
677 ARG_CHECK(secnonce != NULL);
678 /* Fails if the magic doesn't match */
679 ret = secp256k1_musig_secnonce_load(ctx, k, &pk, secnonce);
680- /* Set nonce to zero to avoid nonce reuse. This will cause subsequent calls
681- * of this function to fail */
682- memset(secnonce, 0, sizeof(*secnonce));
683+ /* Clear and invalidate nonce to avoid nonce reuse. This will cause subsequent
684+ * calls of this function to fail */
685+ secp256k1_musig_secnonce_invalidate(ctx, secnonce, 1);
secp256k1_memclear
.
Thanks @john-moffett. I think it would be an improvement to use secp256k1_memclear
in this case.
But if the compiler can prove this, then it’s indeed fine to optimize it out?
I think in this case it still makes sense to clear it to try to avoid having secret data in memory that we don’t use anymore. This is also the argument for clearing local secret variables at the end of functions.
Why do you think it can be optimized out? I think it can be optimized out if the compiler can prove that no other code will read it. That seems unlikely, as the call is an API call.
I agree that it’s unlikely to be optimized out, and I confirmed on my setup (arm64 macOS Clang 17.0.0) that even with -O3 and full LTO that it wasn’t. However, that’s not guaranteed since it’s possible a different compiler (or later version) can prove no later reads, so I figured it’s just best practice to use a wiper much less likely to be DSE’d.
But if the compiler can prove this, then it’s indeed fine to optimize it out? Or are you saying we should make absolutely sure that it’s cleared out, even if the caller won’t read it?
The latter. If they assumed the callee wiped the memory (as is promised in the API comment), they might not bother themselves and it could be left in caller memory.
682- memset(secnonce, 0, sizeof(*secnonce));
683+ * of this function to fail. Declassify as memclear marks the bytes
684+ * as undefined. */
685+ secp256k1_memclear(secnonce, sizeof(*secnonce));
686+ secp256k1_declassify(ctx, secnonce->data, sizeof(secp256k1_musig_secnonce_magic));
687+ secp256k1_declassify(ctx, &secnonce->data[68], 64);
secnonce
anymore after this.
The tests reuse it inentioanlly and valgrind flags it and gives an error on VERIFY builds.
I can declassify in the tests if that’s more natural.
Ok I understand now. I missed that secp256k1_memclear
uses the UNDEFINED macro (in VERIFY builds).
I think it’s indeed preferrable to explicitly “redefine” the macro in the test instead of declassifying in partial_sign. I think the purpose of UNDEFINEing is to catch accidental use of the buffer and there’s no legitimate reason to read the secnonce buffer except in the test that checks that it was cleared correctly.
declassify
call as well?
399@@ -400,7 +400,9 @@ static void musig_api_tests(void) {
400
401 memcpy(&secnonce_tmp, &secnonce[0], sizeof(secnonce_tmp));
402 CHECK(secp256k1_musig_partial_sign(CTX, &partial_sig[0], &secnonce_tmp, &keypair[0], &keyagg_cache, &session) == 1);
403- /* The secnonce is set to 0 and subsequent signing attempts fail */
404+ /* The secnonce is set to 0 and subsequent signing attempts fail.
405+ * Redefine memory first as secp256k1_musig_partial_sign securely clears and undefines the bytes.*/
406+ SECP256K1_CHECKMEM_DEFINE(&secnonce_tmp, sizeof(secnonce_tmp));
0 /* The secnonce is set to 0 and subsequent signing attempts fail.
1 * musig_partial_sign calls secp256k1_memclear to clear the secnonce, which
2 * marks it as undefined for memory checkers. So we need to explicitly mark
3 * secnonce_tmp as defined before we can compare its value. */
4 SECP256K1_CHECKMEM_DEFINE(&secnonce_tmp, sizeof(secnonce_tmp));
I agree it’s better to prevent the compiler from optimizing out that store.
I’m not sure if this PR adds another code smell. I know it’s me who suggested secp256k1_memclear
above. In the rest of the code base, we use secp256k1_memclear
to nuke memory contents. This is true here as well, but here we want the additional guarantee that the memory is overwritten by zero bytes. That fits the current implementation but not the documented contract of secp256k1_memclear
. An implementation that overwrites everything with 0x42 is equally correct. We simply picked 0x00 because it appears to be faster. I doubt that will change in the future, but I still think we should avoid the contract violation.
We could use secp256k1_memclear
but rename secp256k1_memclear
to secp256k1_memzero
and change its docs to guarantee it writes zeros. There’s nothing wrong with adding that guarantee, but I don’t think the result will be much cleaner if we only do this. We use SECP256K1_CHECKMEM_UNDEFINE
inside this function for a reason. If the goal is to nuke a buffer, then there should be no legitimate reason to read the buffer again (as @jonasnick pointed out above). So nuking and zeroing are different concerns, and neither is a strict subset of the other. What we want in this specific case of zeroing the secnonce is simply both.
We could instead call secp256k1_memzero
(after renaming) and then call SECP256K1_CHECKMEM_DEFINE
. That’s totally fine, and that’s my suggestion. But we really should call SECP256K1_CHECKMEM_DEFINE
inside the function and not only in the tests. If we provide the API guarantee that the secnonce is zeroed, then it’s totally legitimate that the API user reads (if only in a test). And we don’t want the user to get a Valgrind false positive when doing that!
We could also call secp256k1_memclear
and then memset(..., 0, ...)
. That’s also good. It may be a performance hit, but in any case a tiny one. (On Linux, the compiler should not be able to remove the second memset
due to the memory barrier. But I wouldn’t care too much.)
edit: @john-moffett Sorry that this PR escalated quickly given that the issue is seemingly simple…
If we provide the API guarantee that the secnonce is zeroed, then it’s totally legitimate that the API user reads (if only in a test). And we don’t want the user to get a Valgrind false positive when doing that!
That’s a good point that I hadn’t considered.
An alternative to your suggestion is to have both a memclear
and memzero
function, where the latter differs from the former by guaranteeing that the memory is overwritten with 0s and not calling CHECKMEM_UNDEFINE
. We’d clear
memory if we never read it again (and therefore do not care what it’s being overwritten with) and we zero
memory if we care what it’s being overwritten with because it might be read again. I think this is cleaner because we wouldn’t have to call CHECKMEM_DEFINE
in musig_partial_sign
(which seems like a code smell in and of itself).
An alternative to your suggestion is to have both a
memclear
andmemzero
function, where the latter differs from the former by guaranteeing that the memory is overwritten with 0s and not callingCHECKMEM_UNDEFINE
I was going to make the same suggestion. But is there any value in just reusing memczero
and making it non-elidable in the same way memclear
is (though SecureZeroMemory
would have to be replaced with something suitable for conditional erase)? As a separate matter, is it worth making memczero
non-elidable anyway? Perhaps not given its current usage.
@john-moffett Sorry that this PR escalated quickly given that the issue is seemingly simple..
No problem! I should’ve been aware of the issue from the start, but I’ve learned a lot, so I’m happy.
At this point, I think the memclear
and (new) memzero
separation is the cleanest.
memczero
suffers from the same potential issue I initially raised in this PR, so maybe memczero
ought to be hardened? I agree that it’s unlikely that any current compiler would actually DSE any of the calls we’ve been discussing, but is the possibility enough to warrant a change?
An alternative to your suggestion is to have both a
memclear
andmemzero
function, where the latter differs from the former by guaranteeing that the memory is overwritten with 0s and not callingCHECKMEM_UNDEFINE
Ok, sure, that’s another option. Concept ACK .
I was considering that option too, but I didn’t mention it because I had the feeling that the existence of two functions may confuse future devs. But now that I think about it, I guess it’s fine. Perhaps then we can make it clear from the naming that both variants protect against optimization, e.g., by adding an _explicit
(inspired by memzero_explicit
in the Linux kernel) or _secure
suffix.
But is there any value in just reusing
memczero
and making it non-elidable in the same waymemclear
is (thoughSecureZeroMemory
would have to be replaced with something suitable for conditional erase)?
As a separate matter, is it worth making
memczero
non-elidable anyway? Perhaps not given its current usage.
That’s an interesting point you raise.
Regarding its usage: I think it was initially introduced to clear the pubkey
when the secret key is invalid, but without leaking the fact that it is. But protecting the fact that it is already “above and beyond” and just defense-in-depth. And perhaps a bit questionable, though it won’t hurt.
But all its three usages in musig/session_impl.h
(two of them wrapped by _musig_secnonce_invalidate
) are for clearing secrets, and this is where it gets interesting. In this case, I believe there’s a tension here between protecting against two kinds of side-channels, namely timing/memory access (protecting the secrecy of the flag
) vs. someone stealing secrets from a memory dump such as a crash dump (protecting the secrecy of the memory region).
How would a function look like that is both non-elidable and protects the flag
? I have no idea, exactly due to platform-specific APIs such as SecureZeroMemory
. That’s the right way(TM) to clear memory on Windows, and I’m not at all convinced that it’s a good idea to replace it, e.g., by a handrolled version. Or by falling back to the “volatile” function pointer trick, which is supposed to be portable, but who knows what compilers will do in the future and what they will do when they encounter “volatile”, which pretty much has implementation-defined semantics. The nice thing about SecureMemoryZero
is really that we get a guarantee from MSVC not to elide it. So I think memczero
should stay entirely separate from memclear
and memzero
, at least for now.
So if we can’t have the best of both worlds, we need to decide what protection we want. Currently, whenever we use memczero
, we implicitly favor the protection of the flag, and this seems to be a questionable choice for its usages in musig/session_impl.h
(or in general when clearing out secrets). I’d much rather give the attacker that one bit whether the secret key was invalid vs. taking the risk to keep the entire secnonce (or the entire nonce seed) somewhere on the stack. If we believe that this PR is a good idea, i.e., the clearing of the secnonce should better be non-elidable, then I think we should arrive at the same conclusion for the other three uses. And in particular this use here, where the flag
is a constant anyway: https://github.com/bitcoin-core/secp256k1/blob/03fb60ad2e38ae9415e64bf5ba523a10100d4c3c/src/modules/musig/session_impl.h#L485
But this may be a larger discussion, and I’d like to hear @jonasnick’s opinion on it. So my suggestion is to leave memczero
for a follow-up PR. This PR here could then just add memzero
as a variant of memclear
, perhaps add a suffix to the names, and use it in _musig_partial_sign
.
secp256k1_memclear has the side effect of undefining bytes for
valgrind checks. In some cases, we may want to zero bytes
but allow subsequent reads. So we split memclear into
memclear_explicit, which makes no guarantees about the content
of the buffer on return, and memzero_explicit, which guarantees
zero value on return.
Change the memset in partial_sign to use memzero_explicit.
we use memczero, we implicitly favor the protection of the flag, and this seems to be a questionable choice for its usages in musig/session_impl.h. […] And in particular this use here, where the flag is a constant anyway
I agree. We have issue #1621 pointing out the inconsistencies in handling constant-timeness with respect to invalid arguments in the library.