Glancing at the internals of Valgrind annotations, it looks like they could have a performance impact even outside Valgrind.
In any case, it’s nice to let the user disable things explicitly.
Glancing at the internals of Valgrind annotations, it looks like they could have a performance impact even outside Valgrind.
In any case, it’s nice to let the user disable things explicitly.
Just for context, the performance impact should be absolutely negligable. The actual valgrind instruction sequence is not invoked unless SECP256K1_CONTEXT_DECLASSIFY is specified at context creation time (which is only done in the ctime test). Without it, calls to secp256k1_declassify
are just a (predictable) conditional jump, and there are just two such calls per signature creation. But indeed, if VALGRIND
is not defined at comile time, it’s nothing at all.
I think it’s conceptually reasonable to permit disabling it though.
I intentionally avoided this because i don’t want the CTime testing testing a different binary than the users are using in practice.
As an option, I suppose I don’t care… however: more options means that the CI needs to test that that combination compiles, and this (useless?) configuration option might not be worth the ci overhead. @luke-jr Those aren’t used at runtime, no _VERIFY macro is. They are purely test-only instrumentation See util.h.
VG_CHECK_VERIFY
calls in non-tests, and those are only enabled when compiled inside unit tests.
I intentionally avoided this because i don’t want the CTime testing testing a different binary than the users are using in practice.
Well, arguably, if you want to guarantee that, it should be forced on (perhaps through this PR’s --with-valgrind
) rather than being auto-detected?
Well, arguably, if you want to guarantee that, it should be forced on (perhaps through this PR’s –with-valgrind) rather than being auto-detected?
If it doesn’t get compiled in you don’t get the test however. :) so if you get the test you test the correct thing. If you don’t have the valgrind headers, you don’t. In that decision I was concerned there about making it “unusable” for people targeting devices/systems where there is no valgrind at all.
I don’t understand what luke is talking about wrt application.
As explained in the thread, the performance overhead of the valgrind stuff is negligible, so I don’t really see a reason for a config option.
@gmaxwell Bitcoin Core may need a high degree of certainty for consensus compatibility, but other applications may not.
Right, but I believe this has nothing to do with our use of valgrind.
Looking at https://github.com/bitcoin/bitcoin/pull/19944, this PR seems to be based on the wrong assumption that we do some magic valgrind calls unconditionally. It has been said in this thread already but let me stress again that this is just not the case.
We have explicit valgrind magic in our code for two purposes.
The only runtime overhead is a single (runtime) check whether we should do valgrind magic, and the result of the check will be “no” unless you’re running a specific test binary. See #708 for context. This is indeed new.
The VG_CHECK macros. These are disabled at compile-time (unless you’re building the tests), so there’s no overhead at all. And those are not new at all, they’ve been there since 2015.
The only runtime overhead is a single (runtime) check whether we should do valgrind magic,
For context, it’s right here: https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/secp256k1.c#L141
and the result of the check will be “no” unless you’re running a specific test binary. See #708 for context. This is indeed new.
It’s relatively recent but as the link demonstrates it’s not something newly introduced in PR 19944.
So I’m super duper confused. There isn’t any valgrind anything in 19944 except for the addition of the valgrind-using test components for the new functions which are pretty much just copy-pasting the existing tests.
The VG_CHECK macros. These are disabled at compile-time (unless you’re building the tests), so there’s no overhead at all. And those are not new at all, they’ve been there since 2015.
Particularly, VG_CHECK_VERIFY which is a total no-op macro except when building ./tests binary. (There is VG_CHECK which is unconditional, but it isn’t and shouldn’t be used outside of the test binary).
FWIW, The valgrind instrumentation has never made a benchmarkable performance difference for me regardless, but as real-or-random notes it isn’t executed in the library for users at all, only in special test binaries.
@luke-jr The runtime valgrind stuff isn’t for consensus compatibility it’s to detect if you compiler ‘optimized’ the code in a way that creates timing sidechannels which leak your secrets.
Can we do that without valgrind? Is the valgrind stuff even supposed to do that, or is it a side effect on certain architectures?
@luke-jr Mind changing the travis configuration so that –with-valgrind and –without-valgrind get exercised at least once?
Which would you recommend?
As explained in the thread, the performance overhead of the valgrind stuff is negligible, so I don’t really see a reason for a config option.
The reason is to ensure the build is deterministic (functionally, not necessarily byte-for-byte) for a given configuration (eg, Gentoo USE flags). With autodetection, the only way to do that is to either forbid or strictly require Valgrind.
Using this patch, the current bitcoin overlay ebuild either DEPENDs on valgrind and uses it (with USE='valgrind'
), or doesn’t build using it (with USE='-valgrind'
) regardless of whether it’s installed or not.
I can’t tell if there is still a miscommunication. There is no valgrind code ran by non-test code, so the library itself is functionally equivalent either way. The macros you’re looking at in libsecp256k1 compile into nothing (for VG_CHECK_VERIFY) outside of the test binaries and are behind a boolean that causes them to never get executed outside of a special test harness:
0/* Define `VG_UNDEF` and `VG_CHECK` when VALGRIND is defined */
1#if !defined(VG_CHECK)
2# if defined(VALGRIND)
3# include <valgrind/memcheck.h>
4# define VG_UNDEF(x,y) VALGRIND_MAKE_MEM_UNDEFINED((x),(y))
5# define VG_CHECK(x,y) VALGRIND_CHECK_MEM_IS_DEFINED((x),(y))
6# else
7# define VG_UNDEF(x,y)
8# define VG_CHECK(x,y)
9# endif
10#endif
-DVERIFY is only set when compiling the tests.c.
Can we do that without valgrind?
No known way.
Is the valgrind stuff even supposed to do that, or is it a side effect on certain architectures?
It’s not much of a side effect of certain architectures but it is a bit of an off-label use.
Valgrind memcheck is a CPU simulator which performs bit-level taint tracking of data. Any data computed from tainted data is treated as tainted (except for operations like xor x,x or mul x,0 which destroy the taint), if memory access index or branch condition depends on tainted data valgrind throws a warning. The original purpose of this is to make uninitialized memory tainted to alert on use of it. This libraries tests use it, among other reasons, to mark secret keys as tainted. It just so happens that the operations that matter the constant time behaviour everywhere are memory indexes and branching.
On some architectures there may be additional relevant operations which data-dependant non-constant time. E.g. s390x has a comparison operation which works basically like memcmp. Valgrind correctly (for our use and its original use) also alerts on the use of this operation on tainted data… which is useful because GCC sometimes emits it for uint64_t comparisons, which brought it to my attention and I have a GCC bug open on that (and it sounds like GCC will fix it). Or– on x86/x86_64 there is division which valgrind does not currently alert on, but this is clearly incorrect for both our usage and the usage memcheck is intended for (especially since division by 0 traps on x86/x86_64, so you really don’t want to divide by uninitialized memory)– so the omission of this check is just an outright bug in memcheck. It’s a ~2 line change to valgrind to also catch tainted inputs to division, I assume upstream will take it if I ever get around to submitting a patch. … but divisions of secret data aren’t a particularly likely error to pass review or get injected by compilers, so I haven’t bothered yet. (though our tests do pass with them applied for me locally, of course).
It’s possible that at some point in the future there would be some arch specific relevant operation which we cared about for constant time behaviour but which memcheck didn’t care about because it never mattered for it’s intended use (rather than being a clear bug like the above mentioned division case). In which case, your “side effect only on some architectures” concern would apply. If that were the case, I’d hope that the valgrind authors would consider constant-time-checking to be a first class supported use of valgrind and add support for catching that operation– because there is no alternative and valgrind’s taint-tracking approach is an extraordinarily useful and effective way of doing exactly what is needed and because doing so is trivial with how valgrind is designed.
For example, if we cared about an architecture where mul secret,0 took non-constant time depending on secret– valgrind’s memcheck might not want to alert on that because mul with zero is a reasonable enough way to untaint memory for initialization tracking purposes. (OTOH, they might not mind alerting on it, because mul is about the slowest way you could untaint memory and so hopefully no compiler ever emits code to do it that way!).
There is no valgrind code ran by non-test code, so the library itself is functionally equivalent either way.
Then why would this be a problem? “i don’t want the CTime testing testing a different binary than the users are using in practice.”
You’ve trimmed the operative part of my comment:
behind a boolean that causes them to never get executed outside of a special test harness:
Valgrind is used for multiple different testing things. Some of them– VG_CHECK_VERIFY – are only for specially instrumented testing builds, these aren’t testing for miscompilation and don’t show up in the library.
The constant time test, however, is substantially a miscompilation test. It uses valgrind in the actual library, but it’s behind a boolean and only runs that code ( https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/secp256k1.c#L231 ) if a special debugging argument is sent to context creation (see link in my prior comment).
I believe the valgrind call there is completely and totally harmless and could be run all the time for everyone, but out of maximum conservativeness, it’s only run with a hidden test argument sent to the context creation.
This allows the constant time test to run on the actual unmodified production binary.
@luke-jr It’s just the fact that compiling with the valgrind macros (even though they’re skipped over in normal operation) changes the binary - a few bytes get inserted, alignments change, sections are ordered differently, … means that the exact library binary tested by the ctime test would be distinct from the one the user will use for production if the latter was compiled without -DVALGRIND. The approach used here means that if Valgrind is available, the ctime test will run on the exact binary used in production.
That is an argument for not compiling the library-as-tested-by-ctime and library-for-production differently, and an argument for having the ctime test and valgrind macro usage default on (especially given the negligible performance impact). I don’t think it’s an argument why the user shouldn’t be able to select whether they want those tests at all or not, though, and the argument about deterministic builds not being dependent on whether or not you had valgrind installed is a very reasonable one IMO.
the argument about deterministic builds not being dependent on whether or not you had valgrind installed is a very reasonable one IMO.
Yes, I think that’s valid point.
167@@ -168,7 +168,19 @@ AC_ARG_WITH([ecmult-gen-precision], [AS_HELP_STRING([--with-ecmult-gen-precision
168 )],
169 [req_ecmult_gen_precision=$withval], [req_ecmult_gen_precision=auto])
170
171-AC_CHECK_HEADER([valgrind/memcheck.h], [enable_valgrind=yes], [enable_valgrind=no], [])
172+AC_ARG_WITH([valgrind], [AS_HELP_STRING([--with-valgrind=yes|no|auto],
173+[Build with extra check for running inside Valgrind [default=auto]]
nit:
0[Build with extra checks for running inside Valgrind [default=auto]]
enable_valgrind=no
if valgrind is disabled explicitly.
A build is going to be byte wise different depending on a multitude of different headers, unfortunately. I assume that’s why luke said “(functionally, not necessarily byte-for-byte)”.
I don’t think it’s an argument why the user shouldn’t be able to select whether they want those tests at all or not, though,
Sure, I don’t see any reason why you shouldn’t be able to force the result of the detection. If nothing else it’ll make it easier for testers to test that the build isn’t broken when the header isn’t there.
Changes look good.
We should also do this:
@luke-jr Mind changing the travis configuration so that –with-valgrind and –without-valgrind get exercised at least once?
But it’s not entirely trivial, we have a VALGRIND
flag already (rename?) and sometimes we pass -DVALGRIND
explicitly (can be done nicer now). Maybe we should enable valgrind explicitly on travis builds where we expect the header to be present, except for one or two where we want to test without valgrind. @luke-jr Feel free to give it a try. Otherwise we can do the travis changes in a second PR.
ACK 412bf874d09517b559eba4f7addb4c181cc2780b
if you want to get this merged
--with-valgrind
, and with --without-valgrind
.
deterministic configure is good
ACK 412bf874d09517b559eba4f7addb4c181cc2780b