Enable configuring Valgrind support

luke-jr commented at 7:18 pm on September 12, 2020: member

Glancing at the internals of Valgrind annotations, it looks like they could have a performance impact even outside Valgrind.

In any case, it’s nice to let the user disable things explicitly.

luke-jr cross-referenced this on Sep 12, 2020 from issue Update secp256k1 subtree (including BIP340 support) by sipa

sipa commented at 8:38 pm on September 12, 2020: contributor

Just for context, the performance impact should be absolutely negligable. The actual valgrind instruction sequence is not invoked unless SECP256K1_CONTEXT_DECLASSIFY is specified at context creation time (which is only done in the ctime test). Without it, calls to secp256k1_declassify are just a (predictable) conditional jump, and there are just two such calls per signature creation. But indeed, if VALGRIND is not defined at comile time, it’s nothing at all.

I think it’s conceptually reasonable to permit disabling it though.

luke-jr commented at 8:55 pm on September 12, 2020: member

I was thinking more about the VG_CHECK calls in {field,scalar}_*_impl.h - not sure how relevant they are, though.

gmaxwell commented at 8:58 pm on September 12, 2020: contributor

I intentionally avoided this because i don’t want the CTime testing testing a different binary than the users are using in practice.

As an option, I suppose I don’t care… however: more options means that the CI needs to test that that combination compiles, and this (useless?) configuration option might not be worth the ci overhead. @luke-jr Those aren’t used at runtime, no _VERIFY macro is. They are purely test-only instrumentation See util.h.

sipa commented at 8:59 pm on September 12, 2020: contributor

There should only be VG_CHECK_VERIFY calls in non-tests, and those are only enabled when compiled inside unit tests.

sipa commented at 9:05 pm on September 12, 2020: contributor

I intentionally avoided this because i don’t want the CTime testing testing a different binary than the users are using in practice.

Well, arguably, if you want to guarantee that, it should be forced on (perhaps through this PR’s --with-valgrind) rather than being auto-detected?

luke-jr commented at 9:16 pm on September 12, 2020: member

The problem with forcing it is that means everyone building needs Valgrind installed. (But forcing it only potentially makes sense for the application using the library, not the library itself, which goes along with @sipa’s idea.)

gmaxwell commented at 11:11 pm on September 12, 2020: contributor

Well, arguably, if you want to guarantee that, it should be forced on (perhaps through this PR’s –with-valgrind) rather than being auto-detected?

If it doesn’t get compiled in you don’t get the test however. :) so if you get the test you test the correct thing. If you don’t have the valgrind headers, you don’t. In that decision I was concerned there about making it “unusable” for people targeting devices/systems where there is no valgrind at all.

I don’t understand what luke is talking about wrt application.

luke-jr commented at 11:17 pm on September 12, 2020: member

@gmaxwell Bitcoin Core may need a high degree of certainty for consensus compatibility, but other applications may not.

sipa commented at 11:46 pm on September 12, 2020: contributor

@luke-jr Mind changing the travis configuration so that –with-valgrind and –without-valgrind get exercised at least once?

gmaxwell commented at 8:27 am on September 13, 2020: contributor

@luke-jr The runtime valgrind stuff isn’t for consensus compatibility it’s to detect if you compiler ‘optimized’ the code in a way that creates timing sidechannels which leak your secrets.

real-or-random commented at 8:53 am on September 13, 2020: contributor

As explained in the thread, the performance overhead of the valgrind stuff is negligible, so I don’t really see a reason for a config option.

@gmaxwell Bitcoin Core may need a high degree of certainty for consensus compatibility, but other applications may not.

Right, but I believe this has nothing to do with our use of valgrind.

real-or-random commented at 9:08 am on September 13, 2020: contributor

Looking at https://github.com/bitcoin/bitcoin/pull/19944, this PR seems to be based on the wrong assumption that we do some magic valgrind calls unconditionally. It has been said in this thread already but let me stress again that this is just not the case.

We have explicit valgrind magic in our code for two purposes.

The only runtime overhead is a single (runtime) check whether we should do valgrind magic, and the result of the check will be “no” unless you’re running a specific test binary. See #708 for context. This is indeed new.
The VG_CHECK macros. These are disabled at compile-time (unless you’re building the tests), so there’s no overhead at all. And those are not new at all, they’ve been there since 2015.

gmaxwell commented at 10:02 am on September 13, 2020: contributor

The only runtime overhead is a single (runtime) check whether we should do valgrind magic,

For context, it’s right here: https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/secp256k1.c#L141

and the result of the check will be “no” unless you’re running a specific test binary. See #708 for context. This is indeed new.

It’s relatively recent but as the link demonstrates it’s not something newly introduced in PR 19944.

So I’m super duper confused. There isn’t any valgrind anything in 19944 except for the addition of the valgrind-using test components for the new functions which are pretty much just copy-pasting the existing tests.

The VG_CHECK macros. These are disabled at compile-time (unless you’re building the tests), so there’s no overhead at all. And those are not new at all, they’ve been there since 2015.

Particularly, VG_CHECK_VERIFY which is a total no-op macro except when building ./tests binary. (There is VG_CHECK which is unconditional, but it isn’t and shouldn’t be used outside of the test binary).

FWIW, The valgrind instrumentation has never made a benchmarkable performance difference for me regardless, but as real-or-random notes it isn’t executed in the library for users at all, only in special test binaries.

luke-jr commented at 4:17 pm on September 13, 2020: member

@luke-jr The runtime valgrind stuff isn’t for consensus compatibility it’s to detect if you compiler ‘optimized’ the code in a way that creates timing sidechannels which leak your secrets.

Can we do that without valgrind? Is the valgrind stuff even supposed to do that, or is it a side effect on certain architectures?

luke-jr commented at 4:54 pm on September 13, 2020: member

@luke-jr Mind changing the travis configuration so that –with-valgrind and –without-valgrind get exercised at least once?

Which would you recommend?

As explained in the thread, the performance overhead of the valgrind stuff is negligible, so I don’t really see a reason for a config option.

The reason is to ensure the build is deterministic (functionally, not necessarily byte-for-byte) for a given configuration (eg, Gentoo USE flags). With autodetection, the only way to do that is to either forbid or strictly require Valgrind.

Using this patch, the current bitcoin overlay ebuild either DEPENDs on valgrind and uses it (with USE='valgrind'), or doesn’t build using it (with USE='-valgrind') regardless of whether it’s installed or not.

gmaxwell commented at 5:34 pm on September 13, 2020: contributor

I can’t tell if there is still a miscommunication. There is no valgrind code ran by non-test code, so the library itself is functionally equivalent either way. The macros you’re looking at in libsecp256k1 compile into nothing (for VG_CHECK_VERIFY) outside of the test binaries and are behind a boolean that causes them to never get executed outside of a special test harness:

 0/* Define `VG_UNDEF` and `VG_CHECK` when VALGRIND is defined  */
 1#if !defined(VG_CHECK)
 2# if defined(VALGRIND)
 3#  include <valgrind/memcheck.h>
 4#  define VG_UNDEF(x,y) VALGRIND_MAKE_MEM_UNDEFINED((x),(y))
 5#  define VG_CHECK(x,y) VALGRIND_CHECK_MEM_IS_DEFINED((x),(y))
 6# else
 7#  define VG_UNDEF(x,y)
 8#  define VG_CHECK(x,y)
 9# endif
10#endif

-DVERIFY is only set when compiling the tests.c.

Can we do that without valgrind?

No known way.

Is the valgrind stuff even supposed to do that, or is it a side effect on certain architectures?

It’s not much of a side effect of certain architectures but it is a bit of an off-label use.

Valgrind memcheck is a CPU simulator which performs bit-level taint tracking of data. Any data computed from tainted data is treated as tainted (except for operations like xor x,x or mul x,0 which destroy the taint), if memory access index or branch condition depends on tainted data valgrind throws a warning. The original purpose of this is to make uninitialized memory tainted to alert on use of it. This libraries tests use it, among other reasons, to mark secret keys as tainted. It just so happens that the operations that matter the constant time behaviour everywhere are memory indexes and branching.

On some architectures there may be additional relevant operations which data-dependant non-constant time. E.g. s390x has a comparison operation which works basically like memcmp. Valgrind correctly (for our use and its original use) also alerts on the use of this operation on tainted data… which is useful because GCC sometimes emits it for uint64_t comparisons, which brought it to my attention and I have a GCC bug open on that (and it sounds like GCC will fix it). Or– on x86/x86_64 there is division which valgrind does not currently alert on, but this is clearly incorrect for both our usage and the usage memcheck is intended for (especially since division by 0 traps on x86/x86_64, so you really don’t want to divide by uninitialized memory)– so the omission of this check is just an outright bug in memcheck. It’s a ~2 line change to valgrind to also catch tainted inputs to division, I assume upstream will take it if I ever get around to submitting a patch. … but divisions of secret data aren’t a particularly likely error to pass review or get injected by compilers, so I haven’t bothered yet. (though our tests do pass with them applied for me locally, of course).

It’s possible that at some point in the future there would be some arch specific relevant operation which we cared about for constant time behaviour but which memcheck didn’t care about because it never mattered for it’s intended use (rather than being a clear bug like the above mentioned division case). In which case, your “side effect only on some architectures” concern would apply. If that were the case, I’d hope that the valgrind authors would consider constant-time-checking to be a first class supported use of valgrind and add support for catching that operation– because there is no alternative and valgrind’s taint-tracking approach is an extraordinarily useful and effective way of doing exactly what is needed and because doing so is trivial with how valgrind is designed.

For example, if we cared about an architecture where mul secret,0 took non-constant time depending on secret– valgrind’s memcheck might not want to alert on that because mul with zero is a reasonable enough way to untaint memory for initialization tracking purposes. (OTOH, they might not mind alerting on it, because mul is about the slowest way you could untaint memory and so hopefully no compiler ever emits code to do it that way!).

luke-jr commented at 8:09 pm on September 13, 2020: member

There is no valgrind code ran by non-test code, so the library itself is functionally equivalent either way.

Then why would this be a problem? “i don’t want the CTime testing testing a different binary than the users are using in practice.”

gmaxwell commented at 8:21 pm on September 13, 2020: contributor

You’ve trimmed the operative part of my comment:

behind a boolean that causes them to never get executed outside of a special test harness:

Valgrind is used for multiple different testing things. Some of them– VG_CHECK_VERIFY – are only for specially instrumented testing builds, these aren’t testing for miscompilation and don’t show up in the library.

The constant time test, however, is substantially a miscompilation test. It uses valgrind in the actual library, but it’s behind a boolean and only runs that code ( https://github.com/bitcoin/bitcoin/blob/master/src/secp256k1/src/secp256k1.c#L231 ) if a special debugging argument is sent to context creation (see link in my prior comment).

I believe the valgrind call there is completely and totally harmless and could be run all the time for everyone, but out of maximum conservativeness, it’s only run with a hidden test argument sent to the context creation.

This allows the constant time test to run on the actual unmodified production binary.

sipa commented at 8:24 pm on September 13, 2020: contributor

@luke-jr It’s just the fact that compiling with the valgrind macros (even though they’re skipped over in normal operation) changes the binary - a few bytes get inserted, alignments change, sections are ordered differently, … means that the exact library binary tested by the ctime test would be distinct from the one the user will use for production if the latter was compiled without -DVALGRIND. The approach used here means that if Valgrind is available, the ctime test will run on the exact binary used in production.

That is an argument for not compiling the library-as-tested-by-ctime and library-for-production differently, and an argument for having the ctime test and valgrind macro usage default on (especially given the negligible performance impact). I don’t think it’s an argument why the user shouldn’t be able to select whether they want those tests at all or not, though, and the argument about deterministic builds not being dependent on whether or not you had valgrind installed is a very reasonable one IMO.

real-or-random commented at 8:39 pm on September 13, 2020: contributor

the argument about deterministic builds not being dependent on whether or not you had valgrind installed is a very reasonable one IMO.

Yes, I think that’s valid point.

in configure.ac:172 in 22da7f79fe outdated

167@@ -168,7 +168,19 @@ AC_ARG_WITH([ecmult-gen-precision], [AS_HELP_STRING([--with-ecmult-gen-precision
168 )],
169 [req_ecmult_gen_precision=$withval], [req_ecmult_gen_precision=auto])
170 
171-AC_CHECK_HEADER([valgrind/memcheck.h], [enable_valgrind=yes], [enable_valgrind=no], [])
172+AC_ARG_WITH([valgrind], [AS_HELP_STRING([--with-valgrind=yes|no|auto],
173+[Build with extra check for running inside Valgrind [default=auto]]

real-or-random commented at 8:53 pm on September 13, 2020:

nit:

0[Build with extra checks for running inside Valgrind [default=auto]]

real-or-random commented at 8:55 pm on September 13, 2020: contributor

This should set enable_valgrind=no if valgrind is disabled explicitly.

gmaxwell commented at 9:32 pm on September 13, 2020: contributor

A build is going to be byte wise different depending on a multitude of different headers, unfortunately. I assume that’s why luke said “(functionally, not necessarily byte-for-byte)”.

I don’t think it’s an argument why the user shouldn’t be able to select whether they want those tests at all or not, though,

Sure, I don’t see any reason why you shouldn’t be able to force the result of the detection. If nothing else it’ll make it easier for testers to test that the build isn’t broken when the header isn’t there.

fanquake referenced this in commit ba4b3fbcf2 on Sep 14, 2020

configure: Allow specifying --with[out]-valgrind explicitly 412bf874d0

luke-jr force-pushed on Sep 14, 2020

luke-jr commented at 9:44 pm on September 14, 2020: member

Changes made

sidhujag referenced this in commit 711b81967a on Sep 15, 2020

real-or-random commented at 8:54 am on September 15, 2020: contributor

Changes look good.

We should also do this:

@luke-jr Mind changing the travis configuration so that –with-valgrind and –without-valgrind get exercised at least once?

But it’s not entirely trivial, we have a VALGRIND flag already (rename?) and sometimes we pass -DVALGRIND explicitly (can be done nicer now). Maybe we should enable valgrind explicitly on travis builds where we expect the header to be present, except for one or two where we want to test without valgrind. @luke-jr Feel free to give it a try. Otherwise we can do the travis changes in a second PR.

real-or-random approved

real-or-random commented at 8:55 am on September 15, 2020: contributor

ACK 412bf874d09517b559eba4f7addb4c181cc2780b

if you want to get this merged

real-or-random cross-referenced this on Sep 15, 2020 from issue Check if variable=yes instead of if var is set in travis.sh by elichai

luke-jr commented at 5:24 am on September 16, 2020: member

Let’s leave the Travis for another PR then, as it isn’t really clear to me which jobs would be best to test each way, and someone more familiar with the CI could probably figure that out easily.

sipa commented at 9:22 pm on September 18, 2020: contributor

ACK 412bf874d09517b559eba4f7addb4c181cc2780b. Tested by running configure on a system with and without valgrind, and with no argument, with --with-valgrind, and with --without-valgrind.

jonasnick approved

jonasnick commented at 8:41 am on September 19, 2020: contributor

deterministic configure is good

ACK 412bf874d09517b559eba4f7addb4c181cc2780b

jonasnick merged this on Sep 19, 2020

jonasnick closed this on Sep 19, 2020

deadalnix referenced this in commit 39a83aca8b on Sep 28, 2020

deadalnix referenced this in commit 73aaab9944 on Sep 29, 2020

UdjinM6 referenced this in commit c2ab8285d8 on Aug 10, 2021

5tefan referenced this in commit abf9ac837f on Aug 12, 2021

Enable configuring Valgrind support #813