Add memory sanitizer test to travis

jonasnick commented at 10:44 pm on November 4, 2019: contributor

In #558 there was an unintialized memory read (in the tests). This happened in all travis test configs but only a single test configuration produced an error in a verify check. In order to catch issues with uninitialized memory reliably, this PR adds a test configuration with the clang memory sanitizer to travis. The memory sanitizer errors out when running with the unfixed schnorrsig PR.

The result of running this PR with an additional commit that adds an uninitialized memory can be viewed at https://travis-ci.org/jonasnick/secp256k1/jobs/607352962

in .travis.yml:70 in 33ad6d62fd outdated

65+    - compiler: clang
66+      # --disable-openssl-tests because openssl uses uninitialized memory. ASM
67+      # and BIGNUM are disabled because clang memory sanitizer does not work
68+      # with inline assembly (https://clang.llvm.org/docs/MemorySanitizer.html).
69+      # The memory sanitizer is instructed to exit with a different exit code
70+      # using MSAN_OPTIONS. This is because the default exit code is 77 - the

sipa commented at 10:51 pm on November 4, 2019:

Wow, how long did it take you to figure this out?

jonasnick commented at 11:20 pm on November 4, 2019:

Haha, not long luckily. The automake doc is pretty clear about exit code 77 being the cause of skipped tests.

sipa commented at 10:52 pm on November 4, 2019: contributor

Concept ACK

real-or-random commented at 10:41 am on November 5, 2019: contributor

Apparently both test programs fail with the sanitizer enabled on travis.

I’m testing this locally and the output is much better with ./configure --enable-coverage --disable-openssl-tests CC=clang CFLAGS="-fsanitize=memory -fsanitize-memory-track-origins -fno-omit-frame-pointer -g"

-fsanitize-memory-track-origins is a no brainer. It needs more ressources but fine
-fno-omit-frame-pointer -g for printing functions and line numbers
--enable-coverage is mostly a hack to force -O0.

Interestingly, I get different results with coverage on and off (due to the enabled VERIFYs), and in both cases the sanitizer fails pretty early here. :/

edit: The caveat of all these options is that they don’t work on the compiler options that we really use for the build in the end. I still think they’re useful, I guess the memory sanitizer is more useful for spotting mistakes in the source code and not issues introduced by the compiler.

jonasnick commented at 10:49 am on November 5, 2019: contributor

@real-or-random you’re missing --without-asm and --with-bignum=no. The problem with the travis test run is that ASM=no doesn’t have an effect. Will write a fix.

real-or-random commented at 10:58 am on November 5, 2019: contributor

Oh indeed. I still suggest adding -fsanitize-memory-track-origins -fno-omit-frame-pointer -g here to make sure that the output of the sanitizer is useful.

By the way, is there a reason why you can’t override the optimization level in the CFLAGS? @gmaxwell @sipa

in .travis.yml:73 in 33ad6d62fd outdated

68+      # with inline assembly (https://clang.llvm.org/docs/MemorySanitizer.html).
69+      # The memory sanitizer is instructed to exit with a different exit code
70+      # using MSAN_OPTIONS. This is because the default exit code is 77 - the
71+      # same exit code that autotools make check interprets as a test that is
72+      # supposed to be skipped.
73+      env: EXTRAFLAGS="--disable-openssl-tests CFLAGS=-fsanitize=memory" ASM=no BIGNUM=no EXPERIMENTAL=yes ENDOMORPHISM=yes RECOVERY=yes ECDH=yes MSAN_OPTIONS=exitcode=42

real-or-random commented at 11:24 am on November 5, 2019:

Maybe it’s a good idea to test with both ENDOMORPHISM=yes and ENDOMORPHISM=no because both code paths are interesting for memory: setting up the contexts involves a lot of pointer arithmetic etc.

jonasnick commented at 1:22 pm on November 5, 2019:

Yeah I can add that.

practicalswift commented at 11:59 am on November 5, 2019: contributor

Strong Concept ACK

Cannot survive over long time horizons without the sanitizers :)

Thanks for doing this @jonasnick!

jonasnick force-pushed on Nov 5, 2019

jonasnick commented at 1:04 pm on November 5, 2019: contributor

Rebased. @real-or-random

Oh indeed. I still suggest adding -fsanitize-memory-track-origins -fno-omit-frame-pointer -g here to make sure that the output of the sanitizer is useful.

When I experimented with this before opening the PR I noticed that with -fsanitize-memory-track-origins the tests take much longer (especially with schnorrsigs). I settled on giving travis only the responsibility to detect errors and not to pretty-print them. That would be done locally, by the devs because they need to verify anyway that their patch works. OTOH I don’t know how to document all the nice flags such that it’s easy to discover. I’ll play with -fsanitize-memory-track-origins in my fork to see how long it takes for travis.

jonasnick force-pushed on Nov 5, 2019

jonasnick commented at 8:23 pm on November 5, 2019: contributor

I’ve added a non-endo test. But my tests with -fsanitize-memory-track-origins didn’t go that well: I didn’t manage to escape quotes in a way that allows specifying multiple options with CFLAGS. Unless someone has an idea for how to do that, I’ll leave the PR as is.

jonasnick force-pushed on Nov 5, 2019

jonasnick force-pushed on Nov 6, 2019

Add a clang memory sanitizer test to travis f3cae6be43

jonasnick commented at 9:16 am on November 6, 2019: contributor

With @real-or-random’s help I added -fno-omit-frame-pointer -g to give a much nicer output in case of failure.

real-or-random commented at 10:37 am on November 6, 2019: contributor

ACK f3cae6be43457e8a7837b53f5f4e60068feffca2

When I experimented with this before opening the PR I noticed that with -fsanitize-memory-track-origins the tests take much longer (especially with schnorrsigs). I settled on giving travis only the responsibility to detect errors and not to pretty-print them. That would be done locally, by the devs because they need to verify anyway that their patch works.

Makes perfect sense by the way.

real-or-random approved

gmaxwell commented at 0:54 am on November 7, 2019: contributor

The msan is a much weaker test than valgrind particularly with -DVALGRIND as that tests properties that can’t be easily tested any other way … does the travis environment have valgrind? We’ve historically used valgrind (so this is not a case where valgrind is too slow to be useful)…

real-or-random commented at 9:38 am on November 7, 2019: contributor

The msan is a much weaker test than valgrind particularly with -DVALGRIND as that tests properties that can’t be easily tested any other way … does the travis environment have valgrind? We’ve historically used valgrind (so this is not a case where valgrind is too slow to be useful)…

My guess was that it is too slow but this was really just a guess. If you say it’s not, we should definitively try it.

elichai commented at 10:39 am on November 7, 2019: contributor

@gmaxwell I’m not an expert on valgrind and/or sanitizers, but I’m not sure that you’re right. I think msan also capture/checks the stack. where valgrind mostly checks only the heap (though I might be confusing msan with asan)

gmaxwell commented at 7:09 pm on November 7, 2019: contributor

@elichai that isn’t the case, memcheck checks the stack/heap, use of freed memory, use of uninitilized memory (regardless of stack or heap). And it isn’t broken by SIMD or assembly … Sometimes optimizations in the compiler can hide issues, since valgrind cannot see any behaviour that doesn’t actually make it into the binary, but the optimizers can hide bugs from msan too.

Moreover, libsecp256k1’s tests have active instrumentation for valgrind which mark memory that calls shouldn’t touch as uninitialized and then after the calls verifies that it’s still uninitialized– making sure that its not moving data out and putting it back from pointers it shouldn’t be accessing at all.

With the exception of bugs totally removed by optimization I’m not aware of any case where msan is more sensitive than valgrind, but I’m aware of many where it’s less. Because valgrind can actually run with the production binaries it can also catch miscompliation. – even the msan page states “MSan implements a subset of functionality found in Valgrind (Memcheck tool)”.

The big thing that msan has going for it is that its faster than memcheck. Of course, there is no harm in using both… but switching from using valgrind in development to msan run in travis would be a step back.

jonasnick commented at 7:17 pm on November 7, 2019: contributor

@elichai valgrind certainly found the uninitialized I temporarily added to #558. @gmaxwell this PR is not intended to replace local testing - it’s just belt and suspenders. Travis is not to be trusted anyway.

I played with running valgrind in travis and it seems to work with “test count” 8 instead of the default 64 https://github.com/jonasnick/secp256k1/pull/10. I’ll open an alternative PR with a more cleaned up version of the valgrind travis config.

in .travis.yml:67 in f3cae6be43

60@@ -61,6 +61,34 @@ matrix:
61           packages:
62             - gcc-multilib
63             - libgmp-dev:i386
64+
65+    # clang with memory sanitizer
66+    #
67+    # --disable-openssl-tests because openssl uses uninitialized memory. ASM

MarcoFalke commented at 10:16 pm on November 7, 2019:

nit: I haven’t checked this myself, but an alternative might be to export MSAN_OPTIONS with a suppressions file for the openssl function that is affected.

real-or-random commented at 9:12 am on November 8, 2019:

Indeed but I don’t think that’s better in the end, and disabling the tests as done here is simpler. Note that the flag here really just influences just the test code. OpenSSL is only used in the tests, not used in the library itself.

jonasnick commented at 1:29 pm on November 8, 2019: contributor

The memory sanitizer documentation says that it implements a subset of valgrind’s memcheck. So I don’t think we need both and can close this PR if we have valgrind.

elichai cross-referenced this on Nov 8, 2019 from issue Add valgrind check to travis by elichai

MarcoFalke cross-referenced this on Nov 14, 2019 from issue Run tests in memory sanitizer by MarcoFalke

jonasnick closed this on Nov 25, 2019

jonasnick referenced this in commit 22a6031184 on Nov 25, 2019

Add memory sanitizer test to travis #687