ci: Run ASan/LSan and reorganize sanitizer and Valgrind jobs #846

pull real-or-random wants to merge 4 commits into bitcoin-core:master from real-or-random:202010_ci_matrix changing 4 files +70 −65
  1. real-or-random commented at 5:02 PM on November 9, 2020: contributor

    This enables asan (including lsan), and restructures the sanitizer builds. It also removes -fno-omit-frame-pointer again. This is debatable. The main reason for removing this is that GCC (but not clang) fails to build (runs out of registers) when combining asan, -fno-omit-frame-pointer, and the x86_64 asm. But I believe without -fno-omit-frame-pointer, the instrumented binary may be closer to the real binary, which is good. If we get unusable debugging output on CI without frame pointers, we can still reproduce the build locally.

    • ECMULTGENPRECISION=8 : We should disable this entirely, not only on CI, because it's slow and needs too much stack space
  2. real-or-random force-pushed on Nov 11, 2020
  3. real-or-random force-pushed on Nov 12, 2020
  4. real-or-random commented at 12:53 PM on November 12, 2020: contributor

    A few things are weird on MacOS, and I don't really understand what's going on:

    • jobs 49 and 50 fail because lsan does not work with Apple's clang, see config.log. That's expected according to https://github.com/google/sanitizers/issues/1026
    • jobs 67 succeeds but 68 fails, with the same error message as above. But why the difference? I have no idea, they should use the same toolchain etc. Is this related to gmp? But 67 seem to fail with the "native compiler", and we don't set $CC_FOR_BUILD correctly for gcc. I've added this now.
    • I'm not convinced that the "gcc" jobs (including 67, 68) really use gcc. It's again Apple's fake gcc according to the $CC --version in after_script. @elichai I added a $CC --version in the script, let's see if this makes a difference. (This will be obsolete anyway when we switch away from travis).

    And job 32 times out... well that's not nice but at least I understand what's going on ...

  5. elichai commented at 1:51 PM on November 12, 2020: contributor
    • I'm not convinced that the "gcc" jobs (including 67, 68) really use gcc. It's again Apple's fake gcc according to the $CC --version in after_script. @elichai I added a $CC --version in the script, let's see if this makes a difference. (This will be obsolete anyway when we switch away from travis).

    IIRC that's why I did CC=gcc-9 because OSX has an alias for gcc=clang https://github.com/bitcoin-core/secp256k1/pull/750/files#diff-5c41b2883cfdc97e6b8a07ebe6456643de0160099b7f15ed04afeb2fd627f630R14

    https://github.com/bitcoin-core/secp256k1/blob/4232e5b7da0a68adc14fa4b481f7e106403c200d/contrib/travis.sh#L10-L13

  6. real-or-random commented at 6:42 PM on November 13, 2020: contributor

    IIRC that's why I did CC=gcc-9 because OSX has an alias for gcc=clang

    Apparently that does not seem to work (anymore?).

    I think we shouldn't try to fix this now but first move to another CI platform. For example, Cirrus seems to ship with GCC https://github.com/cirruslabs/cirrus-ci-docs/issues/67 .

  7. real-or-random cross-referenced this on Jan 18, 2021 from issue Build error with -fsanitize=address,undefined -O0 by guidovranken
  8. real-or-random force-pushed on May 14, 2021
  9. real-or-random commented at 11:57 AM on May 14, 2021: contributor

    Rebased to work with Cirrus. I also updated the initial comment.

  10. real-or-random marked this as ready for review on May 14, 2021
  11. real-or-random renamed this:
    Enable more sanitizers on Travis
    ci: Run asan and add more sanitizer/valgrind jobs
    on May 14, 2021
  12. real-or-random force-pushed on May 17, 2021
  13. real-or-random force-pushed on May 17, 2021
  14. real-or-random force-pushed on May 17, 2021
  15. real-or-random force-pushed on May 17, 2021
  16. real-or-random force-pushed on May 17, 2021
  17. real-or-random renamed this:
    ci: Run asan and add more sanitizer/valgrind jobs
    ci: Run ASan/LSan and reorganize sanitizer and Valgrind jobs
    on May 17, 2021
  18. sipa commented at 10:17 PM on May 18, 2021: contributor

    utACK 1191bb07b0198ed605b30efebc7a542fbc14958f

  19. in .cirrus.yml:267 in 1191bb07b0 outdated
     269 | +        WRAPPER_CMD: env
     270 | +        CFLAGS: "-fsanitize=undefined,address"
     271 | +        CFLAGS_FOR_BUILD: "-fsanitize=undefined,address"
     272 | +        UBSAN_OPTIONS: "print_stacktrace=1:halt_on_error=1"
     273 | +        ASAN_OPTIONS: "strict_string_checks=1:detect_stack_use_after_return=1:detect_leaks=1"
     274 | +        LSAN_OPTIONS: "use_unaligned=0"
    


    jonasnick commented at 1:41 AM on May 21, 2021:

    real-or-random commented at 8:21 AM on May 21, 2021:

    Hm indeed. To be honest, I can't remember where this is from, it was already in my old branch. I think we should set it to 1 actually. Let me do this.

    Some background: Both Valgrind and LSan can check for memory leaks, so we would not need LSan. But LSan is very cheap and none of these tools are perfect. So as long as we don't see false positives, it makes sense to just run LSan even though we have Valgrind. And it turns out that use_unaligned=1 is indeed one thing where LSan has better coverage than Valgrind: https://stackoverflow.com/questions/33178242/valgrind-and-pragma-pack2 On my machine the difference in running time of tests 64 was 99,95s with use_unaligned=0 vs 100,02s use_unaligned=1, so this is just noise.


    jonasnick commented at 11:52 AM on May 21, 2021:

    cool, sgtm

  20. in .cirrus.yml:268 in 1191bb07b0 outdated
     263 | +      env:
     264 | +        # The `--error-exitcode` is required to make the test fail if valgrind found errors, otherwise it'll return 0 (https://www.valgrind.org/docs/manual/manual-core.html)
     265 | +        WRAPPER_CMD: "valgrind --error-exitcode=42"
     266 | +    - name: "UBSan, ASan, LSan"
     267 | +      env:
     268 | +        # Use dummy wrapper command to run the tests outside of make check
    


    jonasnick commented at 1:48 AM on May 21, 2021:

    Remind me, what's the reason for not running this (and valgrind) in make check? If it was just about being able to set the test "count"/TEST_ITERS separately then we do have an env var for this now, and could remove it like this https://github.com/jonasnick/secp256k1/commits/202010_ci_matrix-jn (untested).


    sipa commented at 2:10 AM on May 21, 2021:

    I think it's just that make check will try to run without the wrappers.


    jonasnick commented at 2:44 AM on May 21, 2021:

    You mean make check is run without a wrapper even when a wrapper is set? Afaics no, because we unset BUILD whenever we set WRAPPER_CMD.


    real-or-random commented at 8:01 AM on May 21, 2021:

    The main issue (and I think this is what Pieter is partly saying) is that ${WRAPPER_CMD} make check typically won't work for all wrapper commands.

    • valgrind make check will memcheck make but will not memcheck our tests because they'll be child processes. (Now you could try valgrind --trace-children but then it will still additionally memcheck make which gives false positives and wastes CI time.)
    • wine make check just won't work at all because make is not a Windows executable.
    • I haven't tried QEMU but I expect the same as for Wine (because we cross-build).

    What we could is to tell make to use the wrapper command. Now that I'm reading up on this again, I recall that it's not even more work to do this. There's the (obscurely named) LOG_COMPILER env variable, which does exactly this. You can even pass LOG_COMPILER="./libtool --mode=execute valgrind" etc, which we don't need now but maybe in the future. Running all tests using make check will have the benefit that the logs always end up in the same place in the CI output. Let me give it a shot.

    And thanks for reminding me of SECP256K1_TEST_ITERS. When I was writing this, I didn't know that this exists (I reinvented the name TEST_ITERS). So we should simply set this variable instead, like we do for the benchmarks. This has the nice side effect that it would apply to the ordinary make check builds if we'd like to set it there for some reason.

  21. real-or-random force-pushed on May 21, 2021
  22. real-or-random commented at 10:09 AM on May 21, 2021: contributor

    updated as described above

    edit: pushed again, I had forgotten the use_aligned=1.

  23. ci: Run ASan/LSan and reorganize sanitizer and Valgrind jobs de4157f13a
  24. real-or-random force-pushed on May 21, 2021
  25. real-or-random force-pushed on May 21, 2021
  26. in ci/cirrus.sh:41 in 22e980425b outdated
      53 | -    $QEMU_CMD ./exhaustive_tests
      54 | +    make "$BUILD"
      55 |  fi
      56 |  
      57 | -if [ -n "$WINE_CMD" ]
      58 | +if [ -n "$WRAPPER_CMD" ]
    


    jonasnick commented at 11:50 AM on May 21, 2021:

    If LOG_COMPILER works as described, can't we remove this branch? That's also why the CI jobs currently time out: the tests are being run twice.


    real-or-random commented at 12:32 PM on May 21, 2021:

    fixed

  27. in ci/cirrus.sh:39 in 22e980425b outdated
      45 | +# This limits the iterations in the tests and benchmarks.
      46 | +export SECP256K1_TEST_ITERS="$TEST_ITERS"
      47 | +export SECP256K1_BENCH_ITERS="$BENCH_ITERS"
      48 |  
      49 | -if [ -n "$QEMU_CMD" ]
      50 | +if [ -n "$BUILD" ]
    


    jonasnick commented at 11:51 AM on May 21, 2021:

    We don't need to check for this condition $BUILD is always set.


    real-or-random commented at 12:32 PM on May 21, 2021:

    fixed

  28. real-or-random force-pushed on May 21, 2021
  29. real-or-random commented at 6:43 PM on May 21, 2021: contributor

    Hm, it's still timing out. But everything looks correct including the test iterations (like in the revision yesterday, when the longest job took ~ 40 min.) If I'm not mistaken, the only difference to now is that we run the tests in make check and this means the only real difference is that they run concurrently with -j2 and we allocate only one CPU. I pushed a commit to see if this fixes the timeout.

  30. ci: Simplify to use generic wrapper for QEMU, Valgrind, etc fcfcb97e74
  31. tests: Treat empty SECP2561_TEST_ITERS as if it was unset 489ff5c20a
  32. ci: Make test iterations configurable and tweak for sanitizer builds 02dcea1ad9
  33. real-or-random force-pushed on May 21, 2021
  34. real-or-random commented at 6:49 PM on May 21, 2021: contributor

    Ok, ignore my last comment. The tests were running twice because I screwed up when rebasing. I hope it's really ready now...

  35. jonasnick commented at 9:16 PM on May 21, 2021: contributor

    Nice!

    ACK 02dcea1ad9441f857c7768e2b7d304bb19fd2a0c spot-checked ci output, checked that when valgrind ./tests crashes then LOG_COMPILER=valgrind make check also crashes.

  36. sipa commented at 9:38 PM on May 21, 2021: contributor

    utACK 02dcea1ad9441f857c7768e2b7d304bb19fd2a0c

  37. jonasnick merged this on May 21, 2021
  38. jonasnick closed this on May 21, 2021

  39. Fabcien referenced this in commit 5f47cd0547 on Jan 9, 2023
  40. deadalnix referenced this in commit 826d64fb6d on Jan 10, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-23 03:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me