test, ci: -DDEBUG_LOCKORDER flag prevents TSan from deadlock detecting #19047

issue hebasto openend this issue on May 22, 2020
  1. hebasto commented at 6:51 am on May 22, 2020: member

    To detect a potential deadlock in a form of lock order inversion we have two mutually exclusive options:

    • to set DEBUG_LOCKORDER that will cause abort() or std::logic_error exception
    • to use the ThreadSanitizer

    Using both options simultaneously does not work as in the former case lock() is not called if lock order inversion is detected, and the ThreadSanitizer fails to warn about the issue.

    I believe we should allow the ThreadSanitizer to detect potential deadlocks in a CI job.

  2. fanquake added the label Tests on May 22, 2020
  3. MarcoFalke commented at 10:52 am on May 22, 2020: member
    What do you mean? The potential deadlock should still be detected, and then either std::abort, or std::logic_error, or the sanitizer abort will kill the process with a non-zero exit code.
  4. hebasto commented at 11:01 am on May 22, 2020: member

    @MarcoFalke

    What do you mean? The potential deadlock should still be detected, and then either std::abort, or std::logic_error, or the sanitizer abort will kill the process with a non-zero exit code.

    I suggest to use both tools to detect a potential deadlock, each tool in its own test:

    • the -DDEBUG_LOCKORDER flag in the “[bionic] [no depends, only system libs, sanitizers: address/leak (ASan + LSan) + undefined (UBSan) + integer]” CI test (the current state)

    and

    • the ThreadSanitizer in the “[bionic] [no depends, only system libs, sanitizers: thread (TSan), no wallet]” CI test (the subject to change) that is possible by dropping the -DDEBUG_LOCKORDER flag
  5. hebasto commented at 11:08 am on May 22, 2020: member
    And the scope of the ThreadSanitizer is wider than that of our instrumented mutexes.
  6. MarcoFalke commented at 11:11 am on May 22, 2020: member
    So are you saying that DEBUG_LOCKORDER with tsan might result in a false negative?
  7. hebasto commented at 11:14 am on May 22, 2020: member

    So are you saying that DEBUG_LOCKORDER with tsan might result in a false negative?

    No :)

    I only say that we do not test for potential deadlocks by means of the ThreadSanitizer. Always the DEBUG_LOCKORDER facilities are used only (on Travis CI).

  8. hebasto commented at 11:28 am on May 22, 2020: member

    So are you saying that DEBUG_LOCKORDER with tsan might result in a false negative?

    To be exact, when binaries are built with the --with-sanitizers=thread CPPFLAGS=-DDEBUG_LOCKORDER configure options the DEBUG_LOCKORDER code prevents the ThreadSanitizer from rising a warning. If the DEBUG_LOCKORDER code fails to detect a lock order inversion, the ThreadSanitizer should emit a warning.

  9. MarcoFalke commented at 11:54 am on May 22, 2020: member

    If the DEBUG_LOCKORDER code fails to detect a lock order inversion, the ThreadSanitizer should emit a warning.

    It should emit the warning, unless I am missing something and there are false negatives.

  10. hebasto commented at 12:19 pm on May 22, 2020: member

    @MarcoFalke

    0$ ./configure --with-sanitizers=thread && make clean && make
    1$ ./src/test/test_bitcoin --run_test=sync_tests
    2Running 1 test case...
    3...
    4*** No errors detected
    5ThreadSanitizer: reported 2 warnings
    

    vs

    0
    1$ ./configure --with-sanitizers=thread CPPFLAGS=-DDEBUG_LOCKORDER && make clean && make
    2$ ./src/test/test_bitcoin --run_test=sync_tests
    3Running 1 test case...
    4
    5*** No errors detected
    

    Please note no warnings from the ThreadSanitizer in the latter case.

  11. MarcoFalke commented at 12:24 pm on May 22, 2020: member
    The sync tests are causing a potential deadlock on purpose. As far as I can see everything works as expected.
  12. hebasto commented at 12:28 pm on May 22, 2020: member

    The sync tests are causing a potential deadlock on purpose. As far as I can see everything works as expected.

    Of course! But it shows that CPPFLAGS=-DDEBUG_LOCKORDER suppresses the ThreadSanitizer warnings.

  13. MarcoFalke commented at 12:37 pm on May 22, 2020: member
    That is the exact purpose of it in this test case, otherwise ci could never pass under tsan
  14. hebasto commented at 12:42 pm on May 22, 2020: member

    That is the exact purpose of it in this test case, otherwise ci could never pass under tsan

    I think not CPPFLAGS=-DDEBUG_LOCKORDER but the test/sanitizer_suppressions/tsan is responsible for the ThreadSanitizer warning suppression in that case: https://github.com/bitcoin/bitcoin/blob/b5c423c48e094bd098e11c3d1f57acae7502a4da/test/sanitizer_suppressions/tsan#L8

  15. MarcoFalke commented at 12:49 pm on May 22, 2020: member
    DDEBUG_LOCKORDER is used to make the failure “soft” and recoverable. If it wasn’t set either our sync.cpp logic would abort the whole program and mark ci as failed or tsan would abort the whole program and mark the ci as failed.
  16. MarcoFalke commented at 12:50 pm on May 22, 2020: member
    The suppression is needed when someone is compiling without DDEBUG_LOCKORDER
  17. hebasto commented at 1:32 pm on May 22, 2020: member

    @MarcoFalke

    DDEBUG_LOCKORDER is used to make the failure “soft” and recoverable. If it wasn’t set either our sync.cpp logic would abort the whole program and mark ci as failed or tsan would abort the whole program and mark the ci as failed.

    The suppression is needed when someone is compiling without DDEBUG_LOCKORDER

    The CI already uses test/sanitizer_suppressions/tsan: https://github.com/bitcoin/bitcoin/blob/b5c423c48e094bd098e11c3d1f57acae7502a4da/ci/test/04_install.sh#L27

    Mind looking into the following CI jobs:

    ?

  18. MarcoFalke commented at 1:35 pm on May 22, 2020: member
    Ok, but what exactly is the harm or risk? I don’t see why the must be mutually exclusive.
  19. hebasto commented at 1:51 pm on May 22, 2020: member

    Ok, but what exactly is the harm or risk?

    We are just not using a free opportunity to test our code base with both tools, each tool in its own CI job, without interference.

  20. MarcoFalke commented at 2:16 pm on May 22, 2020: member
    I do run my own set of ci configs outside the repo and I do hope that I am not the only one running tests with sanitizers enabled. We can’t possibly fit every matrix combination into a ci build.
  21. hebasto commented at 2:29 pm on May 22, 2020: member
    So you think that opening a pr with this branch is pointless, aren’t you?
  22. vasild commented at 7:53 am on June 2, 2020: member

    Ignoring the very specific test potential_deadlock_detected our DEBUG_LOCKORDER deadlock prevention will call abort() if it detects a deadlock. That is good enough to fail the CI job. If it does not detect the deadlock, then the tsan is not prevented from acting.

    So I think that there is no need to remove DEBUG_LOCKORDER from the tsan CI job as in https://github.com/hebasto/bitcoin/commit/991068d9ea74a865c464c9645ca1fb20e36f41eb. Because a potential deadlock will either be detected by DEBUG_LOCKORDER (leading to abort()) or, if it misses it, then it will be detected by the thread sanitizer. In either case the CI job should fail.

    Using both options simultaneously does not work as in the former case lock() is not called if lock order inversion is detected, and the ThreadSanitizer fails to warn about the issue.

    That is true, but we get an abort() in that case which is also good enough, no?

  23. hebasto closed this on Jun 2, 2020

  24. hebasto commented at 6:20 pm on December 22, 2020: member
    btw, now -DDEBUG_LOCKORDER directly affects TSan run result: #19983 (review)
  25. vasild commented at 9:42 am on December 23, 2020: member
    Is it the case that DEBUG_LOCKORDER fails to detect some double lock? That would be a bug.
  26. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-19 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me