Clang lock debug #6287

pull theuni wants to merge 3 commits into bitcoin:master from theuni:clang-lock-debug changing 5 files +15 −14
  1. theuni commented at 8:27 AM on June 16, 2015: member

    This is part of a much larger networking refactor.

    After spending a while looking at different approaches for cleaning up the networking code, it's become apparent that some functions/classes are going to have to drop cs_main in favor of using their own fine-grained locks.

    Breaking up cs_main would be a monstrous task, and would likely be even harder to review.

    As a first step, I've fixed up our locking so that LOCK() and friends play nicely with clang's -Wthread-safety static-analysis option. This will allow us to document locking assumptions while verifying them at the same time. I added https://github.com/bitcoin/bitcoin/commit/a794284e61988a226ea39327449be9906a1b5abd as an example of how this is useful.

    My goal is to add the EXCLUSIVE_LOCKS_REQUIRED(cs_main)/GUARDED_BY(cs_main) as needed for functions that need to be broken out of the cs_main lock, that way locking changes can be verified much more easily.

    To test:

    ./configure CXXFLAGS="-O2 -g -Wthread-safety" CXX=clang++ CC=clang
    

    To see an example of a failure in action, comment out LOCK(cs_main); in FinalizeNode(). The result should be something like:

    main.cpp:293:5: warning: calling function 'EraseOrphansFor' requires exclusive lock on 'cs_main' [-Wthread-safety-analysis]
        EraseOrphansFor(nodeid);
    
  2. locking: teach Clang's -Wthread-safety to cope with our scoped lock macros
    This allows us to use function/variable/class attributes to specify locking
    requisites, allowing problems to be detected during static analysis.
    
    This works perfectly with newer Clang versions (tested with 3.3-3.7). For older
    versions (tested 3.2), it compiles fine but spews lots of false-positives.
    cd27bba060
  3. locking: fix a few small issues uncovered by -Wthread-safety
    - rpcwallet: No need to lock twice here
    - openssl: Clang doesn't understand selective lock/unlock here. Ignore it.
    - CNode: Fix a legitimate (though very unlikely) locking bug.
    2b890dd424
  4. locking: add a quick example of GUARDED_BY
    This was chosen not because it's necessarily helpful, but because its locking
    assumptions were already correct.
    a794284e61
  5. Diapolo commented at 10:18 AM on June 16, 2015: none

    This looks very nice and it's great to be able to use such static analyzers with our codebase. Looking forward to see what issues can be uncovered with this.

  6. sipa commented at 1:46 PM on June 16, 2015: member

    utACK

  7. fanquake commented at 3:12 PM on June 16, 2015: member

    Tested building with -Wthread-safety. Warning is generated as expected when commenting out the LOCK() in FinalizeNode()

    main.cpp:293:5: warning: calling function 'EraseOrphansFor' requires holding mutex 'cs_main' exclusively [-Wthread-safety-analysis]
        EraseOrphansFor(nodeid);
        ^
    1 warning generated.
    
  8. fanquake commented at 7:00 PM on June 16, 2015: member

    The other warnings fixed by 2b890dd424b32320be6fc0333e67e2d7c9616065

      CXX      libbitcoin_server_a-net.o
    net.cpp:2059:1: warning: mutex 'cs_vSend' is not held on every path through here
          [-Wthread-safety-analysis]
    }
    ^
    net.cpp:2020:26: note: mutex acquired here
    void CNode::EndMessage() UNLOCK_FUNCTION(cs_vSend)
                             ^
    ./threadsafety.h:28:45: note: expanded from macro 'UNLOCK_FUNCTION'
    #define UNLOCK_FUNCTION(...) __attribute__((unlock_function(__VA_ARGS__)))
                                                ^
    1 warning generated.
    
      CXX      libbitcoin_server_a-rpcmining.o
    rpcmining.cpp:463:9: warning: releasing mutex 'cs_main' that was not held
          [-Wthread-safety-analysis]
            LEAVE_CRITICAL_SECTION(cs_main);
            ^
    ./sync.h:177:9: note: expanded from macro 'LEAVE_CRITICAL_SECTION'
            (cs).unlock();             \
            ^
    rpcmining.cpp:487:5: warning: mutex 'cs_main' is not held on every path through
          here [-Wthread-safety-analysis]
        static CBlockIndex* pindexPrev;
        ^
    rpcmining.cpp:479:9: note: mutex acquired here
            ENTER_CRITICAL_SECTION(cs_main);
            ^
    ./sync.h:172:9: note: expanded from macro 'ENTER_CRITICAL_SECTION'
            (cs).lock();                                          \
            ^
    2 warnings generated.
    
      CXX      libbitcoin_util_a-util.o
    util.cpp:121:9: warning: releasing mutex 'ppmutexOpenSSL[i]' that was not held [-Wthread-safety-analysis]
            LEAVE_CRITICAL_SECTION(*ppmutexOpenSSL[i]);
            ^
    ./sync.h:177:9: note: expanded from macro 'LEAVE_CRITICAL_SECTION'
            (cs).unlock();             \
            ^
    util.cpp:123:1: warning: mutex 'ppmutexOpenSSL[i]' is not held on every path through here
          [-Wthread-safety-analysis]
    }
    ^
    util.cpp:119:9: note: mutex acquired here
            ENTER_CRITICAL_SECTION(*ppmutexOpenSSL[i]);
            ^
    ./sync.h:172:9: note: expanded from macro 'ENTER_CRITICAL_SECTION'
            (cs).lock();                                          \
            ^
    2 warnings generated.
    
  9. laanwj commented at 7:00 AM on June 17, 2015: member

    Concept ACK, still need to review in detail. Very nice to see the static analysis annotations finally used.

  10. laanwj added the label Refactoring on Jun 17, 2015
  11. theuni commented at 3:47 PM on June 17, 2015: member

    @laanwj There are a few details that aren't quite right, but I didn't want to change too much in this PR until there was some interest. I can take a stab at fixing them up before/after pull, whichever you'd prefer:

    • during analysis, TRY_LOCKs are interpreted as always gaining the lock. I don't think that's much of an issue in practice, since from a static point-of-view there's not much difference. Just need to remember that it won't point out cases where you TRY_LOCK and forget to test the result.
    • There's no distinction between a mutex/recursive mutex. I think that's not really an issue either, since clang points out when we're locking a mutex that's already locked (see rpcwallet in 2b890dd)

    For examples of real usage, see my play tree here: https://github.com/theuni/bitcoin/commits/clang-lock-debug-more. It fleshed out several missing locks already.

  12. laanwj commented at 2:56 PM on June 19, 2015: member

    I think that's not really an issue either, since clang points out when we're locking a mutex that's already locked (see rpcwallet in 2b890dd)

    Could this be an issue if a lock is held multiple times along one code path, but only once along another code path? If you then remove the locking (according to clang's warning) in one place it may result in the lock not held anymore at all in one of the cases.

  13. theuni commented at 8:14 PM on June 23, 2015: member

    @laanwj I've looked deeper into this, here's a good sample of some snags we hit:

    static RecursiveMutex cs_main;
    static int myint = 3;
    static std::vector<int*> GUARDED_BY(cs_main) vec(1,&myint);
    int* func2()
    {
      LOCK(cs_main);
      return vec[0];
    }
    
    void func1() EXCLUSIVE_LOCKS_REQUIRED(cs_main)
    {
      func2();
      vec.clear();
    }
    
    int main()
    {
        int* unguarded = func2();
        *unguarded = 0;
        LOCK(cs_main);
        func1();
        unguarded = func2();
        return unguarded;
    }
    

    Things clang doesn't complain about:

    • *unguarded = 0 works without locking cs_main. Unfortunately, we return pointers like that pretty often.
    • cs_main is locked twice in the second call to func2().

    Its analysis seems to be entirely within the scope of each function. If you don't lock in func2, it complains about accessing vec, because it doesn't know that it should already be locked. You can tell it that it should assume cs_main is locked by specifying EXCLUSIVE_LOCKS_REQUIRED(cs_main), but then it complains about locking twice.

    After going through and marking a few functions/variables, it becomes very clear how it all fits together. Since our mutexes are recursive anyway, the ignored double-locks aren't a problem anyway. If you really want to prevent double-locks (treat as a non-recursive mutex) you can notate the function with EXCLUSIVE_LOCKS_REQUIRED(!cs_main).

    However, it does point out functions where we needlessly lock twice. rpcwallet in https://github.com/bitcoin/bitcoin/commit/a794284e61988a226ea39327449be9906a1b5abd is a good example of that. If you have 2 scoped locks in the same function, or if you specify that a lock is required then lock again, it'll warn.

    So to answer your question: if you remove a lock because it's locked twice in one path, and that leaves another path exposed, it'll warn about the new exposed path.

    The basic work-flow is:

    • add GUARDED_BY(cs_foo) to a variable.
    • build. You'll get a slew of new warnings because functions aren't marked.
    • go through each warning and add EXCLUSIVE_LOCKS_REQUIRED(cs_foo) to each function where it needs to be locked before entering if that function does not lock first.
    • build again. if you have new warnings, it means that we do unlocked calls into the functions that you just notated. Potential bug squashed! Now you need to either lock inside of the called function and remove the _REQUIRED, or leave it and lock first.

    So the big question is: For this to be useful, tons of functions need to be notated. Would you be ok with that? I'm happy to do the work (i've already done much of it locally), but I'm afraid that it will get pretty noisy. For what it's worth, once the current issues are cleaned up, travis could be set to fail when new problems are detected.

  14. gavinandresen commented at 8:20 PM on June 23, 2015: contributor

    I vote for noisy annotations-- locking bugs are NASTY.

  15. theuni commented at 8:41 PM on June 23, 2015: member

    For anyone wanting to try out the semantics, here's an easy test that simulates our environment and conventions: https://gist.github.com/theuni/f5d6d3db9434ca546422. Compile with: clang++ -std=c++11 -fsyntax-only -Wthread-safety locks.cpp

  16. laanwj commented at 5:26 PM on July 20, 2015: member

    Is this ready for merging? If not, can we get a useful subset in?

  17. theuni commented at 10:33 PM on July 22, 2015: member

    Yes, this is useful as a starting point.

  18. laanwj merged this on Jul 23, 2015
  19. laanwj closed this on Jul 23, 2015

  20. laanwj referenced this in commit d2464dfee9 on Jul 23, 2015
  21. zkbot referenced this in commit 57d420e2f8 on Feb 15, 2017
  22. zkbot referenced this in commit 88c209dba6 on Feb 20, 2017
  23. zkbot referenced this in commit dadb1ab74c on Mar 3, 2017
  24. HLXEasy referenced this in commit 051143ee0f on Nov 25, 2019
  25. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-18 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me