log, sync: change lock contention from preprocessor directive to log category #22736

pull jonatack wants to merge 4 commits into bitcoin:master from jonatack:lockcontention-log-instead-of-define changing 7 files +22 −26
  1. jonatack commented at 1:05 pm on August 18, 2021: member

    To enable lock contention logging, DEBUG_LOCKCONTENTION has to be defined at compilation. Once built, the logging is not limited to a category and is high frequency, verbose and in all-caps. With these factors combined, it seems likely to be rarely used.

    This patch:

    • adds a lock logging category
    • adds a timing macro in microseconds, LOG_TIME_MICROS_WITH_CATEGORY
    • updates BCLog::LogMsg() to omit irrelevant decimals for microseconds and skip unneeded code and math
    • improves the lock contention logging, drops the all-caps, and displays the duration in microseconds
    • removes the conditional compilation directives
    • allows lock contentions to be logged on startup with -debug=lock or at run time with bitcoin-cli logging '["lock"]'
     0$ bitcoind -signet -debug=lock
     12021-09-01T12:40:01Z LockContention: cs_vNodes, net.cpp:1920 started
     22021-09-01T12:40:01Z LockContention: cs_vNodes, net.cpp:1920 completed (4μs)
     32021-09-01T12:40:01Z LockContention: cs_vNodes, net.cpp:1302 started
     42021-09-01T12:40:01Z LockContention: cs_vNodes, net.cpp:1302 completed (4μs)
     52021-09-01T12:40:02Z LockContention: cs_vNodes, net.cpp:2242 started
     62021-09-01T12:40:02Z LockContention: cs_vNodes, net.cpp:2242 completed (20μs)
     72021-09-01T12:43:04Z LockContention: ::cs_main, validation.cpp:4980 started
     82021-09-01T12:43:04Z LockContention: ::cs_main, validation.cpp:4980 completed (3μs)
     9
    10$ bitcoin-cli -signet logging
    11  "lock": true,
    12
    13$ bitcoin-cli -signet logging [] '["lock"]'
    14  "lock": false,
    15
    16$ bitcoin-cli -signet logging '["lock"]'
    17  "lock": true,
    

    I’ve tested this with Clang 13 and GCC 10.2.1, on Debian, with and without --enable-debug.

  2. jonatack marked this as a draft on Aug 18, 2021
  3. in src/sync.h:140 in 6f6abbe7d7 outdated
    136@@ -138,14 +137,10 @@ class SCOPED_LOCKABLE UniqueLock : public Base
    137     void Enter(const char* pszName, const char* pszFile, int nLine)
    138     {
    139         EnterCritical(pszName, pszFile, nLine, Base::mutex());
    140-#ifdef DEBUG_LOCKCONTENTION
    141         if (!Base::try_lock()) {
    


    laanwj commented at 2:11 pm on August 18, 2021:
    I am somewhat worried that the extra calls to Base::try_lock cause performance overhead in the non-lock contention debugging case. It’s unnecessary. We might want to gate that too on the log category being enabled.

    jonatack commented at 2:14 pm on August 18, 2021:
    Thanks, wondering that as well (or alternatively, leave the directives in place and just improve the logging experience when DEBUG_LOCKCONTENTION is defined).

    jonatack commented at 6:03 pm on August 18, 2021:

    Implemented your suggestion (thanks!)

    (Behind the gate it could now be just LogPrintf if anyone prefers.)

  4. DrahtBot added the label Tests on Aug 18, 2021
  5. log, sync: add LOCK logging category, apply it to lock contention b7a17444e0
  6. jonatack force-pushed on Aug 18, 2021
  7. jonatack commented at 8:15 pm on August 18, 2021: member
    Moving this out of draft now, ready for review.
  8. jonatack marked this as ready for review on Aug 18, 2021
  9. DrahtBot commented at 8:35 am on August 19, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #20487 (Add syscall sandboxing using seccomp-bpf (Linux secure computing mode) by practicalswift)
    • #16365 (Log RPC parameters (arguments) if -debug=rpcparams by LarryRuane)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  10. laanwj commented at 2:50 pm on August 19, 2021: member
    Code review ACK ccd73de1dd969097d34634c2be2fc32b03fbd09e
  11. in src/sync.h:12 in ccd73de1dd outdated
     5@@ -6,7 +6,10 @@
     6 #ifndef BITCOIN_SYNC_H
     7 #define BITCOIN_SYNC_H
     8 
     9+#include <logging.h>
    10 #include <threadsafety.h>
    11+#include <tinyformat.h>
    12+#include <util/strencodings.h>
    


    MarcoFalke commented at 6:29 pm on August 19, 2021:
    why are those includes needed?

    jonatack commented at 11:11 pm on August 19, 2021:
    dropped
  12. in src/sync.h:9 in ccd73de1dd outdated
    5@@ -6,7 +6,10 @@
    6 #ifndef BITCOIN_SYNC_H
    7 #define BITCOIN_SYNC_H
    8 
    9+#include <logging.h>
    


    MarcoFalke commented at 6:36 pm on August 19, 2021:
    nit: If PrintLockContention stayed in the cpp file, this include can be avoided (it indirectly includes the heavy filesystem header).

    jonatack commented at 11:14 pm on August 19, 2021:
    Good point, though we would have to essentially move UniqueLock::Enter() to the cpp file, due to calling LogAcceptCategory(BCLog::LOCK). Punting on this for now.

    jonatack commented at 10:44 am on September 1, 2021:

    nit: If PrintLockContention stayed in the cpp file, this include can be avoided (it indirectly includes the heavy filesystem header).

    agree, dropped the commit changing this

  13. MarcoFalke commented at 6:37 pm on August 19, 2021: member
    cr ACK ccd73de1dd969097d34634c2be2fc32b03fbd09e
  14. practicalswift commented at 9:32 pm on August 19, 2021: contributor

    Concept ACK

    Nice idea. I’ve used something similar in my local tree when analysing lock contention locally.

    Non-blocking suggestions:

    • In my version I printed the duration too to make it clear from the log entry if the lock contention was significant or not. Perhaps worth adding here too?
    • In addition to __FILE__ and __LINE__ it would be nice to have __FUNCTION__ printed as part of the log message too.

    To rule out any accidental performance regressions: would it be possible to benchmark before vs after this PR (with and without -debug=lock) in some intelligent way?

  15. jonatack force-pushed on Aug 19, 2021
  16. jonatack commented at 11:20 pm on August 19, 2021: member

    In my version I printed the duration too to make it clear from the log entry if the lock contention was significant or not. Perhaps worth adding here too?

    In addition to __FILE__ and __LINE__ it would be nice to have __FUNCTION__ printed as part of the log message too.

    I think I’d need your help (propose a commit to pull in?) or you could do it as a follow-up.

    To rule out any accidental performance regressions: would it be possible to benchmark before vs after this PR (with and without -debug=lock) in some intelligent way?

    I’ll look. Without -debug=lock it should be the same, and I’m not sure anyone besides developers would use the lock logging; it’s still fairly verbose. In the worst case I suppose we could add a warning in the RPC logging help.

  17. jonatack commented at 11:23 pm on August 19, 2021: member
    @laanwj: following up on our IRC convo in #bitcoin-core-builds, as a test just pushed a version with the HAVE_THREAD_LOCAL conditional compilation directives removed to see what happens…
  18. jonatack commented at 9:35 am on August 20, 2021: member

    To-do:

    • add a smoke test with -debug=lock
    • do some invocation frequency checks and research
    • try benchmarking Mutex::UniqueLock lock() versus try_lock() and the overhead of LogAcceptCategory(BCLog::LOCK), which evaluates to m_categories.load(std::memory_order_relaxed) & category) != 0
  19. MarcoFalke commented at 10:22 am on August 20, 2021: member

    add a smoke test with -debug=lock

    Shouldn’t this be enabled in the functional test already by default?

  20. jonatack commented at 12:55 pm on August 20, 2021: member

    Shouldn’t this be enabled in the functional test already by default?

    You’re right, TIL. I was thinking of adding something like this assertion but probably not worth it or reliable enough:

    0        with node.assert_debug_log(expected_msgs=["Lock contention: cs_main"]):
    1            node.generate(COINBASE_MATURITY)
    
  21. theStack commented at 4:33 pm on August 20, 2021: member
    Concept ACK
  22. jonatack commented at 5:38 pm on August 20, 2021: member

    Observations

    • void Mutex::UniqueLock::lock() acquires the mutex and blocks until it gains access to it

    • bool Mutex::UniqueLock::try_lock() doesn’t block but instead immediately returns whether it acquired the mutex; it may be used by lock() internally as part of the deadlock-avoidance algorithm

    • IIUC the test-and-set logic of these calls is ~constant time, optimised and light/quick if used carefully (i.e. no mutex convoying), compared to system calls, memory/cache coherency and fences, wait queues, lock contentions

    • added the following patch and tested a regular, non-debug clang 13 build of bitcoind synced on mainnet

    0        if (LogAcceptCategory(BCLog::LOCK)) {
    1             if (!Base::try_lock()) {
    2                 LogPrintf("Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    3                 Base::lock();
    4+            } else {
    5+                LogPrintf("No lock contention\n");
    6             }
    
    • with this, I’m seeing src/sync.h::UniqueLock::Enter() invoked 2500-5000 times per minute, sampled roughly an hour after startup

    • of those, try_lock() returns true 99.26% to 99.91% of the time. In the true case the lock logging should have essentially the same performance as when the lock logging is off

    • thus 0.09% to 0.74% of the time try_lock() returns false, the contention is logged and lock() is then called, in which case the additional overhead compared to lock() alone, if any (~depending on how long the mutex was blocked), is of trying the lock and initially returning false

    • Overall, lock contention seems to be relatively infrequent as a percentage of locks. I’m seeing a few per minute. The most common one I see is cs_vNodes in net.cpp::2242/2277::ThreadMessageHandler.

    Edit: recalculated the numbers and updated.

    Some resources:

  23. theStack approved
  24. theStack commented at 11:53 pm on August 31, 2021: member
    Code-review ACK 6e7b6bfee377cdb1ea700af8084b08f79cbe5ee5
  25. hebasto commented at 7:57 am on September 1, 2021: member
    Concept ACK.
  26. in src/sync.h:143 in 6e7b6bfee3 outdated
    142+        if (LogAcceptCategory(BCLog::LOCK)) {
    143+            if (!Base::try_lock()) {
    144+                LogPrintf("Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    145+                Base::lock();
    146+            }
    147+        } else {
    


    hebasto commented at 8:19 am on September 1, 2021:
    0        if (LogAcceptCategory(BCLog::LOCK) && !Base::try_lock()) {
    1            LogPrintf("Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    2        }
    

    ?


    jonatack commented at 9:09 am on September 1, 2021:
    Thanks. If I understand the suggestion, this would only lock if the lock logging is on and try_lock() succeeds.

    hebasto commented at 9:12 am on September 1, 2021:
    0    void Enter(const char* pszName, const char* pszFile, int nLine)
    1    {
    2        EnterCritical(pszName, pszFile, nLine, Base::mutex());
    3        if (LogAcceptCategory(BCLog::LOCK) && !Base::try_lock()) {
    4            LogPrintf("Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    5        }
    6        Base::lock();
    7    }
    

    hebasto commented at 9:18 am on September 1, 2021:
    nm, my suggestion is wrong.
  27. hebasto commented at 8:31 am on September 1, 2021: member
    * benchmarking ... the overhead of `LogAcceptCategory(BCLog::LOCK)`, which evaluates to `m_categories.load(std::memory_order_relaxed) & category) != 0`
    

    I think the benchmarking results could depend on a platform.

    Could we sacrifice the ability to turn “lock” debugging on/off at runtime, and use just a const?

  28. jonatack commented at 9:23 am on September 1, 2021: member

    Could we sacrifice the ability to turn “lock” debugging on/off at runtime, and use just a const?

    I could be wrong, and this goes way beyond my competence, but readings suggest that these options are low-level, fast, and near constant time compared to system calls, memory/cache coherency and fences, wait queues and (particularly) lock contentions. (Would a difference be apparent in, say, https://bitcoinperf.com? I don’t know, but if this helps us reduce contentions, that may be a win. @jamesob, @0xB10C, @martinus thoughts?)

  29. hebasto commented at 9:23 am on September 1, 2021: member

    Could we sacrifice the ability to turn “lock” debugging on/off at runtime, and use just a const?

    I could be wrong, and this goes way beyond my competence, but readings suggest that these options are low-level, fast, and near constant time compared to system calls, memory/cache coherency and fences, wait queues and (particularly) lock contentions. Would a difference be apparent in, say, https://bitcoinperf.com? I don’t know, but if this helps us reduce contentions, that may be a win.

    And m_categories.load(std::memory_order_relaxed) could cause cache lines switching, no?

  30. jonatack commented at 9:33 am on September 1, 2021: member

    And m_categories.load(std::memory_order_relaxed) could cause cache lines switching, no?

    Alternative version:

    0    void Enter(const char* pszName, const char* pszFile, int nLine)
    1    {
    2        EnterCritical(pszName, pszFile, nLine, Base::mutex());
    3        if (Base::try_lock()) return;
    4        Base::lock();
    5        LogPrint(BCLog::LOCK, "Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    6    }
    
  31. hebasto commented at 9:40 am on September 1, 2021: member

    And m_categories.load(std::memory_order_relaxed) could cause cache lines switching, no?

    Alternative version:

    0    void Enter(const char* pszName, const char* pszFile, int nLine)
    1    {
    2        EnterCritical(pszName, pszFile, nLine, Base::mutex());
    3        if (Base::try_lock()) return;
    4        Base::lock();
    5        LogPrint(BCLog::LOCK, "Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    6    }
    

    Nice!

    Maybe put logging before the blocking call Base::lock(); as it is now? The LogPrint call takes some time, and it could increase probability of releasing a mutex by another thread before the Base::lock() call.

  32. jonatack commented at 10:25 am on September 1, 2021: member

    And m_categories.load(std::memory_order_relaxed) could cause cache lines switching, no?

    I don’t know.

    Maybe put logging before the blocking call Base::lock(); as it is now? The LogPrint call takes some time, and it could increase probability of releasing a mutex by another thread before the Base::lock() call.

    Seems like a good idea–done.

  33. jonatack force-pushed on Sep 1, 2021
  34. hebasto approved
  35. hebasto commented at 10:41 am on September 1, 2021: member

    ACK df149f89dcd0c457c61d463e36c057bb064783b8, tested on Linux Mint 20.2 (x86_64).

    In my version I printed the duration too to make it clear from the log entry if the lock contention was significant or not. Perhaps worth adding here too?

    I think I’d need your help (propose a commit to pull in?) or you could do it as a follow-up.

     0--- a/src/sync.h
     1+++ b/src/sync.h
     2@@ -7,6 +7,7 @@
     3 #define BITCOIN_SYNC_H
     4 
     5 #include <logging.h>
     6+#include <logging/timer.h>
     7 #include <threadsafety.h>
     8 #include <util/macros.h>
     9 
    10@@ -136,7 +137,7 @@ private:
    11     {
    12         EnterCritical(pszName, pszFile, nLine, Base::mutex());
    13         if (Base::try_lock()) return;
    14-        LogPrint(BCLog::LOCK, "Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    15+        LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("Lock contention: %s, %s:%d", pszName, pszFile, nLine), BCLog::LOCK);
    16         Base::lock();
    17     }
    18 
    

    as a follow up?

  36. jonatack commented at 10:46 am on September 1, 2021: member

    as a follow up?

    Thanks–was about to re-push to drop the third commit. Will try your suggestion before re-pushing.

  37. hebasto commented at 10:48 am on September 1, 2021: member

    as a follow up?

    Thanks–was about to re-push to drop the third commit. Will try your suggestion before re-pushing.

    My only concerns are about wording – “Enter: Lock contention: cs_vNodes, net.cpp:2242 completed (0.07ms)” – “contention completed”?

  38. jonatack commented at 11:06 am on September 1, 2021: member

    My only concerns are about wording – “Enter: Lock contention: cs_vNodes, net.cpp:2242 completed (0.07ms)” – “contention completed”?

    I’m seeing the following (all of the times are 0.00ms or 0.01ms so far, and seems I should remove “PrintLockContention” or “Lock Contention” from the log output now)

    02021-09-01T11:03:59Z PrintLockContention: Lock contention: cs_main, net_processing.cpp:1476 started
    12021-09-01T11:03:59Z PrintLockContention: Lock contention: cs_main, net_processing.cpp:1476 completed (0.00ms)
    22021-09-01T11:03:59Z PrintLockContention: Lock contention: cs_vNodes, net.cpp:2242 started
    32021-09-01T11:03:59Z PrintLockContention: Lock contention: cs_vNodes, net.cpp:2242 completed (0.01ms)
    

    timer.h::LOG_TIME_MILLIS_WITH_CATEGORY times the execution of the whole Enter() function, the placement of Base::lock(); doesn’t matter for it?

  39. fanquake deleted a comment on Sep 1, 2021
  40. fanquake deleted a comment on Sep 1, 2021
  41. jonatack commented at 11:34 am on September 1, 2021: member
     0diff --git a/src/logging/timer.h b/src/logging/timer.h
     1index 159920e397..cadfe395ee 100644
     2--- a/src/logging/timer.h
     3+++ b/src/logging/timer.h
     4@@ -93,6 +93,8 @@ private:
     5 } // namespace BCLog
     6 
     7 
     8+#define LOG_TIME_MICROS_WITH_CATEGORY(end_msg, log_category) \
     9+    BCLog::Timer<std::chrono::microseconds> PASTE2(logging_timer, __COUNTER__)(__func__, end_msg, log_category)
    10 #define LOG_TIME_MILLIS_WITH_CATEGORY(end_msg, log_category) \
    11     BCLog::Timer<std::chrono::milliseconds> PASTE2(logging_timer, __COUNTER__)(__func__, end_msg, log_category)
    12 #define LOG_TIME_SECONDS(end_msg) \
    13diff --git a/src/sync.cpp b/src/sync.cpp
    14index a7b63ea986..eace86d9dd 100644
    15--- a/src/sync.cpp
    16+++ b/src/sync.cpp
    17@@ -24,9 +24,9 @@
    18 #include <utility>
    19 #include <vector>
    20 
    21-void PrintLockContention(const char* pszName, const char* pszFile, int nLine)
    22+void LockContention(const char* pszName, const char* pszFile, int nLine)
    23 {
    24-    LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("%s, %s:%d", pszName, pszFile, nLine), BCLog::LOCK);
    25+    LOG_TIME_MICROS_WITH_CATEGORY(strprintf("%s, %s:%d", pszName, pszFile, nLine), BCLog::LOCK);
    26 }
    27 
    28 #ifdef DEBUG_LOCKORDER
    29diff --git a/src/sync.h b/src/sync.h
    30index 252dc95845..0a8ec9e58f 100644
    31--- a/src/sync.h
    32+++ b/src/sync.h
    33@@ -126,7 +126,8 @@ using RecursiveMutex = AnnotatedMixin<std::recursive_mutex>;
    34 /** Wrapped mutex: supports waiting but not recursive locking */
    35 typedef AnnotatedMixin<std::mutex> Mutex;
    36 
    37-void PrintLockContention(const char* pszName, const char* pszFile, int nLine);
    38+/** Print a lock contention to the log */
    39+void LockContention(const char* pszName, const char* pszFile, int nLine);
    40 
    41 /** Wrapper around std::unique_lock style lock for Mutex. */
    42 template <typename Mutex, typename Base = typename Mutex::UniqueLock>
    43@@ -137,7 +138,7 @@ private:
    44     {
    45         EnterCritical(pszName, pszFile, nLine, Base::mutex());
    46         if (Base::try_lock()) return;
    47-        PrintLockContention(pszName, pszFile, nLine);
    48+        LockContention(pszName, pszFile, nLine);
    49         Base::lock();
    50     }
    
    02021-09-01T11:29:03Z LockContention: cs_vNodes, net.cpp:1251 started
    12021-09-01T11:29:03Z LockContention: cs_vNodes, net.cpp:1251 completed (5μs)
    22021-09-01T11:29:03Z LockContention: pnode->cs_vSend, net.cpp:1373 started
    32021-09-01T11:29:03Z LockContention: pnode->cs_vSend, net.cpp:1373 completed (31μs)
    42021-09-01T11:29:03Z LockContention: cs_vNodes, net.cpp:2277 started
    52021-09-01T11:29:03Z LockContention: cs_vNodes, net.cpp:2277 completed (6μs)
    62021-09-01T11:29:04Z LockContention: cs_vNodes, net.cpp:2242 started
    72021-09-01T11:29:04Z LockContention: cs_vNodes, net.cpp:2242 completed (3μs)
    
  42. martinus commented at 11:47 am on September 1, 2021: contributor

    And m_categories.load(std::memory_order_relaxed) could cause cache lines switching, no?

    yes, the only guarantee with memory_order_relaxed is that the operation itself is atomic, but other operations are allowed to be shuffled that by the compiler. I personally try to avoid using any of the memory order stuff. It’s too easy to get wrong.

    By the way, I think that in that patch:

    0        if (LogAcceptCategory(BCLog::LOCK)) {
    1             if (!Base::try_lock()) {
    2                 LogPrintf("Lock contention: %s, %s:%d\n", pszName, pszFile, nLine);
    3                 Base::lock();
    4+            } else {
    5+                LogPrintf("No lock contention\n");
    6             }
    

    If Base::try_lock() fails strictly speaking that does not necessarily mean that there’s a lock contention. std::mutex’s try_lock() can also spuriously fail even when no other thread is currently holding a lock: https://en.cppreference.com/w/cpp/thread/mutex/try_lock I have no idea though how often that can happen.

  43. jonatack force-pushed on Sep 1, 2021
  44. jonatack force-pushed on Sep 1, 2021
  45. log, timer: add timing macro in usec LOG_TIME_MICROS_WITH_CATEGORY
    and update BCLog::LogMsg() to omit irrelevant decimals for microseconds
    and skip unneeded code and math.
    3f4c6b87f1
  46. jonatack force-pushed on Sep 1, 2021
  47. jonatack commented at 1:21 pm on September 1, 2021: member

    Thanks for the ideas and feedback. Updates:

    • added a timing macro in microseconds, LOG_TIME_MICROS_WITH_CATEGORY
    • improved BCLog::LogMsg() to omit irrelevant decimals for microseconds and skip unnecessary code/math
    • added time duration in microseconds to the lock contention logging

    If Base::try_lock() fails strictly speaking that does not necessarily mean that there’s a lock contention. std::mutex’s try_lock() can also spuriously fail even when no other thread is currently holding a lock: en.cppreference.com/w/cpp/thread/mutex/try_lock. I have no idea though how often that can happen.

    Good point. I read that doc previously and admit to discounting the spurious failures a little, as we don’t know the frequency. It seems to be an additional reason to log the contention duration.

  48. log, sync: improve lock contention logging and add time duration
    in microseconds.
    
    Change the function name in order to print "LockContention" instead
    of "PrintLockContention" to the log.  Add Doxygen documentation.
    
    With this change, the lock contention log prints:
    
    2021-09-01T11:29:03Z LockContention: pnode->cs_vSend, net.cpp:1373 started
    2021-09-01T11:29:03Z LockContention: pnode->cs_vSend, net.cpp:1373 completed (31μs)
    2021-09-01T11:29:03Z LockContention: cs_vNodes, net.cpp:2277 started
    2021-09-01T11:29:03Z LockContention: cs_vNodes, net.cpp:2277 completed (6μs)
    2021-09-01T11:29:04Z LockContention: cs_vNodes, net.cpp:2242 started
    2021-09-01T11:29:04Z LockContention: cs_vNodes, net.cpp:2242 completed (3μs)
    
    Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
    Co-authored-by: practicalswift <practicalswift@users.noreply.github.com>
    9b08006bc5
  49. sync: remove DEBUG_LOCKCONTENTION preprocessor directives
    to allow logging the lock contentions without the need to define
    DEBUG_LOCKCONTENTION at compile time.
    7e69873283
  50. jonatack force-pushed on Sep 1, 2021
  51. hebasto approved
  52. hebasto commented at 8:59 pm on September 2, 2021: member
    re-ACK 7e698732836121912f179b7c743a72dd6fdffa72, added a contention duration to the log message since my previous review.
  53. theStack approved
  54. theStack commented at 0:28 am on September 5, 2021: member

    re-ACK 7e698732836121912f179b7c743a72dd6fdffa72 🔏 ⏲️

    Since my last ACK, the lock contention logging was improved by also showing the duration in micro-seconds, and an alternate (more elegant) version of UniqueLock::Enter is now used that doesn’t need to check whether the lock category is on if there is no contention (https://github.com/bitcoin/bitcoin/pull/22736#issuecomment-910110140).

    Tested by building with clang 10.0.1 on OpenBSD 6.9 and running ./src/bitcoind -signet -debug=lock on an empty signet datadir.

    On my system, during IBD the vast majority of lock contentions happen in

    • cs_main, net_processing.cpp:1476
    • m_mutex, ./checkqueue.h:168
    • m_mutex, ./checkqueue.h:77

    After IBD is finished, all lock contentions are in

    • cs_vNodes, net.cpp (different lines)

    Most contentions are in the range of 5-10us. (Note that these results are not based on deeper analysis, but on what I saw scrolling through the logs after a few minutes).

  55. jonatack commented at 11:14 am on September 5, 2021: member
    Thanks for testing! For me, there seems to be a higher frequency of contention and occuring in more places on mainnet than on signet, presumably as the scale/numbers are larger. Also, a big difference in contention durations between a regular build and a debug build, which makes sense.
  56. MarcoFalke merged this on Sep 6, 2021
  57. MarcoFalke closed this on Sep 6, 2021

  58. jonatack deleted the branch on Sep 6, 2021
  59. in src/logging/timer.h:70 in 7e69873283
    69-        if (std::is_same<TimeType, std::chrono::microseconds>::value) {
    70-            units = "μs";
    71-        } else if (std::is_same<TimeType, std::chrono::milliseconds>::value) {
    72+        if (std::is_same<TimeType, std::chrono::milliseconds>::value) {
    73             units = "ms";
    74             divisor = 1000.;
    


    MarcoFalke commented at 9:23 am on September 6, 2021:
    unrelated: Could make sense to inline those as well to avoid the heterogeneous code flow?

    jonatack commented at 11:09 pm on September 6, 2021:
    good idea, done (and more) in #22904
  60. in src/sync.cpp:33 in 9b08006bc5 outdated
    27@@ -27,10 +28,9 @@
    28 #if !defined(HAVE_THREAD_LOCAL)
    29 static_assert(false, "thread_local is not supported");
    30 #endif
    31-void PrintLockContention(const char* pszName, const char* pszFile, int nLine)
    32+void LockContention(const char* pszName, const char* pszFile, int nLine)
    33 {
    34-    LogPrint(BCLog::LOCK, "LOCKCONTENTION: %s\n", pszName);
    35-    LogPrint(BCLog::LOCK, "Locker: %s:%d\n", pszFile, nLine);
    36+    LOG_TIME_MICROS_WITH_CATEGORY(strprintf("%s, %s:%d", pszName, pszFile, nLine), BCLog::LOCK);
    


    martinus commented at 5:21 pm on September 6, 2021:

    I think LockContention doesn’t do what it is supposed to do. At least it doesn’t log the time that the contention actually takes: the macro creates an BCLog::Timer instance where constructor measures start time, and destructor prints finished. But since all that is done inside the LockContention function, this happens instantly and doesn’t measure the actual lock contention.

    I’ve verified that in #21006 because I wanted to see how long the contentions are, but it always prints 0µs or 1µs. When I move the macro into sync.h and replace

    0LockContention(pszName, pszFile, nLine); // log the contention
    

    with

    0LOG_TIME_MICROS_WITH_CATEGORY(strprintf("%s, %s:%d", pszName, pszFile, nLine), BCLog::LOCK);
    

    I get reasonable contention times, like so:

    02021-09-06T17:18:23Z Enter: cs_main, rest.cpp:281 completed (65406μs)
    12021-09-06T17:18:23Z Enter: cs_main, rest.cpp:281 completed (65331μs)
    22021-09-06T17:18:23Z Enter: cs_main, rest.cpp:281 completed (65074μs)
    32021-09-06T17:18:23Z Enter: cs_main, rest.cpp:281 completed (65103μs)
    42021-09-06T17:18:23Z Enter: cs_main, net_processing.cpp:4051 completed (67687μs)
    

    jonatack commented at 5:53 pm on September 6, 2021:
    Interesting, makes sense that it would only be timing the LockContention() function. Right now on mainnet with a non-debug clang 13 build running since 6 hours (2 cores x 2.50GHz), I see varying durations of 1-50 usec, most often 2-30. I suspect your machine is much faster than mine (and that I may be only seeing the time of the macro function as you suggest). Will try your suggestion.

    jonatack commented at 11:08 pm on September 6, 2021:
    @martinus This does seem better, now seeing durations of 1 to several million usec (mainnet, non-debug build). Thanks for reporting. Proposed fix in #22904.
  61. sidhujag referenced this in commit cac636697a on Sep 7, 2021
  62. MarcoFalke referenced this in commit d2dd1697ce on Sep 9, 2021
  63. sidhujag referenced this in commit 388bc0e0cc on Sep 11, 2021
  64. fanquake referenced this in commit c1059c9fef on Apr 8, 2022
  65. DrahtBot locked this on Sep 6, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-08 22:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me