tracing: lock contention analysis #25081

pull martinus wants to merge 6 commits into bitcoin:master from martinus:2022-05-lock-tracking-2 changing 3 files +259 −67
  1. martinus commented at 4:04 pm on May 7, 2022: contributor

    Currently there is a macro DEBUG_LOCKCONTENTION (see developer-notes.md) to enable logging of contention. The disadvantage of this method is that bitcoind needs to be especially compiled with that enabled, and when enabled this quickly produces huge log files, and has a relatively large overhead.

    This PR introduces some tracepoints and adds a tracing script lock_contention_analysis.bt that can be started at any time while bitcoind is running. This has low overhead and gives better analysis possibilities.

    When the script is attached with sudo bpftrace contrib/tracing/lock_contention_analysis.bt, you’ll get a markdown formatted output that looks like this:

    timestamp Lock waiting duration waiting at blocked by
    12:41:05.512089 m_mutex 4491µs ./checkqueue.h ( 175) ./checkqueue.h ( 78)
    12:41:05.861322 cs_main 535058µs net_processing.cpp (1607) validation.cpp (2994)
    12:41:06.655427 m_mutex 4012µs ./checkqueue.h ( 78) ./checkqueue.h ( 78)
    12:41:06.675967 cs_main 650117µs net_processing.cpp (1607) validation.cpp (2994)
    12:41:06.894090 m_mutex 2747µs ./checkqueue.h ( 78) ./checkqueue.h ( 78)
    12:41:06.961872 cs_main 142277µs net_processing.cpp (1607) validation.cpp (2994)
    12:41:07.363535 cs_main 395851µs net_processing.cpp (1607) validation.cpp (2994)
    12:41:07.583226 cs_main 69281µs net_processing.cpp (1607) validation.cpp (2994)
    12:41:07.746665 cs_main 157604µs net_processing.cpp (1607) validation.cpp (2335)
    12:41:07.958787 m_mutex 3041µs ./checkqueue.h ( 175) ./checkqueue.h ( 78)
    12:41:07.958801 m_mutex 2970µs ./checkqueue.h ( 78) ./checkqueue.h ( 175)
    12:41:07.958842 m_mutex 2956µs ./checkqueue.h ( 78) ./checkqueue.h ( 175)
    12:41:07.958853 m_mutex 2898µs ./checkqueue.h ( 78) ./checkqueue.h ( 78)
    12:41:07.958862 m_mutex 2820µs ./checkqueue.h ( 78) ./checkqueue.h ( 78)
    12:41:07.959201 m_mutex 3028µs ./checkqueue.h ( 78) ./checkqueue.h ( 78)
    12:41:07.959218 m_mutex 2916µs ./checkqueue.h ( 78) ./checkqueue.h ( 78)

    When monitoring is stopped (press Ctrl+C), the script prints the source locations where it has waited the longest while the script was running (unfortunately there seems to be no way to format this more nicely):

     0Monitored from 17:12:18.513400 to 17:42:08.565786. top 50 lock contention locations (file, lineNo): µs
     1[...]
     2[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[./net.h, 1233]: 334654
     3[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[txmempool.cpp, 823]: 401472
     4[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4504]: 661670
     5[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4872]: 668911
     6[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[txmempool.cpp, 1096]: 681892
     7[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4857]: 825963
     8[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4724]: 826719
     9[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4677]: 827335
    10[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4154]: 827972
    11[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4222]: 828863
    12[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net.cpp, 2329]: 831334
    13[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4217]: 831451
    14[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4231]: 839408
    15[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4198]: 846353
    16[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[util/system.cpp, 1035]: 883117
    17[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net.cpp, 1426]: 1398949
    18[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net.cpp, 1422]: 1416004
    19[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net.cpp, 1611]: 1564984
    20[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 4205]: 1668173
    21[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[net_processing.cpp, 1334]: 1704239
    22[@locked](/bitcoin-bitcoin/contributor/locked/)_at_source_line[validation.cpp, 5173]: 2899013
    

    For this to work I had to add traces to UniqueLock and AnnotatedMixin, and move the implementation into the cpp file (without moving, due to inlining the maximum number of traces of 512 is quickly exhausted)

  2. sync: Move UniqueLock implementation into sync.cpp
    This is a templated class, so this also adds the two needed explicit
    instantiations. Why do this? having tracepoints in the header leads
    to lots of traces which quickly exhaust the limit. Therefore this moves
    the implementation into the .cpp file.
    0a2c89313d
  3. sync: remember pszName, pszFile, nLine as members
    Stores the arguments as members so they are available in the destructor.
    This will be useful later when adding tracepoints.
    
    Note that it is remembered as a `char const*`, so this only works for
    string literals where long lifetime is guaranteed. UniqueLock is only
    used with the `LOCK`/`LOCK2` macro, so this is fine. Using
    `std::string` would be an unnecessary overhead.
    4d2cce5216
  4. sync: add trace points to UniqueLock
    This enables tracing of locks & lock contention analysis.
    6afeaa7b6f
  5. sync: move AnnotatedMixin implementation into sync.cpp
    Same as `UniqueLock`, moves the implementation of `AnnotatedMixin` into
    the cpp file. That way we can add tracepoints to the creation and
    destruction of a mutex.
    f815bf99e7
  6. sync: add trace points to AnnotatedMixin ed2c4973f5
  7. tracing: add lock_contention_analysis based on the sync tracers
    This bpftrace script can be used to analyze lock contention. It shows
    code locations that have to wait a long time to actually acquire a
    lock, and also shows the last location that lock was held, to see
    who was actually responsible for blocking.
    
    When monitoring ends, prints the locations with the longest waiting
    durations.
    f7917c2697
  8. jonatack commented at 4:21 pm on May 7, 2022: contributor
    Interesting, would like to give this a try soon.
  9. DrahtBot added the label Scripts and tools on May 7, 2022
  10. 0xB10C commented at 6:56 pm on May 7, 2022: contributor

    This looks interesting. Planning to play around with this, e.g. running this on bitcoind.observer.

    I think it would be good to document these tracepoints in doc/tracing.md and, if possible, have an interface test similar to these added in e.g. #24358 (test/functional/interface_usdt_*.py).

    Edit: Oh also, documentation of the bpftrace script in contrib/tracing/README.md would be nice :).

  11. DrahtBot commented at 6:30 am on May 8, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK laanwj, hebasto, jb55

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #25819 (msvc, refactor: Avoid some rare compiler warnings by hebasto)
    • #25676 (sync: simplify and remove unused code from sync.h by vasild)
    • #24923 (Rework logging timer by MarcoFalke)
    • #16545 (refactor: Implement missing error checking for ArgsManager flags by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  12. martinus commented at 7:09 am on May 8, 2022: contributor

    have an interface test similar to these added in e.g. #24358 (test/functional/interface_usdt_*.py).

    Oh, I didn’t even know we had something like that… I guess I need to learn more about eBPF and python :sweat_smile:

  13. jonatack commented at 8:00 am on May 8, 2022: contributor

    bitcoind needs to be especially compiled with that enabled

    For info, #24757 would define it by default in debug builds (--enable-debug).

  14. laanwj commented at 7:31 pm on May 9, 2022: member

    Concept ACK, I like using tracepoints for this kind of analysis instead of our own instrumentation.

    For this to work I had to add traces to UniqueLock and AnnotatedMixin, and move the implementation into the cpp file (without moving, due to inlining the maximum number of traces of 512 is quickly exhausted)

    No longer inlining the locks will add some overhead to locking even in the no-tracing case. We’ll need to benchmark this.

  15. hebasto commented at 8:42 pm on May 9, 2022: member
    Concept ACK.
  16. jamesob commented at 8:24 pm on May 17, 2022: member

    Hvaen’t reviewed the code yet, but just ran a quick IBD bench on a “contemporary” region of the chain to get a sense of any performance impact. No significant slowdown there.

    commands index

    bench name command
    ibd.local.range.667200.677200 bitcoind -dbcache=10000 -debug=coindb -debug=bench -listen=0 -connect=0 -addnode=127.0.0.1:8888 -prune=9999999 -printtoconsole=0 -assumevalid=000000000000000000176c192f42ad13ab159fdb20198b87e7ba3c001e47b876

    #25081 vs. $mergebase (absolute)

    bench name x #25081 $mergebase
    ibd.local.range.667200.677200.total_secs 2 1738.2652 (± 3.4354) 1736.2869 (± 0.6808)
    ibd.local.range.667200.677200.peak_rss_KiB 2 3518128.0000 (± 196.0000) 3518346.0000 (± 3050.0000)
    ibd.local.range.667200.677200.cpu_kernel_secs 2 126.4900 (± 1.4500) 125.9750 (± 0.5650)
    ibd.local.range.667200.677200.cpu_user_secs 2 10095.4250 (± 1.1050) 10089.9250 (± 1.4550)

    #25081 vs. $mergebase (relative)

    bench name x #25081 $mergebase
    ibd.local.range.667200.677200.total_secs 2 1.001 1.00
    ibd.local.range.667200.677200.peak_rss_KiB 2 1.000 1.00
    ibd.local.range.667200.677200.cpu_kernel_secs 2 1.004 1.00
    ibd.local.range.667200.677200.cpu_user_secs 2 1.001 1.00

    Reproduce with

    0bitcoinperf bench-pr --num-blocks 10_000 --run-id lock-tracing \
    1  --bitcoind-args='-dbcache=10000' --run-count 2 25081
    
  17. jb55 commented at 10:06 pm on May 17, 2022: contributor

    Concept ACK!

    This was one of the main usecases of tracepoints that I had in mind! Very nice, will test.

  18. 0xB10C commented at 10:19 pm on May 17, 2022: contributor

    I’ve started playing around with these tracepoints. It seems like there are two ELF notes generated for each sync tracepoint.

     0...
     1  stapsdt              0x0000002f	NT_STAPSDT (SystemTap probe descriptors)
     2    Provider: sync
     3    Name: mutex_ctor
     4    Location: 0x00000000005dca7b, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
     5    Arguments: 8@%rdi
     6  stapsdt              0x0000002f	NT_STAPSDT (SystemTap probe descriptors)
     7    Provider: sync
     8    Name: mutex_dtor
     9    Location: 0x00000000005dcab8, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    10    Arguments: 8@%rdi
    11  stapsdt              0x0000002f	NT_STAPSDT (SystemTap probe descriptors)
    12    Provider: sync
    13    Name: mutex_ctor
    14    Location: 0x00000000005dcc12, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    15    Arguments: 8@%rdi
    16  stapsdt              0x0000002f	NT_STAPSDT (SystemTap probe descriptors)
    17    Provider: sync
    18    Name: mutex_dtor
    19    Location: 0x00000000005dcc48, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    20    Arguments: 8@%rdi
    21  stapsdt              0x0000004e	NT_STAPSDT (SystemTap probe descriptors)
    22    Provider: sync
    23    Name: enter
    24    Location: 0x00000000005dcd8c, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    25    Arguments: 8@(%rdi) 8@16(%rdi) 8@24(%rdi) -4@32(%rdi)
    26  stapsdt              0x0000004d	NT_STAPSDT (SystemTap probe descriptors)
    27    Provider: sync
    28    Name: locked
    29    Location: 0x00000000005dcdb5, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    30    Arguments: 8@%rdi 8@16(%rbx) 8@24(%rbx) -4@32(%rbx)
    31  stapsdt              0x00000051	NT_STAPSDT (SystemTap probe descriptors)
    32    Provider: sync
    33    Name: try_locked
    34    Location: 0x00000000005dce5c, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    35    Arguments: 8@%rdi 8@16(%rbx) 8@24(%rbx) -4@32(%rbx)
    36  stapsdt              0x00000044	NT_STAPSDT (SystemTap probe descriptors)
    37    Provider: sync
    38    Name: unlock
    39    Location: 0x00000000005dceae, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    40    Arguments: 8@(%rdi) 8@24(%rdi) -4@32(%rdi)
    41  stapsdt              0x0000004e	NT_STAPSDT (SystemTap probe descriptors)
    42    Provider: sync
    43    Name: enter
    44    Location: 0x00000000005dd14c, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    45    Arguments: 8@(%rdi) 8@16(%rdi) 8@24(%rdi) -4@32(%rdi)
    46  stapsdt              0x0000004d	NT_STAPSDT (SystemTap probe descriptors)
    47    Provider: sync
    48    Name: locked
    49    Location: 0x00000000005dd175, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    50    Arguments: 8@%rdi 8@16(%rbx) 8@24(%rbx) -4@32(%rbx)
    51  stapsdt              0x00000051	NT_STAPSDT (SystemTap probe descriptors)
    52    Provider: sync
    53    Name: try_locked
    54    Location: 0x00000000005dd21c, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    55    Arguments: 8@%rdi 8@16(%rbx) 8@24(%rbx) -4@32(%rbx)
    56  stapsdt              0x00000044	NT_STAPSDT (SystemTap probe descriptors)
    57    Provider: sync
    58    Name: unlock
    59    Location: 0x00000000005dd26e, Base: 0x000000000086f360, Semaphore: 0x0000000000000000
    60    Arguments: 8@(%rdi) 8@24(%rdi) -4@32(%rdi)
    

    I think, this is caused the templates were using - and that’s not something we want to change. E.g. https://github.com/bitcoin/bitcoin/blob/f7917c26970956bf5fc423b9cc995554694c87be/src/sync.cpp#L329-L333

    Two ELF notes with the same context/provider and name cause problems for me. E.g., when using bcc via the binary path (as we do it in e.g. the interface tests). According to https://github.com/iovisor/bcc/issues/1774#issuecomment-390723275 this works when using the PID, which we can do for the interface tests. Still, supporting hooking into the tracepoints both via PID and binary path would be good, but I’m not sure if that’s possible in this case.

  19. martinus commented at 10:39 am on May 18, 2022: contributor

    there are two ELF notes generated for each sync tracepoint

    Currently the templates are instantiated only for std::mutex and std::recursive_mutex, so workaround might be to do something like this:

     0template <typename PARENT>
     1AnnotatedMixin<PARENT>::AnnotatedMixin()
     2{
     3    if constexpr (std::is_same_v<std::mutex, PARENT>) {
     4        TRACE1(sync, mutex_ctor, this);
     5    } else if constexpr (std::is_same_v<std::recursive_mutex, PARENT>) {
     6        TRACE1(sync, recursive_mutex_ctor, this);
     7    } else {
     8        assert(false && "unknown lock used");
     9    }
    10}
    

    Unfortunately currently I don’t have much time to work on this though

  20. jb55 commented at 4:14 pm on May 18, 2022: contributor

    there are two ELF notes generated for each sync tracepoint

    If the template is getting instantiated twice creating two tracepoints, why not just create a standalone function containing the trace? Then perhaps we wouldn’t need to move the header-only templates as well.

  21. DrahtBot added the label Needs rebase on Oct 3, 2022
  22. DrahtBot commented at 7:04 pm on October 3, 2022: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  23. martinus marked this as a draft on Nov 30, 2022
  24. DrahtBot commented at 1:00 am on February 28, 2023: contributor

    There hasn’t been much activity lately and the patch still needs rebase. What is the status here?

    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  25. martinus commented at 8:05 am on February 28, 2023: contributor
    Hi, I lost interest / don’t have time for this unfortunately. You can close it and mark it as up-for-grabs.
  26. fanquake closed this on Feb 28, 2023

  27. fanquake added the label Up for grabs on Feb 28, 2023
  28. 0xB10C referenced this in commit 598a59c6ec on Mar 14, 2023
  29. bitcoin locked this on Feb 28, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me