[Arch] Bitcoin stalls randomly, pegs a thread until manually stopped #25992

issue ExperiBass opened this issue on September 3, 2022
  1. ExperiBass commented at 3:57 AM on September 3, 2022: none

    <!-- This issue tracker is only for technical issues related to Bitcoin Core. General bitcoin questions and/or support requests are best directed to the Bitcoin StackExchange at https://bitcoin.stackexchange.com. For reporting security issues, please read instructions at https://bitcoincore.org/en/contact/. If the node is "stuck" during sync or giving "block checksum mismatch" errors, please ensure your hardware is stable by running memtest and observe CPU temperature with a load-test tool such as linpack before creating an issue! -->

    <!-- Describe the issue -->

    I was told to open an issue here. I've been having a really annoying issue with my bitcoin client that i cant find the solution to. Randomly, my bitcoin peer will start failing to send requests. At the same time, it pegs an entire thread, and it will stay that way until i intervene and restart it. I've tried everything, even using a different computer with the default config (other than debug=1). I can't seem to find anyone else having my issue, and nothing in my logs stands out to me. Something to note, when i did my test with the default config, i used bitcoin-qt to monitor, and it was still saying it was connected to 11 peers, although i couldnt add or remove any.

    Expected behavior

    I expect my bitcoin node to stay up without me needing to intervene.

    Actual behavior

    My node instead stalls and pegs a thread.

    To reproduce

    I can reproduce it every time i start my client, although it will randomly occur within an observed 4hr-48hr window.

    System information

    <!-- What version of Bitcoin Core are you using, where did you get it (website, self-compiled, etc)? -->

    Main peer:

    • OS: EndeavourOS Linux x86_64
    • Host: MacBookAir7,2 1.0
    • Kernel: 5.19.1-zen1-1-zen
    • CPU: Intel i5-5350U @ 2.900GHz
    • Bitcoin Core v23.0.0 (self-compiled), used bitcoind
    • WM: none (headless)

    Clean test peer:

    • OS: EndeavourOS Linux x86_64
    • Host: MacBookPro16,2 1.0
    • Kernel: 5.18.16-arch1
    • CPU: Intel i5-1038NG7 @ 3.800GHz
    • Bitcoin Core v23.0.0 (downloaded from github releases), used bitcoin-qt
    • WM: i3
    • Graphical Shell: Kitty

    <!-- GUI-related issue? What is your operating system and its version? If Linux, what is your desktop environment and graphical shell? -->

    <!-- Any extra information that might be useful in the debugging process. -->

    <!--- This is normally the contents of a `debug.log` or `config.log` file. Raw text or a link to a pastebin type site are preferred. -->

    Logs Full logs for both machines are here.

  2. ExperiBass added the label Bug on Sep 3, 2022
  3. MarcoFalke commented at 10:35 AM on September 5, 2022: member

    I've tried everything, even using a different computer with the default config

    What are the exact steps to reproduce on a different computer?

  4. ExperiBass commented at 3:26 PM on September 5, 2022: none

    What are the exact steps to reproduce on a different computer?

    1. Download bitcoin core 23.0
    2. run either bitcoind or bitcoin-qt (thereby using only the default config)
    3. wait for it to eventually stall
  5. MarcoFalke commented at 9:59 AM on September 6, 2022: member
    • The log says something about i2p, which is not enabled by default, I think, so you may not using "default config"?
    • Does the issue happen on different OSes as well?
    • Which thread is "pegged", I presume this means it runs at 100% CPU?
  6. ExperiBass commented at 1:07 PM on September 6, 2022: none
    • The log says something about i2p, which is not enabled by default, I think, so you may not using "default config"?

    yes, thats the main peer. the second log from my trst peer has the default config.

    • Does the issue happen on different OSes as well?

    i only have linux to test on atm.

    • Which thread is "pegged", I presume this means it runs at 100% CPU?

    its not a specific thread as it jumps around (linux cpu scheduling maybe?) but it does hog 100% of a thread until killed.

  7. MarcoFalke commented at 1:14 PM on September 6, 2022: member

    i only have linux to test on atm.

    Oh sorry, I meant if this happens on other Linux distros as well?

    its not a specific thread as it jumps around (linux cpu scheduling maybe?) but it does hog 100% of a thread until killed.

    It would help to see the name of the thread(s) that are at 100%. I think htop with thread names enabled can show this to you.

  8. ExperiBass commented at 5:17 PM on September 6, 2022: none

    Oh sorry, I meant if this happens on other Linux distros as well?

    Spooling up a linux mint vm now to test. I'm using 21, so its based off ubuntu 22.0.

    It would help to see the name of the thread(s) that are at 100%. I think htop with thread names enabled can show this to you.

    On my main peer (the one with everything enabled) it appears to be the i2p_accept thread. image

    I'm starting the clean peer to see which thread it fails on (cause it failed, but it doesnt have i2p enabled), i'll edit with the result.

    E: no result yet, but the main peer stopped again, with a different thread pegging the core. image

  9. MarcoFalke added the label P2P on Sep 9, 2022
  10. MarcoFalke commented at 5:44 AM on September 9, 2022: member

    So it looks like the issue happens somewhere low level in the network stack. Can you attach gdb or something like it to see where it spends the cycles in the net thread? Also, could you reproduce on a different distro?

  11. ExperiBass commented at 1:37 PM on September 9, 2022: none

    So it looks like the issue happens somewhere low level in the network stack. Can you attach gdb or something like it to see where it spends the cycles in the net thread?

    Sure, ill restart and start debugging.

    Also, could you reproduce on a different distro?

    Still waiting for the linux mint install to fail, so we'll see.

  12. ExperiBass commented at 2:13 PM on September 11, 2022: none

    update: i cant get any of em to stall anymore. we are somehow pushing 90 hours uptime? very confused but hoping the issue happens soon, unresolved bugs are the worst...

  13. ghost commented at 10:54 PM on September 11, 2022: none

    Bitcoin core 22.0 never fails. Solution is to downgrade to 22.0

  14. MarcoFalke removed the label Bug on Sep 12, 2022
  15. MarcoFalke added the label Questions and Help on Sep 12, 2022
  16. ExperiBass commented at 9:33 PM on September 12, 2022: none

    yeah, I'm... baffled. We're approaching hour 120 and it shows no sign of stopping. I also can't get any of my other peers to crash, even though they were crashing regularly before.

  17. MarcoFalke commented at 6:04 AM on September 15, 2022: member

    Ok, closing for now. Let us know if you have more details

  18. MarcoFalke closed this on Sep 15, 2022

  19. ExperiBass commented at 4:39 PM on September 19, 2022: none

    b-net has decided to peg a core again! I wonder if it could be upstream? i used trickle to keep the bandwidth down as mines is limited, but i wasnt able to find any issues similar to this. I also wasn't able to get the cycles spent, didn't see a command for that 🤔

  20. MarcoFalke commented at 7:17 AM on September 20, 2022: member

    If it does happen again, please attach gdb, as mentioned in #25992 (comment)

  21. ExperiBass commented at 9:59 PM on September 22, 2022: none

    Its pegged again, gdb is attached. The stack is:

    [#0](/bitcoin-bitcoin/0/)  0x00007f368a6a1a3f in bwstat_getdelay () from /usr/lib/trickle/trickle-overload.so
    [#1](/bitcoin-bitcoin/1/)  0x00007f368a6a0b0e in ?? () from /usr/lib/trickle/trickle-overload.so
    [#2](/bitcoin-bitcoin/2/)  0x00007f368a6a0f39 in poll () from /usr/lib/trickle/trickle-overload.so
    [#3](/bitcoin-bitcoin/3/)  0x0000556d439f2284 in poll (__timeout=50, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:39
    [#4](/bitcoin-bitcoin/4/)  Sock::WaitMany (this=<optimized out>, timeout=..., events_per_sock=std::unordered_map with 32 elements = {...}) at util/sock.cpp:156
    [#5](/bitcoin-bitcoin/5/)  0x0000556d43665612 in CConnman::SocketHandler (this=0x556d46241760) at net.cpp:1540
    [#6](/bitcoin-bitcoin/6/)  0x0000556d436657f8 in CConnman::ThreadSocketHandler (this=0x556d46241760) at net.cpp:1759
    [#7](/bitcoin-bitcoin/7/)  operator() (__closure=<optimized out>) at net.cpp:2718
    [#8](/bitcoin-bitcoin/8/)  std::__invoke_impl<void, CConnman::Start(CScheduler&, const Options&)::<lambda()>&> (__f=...) at /usr/include/c++/12.2.0/bits/invoke.h:61
    [#9](/bitcoin-bitcoin/9/)  std::__invoke_r<void, CConnman::Start(CScheduler&, const Options&)::<lambda()>&> (__fn=...) at /usr/include/c++/12.2.0/bits/invoke.h:111
    [#10](/bitcoin-bitcoin/10/) std::_Function_handler<void(), CConnman::Start(CScheduler&, const Options&)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...)
        at /usr/include/c++/12.2.0/bits/std_function.h:290
    [#11](/bitcoin-bitcoin/11/) 0x0000556d43a12c15 in std::function<void ()>::operator()() const (this=0x7f358cffebf0) at /usr/include/c++/12.2.0/bits/std_function.h:591
    [#12](/bitcoin-bitcoin/12/) util::TraceThread(char const*, std::function<void ()>) (thread_name=<optimized out>, thread_func=...) at util/thread.cpp:19
    [#13](/bitcoin-bitcoin/13/) 0x0000556d4364b4cc in std::__invoke_impl<void, void (*)(char const*, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > (__f=<optimized out>) at /usr/include/c++/12.2.0/bits/invoke.h:61
    [#14](/bitcoin-bitcoin/14/) std::__invoke<void (*)(char const*, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > (__fn=<optimized out>)
        at /usr/include/c++/12.2.0/bits/invoke.h:96
    [#15](/bitcoin-bitcoin/15/) std::thread::_Invoker<std::tuple<void (*)(char const*, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > >::_M_invoke<0, 1, 2> (this=<optimized out>) at /usr/include/c++/12.2.0/bits/std_thread.h:252
    [#16](/bitcoin-bitcoin/16/) std::thread::_Invoker<std::tuple<void (*)(char const*, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > >::operator() (this=<optimized out>) at /usr/include/c++/12.2.0/bits/std_thread.h:259
    [#17](/bitcoin-bitcoin/17/) std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(char const*, std::function<void()>), char const*, CConnman::Start(CScheduler&, const Options&)::<lambda()> > > >::_M_run(void) (this=<optimized out>) at /usr/include/c++/12.2.0/bits/std_thread.h:210
    [#18](/bitcoin-bitcoin/18/) 0x00007f368a2d62f3 in std::execute_native_thread_routine (__p=0x556d4ff731c0) at /usr/src/debug/gcc/libstdc++-v3/src/c++11/thread.cc:82
    [#19](/bitcoin-bitcoin/19/) 0x00007f368a09f74d in ?? () from /usr/lib/libc.so.6
    [#20](/bitcoin-bitcoin/20/) 0x00007f368a121700 in ?? () from /usr/lib/libc.so.6
    
  22. MarcoFalke commented at 12:49 PM on September 29, 2022: member

    Last time I used trickle (2014 or so) it would just segfault. Pretty sure this is an upstream issue with that library.

  23. fanquake commented at 12:52 PM on September 29, 2022: member

    Previous report of trickle realted issues #15674.

  24. ExperiBass commented at 1:31 PM on September 29, 2022: none

    that sucks, do you know of any alternatives? trickle doesnt seem to be actively maintained anymore.

  25. bitcoin locked this on Sep 29, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-26 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me