[wallet] Node process hangs after SIGINT #18517

issue naumenkogs openend this issue on April 3, 2020
  1. naumenkogs commented at 4:05 pm on April 3, 2020: member

    I noticed it while testing current master (f0d6487e290761a4fb03798240a351b5fddfdb38). The node is in IBD: 2020-04-03T15:23:06Z Synchronizing blockheaders, height: 69998 (~12.07%) And I’m using Ubuntu 16.04. I will be happy to let someone willing to see it into the VM I’m using and provide instructions to reproduce.

    Here’s how I compiled it: ./configure --with-incompatible-bdb PYTHONPATH= --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --disable-shared --with-pic --enable-benchmark=no --with-bignum=no --enable-module-recovery --disable-jni --no-create --no-recursion

    Steps to reproduce (it happens every time I repeat this sequence):

    1. src/bitcoind
    2. wait til see first message New outbound peer connected: version
    3. click Ctrl+ C (afaik this is SIGINT)
    4. The process stuck with these logs at the end:
     02020-04-03T15:23:12Z Synchronizing blockheaders, height: 73998 (~12.73%)
     1^C2020-04-03T15:23:13Z P2P peers available. Skipped DNS seeding.
     22020-04-03T15:23:13Z dnsseed thread exit
     32020-04-03T15:23:13Z tor: Thread interrupt
     42020-04-03T15:23:13Z Shutdown: In progress...
     52020-04-03T15:23:13Z torcontrol thread exit
     62020-04-03T15:23:13Z addcon thread exit
     72020-04-03T15:23:13Z net thread exit
     82020-04-03T15:23:13Z msghand thread exit
     92020-04-03T15:23:15Z opencon thread exit
    102020-04-03T15:23:15Z scheduler thread exit
    112020-04-03T15:23:15Z Dumped mempool: 4e-06s to copy, 0.002802s to dump
    122020-04-03T15:23:15Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) started
    132020-04-03T15:23:15Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) completed (0.00s)
    142020-04-03T15:23:15Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) started
    152020-04-03T15:23:15Z FlushStateToDisk: write coins cache to disk (0 coins, 0kB) completed (0.00s)
    
    1. Wait for several minutes, nothing happens.
    2. Try kill $PID, doesn’t do anything
    3. Finally, do kill -9 $PID to get rid of the process

    So, the process doesn’t take much CPU (0.4%), and gdb says it’s 2 threads:

    0  Id   Target Id                                     Frame
    1* 1    Thread 0x7fd51d48a740 (LWP 31824) "b-shutoff" pthread_cond_wait@@GLIBC_2.3.2 ()
    2    at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    3  2    Thread 0x7fd507fff700 (LWP 31834) "bitcoind"  pthread_cond_wait@@GLIBC_2.3.2 ()
    4    at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
    

    Backtrace is here.

    Thanks to @vasild for letting me know how to call all that gdb stuff :)

  2. naumenkogs added the label Bug on Apr 3, 2020
  3. MarcoFalke commented at 4:49 pm on April 3, 2020: member
    Please upload the backtrace.txt to github. 0bin does not open for me.
  4. MarcoFalke added this to the milestone 0.20.0 on Apr 3, 2020
  5. naumenkogs commented at 4:52 pm on April 3, 2020: member
    @MarcoFalke done!
  6. MarcoFalke renamed this:
    Node process hangs after SIGINT
    gui, wallet: Node process hangs after SIGINT
    on Apr 3, 2020
  7. MarcoFalke renamed this:
    gui, wallet: Node process hangs after SIGINT
    [wallet] Node process hangs after SIGINT
    on Apr 3, 2020
  8. MarcoFalke commented at 4:58 pm on April 3, 2020: member
    The backtrace now has only one thread :thinking: How is it possible that it still writes logs?
  9. naumenkogs commented at 5:02 pm on April 3, 2020: member

    The backtrace now has only one thread 🤔 How is it possible that it still writes logs?

    To be clear, this is a new backtrace :) Sometimes it has 1, sometimes it has 2 threads after all.

  10. ryanofsky commented at 7:32 pm on April 3, 2020: member
    From the stack trace, this is almost definitely caused by ab31b9d6fe7b39713682e3f52d11238dbe042c16 from #18338. UnloadWallet is waiting forever for the shared_ptr reference count to be released, and apparently it doesn’t happen in the Ctrl-C shutdown sequence
  11. naumenkogs commented at 8:20 pm on April 3, 2020: member
    @ryanofsky you are correct. Before this commit all good, after this commit the problem occurs.
  12. promag commented at 8:57 pm on April 3, 2020: member
    @naumenkogs can’t reproduce. It always quit, mainnet, testnet, regtest, with multiple wallets.. Have you done a make clean?
  13. naumenkogs commented at 9:03 pm on April 3, 2020: member
    @promag yes, I did make clean. I also never touch wallet in my tests or anything. And I run mainnet here.
  14. ryanofsky commented at 9:40 pm on April 3, 2020: member
    I wonder if boost version might be a variable. Maybe different versions of the boost signals library are storing and calling the callback functions (with embedded shared_ptrs) differently.
  15. promag commented at 2:17 am on April 4, 2020: member
    Please detail your dependencies and compiler so we can try to reproduce.
  16. promag commented at 2:19 am on April 4, 2020: member
    Also, does it quit fine with stop RPC? And does it hang with unloadwallet RPC - see the log for “release wallet” or something like that.
  17. naumenkogs commented at 2:25 am on April 4, 2020: member

    @promag gcc version 9.2.1 20191102 BOOST 1.58

    What else is relevant?

  18. naumenkogs commented at 2:35 am on April 4, 2020: member

    I just noticed it starts with a lot of CPU (30%), but slowly releases it by 0.5% every second, stopping at 0.4.

    It also hangs when I do src/bitcoin-cli stop. Unload wallet call takes forever to execute (like more than 10 seconds, but I can wait for more if needed). I’ll just repeat that I’m happy to let you in my VM if that makes life easier :)

  19. promag commented at 9:27 am on April 4, 2020: member

    Does it hang if you run with -nowallet?

    I’ll just repeat that I’m happy to let you in my VM if that makes life easier :)

    Sure, if I can’t reproduce here.

  20. promag commented at 10:17 am on April 4, 2020: member
    @naumenkogs FYI I’ve managed to reproduce the problem, in bionic I’ve downloaded boost 1.58.0 source and also installled same gcc version. Running with -nowallet quits fine.
  21. promag commented at 10:39 am on April 4, 2020: member
  22. naumenkogs commented at 12:27 pm on April 4, 2020: member

    @promag

    Running with -nowallet quits fine.

    I confirm!

  23. promag commented at 2:18 pm on April 4, 2020: member
    Will push fix, thanks for reporting!
  24. ryanofsky referenced this in commit ba8312c7dc on Apr 4, 2020
  25. ryanofsky commented at 4:13 pm on April 4, 2020: member

    Just dropping boost here could fix this. Would be useful to test with #18524:

    0git fetch https://github.com/bitcoin/bitcoin pull/18524/head
    1git cherry-pick FETCH_HEAD
    
  26. naumenkogs commented at 4:38 pm on April 4, 2020: member
    @ryanofsky I confirm that #18524 solves the problem.
  27. ryanofsky referenced this in commit ad067a98ea on Apr 4, 2020
  28. ryanofsky referenced this in commit 01639a21d1 on Apr 4, 2020
  29. hebasto commented at 7:04 pm on April 4, 2020: member

    Steps to reproduce (it happens every time I repeat this sequence):

    1. src/bitcoind

    2. wait til see first message New outbound peer connected: version

    3. click Ctrl+ C (afaik this is SIGINT)

    4. The process stuck with these logs at the end:

    5. Wait for several minutes, nothing happens.

    6. Try kill $PID, doesn’t do anything

    7. Finally, do kill -9 $PID to get rid of the process

    Confirm the bug on Ubuntu 16.04.6 LTS.

  30. ryanofsky referenced this in commit 3d463addfe on Apr 4, 2020
  31. ryanofsky referenced this in commit b5fea244e5 on Apr 5, 2020
  32. ryanofsky referenced this in commit c3a471604b on Apr 6, 2020
  33. ryanofsky referenced this in commit 96176004a3 on Apr 6, 2020
  34. ryanofsky referenced this in commit d6815a2313 on Apr 6, 2020
  35. laanwj referenced this in commit fdeb445a34 on Apr 6, 2020
  36. ryanofsky commented at 3:37 pm on April 6, 2020: member
    #18524 (now merged) should fix this issue. The PR description says #18524 is a refactor, but it’s only a refactor for new versions of boost (>=1.59). With an old version of boost it changes behavior and fixes this bug.
  37. ryanofsky closed this on Apr 6, 2020

  38. sidhujag referenced this in commit 9e2b04c4ed on Apr 8, 2020
  39. glozow referenced this in commit 4542d48e6d on Apr 10, 2020
  40. MarcoFalke referenced this in commit 3347ca4881 on Apr 10, 2020
  41. sidhujag referenced this in commit 91b634cd18 on Apr 13, 2020
  42. HashUnlimited referenced this in commit b67fc6bde0 on Apr 17, 2020
  43. deadalnix referenced this in commit 27fc048869 on Jun 20, 2020
  44. metalicjames referenced this in commit 09af10dec5 on Aug 5, 2020
  45. janus referenced this in commit 400926d149 on Nov 5, 2020
  46. backpacker69 referenced this in commit 23b252ddaf on Mar 28, 2021
  47. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-19 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me