Bitcoin Core 0.10 freezing and disconnecting during synchronization #5851

issue MrKrzYch00 openend this issue on March 3, 2015
  1. MrKrzYch00 commented at 11:20 am on March 3, 2015: none

    Background:

    • Running Bitcoin Core 0.10 x64 on Windows 8.1
    • txindex is enabled
    • first synchronization

    Problem: Bitcoin Core 0.10 is constantly freezing during synchronization, especially on newer blocks losing connections and making RPC unresponsive. This happens when Bitcoin Core uses more than one thread to analyse downloaded blocks (occurs more often when the height is higher - uses more threads then).

    Expencted Results: GUI and RPC server should still response to user inputs or at least should not lose connections to other Bitcoin peers.

  2. laanwj added the label Bug on May 18, 2015
  3. laanwj commented at 9:00 am on May 18, 2015: member
    I’ve also noticed this. There seems to be heavy lock contention during some of the synchronization phases. This causes e.g. RPC to react slowly. Also other peers sometimes completely time out. Not only during initial sync, but also when catching up a significant number of blocks.
  4. laanwj commented at 6:56 am on May 29, 2015: member

    Made a little progress on finding out why the GUI hangs during intensive catch-up phases. Indeed, cs_main is held for a longer time. The GUI does not wait on cs_main during normal operation, however when it receives a new transaction

    02015-05-29 06:48:46 AddToWallet xxx  new
    

    It will try to get the transaction information from the wallet in qt/transactiontablemodel.cpp:130, which takes both a cs_main and wallet lock. It can stay stuck there for minutes at a time. A possible solution is to copy the relevant data in the notification function.

    This is fighting symptoms, but will lead to a more responsive UI. The root cause would be mitigated by changing ActivateBestChain/ActivateBestChainStep/ConnectTip logic to release the cs_main lock between attaching blocks.

    Edit: another consequence is that the block number and progress in the UI is not updated, as it never manages to get the cs_main lock (it polls a few times per second, with TRY_LOCK). The new-block notification path (NotifyBlockTipClientModel) should be restored, with rate limiting.

  5. laanwj added the label GUI on May 29, 2015
  6. laanwj added the label UTXO Db and Indexes on May 29, 2015
  7. laanwj added the label P2P on May 29, 2015
  8. sipa commented at 2:38 pm on May 29, 2015: member
    ActivateBestChain releases cs_main between blocks, except during a reorg, to avoid exposing a worse state than before.
  9. jonasschnelli commented at 6:52 am on July 1, 2015: contributor

    I think the cs_main LOCK at qt/transactiontablemodel.cpp:130 can be avoided.

    While re-writing most parts of the CWalletTx i think the problematic points are CWalletTx::GetCredit() mainly the GetBlocksToMaturity() which needs the chainActive (lock on cs_main) to get the current height. But only for coinbase wtxs.

    The rest of decomposeTransaction() needs no cs_main locking IMO.

  10. sipa commented at 12:41 pm on July 1, 2015: member
    Probably makes sense to store a copy of the best chain’s tip CBlockIndex* pointer in the CWallet object, and update it through a signal mechanism. Credit/confirmation querying can then use the cached version instead of needing a lock every time
  11. jonasschnelli commented at 1:06 pm on July 1, 2015: contributor
    Yes. This would be a good way. The new wallet does interact over a extra interface with main/mempool (https://github.com/jonasschnelli/bitcoin/blob/2015/05/corewallet/src/corewallet/coreinterface.cpp). This would be a starting point for optimizing locks and later separation into a own process.
  12. btcdrak commented at 7:46 pm on July 8, 2015: contributor
    @MrKrzYch00 could you test if the problem still exists in the 0.11?
  13. MrKrzYch00 commented at 8:58 pm on July 8, 2015: none

    I will give it a try after I build myself newest commit. My latest build bases on d0a10c1959176eb40c0ec47a56de00820c59066d , which, I’m pretty sure, was still slowing down RPC (when I was 15h behind and run my PHP script to manually connect to selected IPs, after 3rd connection it started to response very sluggish). However to confirm loosing connections I would need to re-download block chain (I think?).

    Also to be 100% sure I would need to note that I have one small error when building with gcc 4.9.4 (20150630 - prerelease; windows mingw) on /leveldb/util/env_win.cc file which tells me that _beginthread is undefined until I compile the file manually while omitting D_REENTRANT. Not sure what causes it and if it could have any impact on tests being correct… btw. my build is x64, uses Ofast and AVX instruction set tuning (including all dependencies, with one exception being O3 due to Ofast failing).

    EDIT: when I reported this issue I was using standard mingw toolchain with default build optimization but due to linker issues on other program being compiled I was forced to change gcc versions over time. The above gcc version and altered gcc switches were used only in my latest build based on commit mentioned above…

  14. MrKrzYch00 commented at 6:07 pm on July 17, 2015: none

    bitcoin_core_first_synchronisation

    ^ this is what happens when Bitcoin Core is checking blocks, the ping times raise a lot. My speed is ~1.5MB/s download and ~70KB/s upload (bytes).

    I will provide more detailed text log with php script I’m running to get current height, num of connections and time it took to get reply with this data from bitcoin core during first synchronization.

  15. MrKrzYch00 commented at 2:58 pm on July 18, 2015: none

    Full log at: http://4my.eu/first_synchronization_log.txt PHP-CLI script run for testing: http://4my.eu/bitcoin_core_test.txt

    Using 2GB db cache, built with: Ofast, AVX and core i7 tuning, without debug. x64 build. Running on core i7-3630QM 2.7Ghz, 8GB RAM. ~1.5MB/s download, ~70KB/s upload. Windows 8.1 Build last commit: fe3fe547f747b909f66a28cef6addfea3e1606e2. RPC Keep-alive connection with 60s timeout. The connections were estabilished to 38 pool IPs and Core was set to not allow to/from other nodes connections.

    Started with 38 connections, dropped to 14 soon after (guess to network lag) and finally ended up with 8. So I think it’s not that bad, however, response times from RPC were exceeding 60s sometimes - two commands total as seen in php script. During very high response times BitCoin Core was using 60~80% CPU (counting usage for all cores).

  16. 2083236893 commented at 3:03 pm on July 21, 2015: none
    @laanwj That also happens if you load an old wallet file and trigger a rescan, it will take longer than the ping message timeout and on completion all peers will have dropped you. Not an issue under any circumstance but it might explain why I’ve seen peers on the network not responding to messages occasionally.
  17. jonasschnelli commented at 10:37 am on November 27, 2015: contributor
    Partially addresses in #7112. Still, AddToWallet needs better lock handling.
  18. Giszmo commented at 8:08 pm on December 2, 2015: none

    Running bitcoin-qt v0.11.2 I run into “Activating best chain…” with high CPU load for 20 minutes on slow machine.

    Not sure if relevant: I wanted to “quickly” do a transaction but 0.11.1 crashed, making a re-index necessary. I downloaded 0.11.2 meanwhile, orderly interrupted the re-index to let the latest version take over after 10% progress. It then ran into “system shutting down due to CPU@100°C” or so. Now all I get is indefinite(?) “Activating best chain”. I did the kill -9 and started bitcoin-qt, only to run into this again.

    Also I understood that incoming transaction trigger this lock? I switched off wifi of that machine, to not get those.

  19. jonasschnelli commented at 10:08 am on December 3, 2015: contributor
    @Giszmo: mind re-testing this issue after #7112 has been merged? (compile or https://bitcoin.jonasschnelli.ch/nightlybuilds)
  20. Giszmo commented at 2:31 pm on December 3, 2015: none

    @jonasschnelli sorry but I’d be glad if my primary wallet gave me back my access to my funds asap. Syncing since yesterday and without a clue why it had to re-download 8GB so far. 50% to go.

    If 11.3 has the fix and you can tell me how to test, I could do that with a backup or something but else, with this machine I’m a bit paranoid to get only signed software near it.

  21. jonasschnelli commented at 2:56 pm on December 3, 2015: contributor
    @Giszmo: Sure. That is wise (and I would do the same). Maybe try compile bitcoin-core by yourself? It’s not that hard (check the docs/ section).
  22. laanwj commented at 3:01 pm on December 3, 2015: member

    Running bitcoin-qt v0.11.2 I run into “Activating best chain…” with high CPU load for 20 minutes on slow machine.

    “Activating best chain…” can take a long time in some cases, see also #7038. It processes the backlog of blocks. It should always finish eventually, though. Kill -9 will only set back progress.

    I don’t think #7112 helps here. Sure, it makes GUI-core interaction somewhat more fluid, but high CPU during initial sync is normal (lot of verification work - with -par=X you can change the number of verification threads, reduce the load at expense of speed).

  23. rebroad commented at 12:47 pm on November 24, 2016: contributor
    I frequently see my node being disconnected from all peers during IBD as ProcessMessages() doesn’t get a chance to run for over 20 minutes frequently. Might a solution be to take a break from UpdateTip every now and again to give some received messages some attention?
  24. laanwj commented at 1:10 pm on November 24, 2016: member
    Yes, that’s still the same issue. Though the cs_main lock does get released, this does not guarantee other processes will be able to get it (or at least, often enough to make useful progress). We’ve tried adding a yield() in that loop but to no avail either.
  25. fanquake closed this on Mar 7, 2018

  26. cryptozeny referenced this in commit b6f0a9d6c1 on Feb 8, 2019
  27. cryptozeny referenced this in commit 2a96f22c92 on Feb 8, 2019
  28. cryptozeny referenced this in commit 50403a93c6 on Feb 8, 2019
  29. cryptozeny referenced this in commit ea78562840 on Feb 8, 2019
  30. cryptozeny referenced this in commit fff78b9b6f on Feb 8, 2019
  31. cryptozeny referenced this in commit 72436c90b2 on Feb 8, 2019
  32. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-23 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me