The problem: Sometimes, usually after Internet connection disturbances, it is impossible to stop bitcoin-qt or daemon.
I’m expecting: That on close command bitcoin-qt in GUI or daemon mode will finish it’s operation in reasonable time and the database will contain the last synchronized blocks and is ready to continue the operation from it’s current state.
Actual behavior: Qt-interface is hanging on “Bitcoin Core is shutting down” window, daemon just remains as a process. None CPU or I/O load is created when the process stalls. None reads/writes to the storage, none percents of CPU load. Have confirmed by looking at htop and iotop terminals. Bitcoin Core will running normally after WAN connection change or temporary problems. Newer peers are connected both way, the traffic is OK, all seems OK. But it will not stop operation normally any more. Only one way to stop is to send SIGQUIT to the process. The worst thing is the loose of last Bitcoin Core session data. For example, if Bitcoin Core was running two weeks and was failed to stop correctly, at the next start it will continue only from two weeks ago state. The blocks need to be rescanned form HDD again. This taking rather long time. After shutdown problem it may be impossible to perform shutdown even if there were none problems with WAN. I had to restart the client several times to get normal close. After it closed correctly, Bitcoin Core operating normally and shutting down normally. Here is a log from unsuccessful shutdown:
02019-08-17T21:13:34Z GUI: requestShutdown : Requesting shutdown
12019-08-17T21:13:34Z GUI: shutdown : Running Shutdown in thread
22019-08-17T21:13:34Z Interrupting HTTP server
32019-08-17T21:13:34Z Interrupting HTTP RPC server
42019-08-17T21:13:34Z Interrupting RPC
52019-08-17T21:13:34Z Shutdown: In progress...
62019-08-17T21:13:34Z Stopping HTTP RPC server
72019-08-17T21:13:34Z addcon thread exit
82019-08-17T21:13:34Z opencon thread exit
92019-08-17T21:13:34Z Stopping RPC
102019-08-17T21:13:34Z Stopping HTTP server
112019-08-17T21:13:34Z Stopped HTTP server
122019-08-17T21:13:34Z BerkeleyEnvironment::Flush: [/media/nikolaypo/bitcoindb] Flush(false)
132019-08-17T21:13:34Z BerkeleyEnvironment::Flush: Flushing wallet.dat (refcount = 0)...
142019-08-17T21:13:34Z BerkeleyEnvironment::Flush: wallet.dat checkpoint
152019-08-17T21:13:34Z BerkeleyEnvironment::Flush: wallet.dat detach
162019-08-17T21:13:34Z net thread exit
172019-08-17T21:13:34Z msghand thread exit
182019-08-17T21:13:34Z BerkeleyEnvironment::Flush: wallet.dat closed
192019-08-17T21:13:34Z BerkeleyEnvironment::Flush: Flush(false) took 86ms
202019-08-17T21:13:34Z BerkeleyEnvironment::Flush: [/media/nikolaypo/bitcoindb] Flush(false)
212019-08-17T21:13:34Z BerkeleyEnvironment::Flush: Flush(false) took 0ms
222019-08-17T21:27:43Z Flushed 66639 addresses to peers.dat 656ms
232019-08-17T21:42:44Z Flushed 66639 addresses to peers.dat 635ms
242019-08-17T21:50:27Z Potential stale tip detected, will try using extra outbound peer (last tip update: 2260 seconds ago)
252019-08-17T21:50:27Z net: setting try another outbound peer=true
262019-08-17T21:57:44Z Flushed 66639 addresses to peers.dat 622ms
272019-08-17T22:00:57Z Potential stale tip detected, will try using extra outbound peer (last tip update: 2890 seconds ago)
282019-08-17T22:00:57Z net: setting try another outbound peer=true
292019-08-17T22:11:27Z Potential stale tip detected, will try using extra outbound peer (last tip update: 3520 seconds ago)
302019-08-17T22:11:27Z net: setting try another outbound peer=true
312019-08-17T22:12:45Z Flushed 66639 addresses to peers.dat 605ms
322019-08-17T22:21:57Z Potential stale tip detected, will try using extra outbound peer (last tip update: 4150 seconds ago)
332019-08-17T22:21:57Z net: setting try another outbound peer=true
342019-08-17T22:58:17Z
Again, when the client hangs, it does not producing CPU or storage load.
Reproduction: Keep Bitcoin Core running normally. Then change WAN IP on upstream router by restarting PPTP interface. Let the client run some more to confirm normal behavior before stopping. Then try to close bitcoin-qt or send stop command to the daemon.
Sometimes the problem is not manifesting itself. But with about 10-30% of WAN connection disturbances or reconnects leads to impossibility of correct shutting down.
Have tried v0.18.0 and v0.18.1, have tried amd64 and arm architectures. Have tried GCC 6.3 and GCC 9.1.0 compilers. The problem persisting with different builds and on different machines. All sources was from bitcoin project on github.com, the place were I’m leaving this issue description. I have not tested pre-compiled versions.
- Intel Core2Duo, Debian 9 Linux, amd64. 2) Rockchip RK3399, Armbian Linux, aarch64-linux-gnu. With both machines I have used the same USB 3.0/SATA adapter with the same 3TB Seagate (rotating) HDD. EXT4.
I had to rescan the database three times because of hardware failures, two times after a power loss, one time after USB interface sudden disconnection. May be there are the problems with blocks but fsck with -P parameter doesn’t found an errors.
As for me, I need more feedback from Bitcoin Core client in case of problems and more robustness. One time I had a problem with one chainstate file corruption on the media I had to rescan the data base again from block zero. It is a pity that single file corruption cannot be recovered automatically without spending a lot of IO operations and CPU power for rescanning.
Thank you for your attention!
Nikolay