The goal of continue_reindex_after_shutdown
is to start with -reindex
, then stop as soon as possible and restart the node to pick up the reindex again, then stop the node as soon as possible again.
However, sometimes the second shutdown stalls forever (without progress). So it seems like there are several bugs:
- The shutdown stalls forever (it should interrupt and shut down in a reasonable time)
- There is no reindex progess (it should pick up the previous work and try to make progess)
Apparently this is reproducible for an unknown reason in https://github.com/hebasto/bitcoin/pull/261 (The CI log can be downloaded: https://github.com/user-attachments/files/16166689/ci-log.txt)
The tail would be:
02024-07-05T10:43:00.2453258Z [0;34m node0 2024-07-05T10:32:53.991979Z [initload] [/home/runner/work/_temp/src/validationinterface.cpp:183] [UpdatedBlockTip] [validation] Enqueuing UpdatedBlockTip: new block hash=0a6d527afe45175193bcc158cfad2aa3af7b27f5b1706cb52cd4a9c66e6c2ec5 fork block hash=456fd2eb726ba547a4ca41d2e8f88f8e5bfc30c60ce598c6d8bdf77d17aa01f7 (in IBD=true) [0m
12024-07-05T10:43:00.2453933Z [0;34m node0 2024-07-05T10:32:53.992109Z [initload] [/home/runner/work/_temp/src/validation.cpp:3175] [ConnectTip] [bench] - Load block from disk: 0.05ms [0m
22024-07-05T10:43:00.2454680Z [0;34m node0 2024-07-05T10:32:53.992182Z [initload] [/home/runner/work/_temp/src/validation.cpp:2499] [ConnectBlock] [bench] - Sanity checks: 0.00ms [0.00s (0.00ms/blk)] [0m
32024-07-05T10:43:00.2455405Z [0;34m node0 2024-07-05T10:32:53.992230Z [initload] [/home/runner/work/_temp/src/validation.cpp:2601] [ConnectBlock] [bench] - Fork checks: 0.05ms [0.00s (0.04ms/blk)] [0m
42024-07-05T10:43:00.2456292Z [0;34m node0 2024-07-05T10:32:53.992241Z [initload] [/home/runner/work/_temp/src/validation.cpp:2693] [ConnectBlock] [bench] - Connect 1 transactions: 0.01ms (0.011ms/tx, 0.000ms/txin) [0.00s (0.03ms/blk)] [0m
52024-07-05T10:43:00.2457101Z [0;34m node0 2024-07-05T10:32:53.992251Z [initload] [/home/runner/work/_temp/src/validation.cpp:2711] [ConnectBlock] [bench] - Verify 0 txins: 0.02ms (0.000ms/txin) [0.00s (0.04ms/blk)] [0m
62024-07-05T10:43:00.2457995Z [0;34m node0 2024-07-05T10:32:53.992302Z [initload] [/home/runner/work/_temp/src/validation.cpp:2725] [ConnectBlock] [bench] - Write undo data: 0.05ms [0.00s (0.10ms/blk)] [0m
72024-07-05T10:43:00.2458736Z [0;34m node0 2024-07-05T10:32:53.992308Z [initload] [/home/runner/work/_temp/src/validation.cpp:2740] [ConnectBlock] [bench] - Index writing: 0.01ms [0.00s (0.01ms/blk)] [0m
82024-07-05T10:43:00.2459820Z [0;34m node0 2024-07-05T10:32:53.992314Z [initload] [/home/runner/work/_temp/src/validationinterface.cpp:245] [BlockChecked] [validation] BlockChecked: block hash=15f1512d52773c7f26db99f313fed562748b67bf4ab034e92edd8a856f91de0c state=Valid [0m
92024-07-05T10:43:00.2460665Z [0;34m node0 2024-07-05T10:32:53.992320Z [initload] [/home/runner/work/_temp/src/validation.cpp:3194] [ConnectTip] [bench] - Connect total: 0.21ms [0.00s (0.28ms/blk)] [0m
102024-07-05T10:43:00.2461358Z [0;34m node0 2024-07-05T10:32:53.992401Z [initload] [/home/runner/work/_temp/src/validation.cpp:3203] [ConnectTip] [bench] - Flush: 0.08ms [0.00s (0.10ms/blk)] [0m
112024-07-05T10:43:00.2462099Z [0;34m node0 2024-07-05T10:32:53.992408Z [initload] [/home/runner/work/_temp/src/validation.cpp:3213] [ConnectTip] [bench] - Writing chainstate: 0.01ms [0.00s (0.02ms/blk)] [0m
122024-07-05T10:43:00.2463178Z [0;34m node0 2024-07-05T10:32:53.992426Z [initload] [/home/runner/work/_temp/src/validationinterface.cpp:222] [MempoolTransactionsRemovedForBlock] [validation] Enqueuing MempoolTransactionsRemovedForBlock: block height=5 txs removed=0 [0m
132024-07-05T10:43:00.2464650Z [0;34m node0 2024-07-05T10:32:53.992435Z [initload] [/home/runner/work/_temp/src/validation.cpp:2971] [UpdateTipLog] UpdateTip: new best=15f1512d52773c7f26db99f313fed562748b67bf4ab034e92edd8a856f91de0c height=5 version=0x20000000 log2_work=3.584963 tx=6 date='2024-07-05T10:32:50Z' progress=1.000000 cache=0.3MiB(5txo) [0m
142024-07-05T10:43:00.2465405Z [0;34m node0 2024-07-05T10:32:53.992440Z [initload] [/home/runner/work/_temp/src/validation.cpp:3229] [ConnectTip] [bench] - Connect postprocess: 0.03ms [0.00s (0.06ms/blk)] [0m
152024-07-05T10:43:00.2466115Z [0;34m node0 2024-07-05T10:32:53.992445Z [initload] [/home/runner/work/_temp/src/validation.cpp:3233] [ConnectTip] [bench] - Connect block: 0.38ms [0.00s (0.51ms/blk)] [0m
162024-07-05T10:43:00.2466848Z [0;34m node0 2024-07-05T10:32:53.992463Z [initload] [/home/runner/work/_temp/src/txmempool.cpp:664] [check] [mempool] Checking mempool with 0 transactions and 0 inputs [0m
172024-07-05T10:43:00.2468103Z [0;34m node0 2024-07-05T10:32:53.992531Z [initload] [/home/runner/work/_temp/src/validationinterface.cpp:212] [BlockConnected] [validation] Enqueuing BlockConnected: block hash=15f1512d52773c7f26db99f313fed562748b67bf4ab034e92edd8a856f91de0c block height=5 [0m
182024-07-05T10:43:00.2469712Z [0;34m node0 2024-07-05T10:32:53.992539Z [initload] [/home/runner/work/_temp/src/validationinterface.cpp:183] [UpdatedBlockTip] [validation] Enqueuing UpdatedBlockTip: new block hash=15f1512d52773c7f26db99f313fed562748b67bf4ab034e92edd8a856f91de0c fork block hash=0a6d527afe45175193bcc158cfad2aa3af7b27f5b1706cb52cd4a9c66e6c2ec5 (in IBD=true) [0m
192024-07-05T10:43:00.2470134Z [0;36m test 2024-07-05T10:32:54.087000Z TestFramework.node0 (DEBUG): RPC successfully started [0m
202024-07-05T10:43:00.2470891Z [0;34m node0 2024-07-05T10:32:54.087480Z [http] [/home/runner/work/_temp/src/httpserver.cpp:303] [http_request_cb] [http] Received a POST request for / from 127.0.0.1:35146 [0m
212024-07-05T10:43:00.2471658Z [0;34m node0 2024-07-05T10:32:54.087604Z [httpworker.0] [/home/runner/work/_temp/src/rpc/request.cpp:232] [parse] [rpc] ThreadRPCServer method=getblockcount user=__cookie__ [0m
222024-07-05T10:43:00.2472018Z [0;36m test 2024-07-05T10:32:54.088000Z TestFramework.node0 (DEBUG): Stopping node [0m
232024-07-05T10:43:00.2472756Z [0;34m node0 2024-07-05T10:32:54.088812Z [http] [/home/runner/work/_temp/src/httpserver.cpp:303] [http_request_cb] [http] Received a POST request for / from 127.0.0.1:35146 [0m
242024-07-05T10:43:00.2473575Z [0;34m node0 2024-07-05T10:32:54.088902Z [httpworker.2] [/home/runner/work/_temp/src/rpc/request.cpp:232] [parse] [rpc] ThreadRPCServer method=stop user=__cookie__ [0m
252024-07-05T10:43:00.2474250Z [0;34m node0 2024-07-05T10:32:54.089014Z [init] [/home/runner/work/_temp/src/httpserver.cpp:496] [InterruptHTTPServer] [http] Interrupting HTTP server [0m
262024-07-05T10:43:00.2474902Z [0;34m node0 2024-07-05T10:32:54.089097Z [init] [/home/runner/work/_temp/src/httprpc.cpp:378] [InterruptHTTPRPC] [rpc] Interrupting HTTP RPC server [0m
272024-07-05T10:43:00.2475475Z [0;34m node0 2024-07-05T10:32:54.089111Z [init] [/home/runner/work/_temp/src/rpc/server.cpp:308] [operator()] [rpc] Interrupting RPC [0m
282024-07-05T10:43:00.2476143Z [0;34m node0 2024-07-05T10:32:54.089141Z [init] [/home/runner/work/_temp/src/init.cpp:276] [Shutdown] Shutdown: In progress... [0m
292024-07-05T10:43:00.2476784Z [0;34m node0 2024-07-05T10:32:54.089151Z [shutoff] [/home/runner/work/_temp/src/httprpc.cpp:383] [StopHTTPRPC] [rpc] Stopping HTTP RPC server [0m
302024-07-05T10:43:00.2477590Z [0;34m node0 2024-07-05T10:32:54.089165Z [shutoff] [/home/runner/work/_temp/src/httpserver.cpp:758] [UnregisterHTTPHandler] [http] Unregistering HTTP handler for / (exactmatch 1) [0m
312024-07-05T10:43:00.2478273Z [0;34m node0 2024-07-05T10:32:54.089178Z [shutoff] [/home/runner/work/_temp/src/rpc/server.cpp:320] [operator()] [rpc] Stopping RPC [0m
322024-07-05T10:43:00.2478926Z [0;34m node0 2024-07-05T10:32:54.089237Z [addcon] [/home/runner/work/_temp/src/util/thread.cpp:22] [TraceThread] addcon thread exit [0m
332024-07-05T10:43:00.2479490Z [0;34m node0 2024-07-05T10:32:54.089320Z [shutoff] [/home/runner/work/_temp/src/init.cpp:443] [OnRPCStopped] [rpc] RPC stopped. [0m
342024-07-05T10:43:00.2480124Z [0;34m node0 2024-07-05T10:32:54.089328Z [shutoff] [/home/runner/work/_temp/src/httpserver.cpp:508] [StopHTTPServer] [http] Stopping HTTP server [0m
352024-07-05T10:43:00.2480841Z [0;34m node0 2024-07-05T10:32:54.089333Z [shutoff] [/home/runner/work/_temp/src/httpserver.cpp:510] [StopHTTPServer] [http] Waiting for HTTP worker threads to exit [0m
362024-07-05T10:43:00.2481551Z [0;34m node0 2024-07-05T10:32:54.089404Z [shutoff] [/home/runner/work/_temp/src/httpserver.cpp:538] [StopHTTPServer] [http] Waiting for HTTP event thread to exit [0m
372024-07-05T10:43:00.2482157Z [0;34m node0 2024-07-05T10:32:54.089426Z [http] [/home/runner/work/_temp/src/httpserver.cpp:352] [ThreadHTTP] [http] Exited http event loop [0m
382024-07-05T10:43:00.2482785Z [0;34m node0 2024-07-05T10:32:54.089498Z [shutoff] [/home/runner/work/_temp/src/httpserver.cpp:544] [StopHTTPServer] [http] Stopped HTTP server [0m
392024-07-05T10:43:00.2483325Z [0;34m node0 2024-07-05T10:32:54.136587Z [net] [/home/runner/work/_temp/src/util/thread.cpp:22] [TraceThread] net thread exit [0m
402024-07-05T10:43:00.2483910Z [0;34m node0 2024-07-05T10:32:54.186604Z [msghand] [/home/runner/work/_temp/src/util/thread.cpp:22] [TraceThread] msghand thread exit [0m
412024-07-05T10:43:00.2484577Z [0;34m node0 2024-07-05T10:32:54.187928Z [shutoff] [/home/runner/work/_temp/src/net.cpp:2328] [DumpAddresses] [net] Flushed 0 addresses to peers.dat 1ms [0m
422024-07-05T10:43:00.2485177Z [0;34m node0 2024-07-05T10:32:54.317180Z [scheduler] [/home/runner/work/_temp/src/util/thread.cpp:22] [TraceThread] scheduler thread exit [0m
432024-07-05T10:43:00.2485646Z [0;36m test 2024-07-05T10:42:54.132000Z TestFramework.utils (ERROR): wait_until() failed. Predicate: '''' [0m
442024-07-05T10:43:00.2486144Z [0;36m self.wait_until(lambda: self.is_node_stopped(**kwargs), timeout=timeout)[0m
452024-07-05T10:43:00.2486418Z [0;36m '''[0m
One can see that the initload
thread does not stop, indicating that it hangs forever.
I’ve also reproduced this outside of cmake: (full datadir)
https://drahtbot.space/temp_scratch/feature_reindex_243_st.tar.zstd
The tail looks similar:
0 node0 2024-07-06T14:30:49.994884Z [initload] [validationinterface.cpp:210] [BlockConnected] [validation] Enqueuing BlockConnected: block hash=33b86c0c25bc28b3d58d8dd3d06d927bd2aabc69a604283f3539bf90d0c69085 block height=21
1 node0 2024-07-06T14:30:49.994902Z [initload] [validationinterface.cpp:180] [UpdatedBlockTip] [validation] Enqueuing UpdatedBlockTip: new block hash=33b86c0c25bc28b3d58d8dd3d06d927bd2aabc69a604283f3539bf90d0c69085 fork block hash=6f1cd206b4238dc974a98e62a08a105d6e91021405c57f2be587863340c1f541 (in IBD=true)
2 node0 2024-07-06T14:30:49.995034Z [httpworker.2] [rpc/request.cpp:232] [parse] [rpc] ThreadRPCServer method=stop user=__cookie__
3 node0 2024-07-06T14:30:49.995291Z [init] [httpserver.cpp:498] [InterruptHTTPServer] [http] Interrupting HTTP server
4 node0 2024-07-06T14:30:49.995456Z [init] [httprpc.cpp:378] [InterruptHTTPRPC] [rpc] Interrupting HTTP RPC server
5 node0 2024-07-06T14:30:49.995486Z [init] [rpc/server.cpp:308] [operator()] [rpc] Interrupting RPC
6 node0 2024-07-06T14:30:49.995543Z [init] [init.cpp:276] [Shutdown] Shutdown: In progress...
7 node0 2024-07-06T14:30:49.995571Z [shutoff] [httprpc.cpp:383] [StopHTTPRPC] [rpc] Stopping HTTP RPC server
8 node0 2024-07-06T14:30:49.995588Z [shutoff] [httpserver.cpp:760] [UnregisterHTTPHandler] [http] Unregistering HTTP handler for / (exactmatch 1)
9 node0 2024-07-06T14:30:49.995611Z [addcon] [util/thread.cpp:22] [TraceThread] addcon thread exit
10 node0 2024-07-06T14:30:49.995689Z [shutoff] [httpserver.cpp:760] [UnregisterHTTPHandler] [http] Unregistering HTTP handler for /wallet/ (exactmatch 0)
11 node0 2024-07-06T14:30:49.995712Z [shutoff] [rpc/server.cpp:320] [operator()] [rpc] Stopping RPC
12 node0 2024-07-06T14:30:49.995863Z [shutoff] [init.cpp:443] [OnRPCStopped] [rpc] RPC stopped.
13 node0 2024-07-06T14:30:49.995883Z [shutoff] [httpserver.cpp:510] [StopHTTPServer] [http] Stopping HTTP server
14 node0 2024-07-06T14:30:49.995908Z [shutoff] [httpserver.cpp:512] [StopHTTPServer] [http] Waiting for HTTP worker threads to exit
15 node0 2024-07-06T14:30:49.995963Z [shutoff] [httpserver.cpp:540] [StopHTTPServer] [http] Waiting for HTTP event thread to exit
16 node0 2024-07-06T14:30:49.995987Z [http] [httpserver.cpp:354] [ThreadHTTP] [http] Exited http event loop
17 node0 2024-07-06T14:30:49.996087Z [shutoff] [httpserver.cpp:546] [StopHTTPServer] [http] Stopped HTTP server
18 node0 2024-07-06T14:30:50.087354Z [net] [util/thread.cpp:22] [TraceThread] net thread exit
19 node0 2024-07-06T14:30:50.087685Z [scheduler] [validationinterface.cpp:180] [operator()] [validation] UpdatedBlockTip: new block hash=4ce0c52bd550b08742147a543466e7510f10f889e6370264e9d5bc32fc6980da fork block hash=749675d2e5744e412229528d60749a25079418d36602c7c10497f7ba858e1798 (in IBD=true)
20 node0 2024-07-06T14:30:50.087716Z [scheduler] [validationinterface.cpp:220] [operator()] [validation] MempoolTransactionsRemovedForBlock: block height=18 txs removed=0
21 node0 2024-07-06T14:30:50.087774Z [scheduler] [validationinterface.cpp:210] [operator()] [validation] BlockConnected: block hash=29b484857bff1534ae62576263136066ea100fcaac7ae1b6486e779a4a01b708 block height=18
22 node0 2024-07-06T14:30:50.088010Z [scheduler] [validationinterface.cpp:180] [operator()] [validation] UpdatedBlockTip: new block hash=29b484857bff1534ae62576263136066ea100fcaac7ae1b6486e779a4a01b708 fork block hash=4ce0c52bd550b08742147a543466e7510f10f889e6370264e9d5bc32fc6980da (in IBD=true)
23 node0 2024-07-06T14:30:50.088056Z [scheduler] [validationinterface.cpp:220] [operator()] [validation] MempoolTransactionsRemovedForBlock: block height=19 txs removed=0
24 node0 2024-07-06T14:30:50.088084Z [scheduler] [validationinterface.cpp:210] [operator()] [validation] BlockConnected: block hash=0c48535b5c968dc6af40a9610748eb0578492ae2ba4d5800bc7f43574015f5c5 block height=19
25 node0 2024-07-06T14:30:50.088246Z [scheduler] [validationinterface.cpp:180] [operator()] [validation] UpdatedBlockTip: new block hash=0c48535b5c968dc6af40a9610748eb0578492ae2ba4d5800bc7f43574015f5c5 fork block hash=29b484857bff1534ae62576263136066ea100fcaac7ae1b6486e779a4a01b708 (in IBD=true)
26 node0 2024-07-06T14:30:50.088294Z [scheduler] [validationinterface.cpp:220] [operator()] [validation] MempoolTransactionsRemovedForBlock: block height=20 txs removed=0
27 node0 2024-07-06T14:30:50.088321Z [scheduler] [validationinterface.cpp:210] [operator()] [validation] BlockConnected: block hash=6f1cd206b4238dc974a98e62a08a105d6e91021405c57f2be587863340c1f541 block height=20
28 node0 2024-07-06T14:30:50.088475Z [scheduler] [validationinterface.cpp:180] [operator()] [validation] UpdatedBlockTip: new block hash=6f1cd206b4238dc974a98e62a08a105d6e91021405c57f2be587863340c1f541 fork block hash=0c48535b5c968dc6af40a9610748eb0578492ae2ba4d5800bc7f43574015f5c5 (in IBD=true)
29 node0 2024-07-06T14:30:50.088515Z [msghand] [util/thread.cpp:22] [TraceThread] msghand thread exit
30 node0 2024-07-06T14:30:50.089252Z [shutoff] [net.cpp:2327] [DumpAddresses] [net] Flushed 0 addresses to peers.dat 0ms
31 node0 2024-07-06T14:30:50.089308Z [scheduler] [util/thread.cpp:22] [TraceThread] scheduler thread exit
32 test 2024-07-06T15:10:50.029000Z TestFramework.utils (ERROR): wait_until() failed. Predicate: ''''
33 self.wait_until(lambda: self.is_node_stopped(**kwargs), timeout=timeout)
34 '''